CN113239087A - Anti-electricity-stealing inspection monitoring method and system - Google Patents

Anti-electricity-stealing inspection monitoring method and system Download PDF

Info

Publication number
CN113239087A
CN113239087A CN202110367202.XA CN202110367202A CN113239087A CN 113239087 A CN113239087 A CN 113239087A CN 202110367202 A CN202110367202 A CN 202110367202A CN 113239087 A CN113239087 A CN 113239087A
Authority
CN
China
Prior art keywords
electricity
user
stealing
data
sudden change
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110367202.XA
Other languages
Chinese (zh)
Inventor
杨艺宁
薛阳
徐英辉
王聪
杨柳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN202110367202.XA priority Critical patent/CN113239087A/en
Publication of CN113239087A publication Critical patent/CN113239087A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/259Fusion by voting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an anti-electricity-stealing inspection monitoring method and a system, comprising the following steps: acquiring historical data of user electricity utilization, and preprocessing the historical data of the user electricity utilization to acquire processed data of the user electricity utilization; determining an abnormal event behavior record table based on different types of electricity stealing means, and determining characteristic data of a user according to the abnormal behavior record table and user electricity utilization processing data; constructing an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm, and performing model training and optimization by using the characteristic data of the user to determine a final anti-electricity-stealing inspection monitoring machine learning model; and judging the power utilization data of the user by utilizing the final anti-electricity-stealing inspection monitoring machine learning model so as to monitor the electricity-stealing behavior of the user. The invention can provide effective data basis for on-site inspection personnel, improve the efficiency, greatly reduce the workload of first-line personnel, greatly reduce the operation cost and can recover huge economic loss for power enterprises.

Description

Anti-electricity-stealing inspection monitoring method and system
Technical Field
The invention relates to the technical field of power industry, in particular to an anti-electricity-stealing inspection monitoring method and system.
Background
With the development of scientific technology, the behavior of electricity stealing through high-tech means is increased, the technical means are not only high in concealment, but also difficult to control, the electricity stealing amount of a user is large, and the difficulty of investigation and treatment is very large. The electricity stealing behavior causes great loss to the national economy and also causes great threat to the life and property safety of the public.
Therefore, an anti-electricity-stealing inspection monitoring method is needed.
Disclosure of Invention
The invention provides an anti-electricity-stealing inspection monitoring method and system, which aim to solve the problem of how to determine electricity-stealing behavior of a user.
In order to solve the above problems, according to an aspect of the present invention, there is provided an anti-electricity-stealing inspection monitoring method, the method including:
acquiring historical data of user electricity utilization, and preprocessing the historical data of the user electricity utilization to acquire processed data of the user electricity utilization; wherein the user history data comprises: electricity data of electricity stealing users and electricity data of normal users;
determining an abnormal event behavior record table based on different types of electricity stealing means, and determining characteristic data of a user according to the abnormal behavior record table and user electricity utilization processing data;
constructing an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm, performing model training and optimization by using the characteristic data of the user, enabling the model to automatically generate a plurality of decision trees through the random forest algorithm, and judging whether the electricity-stealing behavior of the user exists according to the voting result of each decision tree so as to determine the final anti-electricity-stealing inspection monitoring machine learning model;
and judging the power utilization data of the user by utilizing the final anti-electricity-stealing inspection monitoring machine learning model so as to monitor the electricity-stealing behavior of the user.
Preferably, the preprocessing each piece of user history data in the user electricity utilization history data to obtain user electricity utilization processing data includes:
and converting each piece of user historical data in the user electricity utilization historical data into a binary file in a pkl format based on the pickle to acquire user electricity utilization processing data.
Preferably, wherein the different types of electricity stealing means comprise: short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, differential electricity stealing and meter-free electricity stealing.
Preferably, the determining the characteristic data of the user according to the abnormal event behavior record table and the electricity utilization processing data of the user comprises:
counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events;
calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event;
calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event;
counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user;
calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time to determine the line loss sudden change degree at the power consumption sudden change moment;
determining the empty value ratio of the time sequence, the time sequence 0 value ratio, the time sequence abnormal electricity ratio, the time sequence median, the time sequence variance, the time sequence standardization median, the time sequence standardization variance, the mutation point number, the mutation point transition maximum value, the working day average electricity consumption, the resting day average electricity consumption, the working day average electricity consumption proportion, the resting day average electricity consumption proportion and the information entropy of the week electricity consumption proportion according to the average value, the variance, the abnormal value, the mutation point, the working day electricity consumption and the resting day electricity consumption of the user.
Preferably, wherein the method further comprises:
and determining the electricity stealing suspicion degree of the user by utilizing the final electricity stealing prevention inspection monitoring machine learning model, and performing correlation output on the electricity stealing suspicion degree and the user information of the user.
According to another aspect of the present invention, there is provided an anti-electricity-stealing audit monitoring system, the system including:
the data preprocessing unit is used for acquiring historical data of user electricity utilization and preprocessing the historical data of the user electricity utilization to acquire processed data of the user electricity utilization; wherein the user history data comprises: electricity data of electricity stealing users and electricity data of normal users;
the characteristic data determining unit is used for determining an abnormal event behavior record table based on different types of electricity stealing means and determining the characteristic data of a user according to the abnormal behavior record table and the user electricity utilization processing data;
the model determining unit is used for constructing an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm, performing model training and optimization by using the characteristic data of the user, enabling the model to automatically generate a plurality of decision trees through the random forest algorithm, and judging whether electricity-stealing behaviors exist in the user according to the voting result of each decision tree so as to determine the final anti-electricity-stealing inspection monitoring machine learning model;
and the judging unit is used for judging the power utilization data of the user by utilizing the final anti-electricity-stealing inspection monitoring machine learning model so as to monitor the electricity-stealing behavior of the user.
Preferably, the data preprocessing unit preprocesses each piece of user history data in the user electricity consumption history data to obtain user electricity consumption processing data, and includes:
and converting each piece of user historical data in the user electricity utilization historical data into a binary file in a pkl format based on the pickle to acquire user electricity utilization processing data.
Preferably, wherein the different types of electricity stealing means comprise: short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, differential electricity stealing and meter-free electricity stealing.
Preferably, the characteristic data determining unit determines the characteristic data of the user according to the abnormal event behavior record table and the user electricity utilization processing data, and includes:
counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events;
calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event;
calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event;
counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user;
calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time to determine the line loss sudden change degree at the power consumption sudden change moment;
determining the empty value ratio of the time sequence, the time sequence 0 value ratio, the time sequence abnormal electricity ratio, the time sequence median, the time sequence variance, the time sequence standardization median, the time sequence standardization variance, the mutation point number, the mutation point transition maximum value, the working day average electricity consumption, the resting day average electricity consumption, the working day average electricity consumption proportion, the resting day average electricity consumption proportion and the information entropy of the week electricity consumption proportion according to the average value, the variance, the abnormal value, the mutation point, the working day electricity consumption and the resting day electricity consumption of the user.
Preferably, wherein the system further comprises:
and the output unit is used for determining the electricity stealing suspicion degree of the user by utilizing the final electricity stealing prevention inspection monitoring machine learning model and performing correlation output on the electricity stealing suspicion degree and the user information of the user.
According to the anti-electricity-stealing inspection monitoring method and system, the suspected users of electricity-stealing are determined through the anti-electricity-stealing inspection monitoring machine learning model, effective data basis can be provided for on-site inspectors, the efficiency is improved, the workload of first-line workers is greatly reduced, the operation cost is greatly reduced, and huge economic loss can be recovered for power enterprises.
Drawings
A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:
FIG. 1 is a flow chart of an anti-electricity-stealing audit monitoring method 100 according to an embodiment of the invention;
FIG. 2 is a schematic illustration of determining characteristic data according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating various data source association features according to an embodiment of the invention;
FIG. 4 is a schematic diagram of processing subscriber profile data according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of constructing a model data feature set according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of data classification for feature engineering according to an embodiment of the present invention;
FIG. 7 is a process diagram of determining an anti-electricity-stealing audit monitoring machine learning model according to an embodiment of the invention;
FIG. 8 is a diagram illustrating a tree in a random access model for determining whether to steal power according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an anti-electricity-stealing inspection monitoring system 900 according to an embodiment of the invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.
Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
FIG. 1 is a flow chart of an anti-electricity-stealing audit monitoring method 100 according to an embodiment of the invention. According to the anti-electricity-stealing inspection monitoring method provided by the embodiment of the invention, the suspected user of electricity-stealing is determined through the anti-electricity-stealing inspection monitoring machine learning model, effective data basis can be provided for on-site inspection personnel, the efficiency is improved, the workload of first-line staff is greatly reduced, the operation cost is greatly reduced, and huge economic loss can be recovered for power enterprises. As shown in fig. 1, in an anti-electricity-stealing inspection and monitoring method 100 provided by the embodiment of the present invention, starting from step 101, in step 101, historical electricity consumption data of a user is obtained, and the historical electricity consumption data of the user is preprocessed to obtain electricity consumption processing data of the user; wherein the user history data comprises: electricity data of electricity stealing users and electricity data of normal users.
Preferably, the preprocessing each piece of user history data in the user electricity utilization history data to obtain user electricity utilization processing data includes:
and converting each piece of user historical data in the user electricity utilization historical data into a binary file in a pkl format based on the pickle to acquire user electricity utilization processing data.
In the invention, data access is carried out through a national network marketing service application system, an electricity utilization information acquisition system and an oracle-based cx _ oracle interface, 360-day data (including electricity stealing users and non-electricity stealing users) are selected in a model input data part, and 2W positive samples of training data and 5W common electricity stealing users are trained. During data preprocessing, the existing data are converted into a pkl binary file based on the pinckle, so that model data can be stored for analysis, and the running efficiency of the model is improved.
In step 102, an abnormal event behavior record table is determined based on different types of electricity stealing means, and characteristic data of a user is determined according to the abnormal behavior record table and the electricity utilization processing data of the user.
Preferably, wherein the different types of electricity stealing means comprise: short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, differential electricity stealing and meter-free electricity stealing.
Preferably, the determining the characteristic data of the user according to the abnormal event behavior record table and the electricity utilization processing data of the user comprises:
counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events;
calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event;
calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event;
counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user;
calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time to determine the line loss sudden change degree at the power consumption sudden change moment;
determining the empty value ratio of the time sequence, the time sequence 0 value ratio, the time sequence abnormal electricity ratio, the time sequence median, the time sequence variance, the time sequence standardization median, the time sequence standardization variance, the mutation point number, the mutation point transition maximum value, the working day average electricity consumption, the resting day average electricity consumption, the working day average electricity consumption proportion, the resting day average electricity consumption proportion and the information entropy of the week electricity consumption proportion according to the average value, the variance, the abnormal value, the mutation point, the working day electricity consumption and the resting day electricity consumption of the user.
As shown in FIG. 2, in the present invention, the characteristic engineering is performed on the electricity utilization information of electricity stealing users and the electricity stealing information of normal users (non-electricity stealing users), the characteristic vector of the electricity stealing users and the characteristic vector of the normal users are determined, and the obtained characteristic vector is input into the anti-electricity-stealing audit monitoring machine learning model based on the random forest algorithm for training.
According to the invention, firstly, the electricity stealing means is used for judging, the abnormal event behavior recording table can be obtained according to short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, spread spectrum electricity stealing, meter-free electricity stealing and the like, the recorded information is the part with larger weight of the model data, the behavior of an electricity stealing user is judged according to the table, the occurrence time point of the abnormal event is taken, the electricity consumption and the line loss of a transformer area of the corresponding time are related according to the user, the occurrence time, the sudden change degree is observed, and the maximum value of the sudden change degree is taken. The method comprises the following steps: counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events; calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event; calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event; counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user; and calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time so as to determine the line loss sudden change degree at the power consumption sudden change moment.
Then, according to the statistical information such as the average value, variance, abnormal value, mutation point, working day, daily power consumption for rest and the like of the daily power consumption of the user, the statistical information comprises the following steps: counting information entropies of a null value proportion of a time series, a time series 0 value proportion, a time series abnormal electricity proportion, a time series median, a time series variance, a time series standardization median (< ═ median/time series maximum), a time series standardization variance (< ═ variance/time series maximum), a mutation point number, a mutation point transition maximum, a working day average electricity consumption, a resting day average electricity consumption, a working day average electricity consumption proportion, a resting day average electricity consumption proportion and a week electricity consumption proportion.
In the invention, for the daily electricity consumption of users, the possible electricity stealing scenes corresponding to the characteristic 'average electricity consumption, variance electricity consumption, missing value ratio, zero value ratio, daily electricity consumption ratio and double-break day electricity consumption ratio' are respectively 'electricity stealing users stealing a large amount of electricity to cause the obvious reduction of electricity consumption, electricity stealing users stealing electricity to cause the lack of seasonal fluctuation of electricity consumption, electricity stealing users stealing electricity to cause the missing of electricity meter records, electricity stealing users stealing electricity to cause the no current of the electricity meter to pass through, electricity stealing users stealing electricity on working days and electricity stealing users on rest days'.
For the basic attributes of the user, the characteristic scenes corresponding to the characteristics of urban and rural categories, historical electricity stealing records and electricity utilization types are respectively 'difference between urban electricity stealing and rural electricity stealing, credit level of the user is evaluated, and difference between electricity stealing modes corresponding to different electricity utilization modes'.
For the station area daily line loss, the characteristic scenes corresponding to the characteristic line loss mean value and the line loss power utilization correlation are respectively 'whether electricity stealing happens to the whole station area or not and negative correlation exists between electricity stealing users and line loss'.
For the abnormal events, the characteristic scenes corresponding to the characteristics of the event type and the occurrence frequency are respectively caused by the fact that the abnormal event type is different and the electricity stealing frequency or the faults of the electricity meter are caused when the electricity stealing methods are different.
In the present invention, as shown in fig. 3, the daily power consumption and the station area line loss both include: data under the event occurrence node and data under all time nodes. As shown in fig. 4 and 5, the subscriber profile information characteristic data set is determined by performing an association query on the subscriber profile table, the metering points, the main table of the measurement point data, the zone profile and the line profile. As shown in fig. 6, the data for feature engineering is divided into: time sequence information, event information and static information. Wherein the timing class information includes: daily electric quantity of users and daily line loss of transformer areas. The event class information includes: and (4) abnormal electricity utilization events of users. The static class information includes: user basic information.
In step 103, an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm is constructed, model training and optimization are performed by using the characteristic data of the user, so that the model automatically generates a plurality of decision trees through the random forest algorithm, and whether electricity-stealing behaviors exist in the user is judged according to the voting result of each decision tree, so that the final anti-electricity-stealing inspection monitoring machine learning model is determined.
In step 104, the final electricity stealing prevention inspection and monitoring machine learning model is used for judging the electricity utilization data of the user so as to monitor the electricity stealing behavior of the user.
Preferably, wherein the method further comprises:
and determining the electricity stealing suspicion degree of the user by utilizing the final electricity stealing prevention inspection monitoring machine learning model, and performing correlation output on the electricity stealing suspicion degree and the user information of the user.
The Random Forest (RF) is a classifier comprising a plurality of decision trees, and the traditional decision tree selects an optimal attribute from all candidate attributes (d are assumed) of a current node when selecting the partition attribute; in RF, for each node of the base decision tree, a subset of k attributes is randomly selected from the candidate attribute set of the node, and then an optimal attribute is selected from the subset for partitioning. The parameter k here controls the degree of introduction of randomness: if k is equal to d, the construction of the base decision tree is the same as that of the traditional decision tree; if k is 1, randomly selecting an attribute for division; obviously, the selection of the number k of extracted attributes is important, and it is generally recommended that k be log2 d. Therefore, the 'diversity' of the base learners of the random forest comes from the disturbance of the sample and the disturbance of the attribute, so that the generalization capability of the final integration is further improved. For decision trees in machine learning, if a set of things with classifications can be divided into multiple classes, information of a certain class (xi) can be defined as l (X ═ xi) — log2 p (xi); where I (X) is used to represent the information of the random variable, and p (xi) is the probability when xi occurs. Entropy is used to measure uncertainty, with X ═ xi being more uncertain when the entropy is larger and vice versa. For the classification problem in machine learning, the larger the entropy, i.e., the greater the uncertainty of this class, and vice versa. The information gain is an index used for selecting the feature in the decision tree algorithm, and the larger the information gain is, the better the selectivity of the feature is.
The random forest algorithm can select partial features for each splitting process in the subtree according to the features of the electricity stealing users and the features of the electricity non-stealing users, randomly select certain features from all the features, then select the features from the randomly selected features, repeat the processes for each tree, and finally vote to select the most correct classification. The model can improve the diversity of the anti-electricity-stealing inspection monitoring system, thereby improving the availability of data and the prediction correctness of results, and simultaneously preventing overfitting. The random forest can improve the prediction precision on the premise that the calculation amount is not obviously improved, and the model selects the random forest, and is a classifier for training and predicting samples by utilizing a plurality of trees. The binary data are repeatedly classified or regressed, and the calculated amount is greatly reduced. Randomizing the use of variables (columns) and data (rows) to generate a plurality of classification trees, and then collecting the results of the classification trees.
With reference to fig. 7, in the present invention, after the characteristic data is determined, an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm is constructed, training is performed through the model, a plurality of decision trees are automatically generated through the random forest algorithm, whether the electricity is stolen or not is judged according to the voting result of each decision tree, the model is trained, a file after training is exported, the trained model is included, and the file is directly called during subsequent prediction. Then, testing and optimization are performed using the test set. The method comprises the steps of extracting features of a test set which is known to be a power stealing user or not, inputting the extracted features into a model, obtaining an output result of the test set, comparing the output result with a real result, optimizing the model and determining a final power stealing prevention inspection monitoring machine learning model.
When monitoring is carried out, the characteristic sequence of the electricity utilization data of the user to be monitored is input into a final anti-electricity-stealing inspection monitoring machine learning model by taking the user as a unit, the suspicion degree of the electricity-stealing probability of the suspected user is output, a txt file is exported, the file is imported into a temporary result database, other information of the user is correlated, and the file is imported into a final database.
According to the method, a Cangzhou county is selected for verification test work, partial characteristics of partial suspected users are analyzed, the suspicion degree ranking is carried out according to output results, the suspected users with high suspicion degree are selected, and user detection results are in line with expectations.
In the invention, 360-day data (including electricity stealing users and non-electricity stealing users) is selected in the model input data part, 5 main features are extracted from the basic table, the training data is 2W positive samples and 5W common electricity stealing users, the cross validation accuracy is 80%, and a process of randomly extracting a tree in the model is shown in figure 8. And randomly extracting features from more than 100 trees, fitting a model according to training data, inputting a new sample, and executing the process to obtain a result. When the model is trained, the first step is to extract the features, before the features are extracted, training samples (electricity stealing users and users without finding electricity stealing) of a certain ratio are required to be extracted, the parts are manually set, after the training samples are extracted, the data are handed to feature _ extract to extract data, then a method in feature _ fail is called to convert the data, the data are integrated into data (features) for training in feature _ combination, the model is trained in model _ train _ script, after the training is finished, a trained file is led out and contains the trained model, and the file is directly called in the subsequent prediction process.
During prediction, starting from extracting data needing prediction, a process script is executed to create a needed temporary table, the data needing prediction is taken out of the temporary table and is delivered to feature _ extract to extract the data, the partial code calls a method in feature _ util to convert the data, the data are integrated into data used for prediction in feature _ combination, a trained model file is opened through model _ prediction _ util, the data are imported into a model, a result is predicted and output, a method in output _ database is called to export a xt file, then a hadoop command is used to import the file into a temporary table of hive, address names are associated according to ID information in the table, and the address names are imported into an oracle to complete the whole flow.
Fig. 9 is a schematic structural diagram of an anti-electricity-stealing inspection monitoring system 900 according to an embodiment of the invention. As shown in fig. 9, an anti-electricity-stealing inspection monitoring system 900 according to an embodiment of the present invention includes: a data preprocessing unit 901, a feature data determination unit 902, a model determination unit 903, and a judgment unit 904.
Preferably, the data preprocessing unit 901 is configured to acquire user electricity utilization history data and preprocess the user electricity utilization history data to acquire user electricity utilization processing data; wherein the user history data comprises: electricity data of electricity stealing users and electricity data of normal users.
Preferably, the data preprocessing unit 901, which preprocesses each piece of user history data in the user electricity consumption history data to obtain user electricity consumption processing data, includes:
and converting each piece of user historical data in the user electricity utilization historical data into a binary file in a pkl format based on the pickle to acquire user electricity utilization processing data.
Preferably, the characteristic data determining unit 902 is configured to determine an abnormal event behavior record table based on different types of electricity stealing means, and determine the characteristic data of the user according to the abnormal event behavior record table and the user electricity utilization processing data.
Preferably, wherein the different types of electricity stealing means comprise: short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, differential electricity stealing and meter-free electricity stealing.
Preferably, the characteristic data determining unit determines the characteristic data of the user according to the abnormal event behavior record table and the user electricity utilization processing data, and includes:
counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events;
calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event;
calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event;
counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user;
calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time to determine the line loss sudden change degree at the power consumption sudden change moment;
determining the empty value ratio of the time sequence, the time sequence 0 value ratio, the time sequence abnormal electricity ratio, the time sequence median, the time sequence variance, the time sequence standardization median, the time sequence standardization variance, the mutation point number, the mutation point transition maximum value, the working day average electricity consumption, the resting day average electricity consumption, the working day average electricity consumption proportion, the resting day average electricity consumption proportion and the information entropy of the week electricity consumption proportion according to the average value, the variance, the abnormal value, the mutation point, the working day electricity consumption and the resting day electricity consumption of the user.
Preferably, the model determining unit 903 is configured to construct an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm, perform model training and optimization by using the feature data of the user, enable the model to automatically generate a plurality of decision trees through the random forest algorithm, and determine whether an electricity-stealing behavior exists in the user according to a voting result of each decision tree, so as to determine a final anti-electricity-stealing inspection monitoring machine learning model.
Preferably, the judging unit 904 is configured to judge the power consumption data of the user by using the final electricity stealing prevention inspection and monitoring machine learning model, so as to monitor the electricity stealing behavior of the user.
Preferably, wherein the system further comprises:
and the output unit is used for determining the electricity stealing suspicion degree of the user by utilizing the final electricity stealing prevention inspection monitoring machine learning model and performing correlation output on the electricity stealing suspicion degree and the user information of the user.
The anti-electricity-stealing inspection monitoring system 900 of the embodiment of the invention corresponds to the anti-electricity-stealing inspection monitoring method 100 of another embodiment of the invention, and is not described herein again.
The invention has been described with reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the [ device, component, etc ]" are to be interpreted openly as referring to at least one instance of said device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. An anti-electricity-stealing inspection monitoring method is characterized by comprising the following steps:
acquiring historical data of user electricity utilization, and preprocessing the historical data of the user electricity utilization to acquire processed data of the user electricity utilization; wherein the user history data comprises: electricity data of electricity stealing users and electricity data of normal users;
determining an abnormal event behavior record table based on different types of electricity stealing means, and determining characteristic data of a user according to the abnormal behavior record table and user electricity utilization processing data;
constructing an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm, performing model training and optimization by using the characteristic data of the user, enabling the model to automatically generate a plurality of decision trees through the random forest algorithm, and judging whether the electricity-stealing behavior of the user exists according to the voting result of each decision tree so as to determine the final anti-electricity-stealing inspection monitoring machine learning model;
and judging the power utilization data of the user by utilizing the final anti-electricity-stealing inspection monitoring machine learning model so as to monitor the electricity-stealing behavior of the user.
2. The method of claim 1, wherein the preprocessing each piece of user electricity usage history data in the user electricity usage history data to obtain user electricity usage processing data comprises:
and converting each piece of user historical data in the user electricity utilization historical data into a binary file in a pkl format based on the pickle to acquire user electricity utilization processing data.
3. The method of claim 1, wherein the different types of electricity stealing means comprise: short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, differential electricity stealing and meter-free electricity stealing.
4. The method of claim 1, wherein determining the user profile data from the exception behavior log and the user power usage processing data comprises:
counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events;
calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event;
calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event;
counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user;
calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time to determine the line loss sudden change degree at the power consumption sudden change moment;
determining the empty value ratio of the time sequence, the time sequence 0 value ratio, the time sequence abnormal electricity ratio, the time sequence median, the time sequence variance, the time sequence standardization median, the time sequence standardization variance, the mutation point number, the mutation point transition maximum value, the working day average electricity consumption, the resting day average electricity consumption, the working day average electricity consumption proportion, the resting day average electricity consumption proportion and the information entropy of the week electricity consumption proportion according to the average value, the variance, the abnormal value, the mutation point, the working day electricity consumption and the resting day electricity consumption of the user.
5. The method of claim 1, further comprising:
and determining the electricity stealing suspicion degree of the user by utilizing the final electricity stealing prevention inspection monitoring machine learning model, and performing correlation output on the electricity stealing suspicion degree and the user information of the user.
6. An anti-electricity-stealing audit monitoring system, the system comprising:
the data preprocessing unit is used for acquiring historical data of user electricity utilization and preprocessing the historical data of the user electricity utilization to acquire processed data of the user electricity utilization; wherein the user history data comprises: electricity data of electricity stealing users and electricity data of normal users;
the characteristic data determining unit is used for determining an abnormal event behavior record table based on different types of electricity stealing means and determining the characteristic data of a user according to the abnormal behavior record table and the user electricity utilization processing data;
the model determining unit is used for constructing an anti-electricity-stealing inspection monitoring machine learning model based on a random forest algorithm, performing model training and optimization by using the characteristic data of the user, enabling the model to automatically generate a plurality of decision trees through the random forest algorithm, and judging whether electricity-stealing behaviors exist in the user according to the voting result of each decision tree so as to determine the final anti-electricity-stealing inspection monitoring machine learning model;
and the judging unit is used for judging the power utilization data of the user by utilizing the final anti-electricity-stealing inspection monitoring machine learning model so as to monitor the electricity-stealing behavior of the user.
7. The system of claim 6, wherein the data preprocessing unit preprocesses each piece of user history data in the user electricity consumption history data to obtain user electricity consumption processing data, and comprises:
and converting each piece of user historical data in the user electricity utilization historical data into a binary file in a pkl format based on the pickle to acquire user electricity utilization processing data.
8. The system of claim 6, wherein the different types of electricity stealing means comprise: short circuit electricity stealing, electric energy meter stalling, electric energy meter reversal, under-voltage electricity stealing, under-current electricity stealing, phase-shift electricity stealing, differential electricity stealing and meter-free electricity stealing.
9. The system of claim 6, wherein the characteristic data determining unit determines the characteristic data of the user according to the abnormal event behavior record table and the user electricity utilization processing data, and comprises:
counting all the extracted abnormal events to determine the occurrence frequency of the abnormal events;
calculating the sudden change degree of the power consumption corresponding to the time node of the abnormal event to determine the power consumption sudden change degree at the moment of the abnormal event;
calculating the sudden change degree of the line loss of the transformer area corresponding to the time node of the abnormal event to determine the line loss sudden change degree at the time of the abnormal event;
counting the power consumption mutation points at all times to determine the number of the power consumption mutation points of the user;
calculating the line loss sudden change degree corresponding to the time of the power consumption sudden change point at all the time to determine the line loss sudden change degree at the power consumption sudden change moment;
determining the empty value ratio of the time sequence, the time sequence 0 value ratio, the time sequence abnormal electricity ratio, the time sequence median, the time sequence variance, the time sequence standardization median, the time sequence standardization variance, the mutation point number, the mutation point transition maximum value, the working day average electricity consumption, the resting day average electricity consumption, the working day average electricity consumption proportion, the resting day average electricity consumption proportion and the information entropy of the week electricity consumption proportion according to the average value, the variance, the abnormal value, the mutation point, the working day electricity consumption and the resting day electricity consumption of the user.
10. The system of claim 6, further comprising:
and the output unit is used for determining the electricity stealing suspicion degree of the user by utilizing the final electricity stealing prevention inspection monitoring machine learning model and performing correlation output on the electricity stealing suspicion degree and the user information of the user.
CN202110367202.XA 2021-04-06 2021-04-06 Anti-electricity-stealing inspection monitoring method and system Pending CN113239087A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110367202.XA CN113239087A (en) 2021-04-06 2021-04-06 Anti-electricity-stealing inspection monitoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110367202.XA CN113239087A (en) 2021-04-06 2021-04-06 Anti-electricity-stealing inspection monitoring method and system

Publications (1)

Publication Number Publication Date
CN113239087A true CN113239087A (en) 2021-08-10

Family

ID=77131238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110367202.XA Pending CN113239087A (en) 2021-04-06 2021-04-06 Anti-electricity-stealing inspection monitoring method and system

Country Status (1)

Country Link
CN (1) CN113239087A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113947504A (en) * 2021-11-11 2022-01-18 国网辽宁省电力有限公司营销服务中心 Electricity stealing analysis method and system based on random forest method
CN114154999A (en) * 2021-10-27 2022-03-08 国网河北省电力有限公司营销服务中心 Electricity stealing prevention method, device, terminal and storage medium
CN114648255A (en) * 2022-05-18 2022-06-21 国网浙江省电力有限公司 Inspection method and platform based on marketing business risk digital internal control system
CN116910518A (en) * 2023-09-14 2023-10-20 福州众点网络技术开发有限公司 Knowledge graph-based anti-electricity-stealing early warning method and system
WO2024103082A1 (en) * 2022-11-11 2024-05-16 Norman Frederick Parkin System and method for validating energy consumption data

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114154999A (en) * 2021-10-27 2022-03-08 国网河北省电力有限公司营销服务中心 Electricity stealing prevention method, device, terminal and storage medium
CN113947504A (en) * 2021-11-11 2022-01-18 国网辽宁省电力有限公司营销服务中心 Electricity stealing analysis method and system based on random forest method
CN114648255A (en) * 2022-05-18 2022-06-21 国网浙江省电力有限公司 Inspection method and platform based on marketing business risk digital internal control system
CN114648255B (en) * 2022-05-18 2022-08-16 国网浙江省电力有限公司 Inspection method and platform based on marketing business risk digital internal control system
WO2024103082A1 (en) * 2022-11-11 2024-05-16 Norman Frederick Parkin System and method for validating energy consumption data
CN116910518A (en) * 2023-09-14 2023-10-20 福州众点网络技术开发有限公司 Knowledge graph-based anti-electricity-stealing early warning method and system

Similar Documents

Publication Publication Date Title
CN113239087A (en) Anti-electricity-stealing inspection monitoring method and system
CN106780121B (en) Power consumption abnormity identification method based on power consumption load mode analysis
CN110232203B (en) Knowledge distillation optimization RNN short-term power failure prediction method, storage medium and equipment
CN110298601A (en) A kind of real time business air control system of rule-based engine
CN104850963A (en) Drainage basin sudden water pollution accident warning and emergency disposal method and drainage basin sudden water pollution accident warning and emergency disposal system
CN111861786B (en) Special power-stealing identification method based on feature selection and isolated random forest
CN109308411B (en) Method and system for hierarchically detecting software behavior defects based on artificial intelligence decision tree
CN113791926A (en) Intelligent alarm analysis method, device, equipment and storage medium
CN109391624A (en) A kind of terminal access data exception detection method and device based on machine learning
CN113516336A (en) Method and system for determining electricity stealing suspected user
CN112257784A (en) Electricity stealing detection method based on gradient boosting decision tree
CN112084220B (en) Abnormality diagnosis method and device for electric energy metering device and readable storage medium
CN116862081B (en) Operation and maintenance method and system for pollution treatment equipment
CN108268886A (en) For identifying the method and system of plug-in operation
CN111506635A (en) System and method for analyzing residential electricity consumption behavior based on autoregressive naive Bayes algorithm
CN114169424A (en) Discharge capacity prediction method based on k nearest neighbor regression algorithm and electricity utilization data
CN106022640B (en) Electric quantity index checking system and method
CN111159251A (en) Method and device for determining abnormal data
CN114048439A (en) AI-based security behavior analysis system and method
CN114665986B (en) Bluetooth key testing system and method
CN115598459A (en) Power failure prediction method for 10kV feeder line fault of power distribution network
CN113627452A (en) Electricity stealing behavior detection method based on machine learning random forest algorithm
CN117745080B (en) Multi-factor authentication-based data access control and security supervision method and system
He et al. Evaluation of Reactive Power and Voltage Support Capability of Power Grid Based on Big Data
Chandel et al. Cyber security of smart metering infrastructure using hybrid machine learning technique

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination