CN115033591A - Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment - Google Patents
Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment Download PDFInfo
- Publication number
- CN115033591A CN115033591A CN202210617862.3A CN202210617862A CN115033591A CN 115033591 A CN115033591 A CN 115033591A CN 202210617862 A CN202210617862 A CN 202210617862A CN 115033591 A CN115033591 A CN 115033591A
- Authority
- CN
- China
- Prior art keywords
- data
- electricity charge
- abnormal
- charge data
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005611 electricity Effects 0.000 title claims abstract description 132
- 238000001514 detection method Methods 0.000 title claims abstract description 74
- 230000002159 abnormal effect Effects 0.000 claims abstract description 89
- 238000000034 method Methods 0.000 claims abstract description 41
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 22
- 238000007418 data mining Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims description 19
- 238000010801 machine learning Methods 0.000 claims description 17
- 230000005856 abnormality Effects 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 14
- 238000007637 random forest analysis Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 9
- 238000007405 data analysis Methods 0.000 claims description 6
- 238000005065 mining Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000003066 decision tree Methods 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 3
- 238000009795 derivation Methods 0.000 claims 1
- 238000012216 screening Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004880 explosion Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Quality & Reliability (AREA)
- Medical Informatics (AREA)
- Tourism & Hospitality (AREA)
- Life Sciences & Earth Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Water Supply & Treatment (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Primary Health Care (AREA)
- Operations Research (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to an intelligent detection method, a system, a storage medium and computer equipment for abnormal electricity charge data, wherein the method comprises the following steps: s1, rule setting is carried out, and the rule of the abnormal electricity charge data detection is dynamically set; s2, acquiring data, and deriving the original electric charge data and the abnormal electric charge data from the database; s3, carrying out data processing, namely carrying out missing value processing, feature coding and feature selection processing on the electric charge original data; s4, data mining is carried out, hidden abnormal data are mined out, and the abnormal type is detected and obtained; s5, constructing a model, training an algorithm model, and dynamically adjusting model parameters; and S6, performing model prediction, inputting data into the model for prediction, and obtaining a final abnormal electricity charge data detection result. The method and the device can efficiently detect and identify the abnormal data of the electric charge, improve the detection level of the abnormal data of the electric charge, effectively improve the hit rate of the abnormal data of the electric charge and improve the intelligent detection level of the abnormal data of the electric charge.
Description
Technical Field
The invention relates to the technical field of machine learning, in particular to an intelligent detection method, system, storage medium and computer equipment for electricity charge data abnormity.
Background
According to the traditional electricity charge data abnormity detection method, the manual experience is summarized, the electricity charge data abnormity screening rule is extracted, according to investigation, the existing electricity charge data abnormity screening rule of a power grid company reaches dozens to hundreds, if all the rules need to be traversed once during each time of electricity charge abnormity error data checking, heavy workload can be brought to an electricity marketing department, and the operation efficiency of the electricity department is reduced. In the existing electric charge data anomaly detection and accounting rules, a rule that a plurality of variables can be adjusted exists, the rules are divided according to the adjustable variables of the rules, the rules comprise two types of 'reference electric quantity' and 'fluctuation rate', the two types of 'reference electric quantity' and 'fluctuation rate' can be manually adjusted in the electric power marketing process, and the applicability, effectiveness and reasonability of variable parameter setting can directly influence the electric charge error abnormal data quantity generated by the electric power marketing system, so that the working efficiency of electric charge accounting is further influenced. On the other hand, some of the existing rules change their detection effect with the change of months. Therefore, the existing electricity charge data anomaly detection technology has long time consumption and low hit rate for detecting the electricity charge data anomaly, and cannot meet the requirement of smart grid construction.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides an intelligent detection method, a system, a storage medium and a computer device for abnormal electricity charge data, which can efficiently detect and identify abnormal electricity charge data, improve the detection level of an electricity charge company on the abnormal electricity charge data, effectively improve the hit rate of the abnormal electricity charge data and further improve the intelligent detection level of the abnormal electricity charge data of the electricity charge company.
The method is realized by adopting the following technical scheme: an intelligent detection method for abnormal electricity charge data comprises the following steps:
s1, rule setting is carried out, and the rule of the abnormal electricity charge data detection is dynamically set;
s2, acquiring data, and deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
s3, carrying out data processing, namely carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
s4, performing data mining, mining hidden abnormal data from the processed electric charge original data and detecting the abnormal type of the acquired electric charge data;
s5, model construction is carried out, a machine learning algorithm model is built according to data analysis results of data mining, the algorithm model is trained, and model parameters are dynamically adjusted in the model training process;
and S6, carrying out model prediction, constructing an obtained electricity charge data abnormity intelligent detection model according to the model, and inputting data into the model for prediction after acquiring the original electricity charge data to obtain a final electricity charge data abnormity detection result.
The system of the invention is realized by adopting the following technical scheme: an intelligent detection method for abnormal electricity charge data comprises the following steps:
the rule setting model is used for dynamically setting the rule of the abnormal detection of the electricity charge data;
the data acquisition module is used for deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
the data processing module is used for carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
the data mining module is used for mining hidden abnormal data from the processed electric charge raw data and detecting and acquiring abnormal types of the electric charge data;
the model building module is used for building a machine learning algorithm model according to a data analysis result of data mining, training the algorithm model and dynamically adjusting model parameters in the model training process;
and the model prediction module is used for constructing an intelligent abnormal electricity charge data detection model according to the model, inputting the data into the model for prediction after acquiring the original electricity charge data, and obtaining a final abnormal electricity charge data detection result.
The present invention also proposes a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the intelligent detection method of electricity charge data abnormality of the present invention.
The invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the computer program, the intelligent detection method for the abnormal electricity charge data is realized.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. according to the invention, through carrying out a series of data processing such as missing value processing, feature coding, feature selection and the like on the electricity charge data and combining a related data mining method, the hidden information of the abnormal electricity charge data is greatly mined, and the capability of detecting the abnormal electricity charge data is further improved.
2. The method can solve the problem that the hit rate of the existing method for detecting the abnormal electricity fee data is low, so that the finally detected suspected abnormal electricity fee data is greatly reduced, the workload of the abnormal rechecker of the basic electricity fee data is greatly reduced, the operation cost of the electricity fee checking department of a power grid company is reduced, and a large amount of manpower and material resources are saved.
3. According to the method, a weighted residual deep forest model is constructed, the difference among deep forest subtrees obtained by training on a power charge data set can be reduced in the weighting process of the model, the weighted deep forest gives a subtree with high accuracy rate for predicting abnormal power charge data with larger weight so as to increase the function of the subtree in decision, so that the accuracy rate of abnormal hit of the power charge data is effectively improved, the number of layers of cascaded forests is reduced, and the training time is shortened; meanwhile, the model can make up the defect that gradient disappearance or gradient explosion possibly occurs in the deep forest algorithm, and the ability of learning the abnormal features of the electric charge data can be continuously increased under the condition that the number of the deep forest cascade layers is increased and on the basis of keeping the previous model effect.
4. The intelligent detection model for the abnormal electricity charge data based on machine learning is established, the model is based on machine learning algorithms such as a weighted residual error deep forest model and XGBOOST, various algorithms are optimized and model fusion is carried out, the abnormal electricity charge data are detected to the maximum extent, compared with the existing abnormal electricity charge data detection method based on rules, the intelligent detection model has the advantages of being fast in response time, short in detection time and low in omission ratio, the environment required by field configuration and operation is simple, safety is high, and the intelligent detection model has high practical application value.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a data processing flow diagram of the present invention;
FIG. 3 is a flow chart of the present invention for electric utility data mining;
FIG. 4 is a flow chart of the model construction of the present invention;
FIG. 5 is a residual forest flow diagram of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
As shown in fig. 1, the intelligent detection method for electricity charge data abnormality in the embodiment includes the following steps:
s1, rule setting is carried out, and the rule of the electricity charge data abnormity detection is dynamically set;
s2, acquiring data, and deriving the original electric charge data and the abnormal electric charge data of all the electricity users in the previous month from the database;
s3, carrying out data processing, namely carrying out missing value processing, feature coding, feature selection and other processing on the original electric charge data;
s4, data mining is carried out, hidden abnormal data are mined from the processed electric charge original data, and the abnormal type of the rough electric charge data is detected and obtained;
s5, model construction is carried out, a machine learning algorithm model is built according to data analysis results of data mining, the algorithm model is trained, and model parameters are dynamically adjusted in the model training process;
and S6, carrying out model prediction, constructing an obtained electricity charge data abnormity intelligent detection model according to the model, and inputting data into the model for prediction after acquiring original electricity charge data of the same edition in the same month to obtain a final electricity charge data abnormity detection result.
Specifically, as shown in fig. 2, the specific procedure of the data processing in step S3 is as follows:
s301, marking the collected electric charge original data and the collected abnormal electric charge data, adding a feature for marking whether the electric charge data is abnormal or not, marking all the original electric charge data, wherein the abnormal electric charge data is marked as 1, and the non-abnormal electric charge data is marked as 0;
s302, performing missing value processing on all marked electricity charge original data, if more than 10 characteristics of a certain row of electricity charge data have missing values, directly deleting the row of data, and performing filling processing on other missing values, wherein the missing value filling methods are all filling numerical values of-1;
s303, carrying out feature coding on text characters in the electric charge original data, wherein the feature coding mode can select text coding and unique hot coding;
s304, feature importance ranking is carried out on the electricity charge original data, a machine learning algorithm adopted by the feature importance ranking of the electricity charge original data can select random forests, XGBOOST, SVM and the like, and finally the importance ranking of all the electricity charge data features is obtained;
s305, all features of the electricity charge data are selected, the feature selection can refer to the feature importance sorting result, and a plurality of features of the last bit of the feature importance sorting are deleted, so that the hit rate and the efficiency of the abnormal intelligent detection of the electricity charge data are improved.
Specifically, as shown in fig. 3, the specific process of data mining in step S4 is as follows:
s401, carrying out abnormality degree grading on the electricity charge data samples by using an isolated forest algorithm, carrying out abnormality degree sorting according to the abnormality degree grading, taking the electricity charge data samples with the abnormality degree grading ranked in the top 70% as normal electricity charge data samples with high reliability, screening the normal electricity charge data samples with high reliability, and further eliminating the influence of untrue electricity charge data on model construction;
s402, determining the cluster number and clustering of a plurality of types of samples of the electric charge original data, clustering the electric charge original data by using a K-means algorithm, and calculating the optimal clustering cluster number K of the plurality of types of samples, so that the plurality of types of samples are clustered into K clusters;
s403, sampling each cluster after the majority of samples are gathered into k clusters, wherein the majority of samples can be subjected to undersampling in the sampling process, and the undersampling algorithm can be a random undersampling algorithm;
s404, data mining is conducted on the finally sampled electric charge data, data mining is conducted on the electric charge data through combining the existing electric charge data abnormity detection rule and the machine learning algorithm, accordingly, hidden abnormal data are mined out, and the approximate abnormal type of the electric charge data is obtained through detection.
Specifically, as shown in fig. 4, the specific process of model building in step S5 is as follows:
s501, dividing the electricity charge data into a training set and a testing set, and dividing the electricity charge data into the training set and the testing set according to the proportion of 7:3 or 8:2 by adopting a random sampling method;
s502, carrying out feature interactive processing on the electricity charge data, respectively obtaining combined and derived training prediction results between features by using algorithms such as random forest and XGBOOST, comparing the results with original data training prediction results without feature interaction, and comparing the adopted indexes, namely recall ratio and precision ratio of abnormality of the electricity charge data, wherein the recall ratio reflects the condition of missing detection of the electricity charge abnormal data, the precision ratio reflects the condition of successful hit detection of the electricity charge abnormal data, the feature interactive results are synthesized, and multi-feature combined and derived new features are further constructed;
s503, constructing an intelligent abnormal electricity charge data detection model, training and predicting electricity charge data by constructing a weighted residual deep forest model and machine learning algorithms such as a decision tree, a random forest, XGBOOST, CATBOOST and the like to obtain the trained intelligent abnormal electricity charge data detection model;
s504, performing parameter adjustment on the intelligent abnormal electricity charge data detection model, wherein the selected parameter adjustment method can be a greedy parameter adjustment method, a grid parameter adjustment method or a Bayesian parameter adjustment method, and finally obtaining the optimal parameters of the intelligent abnormal electricity charge data detection model;
and S505, performing algorithm fusion on multiple reference algorithms with adjusted parameters, wherein the reference algorithms can be algorithms such as weighted residual deep forest, decision tree, random forest, XGB OST, CATBOOST and the like, and the fused machine learning model is a final intelligent detection model for abnormal electricity charge data.
Specifically, in this embodiment, the specific process of constructing the weighted residual depth forest model in step S503 is as follows:
set the electricity charge data set S ═ { N ═ N 1 ,N 2 ,…,N m The category is L ═ L 1 ,L 2 In which L is 1 Representing an abnormal electricity charge, L 2 Representing non-electricity charge anomaly data, the prediction probability matrix of the weighted residual depth forest is represented as follows:
wherein, T ij Representing prediction probability of ith electricity charge data divided into jth class by weighting position of maximum value of each row in prediction probability matrix of residual depth forestThe subscript j is used as the final prediction category of the piece of electricity charge data, the position of the value in the prediction probability matrix is marked as 1, and the rest values are marked as 0, as follows:
and calculating the accuracy of the weighted residual depth forest according to the following formula:
wherein m represents the total number of the electric charge data, A [ i ] [ j ] represents a distribution matrix of the actual category of the electric charge data, T [ i ] [ j ] represents a prediction probability matrix of the weighted residual depth forest, the number of the electric charge data with correct prediction is obtained by taking intersection, and then the ratio of the number of the electric charge data with the total number m of the electric charge data is calculated, so that the final accuracy of prediction of each weighted residual depth forest is obtained;
assuming that the weighted residual depth forest F is {1,2, …, F }, the weight can be calculated according to the accuracy of each forest, and defined as η, which is expressed as follows:
wherein, P i And representing the prediction accuracy of the ith forest, and obtaining a weighted prediction probability matrix of the ith forest as follows:
T (i) =T×η
and taking the probability result of the weighted prediction probability matrix of each forest as the input of the next forest cascade layer until the maximum cascade forest layer number is reached or the accuracy of the forest prediction result is not improved any more, and stopping iteration.
In this embodiment, as shown in fig. 5, in order to avoid the problem of gradient explosion or disappearance while increasing the number of forest layers in the deep forest, a structure similar to a residual error network is adopted to further form a weighted residual error deep forest, and the specific process is as follows:
inputting the characteristics of the electric charge data, inputting the characteristic values after weighted deep forest multi-granularity scanning into a completely random forest and an extremely random forest, and because the abnormal detection of the electric charge data is a problem of two classifications, each random forest finally generates two classification results, storing the results and inputting the results and the multi-granularity scanning result of the next layer of forest into the forest of each layer behind until the maximum cascade forest layer number is reached, or the accuracy of the forest prediction result is not improved any more, and stopping iteration.
Based on the same inventive concept, the invention also provides an electricity charge data abnormal intelligent detection system, which comprises:
the rule setting model is used for dynamically setting the rule of the abnormal detection of the electricity charge data;
the data acquisition module is used for deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
the data processing module is used for carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
the data mining module is used for mining hidden abnormal data from the processed electric charge raw data and detecting and acquiring abnormal types of the electric charge data;
the model construction module is used for establishing a machine learning algorithm model according to a data analysis result of data mining, training the algorithm model and dynamically adjusting model parameters in the model training process;
and the model prediction module is used for constructing an obtained electricity charge data abnormity intelligent detection model according to the model, inputting the data into the model for prediction after acquiring the original electricity charge data, and obtaining a final electricity charge data abnormity detection result.
In addition, the invention also provides a storage medium and computer equipment. Wherein the storage medium has stored thereon a computer program which, when executed by the processor, implements the steps S1-S6 of the electricity fee data abnormality intelligent detection method of the present invention. The computer device comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and when the processor executes the computer program, the intelligent detection method for the electricity charge data abnormity of the invention is realized, namely the intelligent detection method comprises the processes of the steps S1-S6.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (8)
1. An intelligent detection method for abnormal electricity charge data is characterized by comprising the following steps:
s1, rule setting is carried out, and the rule of the abnormal electricity charge data detection is dynamically set;
s2, data acquisition is carried out, and original electric charge data and abnormal electric charge data of all electric users are derived from the database;
s3, carrying out data processing, namely carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
s4, performing data mining, mining hidden abnormal data from the processed electric charge original data and detecting the abnormal type of the acquired electric charge data;
s5, model construction is carried out, a machine learning algorithm model is built according to the data analysis result of data mining, the algorithm model is trained, and model parameters are dynamically adjusted in the model training process;
and S6, carrying out model prediction, constructing an obtained electricity charge data abnormity intelligent detection model according to the model, and inputting data into the model for prediction after acquiring the original electricity charge data to obtain a final electricity charge data abnormity detection result.
2. The intelligent detection method for electricity charge data abnormality according to claim 1, wherein the specific process of data processing in step S3 is as follows:
s301, marking the collected electric charge original data and the electric charge abnormal data, adding a feature for marking whether the electric charge data is abnormal or not, marking all the original electric charge data, marking the abnormal electric charge data as 1, and marking the non-abnormal electric charge data as 0;
s302, performing missing value processing on all marked electricity charge original data, if more than 10 characteristics of a certain row of electricity charge data have missing values, directly deleting the row of data, and performing filling processing on other missing values, wherein the missing value filling methods are all filling numerical values of-1;
s303, carrying out feature coding on text characters in the electric charge original data, wherein the feature coding mode is text coding and one-hot coding;
s304, performing characteristic importance ranking on the electric charge original data, wherein the characteristic importance ranking of the electric charge original data adopts a machine learning algorithm of random forest, XGBOOST and SVM, and finally obtaining the importance ranking of all electric charge data characteristics;
s305, all features of the electric charge data are selected, the feature selection refers to the feature importance sorting result, and a plurality of features of the last bit of the feature importance sorting are deleted.
3. The intelligent detection method for electricity charge data abnormity according to claim 1, characterized in that the specific process of data mining in step S4 is as follows:
s401, carrying out abnormality degree grading on the electricity charge data samples by using an isolated forest algorithm, carrying out abnormality degree sorting according to the abnormality degree grading, and taking the electricity charge data samples with the abnormality degree grading ranked in the top 70% as normal electricity charge data samples with high reliability;
s402, determining the cluster number and clustering of most samples of the electric charge original data, clustering the electric charge original data by using a K-means algorithm, calculating the optimal cluster number K of the most samples, and clustering the most samples into K clusters;
s403, after the majority of samples are gathered into k clusters, sampling each cluster, and performing undersampling on the majority of samples in the sampling process, wherein the undersampling algorithm utilizes a random undersampling algorithm;
and S404, performing data mining on the finally sampled electric charge data, performing data mining on the electric charge data by combining an electric charge data abnormity detection rule and a machine learning algorithm, mining hidden abnormal data and detecting and acquiring an abnormal type of the electric charge data.
4. The intelligent detection method for abnormal electricity charge data according to claim 1, wherein the specific process of model construction in step S5 is as follows:
s501, dividing the electricity charge data into a training set and a testing set, and dividing the electricity charge data into the training set and the testing set according to the proportion of 7:3 or 8:2 by adopting a random sampling method;
s502, carrying out feature interaction processing on the electricity charge data, respectively obtaining combined and derived training prediction results between features by using random forest and XGBOOST algorithms, comparing the results with original data training prediction results without feature interaction, and comparing the adopted indexes, namely recall ratio and precision ratio of abnormality of the electricity charge data, wherein the recall ratio reflects the condition of missing detection of the electricity charge abnormal data, the precision ratio reflects the condition of successful hit detection of the electricity charge abnormal data, and the feature interaction results are synthesized to construct new features of multi-feature combination and derivation;
s503, constructing an intelligent abnormal electricity charge data detection model, training and predicting electricity charge data through constructing a weighted residual deep forest model and a machine learning algorithm of a decision tree, a random forest, an XGBOOST and a CATBOOST respectively to obtain the trained intelligent abnormal electricity charge data detection model;
s504, performing parameter adjustment on the intelligent abnormal electricity charge data detection model, wherein the selected parameter adjustment method is a greedy parameter adjustment method, a grid parameter adjustment method or a Bayesian parameter adjustment method, and finally obtaining the optimal parameters of the intelligent abnormal electricity charge data detection model;
and S505, performing algorithm fusion on multiple reference algorithms of the adjusted parameters, wherein the reference algorithms are weighted residual error deep forest, decision tree, random forest, XGB OST and CATBOOST algorithms, and the fused machine learning model is a final intelligent detection model for abnormal electricity charge data.
5. The intelligent detection method for electricity charge data abnormity according to claim 4, wherein the specific process of building the weighted residual depth forest model in step S503 is as follows:
set the electricity charge data set S ═ { N ═ N 1 ,N 2 ,…,N m The category is L ═ L 1 ,L 2 In which L is 1 Representing an abnormal electricity charge, L 2 Representing non-electricity charge anomaly data, the prediction probability matrix of the weighted residual depth forest is represented as follows:
wherein, T ij The prediction probability that the ith piece of electricity charge data is divided into the jth type is represented, the subscript j of the position of the maximum value of each row in the prediction probability matrix of the weighted residual depth forest is used as the final prediction type of the electricity charge data, the position of the value in the prediction probability matrix is marked as 1, and the rest values are marked as 0, and the method comprises the following steps:
and calculating the accuracy of the weighted residual depth forest according to the following formula:
wherein m represents the total number of the electricity charge data, A [ i ] [ j ] represents a distribution matrix of the actual category of the electricity charge data, T [ i ] [ j ] represents a prediction probability matrix of the weighted residual error depth forest, the number of the electricity charge data with correct prediction is obtained by taking intersection, then the ratio of the number of the electricity charge data with the total number m of the electricity charge data is calculated, and the final prediction accuracy of each weighted residual error depth forest is obtained;
assuming that the weighted residual depth forest F is {1,2, …, F }, a weight is calculated according to the accuracy of each forest, and the weight is defined as η, which is expressed as follows:
wherein, P i And representing the prediction accuracy of the ith forest, and obtaining a weighted prediction probability matrix of the ith forest as follows:
T (i) =T×η
taking the probability result of the weighted prediction probability matrix of each forest as the input of the next forest cascade layer after weighting until the maximum cascade forest layer number is reached or the accuracy of the forest prediction result is not improved any more, and stopping iteration;
forming a weighted residual depth forest by using the structure of a residual network, which specifically comprises the following steps: inputting the characteristics of the electric charge data, and inputting the characteristic values after weighted depth forest multi-granularity scanning into a complete random forest and an extreme random forest; and storing two classification results generated by each random forest and inputting the two classification results and the multi-granularity scanning result of the next layer of forest into each layer of forest at the back until the maximum cascade forest layer number is reached, and stopping iteration.
6. An abnormal electricity charge data intelligent detection system, comprising:
the rule setting model is used for dynamically setting the rule of the abnormal detection of the electricity charge data;
the data acquisition module is used for deriving the original electric charge data and the abnormal electric charge data of all the electricity users from the database;
the data processing module is used for carrying out missing value processing, feature coding and feature selection processing on the electric charge original data;
the data mining module is used for mining hidden abnormal data from the processed electric charge raw data and detecting and acquiring abnormal types of the electric charge data;
the model construction module is used for establishing a machine learning algorithm model according to a data analysis result of data mining, training the algorithm model and dynamically adjusting model parameters in the model training process;
and the model prediction module is used for constructing an obtained electricity charge data abnormity intelligent detection model according to the model, inputting the data into the model for prediction after acquiring the original electricity charge data, and obtaining a final electricity charge data abnormity detection result.
7. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the electricity charge data abnormality intelligent detection method according to any one of claims 1 to 5.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the intelligent electricity charge data abnormality detection method according to any one of claims 1 to 5 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210617862.3A CN115033591A (en) | 2022-06-01 | 2022-06-01 | Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210617862.3A CN115033591A (en) | 2022-06-01 | 2022-06-01 | Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115033591A true CN115033591A (en) | 2022-09-09 |
Family
ID=83123415
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210617862.3A Pending CN115033591A (en) | 2022-06-01 | 2022-06-01 | Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115033591A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115905319A (en) * | 2022-11-16 | 2023-04-04 | 国网山东省电力公司营销服务中心(计量中心) | Automatic identification method and system for abnormal electricity charges of massive users |
CN116048912A (en) * | 2022-12-20 | 2023-05-02 | 中科南京信息高铁研究院 | Cloud server configuration anomaly identification method based on weak supervision learning |
-
2022
- 2022-06-01 CN CN202210617862.3A patent/CN115033591A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115905319A (en) * | 2022-11-16 | 2023-04-04 | 国网山东省电力公司营销服务中心(计量中心) | Automatic identification method and system for abnormal electricity charges of massive users |
CN115905319B (en) * | 2022-11-16 | 2024-04-19 | 国网山东省电力公司营销服务中心(计量中心) | Automatic identification method and system for abnormal electricity fees of massive users |
CN116048912A (en) * | 2022-12-20 | 2023-05-02 | 中科南京信息高铁研究院 | Cloud server configuration anomaly identification method based on weak supervision learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022110557A1 (en) | Method and device for diagnosing user-transformer relationship anomaly in transformer area | |
CA3088899C (en) | Systems and methods for preparing data for use by machine learning algorithms | |
CN115033591A (en) | Intelligent detection method and system for electricity charge data abnormity, storage medium and computer equipment | |
CN106095639A (en) | A kind of cluster subhealth state method for early warning and system | |
CN107103332A (en) | A kind of Method Using Relevance Vector Machine sorting technique towards large-scale dataset | |
CN113792754A (en) | Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing | |
CN105871879A (en) | Automatic network element abnormal behavior detection method and device | |
CN109711707B (en) | Comprehensive state evaluation method for ship power device | |
CN106060008A (en) | Network invasion abnormity detection method | |
CN112363896A (en) | Log anomaly detection system | |
CN109255029A (en) | A method of automatic Bug report distribution is enhanced using weighted optimization training set | |
CN111507504A (en) | Adaboost integrated learning power grid fault diagnosis system and method based on data resampling | |
CN112464996A (en) | Intelligent power grid intrusion detection method based on LSTM-XGboost | |
CN117520954A (en) | Abnormal data reconstruction method and system based on isolated forest countermeasure network | |
CN117556369B (en) | Power theft detection method and system for dynamically generated residual error graph convolution neural network | |
CN108830407B (en) | Sensor distribution optimization method in structure health monitoring under multi-working condition | |
CN113743453A (en) | Population quantity prediction method based on random forest | |
CN116365519B (en) | Power load prediction method, system, storage medium and equipment | |
CN113112067A (en) | Method for establishing TFRI weight calculation model | |
CN116663972A (en) | Visual analysis method for weight of food adulterants based on feature selection | |
Dong et al. | Research on academic early warning model based on improved SVM algorithm | |
CN115758462A (en) | Method, device, processor and computer readable storage medium for realizing sensitive data identification in trusted environment | |
CN115392582A (en) | Crop yield prediction method based on incremental fuzzy rough set attribute reduction | |
CN115422821A (en) | Data processing method and device for rock mass parameter prediction | |
CN111654853B (en) | Data analysis method based on user information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |