CN116186630A - Abnormal leakage current data identification method and related device - Google Patents

Abnormal leakage current data identification method and related device Download PDF

Info

Publication number
CN116186630A
CN116186630A CN202211151491.0A CN202211151491A CN116186630A CN 116186630 A CN116186630 A CN 116186630A CN 202211151491 A CN202211151491 A CN 202211151491A CN 116186630 A CN116186630 A CN 116186630A
Authority
CN
China
Prior art keywords
data
time sequence
determining
current
metering data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211151491.0A
Other languages
Chinese (zh)
Inventor
宋如楠
杨艺宁
张蓬鹤
薛阳
陈昊
郑安刚
吴忠强
秦译为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Marketing Service Center of State Grid Hebei Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Marketing Service Center of State Grid Hebei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI, Marketing Service Center of State Grid Hebei Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202211151491.0A priority Critical patent/CN116186630A/en
Publication of CN116186630A publication Critical patent/CN116186630A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/50Testing of electric apparatus, lines, cables or components for short-circuits, continuity, leakage current or incorrect line connections
    • G01R31/52Testing for short-circuits, leakage current or ground faults
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00002Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by monitoring
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Power Engineering (AREA)
  • Testing Of Short-Circuits, Discontinuities, Leakage, Or Incorrect Line Connections (AREA)

Abstract

The invention discloses an abnormal leakage current data identification method and a related device. The method comprises the following steps: acquiring time sequence metering data of a preset number of times before the current time of a user sampling point; calculating the time sequence metering data by utilizing a pre-trained cluster analysis model, and determining abnormal value time sequence metering data; predicting the abnormal value time sequence metering data by utilizing a pre-trained neural network model, and determining predicted metering data at the current moment; and comparing the predicted metering data with the real metering data at the current moment, determining a comparison error, and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.

Description

Abnormal leakage current data identification method and related device
Technical Field
The invention relates to the technical field of metering data identification, in particular to an abnormal leakage current data identification method and a related device.
Background
The digital alternation of the power grid and the equipment enables the normal and fault leakage of the client side line to present complex nonlinear characteristics, the traditional power frequency residual current detection method is not applicable any more, and great difficulty is brought to the residual current monitoring of the low-voltage distribution system. On the other hand, compared with the traditional power frequency electric equipment, the power electronic equipment working in the high-frequency switch state is easier to generate leakage current to the ground, so that the leakage information of different areas such as customer electric equipment and a power supply circuit is more complex and variable, the hidden trouble is more hidden, and the problem of residual current protection misoperation or refusal operation is possibly aggravated. In the face of huge data volume, the judgment and identification are carried out on the piece-by-piece data by means of manual or statistical methods, so that the accuracy cannot be ensured, and the judgment efficiency is quite low. In order to improve the rapidity and accuracy of residual current monitoring, a scientific and efficient data analysis method is needed to judge whether abnormal data exist in the metering data of the ammeter.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an abnormal leakage current data identification method and a related device.
According to an aspect of the present invention, there is provided an abnormal leakage current data identification method including:
acquiring time sequence metering data of a preset number of times before the current time of a user sampling point;
calculating the time sequence metering data by utilizing a pre-trained cluster analysis model, and determining abnormal value time sequence metering data;
predicting the abnormal value time sequence metering data by utilizing a pre-trained neural network model, and determining predicted metering data at the current moment;
and comparing the predicted metering data with the real metering data at the current moment, determining a comparison error, and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.
Optionally, the operation of obtaining the time sequence metering data of the previous preset number of time instants of the current time instant of the user sampling point includes:
zero line current time sequence data and live line current time sequence data of the preset number of times before the current time of the user sampling point are collected.
Optionally, the operation of calculating the time sequence metering data by using a pre-trained cluster analysis model to determine the abnormal value time sequence metering data comprises the following steps:
cleaning zero line current time sequence data and live line current time sequence data, and eliminating invalid current data;
determining a residual current value according to the zero line current time sequence data and the live line current time sequence data;
carrying out normalization processing on the residual current value, and determining a normalization result;
and calculating a normalization result and a clustering center in the clustering analysis model by using a Euclidean distance formula, and determining abnormal value time sequence metering data.
Optionally, the operation of determining whether the real measurement data at the current moment is leakage current data according to the comparison error includes:
and when the error value exceeds a preset error threshold value, determining the real metering data at the current moment as leakage current data.
Optionally, the method further comprises: training a cluster analysis model by:
performing cluster analysis on the acquired historical time sequence sample metering data by using a Canopy algorithm, and determining the number of clusters and an initial cluster center of each category;
performing iterative computation on the initial cluster center of each category by using K-means to determine the final cluster center of each category;
and determining a cluster analysis model according to the number of clusters and the final cluster center of each category.
Optionally, the method further comprises: the neural network model is determined by the following method:
determining the number of neurons of the neural network model according to the number of sampling points of the historical time sequence sample metering data;
setting an activation function of the neural network model as a sigmoid function;
the output activation function of the neural network model is set to the tanh function.
According to another aspect of the present invention, there is provided an abnormal leakage current data identification apparatus comprising:
the acquisition module is used for acquiring time sequence metering data of a preset number of times before the current time of the user sampling point;
the first determining module is used for calculating the time sequence metering data by utilizing a pre-trained cluster analysis model and determining abnormal value time sequence metering data;
the second determining module is used for predicting the abnormal value time sequence metering data by utilizing a pre-trained neural network model and determining the predicted metering data at the current moment;
and the judging module is used for comparing the predicted metering data with the real metering data at the current moment, determining a comparison error and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.
According to a further aspect of the present invention there is provided a computer readable storage medium storing a computer program for performing the method according to any one of the above aspects of the present invention.
According to still another aspect of the present invention, there is provided an electronic device including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method according to any of the above aspects of the present invention.
Therefore, the problem that the initial K value and the clustering center are difficult to determine in the traditional K-means algorithm is solved based on the Canopy+K-means clustering algorithm, wherein the Canopy can quickly find out the initial clustering centers of a large number of sample data, the K-means clustering can preliminarily screen out the data sequence containing abnormal data, the calculation method is simple and convenient, and the clustering precision is high. The processing capacity of the long-term memory neural network to the long sequence is applied to abnormal data identification, the predicted value and the measured data error are judged by setting a comparison error threshold value, and the measured data with larger difference from the predicted value is effectively identified, namely the abnormal data of the time sequence. Therefore, the method provided by the invention can effectively reduce the risk of misoperation and refusal of the distribution line leakage protection device caused by abnormal leakage data, and has a certain significance for maintaining the normal operation of the low-voltage distribution line.
Drawings
Exemplary embodiments of the present invention may be more completely understood in consideration of the following drawings:
fig. 1 is a flowchart of an abnormal leakage current data identification method according to an exemplary embodiment of the present invention;
FIG. 2 is a flowchart of a clustering algorithm based on Canopy+K-means provided by an exemplary embodiment of the present invention;
FIG. 3 is a diagram of an LSTM internal neuron structure provided by an exemplary embodiment of the present invention;
FIG. 4 is a flowchart for identifying abnormal leakage based on a clustering algorithm and an LSTM neural network, according to an exemplary embodiment of the present invention;
FIG. 5 is a flowchart illustrating an embodiment of a method for identifying abnormal leakage current data according to an exemplary embodiment of the present invention;
fig. 6 is a schematic structural diagram of an abnormal leakage current data identification apparatus according to an exemplary embodiment of the present invention;
fig. 7 is a structure of an electronic device provided in an exemplary embodiment of the present invention.
Detailed Description
Hereinafter, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.
It will be appreciated by those of skill in the art that the terms "first," "second," etc. in embodiments of the present invention are used merely to distinguish between different steps, devices or modules, etc., and do not represent any particular technical meaning nor necessarily logical order between them.
It should also be understood that in embodiments of the present invention, "plurality" may refer to two or more, and "at least one" may refer to one, two or more.
It should also be appreciated that any component, data, or structure referred to in an embodiment of the invention may be generally understood as one or more without explicit limitation or the contrary in the context.
In addition, the term "and/or" in the present invention is merely an association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In the present invention, the character "/" generally indicates that the front and rear related objects are an or relationship.
It should also be understood that the description of the embodiments of the present invention emphasizes the differences between the embodiments, and that the same or similar features may be referred to each other, and for brevity, will not be described in detail.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, the techniques, methods, and apparatus should be considered part of the specification.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
Embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations with electronic devices, such as terminal devices, computer systems, servers, etc. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with the terminal device, computer system, server, or other electronic device include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments that include any of the foregoing, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment in which tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.
Exemplary method
Fig. 1 is a flowchart of an abnormal leakage current data identification method according to an exemplary embodiment of the present invention. The embodiment can be applied to an electronic device, as shown in fig. 1, the abnormal leakage current data identification method 100 includes the following steps:
step 101, acquiring time sequence metering data of a preset number of times before the current time of a user sampling point;
where, for example, 24 data are collected each, that is, data are collected once per hour, if the current time is 9, the previous predetermined number of times may be 6, 7, 8, or other numbers are not limited herein.
Step 102, calculating time sequence metering data by utilizing a pre-trained cluster analysis model, and determining abnormal value time sequence metering data;
wherein the cluster analysis model may be a canopy+k-means clustering algorithm.
Step 103, predicting the abnormal value time sequence metering data by utilizing a pre-trained neural network model, and determining predicted metering data at the current moment;
for example, the data at 9 may be predicted from the data at 6, 7, 8 to obtain the predicted data at 9, and then the predicted data may be compared with the acquired real data at 9. The neural network model may be a long-term memory neural network.
And 104, comparing the predicted metering data with the real metering data at the current moment, determining a comparison error, and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.
Therefore, the problem that the initial K value and the clustering center are difficult to determine in the traditional K-means algorithm is solved based on the Canopy+K-means clustering algorithm, wherein the Canopy can quickly find out the initial clustering centers of a large number of sample data, the K-means clustering can preliminarily screen out data sequences containing abnormal data, the calculation method is simple and convenient, and the clustering precision is high. The processing capacity of the long-term memory neural network to the long sequence is applied to abnormal data identification, the predicted value and the measured data error are judged by setting a comparison error threshold value, and the measured data with larger difference from the predicted value is effectively identified, namely the abnormal data of the time sequence. Therefore, the method provided by the invention can effectively reduce the risk of misoperation and refusal of the distribution line leakage protection device caused by abnormal leakage data, and has a certain significance for maintaining the normal operation of the low-voltage distribution line.
Optionally, the operation of obtaining the time sequence metering data of the previous preset number of time instants of the current time instant of the user sampling point includes:
zero line current time sequence data and live line current time sequence data of the preset number of times before the current time of the user sampling point are collected.
Optionally, the operation of calculating the time sequence metering data by using a pre-trained cluster analysis model to determine the abnormal value time sequence metering data comprises the following steps:
cleaning zero line current time sequence data and live line current time sequence data, and eliminating invalid current data;
determining a residual current value according to the zero line current time sequence data and the live line current time sequence data;
carrying out normalization processing on the residual current value, and determining a normalization result;
and calculating a normalization result and a clustering center in the clustering analysis model by using a Euclidean distance formula, and determining abnormal value time sequence metering data.
Specifically, the current data are cleaned, the situations that zero line current and live line current are zero or the data are not counted in the ammeter at a certain acquisition point are removed, and the normal leakage current characteristic analysis of the low-voltage distribution line is carried out according to the live line current and zero line current metering data of the single-phase user ammeter. The value of the residual current in the line can be represented indirectly by the difference between the phase (L) and neutral (N) currents, i.e
I Δ =|I L -I N |
In order to accelerate the convergence rate of the Canopy+K-means clustering algorithm, normalization processing is needed to be carried out on sample data before clustering operation is carried out, and a maximum and minimum normalization method is adopted, wherein the formula is shown in the following formula.
Figure BDA0003856610300000071
Wherein y is min And y max Respectively taking 0 and 1 as the variable to be normalized, and x as the minimum value and the maximum value after normalization min And x max Respectively the minimum value and the maximum value of the variable to be normalized, and y is the normalization result.
Optionally, the operation of determining whether the real measurement data at the current moment is leakage current data according to the comparison error includes:
and when the error value exceeds a preset error threshold value, determining the real metering data at the current moment as leakage current data.
Optionally, the method further comprises: training a cluster analysis model by:
performing cluster analysis on the acquired historical time sequence sample metering data by using a Canopy algorithm, and determining the number of clusters and an initial cluster center of each category;
performing iterative computation on the initial cluster center of each category by using K-means to determine the final cluster center of each category;
and determining a cluster analysis model according to the number of clusters and the final cluster center of each category.
The invention provides an abnormal data leakage identification method, which mainly comprises three parts, namely leakage data cluster analysis, LSTM neural network prediction and abnormal data identification.
The electric leakage data cluster analysis mainly comprises the steps of collecting electric energy metering data, preprocessing the electric leakage data and carrying out Canopy+K-means cluster analysis. The primary analysis of the electric energy metering data collected in the low-voltage distribution line shows that the leakage current is calculated by the collected phase line and neutral line currents, and the electricity utilization habits of different users have great influence on the leakage current. And clustering the leakage data with the same change rule by using cluster analysis to realize the preliminary identification of abnormal leakage. The problem that the K value and the initial K clustering centers of the traditional K-means algorithm are difficult to determine can be solved by using the Canopy clustering algorithm. Compared with other clustering algorithms, the Canopy algorithm has a shortage in clustering accuracy, but has a great advantage in speed, so that the method can be used as a preliminary processing step of the K-means algorithm. An initial cluster center is provided for the K-means algorithm, and the K value is determined, so that finer clustering operation is performed, and the algorithm flow is as follows, and the schematic diagram is shown in fig. 2.
(1) The data sets of the leakage current samples are ordered and stored in a computer by list, the distance between each leakage current is calculated, the average distance is obtained, the data closest to the average distance is set as the first Canopy, and two distance thresholds are selected: t (T) 1 And T 2 Wherein T is 1 >T 2
(2) Selecting an object P from the leakage data set, calculating its distance from all the produced Canopy, if the distance from P to all the Canopy centers is greater than T 1 P may act as a new Canopy while deleting the leakage data from the dataset. If the distance to a particular Canopy is greater than T 2 And less than T 1 P is added to the Canopy, and the leakage data is not countedConcentrated deletion, i.e., the data object still participates in the next round of clustering; if P is less than T from a certain Canopy center 2 It is added to the Canopy while it is deleted from the dataset and no longer acts as the other Canopy center.
(3) Repeating the above process until the set is empty, and ending the algorithm, namely selecting K samples as an initial cluster center C= { C 1 ,c 2 ,…c k }. Calculating shortest distance dist (x) between other data and determined cluster center i ,C j ),i∈[1,n],j∈[1,k]The method for calculating the distance adopted by the invention is a Euclidean distance calculation method, so that the calculated amount can be simplified, and the calculation formula is shown as follows:
Figure BDA0003856610300000091
wherein, i and p both represent the number of lines of the sample, and D is the dimension of the data sample;
(4) For each category c k And selecting a new cluster center by using a method for calculating the average value of all the objects, wherein the number of the new cluster center is still K, and a calculation formula is shown as follows. When the center of each cluster is no longer changed or a specified number of iterations is reached, the K-means algorithm may be stopped.
Figure BDA0003856610300000092
Wherein, c k Represents the kth class, |c k The i indicates the number of data objects in the kth class.
(5) Clustering centers of various types of leakage current data obtained through operation of a Canopy+K-means clustering algorithm can be used as characteristic data of various clusters, euclidean distances F between the various types of clustering centers and other leakage current data in sequence in the formula (1) are utilized, the Euclidean distances of the various types obtained through calculation are arranged from small to large, and when abnormal leakage data exists in certain type of data, the change rule of the abnormal leakage current data is similar to that of the type of dataThere is a large difference in the properties, and thus the calculated euclidean distance value is also large. Therefore, the judgment threshold F can be set k When the distance value is greater than F k Then, it can be primarily determined that abnormal leakage data exists in the category.
Optionally, the method further comprises: the neural network model is determined by the following method:
determining the number of neurons of the neural network model according to the number of sampling points of the historical time sequence sample metering data;
setting an activation function of the neural network model as a sigmoid function;
the output activation function of the neural network model is set to the tanh function.
Specifically, the LSTM neural network prediction mainly utilizes the unique memory unit and combines the forgetting gate, the input gate and the output gate in the network structure, so that the information in the original data can be selectively reserved or deleted, the long-term dependency relationship can be well learned, and the internal neuron structure is shown in the figure 3. Classifying the metering data based on a Canopy+K-means clustering algorithm, performing primary screening on data containing anomalies, taking a time sequence sample containing the anomalies as the input of an LSTM neural network, dividing a training set and a testing set of sample data, and training by utilizing forward calculation and backward propagation processes of the neural network.
The abnormal data identification is mainly carried out by predicting and identifying through an LSTM neural network, predicting the next data by utilizing the first n data in the time sequence data, taking the predicted value as the accurate value of the data, and judging the real metering data and the predicted value. Setting a certain error threshold range, judging that the current leakage current is a normal value if the error is within the threshold range, and judging that abnormal data occur when the error exceeds the threshold value. And removing the abnormal data from the samples and replacing the abnormal data with a predicted value, and continuing the network model prediction until the time sequence data is completely operated. The specific flow is shown in fig. 4.
In addition, referring to fig. 5, the method of the present invention can be used for identifying whether abnormal data exists in the ammeter measurement data of the common resident user, and realizing the on-line monitoring of the electricity consumption information. Data preprocessing: a total of 1250 sets of 210 user data are selected, with 625 sets of neutral current and live current data. The current value of each collection point is 24, namely the ammeter carries out current data collection every one hour every day.
And cleaning the current data of 210 users, eliminating the situations that the zero line current and the live line current are zero or the data is not counted by the ammeter at a certain acquisition point, and the like, and carrying out the normal leakage current characteristic analysis of the low-voltage distribution line according to the live line current metering data of the single-phase user ammeter. The value of the residual current in the line can be represented indirectly by the difference between the phase (L) and neutral (N) currents, i.e
I Δ =|I L -I N |
In order to accelerate the convergence rate of the Canopy+K-means clustering algorithm, normalization processing is needed to be carried out on sample data before clustering operation is carried out, and a maximum and minimum normalization method is adopted, wherein the formula is shown in the following formula.
Figure BDA0003856610300000101
Wherein y is min And y max Respectively taking 0 and 1 as the variable to be normalized, and x as the minimum value and the maximum value after normalization min And x max Respectively the minimum value and the maximum value of the variable to be normalized, and y is the normalization result.
And (3) data identification: the LSTM neural network abnormal data identification based on the clustering algorithm mainly comprises the following steps: three parts are identified by Canopy+K-means cluster analysis, LSTM neural network prediction and abnormal data. The specific flow is as follows:
(1) And (5) inputting the calculated residual current as raw data into a Canopy+K-means clustering algorithm. Firstly, obtaining the number of clusters and an initial cluster center by using a Canopy algorithm, obtaining the next cluster center by using a method of obtaining the mean value by using K-means, and repeating the above operation until the cluster center is not changed, namely the cluster algorithm is completed.
(2) In daily lifeThe living part is the same as the living part in domestic electricity habit and rule, and some have great differences. Users with the same change rule are classified in the abnormal data identification and extraction processes, so that the residual current data can be finely divided. And taking the clustering centers of the various types of data extracted by the clustering analysis algorithm as the most representative data of the type of users. Calculating the distance between other data and each class of cluster center in sequence by using Euclidean distance formula, when the distance exceeds a threshold value F k When the time series data (24 acquisition points) are abnormal data can be initially inferred.
(3) The time sequence data containing abnormal values screened by the cluster analysis is used as the input data of the LSTM neural network, and the number m of data sampling points of the time sequence is 24, so that the number n (n < m) of the LSTM input neurons can be set to be 3, the number of the output neurons is set to be 1, namely, the data of the next moment is predicted by using the data of the previous 3 moments. The activation function of the LSTM neural network is set as a sigmoid function, and the output activation function is a tanh function. In the forward propagation calculation process of the LSTM neural network, the continuous updating of the forgetting gate, the input gate, the state unit and the output gate is mainly completed by using different activation functions. In the back propagation process, the main idea is to update all parameters iteratively by gradient descent, and the key point is to calculate the partial derivatives of all parameters based on the loss function. The implementation process of back propagation can be realized by means of an algorithm library, the related mathematical derivation process is complicated, and the invention is not explained in detail.
(4) In the prediction process of the LSTM neural network on the leakage current time sequence, error judgment is carried out on each predicted leakage current and metering data, and an error reasonable threshold value is set, so that the retention of real leakage data and the identification and replacement of abnormal leakage data are realized, until the time sequence prediction is finished, and the final identification result is the leakage current data with larger phase difference with the predicted value of the LSTM neural network.
Therefore, the problem that the initial K value and the clustering center are difficult to determine in the traditional K-means algorithm is solved based on the Canopy+K-means clustering algorithm, wherein the Canopy can quickly find out the initial clustering centers of a large number of sample data, the K-means clustering can preliminarily screen out data sequences containing abnormal data, the calculation method is simple and convenient, and the clustering precision is high. The processing capacity of the long-term memory neural network to the long sequence is applied to abnormal data identification, the error judgment of the predicted value and the metering data is carried out by setting a threshold value, and metering data with larger difference from the predicted value is effectively identified, namely the abnormal data of the time sequence. The method provided by the invention can effectively reduce the risk of misoperation and refusal of the distribution line leakage protection device caused by abnormal leakage data, and has a certain significance for maintaining the normal operation of the low-voltage distribution line.
Exemplary apparatus
Fig. 6 is a schematic structural diagram of an abnormal leakage current data identification apparatus according to an exemplary embodiment of the present invention. As shown in fig. 6, the apparatus 600 includes:
an obtaining module 610, configured to obtain time sequence measurement data of a predetermined number of times before a current time of a user sampling point;
a first determining module 620, configured to calculate the timing measurement data using a pre-trained cluster analysis model, and determine outlier timing measurement data;
a second determining module 630, configured to predict the outlier time-series metering data by using a pre-trained neural network model, and determine predicted metering data at the current time;
and the judging module 640 is used for comparing the predicted metering data with the real metering data at the current moment, determining a comparison error and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.
Optionally, the timing metering data includes neutral current timing data and line current timing data, and the acquisition module 610 includes:
and the acquisition sub-module is used for acquiring zero line current time sequence data and live line current time sequence data of the preset number of times before the current time of the user sampling point.
Optionally, the first determining module 620 includes:
the eliminating sub-module is used for cleaning zero line current time sequence data and live wire current time sequence data and eliminating invalid current data;
the first determining submodule is used for determining a residual current value according to zero line current time sequence data and live line current time sequence data;
the second determining submodule is used for carrying out normalization processing on the residual current value and determining a normalization result;
and the third determining submodule is used for calculating the normalization result and the clustering center in the clustering analysis model by using the Euclidean distance formula and determining abnormal value time sequence metering data.
Optionally, the determining module 640 includes:
and the fourth determining submodule is used for determining that the real metering data at the current moment is leakage current data under the condition that the error value exceeds a preset error threshold value.
Optionally, the apparatus 600 further comprises: the first training module is used for training the cluster analysis model by the following method:
a fifth determining submodule, configured to perform cluster analysis on the acquired historical time sequence sample measurement data by using a Canopy algorithm, and determine a cluster number and an initial cluster center of each category;
a sixth determining submodule, configured to perform iterative computation on the initial cluster center of each category by using K-means, and determine a final cluster center of each category;
and a seventh determining sub-module, configured to determine a cluster analysis model according to the number of clusters and the final cluster center of each category.
Optionally, the apparatus 600 further comprises: the second training module is used for determining a neural network model by the following method:
an eighth determining submodule, configured to determine the number of neurons of the neural network model according to the number of sampling points of the historical time sequence sample measurement data;
the first setting module is used for setting the activation function of the neural network model as a sigmoid function;
and the second setting module is used for setting the output activation function of the neural network model as a tanh function.
Exemplary electronic device
Fig. 7 is a structure of an electronic device provided in an exemplary embodiment of the present invention. As shown in fig. 7, the electronic device 70 includes one or more processors 71 and memory 72.
The processor 71 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
Memory 72 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 71 to implement the methods of the software programs of the various embodiments of the present invention described above and/or other desired functions. In one example, the electronic device may further include: an input device 73 and an output device 74, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).
In addition, the input device 73 may also include, for example, a keyboard, a mouse, and the like.
The output device 74 can output various information to the outside. The output device 74 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.
Of course, only some of the components of the electronic device relevant to the present invention are shown in fig. 7 for simplicity, components such as buses, input/output interfaces, etc. being omitted. In addition, the electronic device may include any other suitable components depending on the particular application.
Exemplary computer program product and computer readable storage Medium
In addition to the methods and apparatus described above, embodiments of the invention may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in a method according to various embodiments of the invention described in the "exemplary methods" section of this specification.
The computer program product may write program code for performing operations of embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present invention may also be a computer-readable storage medium, having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the steps in a method of mining history change records according to various embodiments of the present invention described in the "exemplary methods" section above in this specification.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The basic principles of the present invention have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present invention are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present invention. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the invention is not necessarily limited to practice with the above described specific details.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, so that the same or similar parts between the embodiments are mutually referred to. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.
The block diagrams of the devices, systems, apparatuses, systems according to the present invention are merely illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, systems, apparatuses, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
The method and system of the present invention may be implemented in a number of ways. For example, the methods and systems of the present invention may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present invention are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present invention may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present invention. Thus, the present invention also covers a recording medium storing a program for executing the method according to the present invention.
It is also noted that in the systems, devices and methods of the present invention, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention. The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the invention to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims (12)

1. An abnormal leakage current data identification method, characterized by comprising the following steps:
acquiring time sequence metering data of a preset number of times before the current time of a user sampling point;
calculating the time sequence metering data by utilizing a pre-trained cluster analysis model, and determining abnormal value time sequence metering data;
predicting the abnormal value time sequence metering data by utilizing a pre-trained neural network model, and determining predicted metering data at the current moment;
and comparing the predicted metering data with the real metering data at the current moment, determining the comparison error, and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.
2. The method of claim 1, wherein the timing metering data comprises neutral current timing data and line current timing data, and wherein the operation of obtaining timing metering data for a predetermined number of times prior to a current time of a user sample point comprises:
and acquiring zero line current time sequence data and live line current time sequence data of the preset number of moments before the current moment of the user sampling point.
3. The method of claim 2, wherein the operation of computing the timing metric data using a pre-trained cluster analysis model to determine outlier timing metric data comprises:
cleaning the zero line current time sequence data and the live wire current time sequence data, and eliminating invalid current data;
determining a residual current value according to the zero line current time sequence data and the live line current time sequence data;
normalizing the residual current value to determine the normalization result;
and calculating the normalization result and a clustering center in the clustering analysis model by using a Euclidean distance formula, and determining abnormal value time sequence metering data.
4. The method of claim 1, wherein determining whether the actual metering data at the current time is leakage current data based on the alignment error comprises:
and when the error value exceeds a preset error threshold value, determining that the real metering data at the current moment is leakage current data.
5. The method as recited in claim 1, further comprising: training the cluster analysis model by:
performing cluster analysis on the acquired historical time sequence sample metering data by using a Canopy algorithm, and determining the number of clusters and an initial cluster center of each category;
performing iterative computation on the initial cluster center of each category by using K-means to determine a final cluster center of each category;
and determining the cluster analysis model according to the number of clusters and the final cluster center of each category.
6. The method as recited in claim 5, further comprising: determining the neural network model by:
determining the number of neurons of the neural network model according to the number of sampling points of the historical time sequence sample metering data;
setting an activation function of the neural network model as a sigmoid function;
and setting an output activation function of the neural network model as a tanh function.
7. An abnormal leakage current data identification device, comprising:
the acquisition module is used for acquiring time sequence metering data of a preset number of times before the current time of the user sampling point;
the first determining module is used for calculating the time sequence metering data by utilizing a pre-trained cluster analysis model and determining abnormal value time sequence metering data;
the second determining module is used for predicting the outlier time sequence metering data by utilizing a pre-trained neural network model and determining the predicted metering data at the current moment;
and the judging module is used for comparing the predicted metering data with the real metering data at the current moment, determining the comparison error and judging whether the real metering data at the current moment is leakage current data or not according to the comparison error.
8. The apparatus of claim 7, wherein the timing metering data comprises neutral current timing data and line current timing data, and the acquisition module comprises:
and the acquisition sub-module is used for acquiring zero line current time sequence data and live line current time sequence data of the preset number of times before the current time of the user sampling point.
9. The apparatus of claim 8, wherein the first determining module comprises:
the eliminating sub-module is used for cleaning the zero line current time sequence data and the live wire current time sequence data and eliminating invalid current data;
the first determining submodule is used for determining a residual current value according to the zero line current time sequence data and the live wire current time sequence data;
the second determining submodule is used for carrying out normalization processing on the residual current value and determining the normalization result;
and the third determining submodule is used for calculating the normalization result and the clustering center in the clustering analysis model by using a Euclidean distance formula and determining abnormal value time sequence metering data.
10. The apparatus of claim 7, wherein the means for determining comprises:
and the fourth determining submodule is used for determining that the real metering data at the current moment is leakage current data under the condition that the error value exceeds a preset error threshold value.
11. A computer readable storage medium, characterized in that the storage medium stores a computer program for executing the method of any of the preceding claims 1-6.
12. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any of the preceding claims 1-6.
CN202211151491.0A 2022-09-21 2022-09-21 Abnormal leakage current data identification method and related device Pending CN116186630A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211151491.0A CN116186630A (en) 2022-09-21 2022-09-21 Abnormal leakage current data identification method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211151491.0A CN116186630A (en) 2022-09-21 2022-09-21 Abnormal leakage current data identification method and related device

Publications (1)

Publication Number Publication Date
CN116186630A true CN116186630A (en) 2023-05-30

Family

ID=86438976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211151491.0A Pending CN116186630A (en) 2022-09-21 2022-09-21 Abnormal leakage current data identification method and related device

Country Status (1)

Country Link
CN (1) CN116186630A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992389A (en) * 2023-09-26 2023-11-03 河北登浦信息技术有限公司 False data detection method and system for Internet of things
CN117890825A (en) * 2024-03-15 2024-04-16 深圳永贵技术有限公司 Leakage current testing method, device and equipment of charging gun and storage medium
CN117970182A (en) * 2024-03-28 2024-05-03 国网山东省电力公司曲阜市供电公司 Electric leakage early warning method and system based on DTW algorithm
CN118070080A (en) * 2024-04-17 2024-05-24 山东中电仪表有限公司 Intelligent analysis method and system for user electricity consumption data of multifunctional electric energy meter
CN118070080B (en) * 2024-04-17 2024-07-30 山东中电仪表有限公司 Intelligent analysis method and system for user electricity consumption data of multifunctional electric energy meter

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992389A (en) * 2023-09-26 2023-11-03 河北登浦信息技术有限公司 False data detection method and system for Internet of things
CN116992389B (en) * 2023-09-26 2023-12-29 河北登浦信息技术有限公司 False data detection method and system for Internet of things
CN117890825A (en) * 2024-03-15 2024-04-16 深圳永贵技术有限公司 Leakage current testing method, device and equipment of charging gun and storage medium
CN117890825B (en) * 2024-03-15 2024-05-14 深圳永贵技术有限公司 Leakage current testing method, device and equipment of charging gun and storage medium
CN117970182A (en) * 2024-03-28 2024-05-03 国网山东省电力公司曲阜市供电公司 Electric leakage early warning method and system based on DTW algorithm
CN118070080A (en) * 2024-04-17 2024-05-24 山东中电仪表有限公司 Intelligent analysis method and system for user electricity consumption data of multifunctional electric energy meter
CN118070080B (en) * 2024-04-17 2024-07-30 山东中电仪表有限公司 Intelligent analysis method and system for user electricity consumption data of multifunctional electric energy meter

Similar Documents

Publication Publication Date Title
CN116186630A (en) Abnormal leakage current data identification method and related device
CN110826648B (en) Method for realizing fault detection by utilizing time sequence clustering algorithm
CN110990461A (en) Big data analysis model algorithm model selection method and device, electronic equipment and medium
CN109934301B (en) Power load cluster analysis method, device and equipment
CN111738331A (en) User classification method and device, computer-readable storage medium and electronic device
CN112149873A (en) Low-voltage transformer area line loss reasonable interval prediction method based on deep learning
CN111738348A (en) Power data anomaly detection method and device
CN113704389A (en) Data evaluation method and device, computer equipment and storage medium
CN115392592A (en) Storage product parameter configuration recommendation method, device, equipment and medium
Lawrence et al. Explaining neural matrix factorization with gradient rollback
CN113642727B (en) Training method of neural network model and processing method and device of multimedia information
CN117236571B (en) Planning method and system based on Internet of things
CN113094448B (en) Analysis method and analysis device for residence empty state and electronic equipment
CN117034149A (en) Fault processing strategy determining method and device, electronic equipment and storage medium
Chen et al. Machine learning-based anomaly detection of ganglia monitoring data in HEP Data Center
Wang et al. A Novel Multi‐Input AlexNet Prediction Model for Oil and Gas Production
CN114580534A (en) Industrial data anomaly detection method and device, electronic equipment and storage medium
CN116842936A (en) Keyword recognition method, keyword recognition device, electronic equipment and computer readable storage medium
CN113743004A (en) Quantum Fourier transform-based full-factor productivity calculation method
CN114861800B (en) Model training method, probability determining device, model training equipment, model training medium and model training product
CN112308338A (en) Power data processing method and device
Lu et al. Time series power anomaly detection based on Light Gradient Boosting Machine
CN118133058B (en) Voltage stability monitoring method for small direct current bus series micro-grid power distribution
CN117332280A (en) Power consumption behavior clustering method, device, equipment and medium for low-voltage user side
CN114861800A (en) Model training method, probability determination method, device, equipment, medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination