CN114580829A - Power utilization safety sensing method, equipment and medium based on random forest algorithm - Google Patents

Power utilization safety sensing method, equipment and medium based on random forest algorithm Download PDF

Info

Publication number
CN114580829A
CN114580829A CN202111634082.1A CN202111634082A CN114580829A CN 114580829 A CN114580829 A CN 114580829A CN 202111634082 A CN202111634082 A CN 202111634082A CN 114580829 A CN114580829 A CN 114580829A
Authority
CN
China
Prior art keywords
current
value
total
model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111634082.1A
Other languages
Chinese (zh)
Inventor
肖宇
叶志
肖湘奇
胡军华
徐正义
刘小平
肖湘晨
卿曦
吴文娴
罗宇剑
黄瑞
贺星
刘谋海
杨茂涛
黄燕娇
熊德智
彭沛
邹晟
曾娟
周滨
王庭婷
陈浩
余敏琪
叶浏青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Hunan Electric Power Co Ltd
Metering Center of State Grid Hunan Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Hunan Electric Power Co Ltd
Metering Center of State Grid Hunan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Hunan Electric Power Co Ltd, Metering Center of State Grid Hunan Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202111634082.1A priority Critical patent/CN114580829A/en
Publication of CN114580829A publication Critical patent/CN114580829A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Biology (AREA)
  • Marketing (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a power utilization safety sensing method, equipment and medium based on a random forest algorithm. The invention provides a comprehensive model which does not need characteristic screening and integrates fault electric arc, electric leakage and short circuit judgment. The model aims to quickly sense and judge the occurrence of phenomena of fault electric arc, electric leakage and short circuit, improve the fault sensing precision of low-voltage users and ensure the life and property safety of the low-voltage users.

Description

Power utilization safety sensing method, equipment and medium based on random forest algorithm
Technical Field
The invention relates to the field of power utilization safety monitoring, in particular to a power utilization safety sensing method, equipment and medium based on a random forest algorithm.
Background
With the progress of science and technology, various new technologies are continuously emerging, the application of electrified products is greatly popularized, various low-voltage apparatuses are spread in various fields of production and life of people, and certain potential safety hazards are brought. The electrical safety accidents caused by aging or improper configuration of electrical lines tend to rise year by year, and low-voltage fault arcs are one of the main causes of electrical fire. The low-voltage arc fault in the form of short circuit in the low-voltage distribution device still exists, so that the loss caused by the fault is very serious, and great hidden danger is brought to the safety of life and property of people. In addition, the gas explosion accident caused by electric leakage causes serious casualties of workers. Therefore, a method suitable for sensing the electricity safety of low-voltage customers is needed to detect fault arcs, leakage and short-circuit faults in a low-voltage system in real time.
Fault arcing is a gas ionization discharge phenomenon caused by carbonization paths or point contacts. When the leakage current is generated by aging and breakage of the insulating material or overload heating of the line, a carbonization path is formed on the surface, and thus, an arc is generated. The arc detection technology is a technology for extracting features from raw current data (normal working state and fault state), using the features as arc identification conditions and input of a calculation model, and identifying whether an arc exists in a line through the calculation model (after training by using a training set). Common artificial neural network models include a BP neural network model, a convolutional neural network, a pulse neural network, and the like. The leakage protection is that when the phase line of the protected line is connected to earth directly or through unexpected load, a residual current which is approximately sinusoidal and whose effective value is slowly changed is generated, and when the current is greater than a certain value, the protector cuts off the line to play the protection role. The current commonly used leakage protection measures are as follows: and the residual current in the circuit is sensed by using the leakage protector, and the leakage protection is performed based on the fault characteristic value. When the insulation of an electric appliance or a distribution line in an electric control line is damaged, a load is short-circuited, and wiring is wrong, short-circuit fault current is generated. Short-circuit protection requires that the power supply be turned off for a very short time after the short-circuit fault occurs.
At present, the current common method for intelligently sensing the electricity utilization safety of low-voltage customers comprises the following steps: the low-voltage arc detection technology based on the convolutional neural network and the leakage protection method based on the PSO-VMD are disclosed. The method principle and the steps are as follows:
low-voltage arc detection technology based on convolutional neural network
The principle is as follows: convolutional neural networks have three typical characteristics, namely sparse connections, weight sharing, and translational invariance. In the convolution operation process, the nodes of the convolution layer are only connected with partial nodes of the previous layer and are only used for learning local features, namely sparse connection. Sparse connections greatly reduce the number of parameters and computational complexity required for the network. In order to further reduce the number of parameters, all nodes of the same convolutional layer are allowed to share the parameters, which is weight sharing. Another feature of the convolutional neural network is translation invariance, for example, in image classification, the classification target moves the same distance in the same direction in the image, and the obtained classification result is invariant, which is mainly derived from the translation invariance of the convolution operation, i.e., the target is translated, and the representation of the target on the feature map is also translated.
The method comprises the following steps:
collecting current data: and current data are acquired by using a current transformer through the built electric arc experiment platform.
Preprocessing data: and selecting the half-period current signal as a fault arc data sample, and adopting a data extraction mode of a time window. In order to accelerate the model training speed and improve the model precision, the standard normalization is carried out on the collected current data.
Dividing an experimental data set: a data set is constructed that contains single load samples and multiple load samples. Samples were taken for a single load during normal operation (negative samples) and during a fault arc (positive samples), respectively. And only collecting samples in normal working for multiple loads, and randomly combining all loads during collection.
Fourthly, extracting the current characteristics of the fault arc: and (3) forming a 19-dimensional feature vector by using the time domain features of the half-cycle current signals, the frequency domain features extracted by FFT and the signal detail features extracted by WT, and inputting the feature vector into a classification model.
Selecting and optimizing a classification model: and selecting an artificial neural network as a classification model, performing standard normalization on each extracted arc current characteristic, and taking the characteristic vector subjected to characteristic normalization as the input of the artificial neural network. The artificial neural network adopts a cross entropy loss function as an objective function.
Sixthly, designing a convolutional neural network: a fault arc detection model is designed based on a convolutional neural network, and the network model is optimized by adopting the technologies of residual connection, cavity convolution, multi-scale feature fusion and the like.
And (c) optimizing network parameters: network parameters are optimized using the Adam algorithm. To avoid overfitting, a dropout layer is added to the network and an early stop method is used in training.
PSO-VMD-based leakage protection method
The principle is as follows:
VMD is an adaptive multi-component signal decomposition method that decomposes a complex signal composed of multiple frequencies into several eigenmode components. The VMD decomposition process mainly comprises the construction and the solution of a variation problem. When the VMD algorithm decomposes the signal, the decomposition mode number K and the penalty factor
Figure RE-RE-GDA0003628617320000021
Two are
The parameters need to be manually set in advance, wherein K is the number of modal components obtained after decomposition,
Figure RE-RE-GDA0003628617320000022
is a factor in the size of the bandwidth of the modal components,
Figure RE-RE-GDA0003628617320000023
the larger the modal component the smaller the bandwidth,
Figure RE-RE-GDA0003628617320000024
the smaller the modal component the greater the bandwidth. Before the two parameter combinations are set, people mostly adopt artificial setting by experience, which has great influence on whether the parameter combinations are optimal or not. Therefore, the setting of the VMD parameter combination affects whether the useful information of the signal can be accurately extracted. In order to reduce adverse effects possibly caused by manual parameter selection, a particle swarm optimization algorithm is adopted to perform optimization processing on the VMD parameter set, the PSO algorithm is similar to the genetic algorithm and is an optimization algorithm through iteration, a set of random solutions are initialized by a system, the optimal solution is searched in a continuous iteration mode, and compared with intersection and variation of the genetic algorithm, the particle swarm optimization algorithm is a process that particles search the optimal solution through optimal particles in a solution space.
The method comprises the following steps:
initializing a particle swarm;
updating the positions of the particles;
finding the corresponding integer position;
VMD processing the signal in different particle positions and calculating the adaptability value minE of each particlepFor each particle, if the fitness value of the particle is better than the original individual extremum PbestSetting the current adaptation value to the individual extremum Pbest
According to individual extreme value P of each particlebestFind global extremum gbest
Sixthly, the position of the particle is updated again;
reach the iteration stop condition, stop the procedure.
In the scheme, the characteristic values need to be screened, so that the calculated amount of the model and the occupied space of the memory are increased, and the fault detection speed is reduced; in addition, the importance of the feature value is greatly different and needs to be evaluated, but the importance evaluation of the feature value in the current technology still has difficulty, so that the robustness and the generalization capability of the intelligent perception model are reduced.
Meanwhile, due to the limitations of methods such as subjectivity, threshold judgment and the like in the process of constructing the detection index, the anti-interference capability is weak, and misjudgment or missed judgment can occur during fault detection, so that the identification accuracy of the safety problem of the low-voltage user is reduced.
In the aspect of detecting the fault arc, the fault arc is usually detected based on the physical characteristics and waveform characteristics of the arc and the setting of a slope setting threshold or the length of the stay time in a certain range near the zero crossing of the judgment current. The method can lower the accuracy and efficiency of fault detection, and cannot judge the fault arc timely and accurately.
Finally, the probability of overfitting is increased because there is only one output for the algorithmic model.
Therefore, by integrating the scheme, the traditional low-voltage customer electricity utilization safety intelligent sensing method based on artificial intelligence often has the problems of weak model generalization capability, easy falling into local extreme values in the learning process, difficult evaluation and selection of feature importance and the like. Therefore, the robustness of the model is low, under-fitting is easy to occur in the training process, and the identification accuracy rate for the safety problem of the low-voltage user is low.
In the arc detection problem, the zero-crossing retention phenomenon is a remarkable characteristic of the fault arc, but a technical problem is that the method for quickly identifying the occurrence of the zero-crossing retention phenomenon so as to quickly detect the fault arc is provided. In the traditional method, the judgment of the zero-crossing retention phenomenon is mostly realized by setting a slope setting threshold or judging the retention time length of the current in a certain range near 0, however, in the practical situation, the slope value of the zero-crossing retention interval is not absolute zero, the slope with a certain value still exists after amplification, and the slopes of different loads are distributed in a certain range due to different zero-crossing retention time. Therefore, a setting slope value or a setting zero-crossing time length value is used for defining the zero-crossing retention phenomenon, and larger deviation can be generated.
Disclosure of Invention
In order to solve the technical problems that the characteristic value is inaccurate, the detection index is limited, the fault electric arc is difficult to accurately judge and the over-fitting probability is too high in the current low-voltage client electricity utilization safety intelligent sensing scheme, the invention provides a comprehensive model which does not need characteristic screening and integrates fault electric arc, electric leakage and short circuit judgment. The power utilization safety sensing method, equipment and medium based on the random forest algorithm aim at quickly sensing and judging the occurrence of the phenomena of fault arc, electric leakage and short circuit, improving the fault sensing precision of low-voltage users and ensuring the life and property safety of the low-voltage users.
In order to achieve the technical purpose, the technical scheme of the invention is that,
a power utilization safety sensing method based on a random forest algorithm comprises the following steps:
collecting power consumption data, and extracting a current peak value, a current amplitude value, a current integral and variance, a current peak value, a current power information entropy, signal proportions of different current frequency bands, a current waveform factor, a current peak value index, a current pulse index and a fault arc zero-crossing retention probability index as characteristic indexes;
and step two, inputting the characteristic indexes into the trained random forest algorithm model, outputting probability values of different electricity utilization conditions represented by electricity utilization data by all decision trees in the random forest algorithm model respectively, taking the average of all the probability values, and taking the electricity utilization condition with the highest probability value in the average as a current electricity utilization condition result.
In the method, in the first step, the characteristic index includes:
current peak-to-peak value: peak to peak total residual current PvdiThe difference between the last peak value corresponding to the sampling point i and the last inverse peak value; peak to peak value of total current PvaiThe difference between the last peak value corresponding to the sampling point i and the last inverse peak value;
current amplitude value: amplitude Ad of the residual currentiThe amplitude of the residual current at the sampling point i is shown; amplitude Aa of the total currentiThe amplitude of the total current at a sampling point i is shown;
current integral and variance: residual current integral INTGdiSumming sampling values of the residual current in a recent power frequency period T at a sampling point i moment; residual current variance DaiCalculating the variance of the sampling value of the residual current in a recent power frequency period T at the sampling point i moment; integral of total current INTGaiSumming sampling values of the total current in a recent power frequency period T at a sampling point i moment; total current variance DaiCalculating the variance of the sampling value of the total current in a recent power frequency period T at the sampling point i moment;
current peak value: residual current kurtosis KdiThe kurtosis is obtained by the current at a sampling point i for a sampling value in a recent power frequency period T; total current kurtosis KaiThe kurtosis is obtained by the total current at a sampling point i for a sampling value in a recent power frequency period T;
current power information entropy: work (Gong)Entropy of rate information PseiCalculating power spectrum information entropy of a sampling value in a recent period T for the current at a sampling point i;
signal proportion of different current frequency bands: the residual current low frequency band signal specific gravity Rdl is a low frequency band: the proportion of 0-8kHz signal quantity to the total signal quantity; residual current mid-band signal specific gravity Rdm, which is the mid-frequency range: the 8kHz-40kHz signal quantity accounts for the proportion of the total signal quantity; residual current high frequency band signal proportion Rdh, is the high frequency section: the proportion of 40kHz-80kHz signal quantity to the total signal quantity; the total current low-frequency band signal proportion Ral is a low-frequency band: the proportion of 0-8kHz signal quantity to the total signal quantity; the specific gravity Rad of the total current intermediate frequency band signal is an intermediate frequency section: the 8kHz-40kHz signal quantity accounts for the proportion of the total signal quantity; the total current high-frequency band signal proportion Rah is a high-frequency band: the proportion of 40kHz-80kHz signal quantity to the total signal quantity;
current form factor: total residual current form factor SdiThe ratio of the root mean square value of the total residual current at the time of the sampling point i to the absolute value of the root mean square value of the total residual current in nearly one power frequency period; total current form factor SaiThe ratio of the root mean square value of the total current at the time of the sampling point i to the absolute value of the root mean square value of the total current at the time of the sampling point i;
current peak index: total residual current peak indicator CdiThe ratio of the maximum value of the residual current close to a power frequency period to the effective value of the residual current close to the power frequency period at the ith sampling point moment; total current peak index CaiThe ratio of the maximum value of the total current close to a power frequency period to the effective value of the total current close to the power frequency period at the ith sampling point moment;
current pulse index: residual current pulse index SdiThe ratio of the maximum value of the residual current of nearly one power frequency period to the absolute average value of the residual current of nearly one monthly power frequency period at the ith sampling moment; total current pulse index SaiThe ratio of the maximum value of the total current of nearly one power frequency period to the absolute average value of the total current of nearly one monthly power frequency period at the ith sampling moment;
fault arc zero-crossing retention probability index: and calculating according to the historical data sequence by using a long-term and short-term memory artificial neural network.
The method is characterized in that the fault arc zero-crossing retention probability index is obtained by the following steps:
inputting the total current value change data before the current i moment into the trained long-short term memory artificial neural network model, and outputting the probability value PA of the fault electric arc occurring at the current i moment by the long-short term memory artificial neural network modeli
The total current value change data is obtained by averagely dividing sampling points in one period of the current power frequency into a plurality of sub data sets according to the time sequence and inputting the sub data sets of adjacent time before the current i moment as the total current value change data.
The method is characterized in that the long-term and short-term memory artificial neural network model is trained through the following steps:
step 1, establishing a long-term and short-term memory artificial neural network model;
step 2, collecting historical current data, collecting the data in a power frequency period, dividing the data into a training set and a test set, and then averagely dividing sampling points in one current power frequency period into a plurality of subdata sets according to a time sequence;
step 3, inputting the subdata set into a long-short term memory artificial neural network model so as to output a probability value of the occurrence of the fault arc at the moment i, and then optimizing weight parameters of the model by adopting an Adam algorithm; repeating the step 3 until the training target function reaches the vicinity of the preset limit, thereby completing the training;
and 4, inputting the test set into the trained model, and if no overfitting phenomenon occurs, completing the test of the model.
In the method, in the step 1, the number of units, namely the hidden layer units, in the full-connection layer neural network of the long-term and short-term memory artificial neural network model is 40; the learning rate was set to 0.0006; the number of model layers was set to 3.
In the second step, the random forest algorithm model is trained through the following steps:
firstly, establishing a random forest algorithm model;
acquiring historical data including total voltage, total current and total residual current, and extracting features to obtain a feature set;
step three, training a random forest algorithm model by using a training set generated based on the feature set through a bag method;
and fourthly, testing the random forest algorithm model through a test set generated by a bag method based on the feature set, and finishing training after the test is qualified.
In the method, in the step I, when a random forest algorithm model is established, initialization setting is carried out on the model, namely the number of decision trees in the model is solidified, and the structure of the model is solidified.
In the third step, the bagging method comprises the following steps:
step 1) extracting a sample from a historical electricity consumption data set containing N samples, and marking the number j of times;
step 2) copying the sample and putting back the historical electricity utilization data set;
step 3) repeating the steps 1) and 2) until j is equal to N, and obtaining a self-service collection custom feature with the same size as the historical electricity utilization data setkAnd training the kth sub-tree.
Step 4), repeating 1), 2) and 3) until k is equal to the number of decision trees in the model, and finishing the generation of a training set; and samples which are not extracted in the historical electricity utilization data set are used as a test set.
An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method.
A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method.
The method has the technical effects that the fault arc zero-crossing retention probability index provided by the invention can play a role in early warning the occurrence of the fault arc, when the first zero-crossing retention phenomenon is in the sprouting trend, the calculation model of the fault arc zero-crossing retention probability index can sharply sense the change trend of the fault arc, and outputs corresponding probability to the random low-voltage customer electricity safety sensing model as reference. The method is beneficial to improving the accuracy of the perception model, shortening the fault arc judgment reaction time, finding the potential safety hazard of the user earlier and improving the power utilization safety of low-voltage users.
The invention integrates a large number of relevant characteristics (such as peak-to-peak value, current amplitude, integral and variance, kurtosis, power information entropy, signal proportion of different frequency bands and the like), integrates fault electric arc, short circuit and electric leakage judgment, and is a comprehensive model. The input data of each decision tree in the forest model is randomly extracted from the original feature data set without feature screening, so that the time and memory expenditure caused by the feature screening is reduced, and the risk of negative influence of low-correlation-degree features is reduced.
The method does not depend on the output result of a single model, and takes the average value of the outputs of all decision trees as the final perception result of the low-voltage client, thereby greatly reducing the probability of the overfitting problem which is easy to occur in the traditional neural network, improving the model building efficiency and having strong generalization capability.
According to the method, the power utilization safety of the low-voltage user is sensed by using a random forest algorithm, each decision tree in a random forest model outputs a fault occurrence probability, the output values of all the decision trees are averaged, so that the average value of the whole model is obtained, the defect that a single classifier of a traditional neural network is easy to fall into a local extreme value during training is overcome, the integrated classifier considers various conditions at the same time, and the interference of the problem of the local extreme value is effectively avoided.
The invention will be further explained with reference to the drawings.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of the present invention;
FIG. 2 is a diagram illustrating a format of raw data collected according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the operation mechanism of the zero-crossing retention judgment model in the present invention;
FIG. 4 is a schematic diagram of a method for calculating an arc occurrence probability indicator according to the present invention;
FIG. 5 is a schematic diagram of the sliding window selection for selecting sampling points in the calculation of the arc occurrence probability indicator according to the present invention;
FIG. 6 is a schematic structural diagram of an LSTM model for calculating the probability of arc occurrence according to the present invention;
FIG. 7 is a schematic diagram of a low-voltage user security perception feature set of the present invention;
fig. 8 is a schematic diagram of the bag packaging method adopted by the present invention.
Detailed Description
The overall architecture diagram of the embodiment is shown in fig. 1, and includes two parts, the first part is a random forest model generation part, and the part mainly describes the establishment process of a random forest model; the second part is a random forest model using part which mainly shows how the random forest model realizes the intelligent perception of the low-voltage customer electricity utilization safety.
The following provides the raw data acquisition and feature extraction methods
Simulating various power consumption fault problem scenes and normal operation states of low-voltage customers, and acquiring residual current data Id by using a wave recording device in a simulation experimentiAnd total current data IaiThe sampling frequency is set to be 1MHz, and the power frequency is 50Hz, so each power frequency period comprises 20000 sampling points. The original dataset matrix OriginalData is formed. The matrix structure is N rows and 5 columns, wherein the ith row represents data at the ith sampling point time, the 1 st column represents a sampling point number, the 2 nd column represents residual current data, the 3 rd column represents total current data, the 4 th column represents voltage data between a live wire and a zero wire, and the 5 th column represents a label value (a normal operation label: '0', a short circuit condition label: '1', an electric leakage condition label: '2', a fault arc condition label: '3'). The original data format is schematically shown in FIG. 2Shown in the figure.
Next, different features are extracted from the raw data table by the following method, the features are calculated from the raw data at the beginning of the second power frequency period (where the power frequency period is denoted by T, and its value is 20 ms):
(1) peak to peak value
The peak-to-peak value of the current sample point i is the difference between the last peak value and the last inverse peak value. The peak-to-peak value of the total residual current and the total current is recorded as Pvdi、Pvai
(2) Amplitude of current
The current amplitude of the current sampling point i and the amplitude of the residual current at the moment i are recorded as AdiThe amplitude of the total current at time i is recorded as Aai
(3) Integral and variance
The integral and the variance of the current sampling point i are calculated by solving the integral and the variance of the sampling value in a recent power frequency period T at the moment of the current sampling point i, and the residual current data IdiAnd total current data IaiThe integral and the variance of (A) are respectively recorded as INTGdi、Ddi、INTGai、DaiThe specific calculation formula is as follows:
Figure RE-RE-GDA0003628617320000081
Figure RE-RE-GDA0003628617320000082
Figure RE-RE-GDA0003628617320000091
Figure RE-RE-GDA0003628617320000092
wherein Idmean、IameanThe average value of the residual current in a recent power frequency period T is obtained.
(4) Kurtosis
Kurtosis is a characteristic quantity used to characterize the degree of concentration of a data distribution, and in particular, it can characterize the degree of spiking of a probability density curve. The kurtosis of the current sampling point i is calculated by solving the kurtosis of the sampling value in a recent period T at the moment of the current sampling point i, and the residual current data IdiAnd total current data IaiRespectively expressed as Kdi、KaiTheir calculation formula is as follows:
Figure RE-RE-GDA0003628617320000093
Figure RE-RE-GDA0003628617320000094
the kurtosis value of the normal distribution is 3, if the calculated kurtosis value is larger than 3, the data sequence is dispersed than the normal distribution, and if the calculated kurtosis value is smaller than 3, the data sequence is concentrated than the normal distribution.
(5) Entropy of power information
The information entropy is also called shannon entropy and is used for representing the uncertainty of the random variable. The larger the information entropy is, the more discrete the information sequence is, the stronger the uncertainty is, otherwise, the smaller the uncertainty is. Obtaining power spectrum information entropy Pse from the sampling value in the recent period T at the moment of the current sampling point iiCalculated total current power spectrum information entropy PseiExpressed as:
Figure RE-RE-GDA0003628617320000095
wherein, PiRepresenting the total power of the low voltage user.
(6) Specific gravity of signals in different frequency bands
At the ith sampling point moment, the residual current data Id is measurediAnd total current data IaiPerforming fast Fourier transform to obtain the residual currentAccording to IdiAnd total current data IaiCorresponding frequency domain form IdmAnd IamThe transformation is as follows:
Figure RE-RE-GDA0003628617320000101
Figure RE-RE-GDA0003628617320000102
in the frequency domain, high, medium and low frequencies are divided through experience, the proportion of signals in different frequency bands to the total amount of the signals is calculated, and fault signal data and normal signal data can be distinguished through proportion values. A low-frequency section: 0-8kHz, intermediate frequency section: 8kHz-40kHz, high frequency region: 40kHz-80 kHz. Expressions of residual current low-frequency band signal specific gravity Rdl, intermediate-frequency band signal specific gravity Rdm, high-frequency band signal specific gravity Rdh, total current low-frequency band signal specific gravity Ral, intermediate-frequency band signal specific gravity Rad and high-frequency band signal specific gravity Rah are as follows:
Figure RE-RE-GDA0003628617320000103
Figure RE-RE-GDA0003628617320000104
Figure RE-RE-GDA0003628617320000105
Figure RE-RE-GDA0003628617320000106
Figure RE-RE-GDA0003628617320000107
Figure RE-RE-GDA0003628617320000108
(7) form factor
The form factor can reflect the difference degree between the form and the standard sine wave, and is defined as the ith sampling point moment, the proportion of the root mean square value of nearly one power frequency period to the absolute value thereof, and the total residual current form factor SdiAnd the total current form factor SaiThe expression of (a) is as follows:
Figure RE-RE-GDA0003628617320000111
Figure RE-RE-GDA0003628617320000112
(8) peak index
The peak index is defined as the ratio of the peak value to the effective value, and can be used for representing whether an impact event occurs in the studied time section, and can reflect the relative height of the peak. Specifically, at the ith sampling point, the ratio Cd of the residual current and the total current maximum value in nearly one power frequency period to the residual current and the total current effective value in nearly one monthly power frequency periodi、Cai
Figure RE-RE-GDA0003628617320000113
Figure RE-RE-GDA0003628617320000114
(9) Pulse index
The pulse index is defined as the ratio of the peak value to the absolute mean value of the sampling sequence and can reflect the impact degree of the waveform. The ith sampling time, the residual current and the maximum value of the total current of the power frequency period and the residual current of the power frequency period of the month,Ratio Sd of the absolute mean of the total currenti、Sai
Figure RE-RE-GDA0003628617320000115
Figure RE-RE-GDA0003628617320000116
(10) Probability index of fault electric arc zero passage detention
In the conventional zero-crossing detention judging method, whether zero-crossing detention is generated is mostly judged according to the time length of the current amplitude staying near zero or whether the slope of a sampling point sequence exceeds a certain fixed critical value. By the method for calculating the residence time near zero, the fault arc judgment reaction time is easy to overlong because the fault arc judgment reaction time can be identified only after the zero-crossing residence occurs; the setting of the fixed critical value cannot adapt to various actual complex conditions, and errors such as missing judgment or erroneous judgment are easy to occur. Therefore, aiming at the problems, the invention provides a method which has no setting value and can pre-judge the fault arc in the 'zero-crossing retention' sprouting state, and the method finally outputs the probability PA of the arc occurrence at the moment ii. The model judgment mechanism is schematically shown in FIG. 3. The left half of the graph expresses that under normal conditions, the model outputs the probability PA that the zero-crossing detention occurs at the moment i according to the change trend of the data sequence in the period of time before the moment iiThe value should be lower at this time; and the right part of the graph expresses that in the initial stage of the fault, at the initial stage of the first zero-crossing detention, the model can show that the rising amount is gradually reduced after the zero-crossing through the sequence change trend, and the probability PA of the zero-crossing detention phenomenon at the moment is outputiThe value should be higher at this time.
Therefore, the probability of occurrence of arc PAiThe method can be used as a key index for judging the occurrence of the electric arc, and the specific calculation method is as follows:
making characteristic data set
First, extracting from original data set OriginalDataTaking a sampling point, a column where the total current is located and a label column to form a sub data set originalduchild, wherein the scale of the sub data set is Nx 3; secondly, constructing sub-data and a feature set corresponding to originaldata, wherein each period has 20000 sampling points due to the sampling rate of 1MHz, and according to the experience that zero-crossing retention is compared with the whole period in time, 1000 points are taken as the width of a sliding window in the method, namely the feature set corresponding to the sub-data set is constructed from the 1000 th sampling point of the original data. As shown in fig. 5, the total current data sequence is acquired in time sequence, and the data sequence with time sequence characteristics is stepped one point at a time in a sliding window manner to construct a characteristic data set, wherein the first 70% is divided into a training set, and the last 30% is divided into a testing set. Using long-short term memory artificial neural network (LSTM) to learn the training set of the feature set, so that the LSTM model can judge the state of the current moment from the change trend of the recent historical current data sequence, at the initial stage of fault arc occurrence, the recent historical current data contains the trend of zero-crossing detention, the sensitive LSTM will dig out important information from the historical data sequence and output the probability value PA of the fault arc occurrence at the current momenti
Training of LSTM model
Using python3.8, under the tensierflow 2.0 framework, call LSTM unit to create sub-packet tf. composition. v1.nn. rnn _ cell. basiclstmacell, and set the model hyperparameters according to empirical model as in table 1 below:
TABLE 1 LSTM model hyper-parameter settings
Figure RE-RE-GDA0003628617320000121
The number of hidden layer units is 40, and the number of the hidden layer units is the number of the units of the full-connection layer neural network in the LSTM unit.
The input dimensionality is the number of the features, 1000 historical continuous sampling points are input at one time, and a data change rule is excavated from a sampling point sequence by a model.
The learning rate may control the magnitude of the parameters that the model adjusts during training to be set at 0.0006.
The number of LSTM layers is set to 3.
Batchsize is the scale of each batch of training, and the invention sets each batch of training to contain 10 power frequency cycles.
The time step indicates how much history memory is referenced each time a calculation is made.
The weight optimization is to set the weight parameters in the model optimized by what method during the training of the LSTM, and set the weight parameters as a general Adam algorithm.
And inputting 1000 historical data, and outputting 1 probability value, namely the probability of the occurrence of the fault arc at the moment i. The constructed LSTM model structure is shown in fig. 6.
And (3) training the constructed LSTM model based on training set data by using an Adam algorithm, calling a reduce _ mean function in a tensoflow frame as a training target function, and when the reduce _ mean is reduced to be close to a preset limit, indicating that the training is finished and entering a test stage.
Wherein the preset limit value is determined by:
1) starting the training of the model, and observing the variation trend of the output value of the objective function, wherein the expression of the objective function is as follows:
Figure RE-RE-GDA0003628617320000131
wherein, y'iRepresenting the ith value, y, in the model output vectoriThe method is characterized in that the label value of the test data is represented, N represents the length of the output vector of the model, the meaning of loss is that the difference value of each output value and the label value in the output vector of the model is subjected to root number averaging, the obtained result can reflect the good and bad degree of model training under the current training degree, and the smaller the value is, the closer the value is to a target is represented.
Secondly, setting the limit value error to be 0, keeping the model in a training state all the time, and observing the relation between the output value of the objective function loss and the training times j.
(iii) when satisfying
Figure RE-RE-GDA0003628617320000132
The j training effect at the present time is illustrated, the difference is not great from the training effect before 100 times, the current training task is judged to reach the limit, and the current loss can be outputjAs a preset limit value for model training.
Testing model
And testing the trained model by using the test set data, verifying the generalization capability of the model, and if no overfitting phenomenon appears, indicating that the model completes the test. Here, overfitting, i.e., the model, performed well on the training set, but not well on the test set, i.e., the generalization performance was poor. Specifically, when the test is performed using the test set data, the overfitting phenomenon may be considered to have not occurred as long as the model performance is not lower than the performance using the training set.
Output value PA of the model at the moment iiThe probability of the fault arc at the current moment is represented, and the probability can be used as a characteristic index and can be used as the input of a low-voltage user safety perception model.
The following gives the data set construction process used for random forest algorithm model training and testing.
The total feature dimension participating in calculation is 24, the label output is 1 dimension, a low-voltage user electricity utilization safety perception feature set is established by taking a sampling point as a minimum time unit, and the feature set structure is shown in fig. 7. Wherein the bag-in data of the feature set is used for training and the bag-out data is used for testing. This data set was named CustomFeature.
Then establishing a random forest model through the following steps:
using python3.8 under the tenserflow 2.0 framework, a random forest model was built with several relevant steps.
(1) Importing related python packages
Introducing a random forest classifier method from a sklern. ensemble package, wherein the method is used for constructing a random forest; a train _ test _ split method is introduced from a skler model _ selection packet, and is used to divide a data set. The read _ excel method imported into the pandas package is used to read data.
(2) Importing a data set
The previously organized CustomFeature dataset was imported from the Excel data sheet. Reading the CustomFeture data set by using a read _ excel method in the Pandas packet, and assigning values to variables: dataset _ CF.
(3) Establishing random forest
When the random forest model adopted in this embodiment is established, the value of n _ estimators is set to 20, which is the set number of decision trees, and this value needs to be adapted to the running hardware environment, and generally, the higher the value is, the higher the memory overhead is. This value may be set according to the accuracy requirements and hardware conditions at the time of a particular application.
In this embodiment, the parameter random _ state is set to 2 or any fixed value, so that the same random forest model can be generated by running the algorithm each time. In the parameter setting of the model, random _ state is a random seed, which is used as a parameter in any class or function with randomness to control a random mode, and when random _ state takes a certain value, a rule is determined. Because the establishment of the random forest model is a random process, if random _ state parameters are not set, the model established each time is random, and the experimental result cannot be fixed but is randomly displayed, that is: when the value of the random _ state parameter is unchanged, the same training set is used for establishing a random forest model to obtain the same result, and the same result is obtained for the prediction result of the test set; when the value is changed, the results obtained by establishing the random forest model are different; if the parameter is not set, the function will automatically select a random pattern, and the result obtained each time will be different. Adding the random _ state parameter and setting it to a fixed value serves to control the random state.
In this embodiment, the parameter bootstrap is set to True, that is, a random forest is trained by using a bag packing method, which can ensure that the finally obtained decision tree models in the forest are different from each other. The parameters can be used for randomly and replaceably extracting sub-data sets from a data set by applying a bagging method, and the sub-data sets are used for training single decision trees in a forest. That is, each decision tree is guaranteed to use different data sets, and the data sets are similar but different after training. Under the condition that the overall distribution of the sample is unknown and the sample cannot be accurately extracted, the bootstrap can resample the sample, and then the overall distribution is described.
The statistical principle of Bootstrap is as follows:
suppose there is a sample x1,x2,...,xiThe true distribution is P, and the empirical distribution of X at this time is:
Figure RE-RE-GDA0003628617320000151
(A belongs to the overall sample space). In the case where there is no other information,
Figure RE-RE-GDA0003628617320000152
is a nonparametric maximum likelihood estimate of P.
Specifically, the bagging method comprises the following steps:
firstly, extracting a sample from an original data set CustomFeture containing N samples, and marking the number j of times.
And copying the sample and putting back the original data set to keep the consistency of the original data set.
Thirdly, repeating the first step and the second step until j is equal to N, and obtaining a self-service collection custom feature with the same size as the original data collection custom featurekAnd training the kth sub-tree.
And fourthly, repeating the third step until k is equal to n _ estimators, namely the number of the decision trees in the model, and stopping.
The schematic diagram of the principle of the bagging method is shown in fig. 8, the self-service sets are subsets of the sample sets, repeated samples may exist in the self-service sets, the self-service sets are different from each other, and a decision tree is constructed to ensure that the decision trees are different from each other after training. Due to the nature of the put-back random sampling method, when the sample set is large enough, there is a portion of data that is never drawn, as in equation (22), the probability that each sample is drawn is Pslt, when n approaches infinity, the Pslt value converges to the domain 1- (1/e), which is about 0.632, i.e., about 37% of the data is not involved in training, and this portion of data is used in the testing phase, called out-of-bag data.
Figure RE-RE-GDA0003628617320000153
Setting parameter oob _ score to True indicates that it is to be determined to participate in the test using off-bag data.
And after the parameter setting is finished, calling a fit method, taking the training data as the parameter, entering a training stage, and finally outputting the trained random forest model RF.
In order to evaluate the classification capability of the random forest model RF, a score method is called, the classification capability of the model is evaluated in a test set, and a corresponding evaluation value is output to serve as a model classification capability test index. When score is greater than 0.95, the RF classification capability of the random forest model is qualified, and the method can be used for intelligent perception of low-voltage customer electricity utilization safety.
Inputting the characteristics corresponding to the real-time data into a model RF according to the real-time data output result, outputting a probability for each decision tree in the random forest model, averaging the output values of all the decision trees to obtain the average value of the whole model, and checking and outputting by using a prediction method with the input of the test set as a parameter. Further, the corresponding category (normal: "0", short circuit: "1", leakage: "2", arc: "3") was obtained.
Further, the embodiment also provides an electronic device and a computer readable medium.
Wherein electronic equipment includes:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the aforementioned methods.
In specific use, a user can interact with a server which is also used as a terminal device through an electronic device which is used as the terminal device and based on a network, and functions of receiving or sending messages and the like are realized. The terminal device is generally a variety of electronic devices provided with a display device and used based on a human-computer interface, including but not limited to a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like. Various specific application software can be installed on the terminal device according to needs, including but not limited to web browser software, instant messaging software, social platform software, shopping software and the like.
The server is a network server for providing various services, such as a background server for providing corresponding computing services for received power consumption data transmitted from the terminal device. The power utilization condition analysis is carried out on the received power utilization data based on the trained model, and the final power utilization condition analysis result is returned to the terminal equipment.
The power consumption security sensing method provided by the embodiment is generally executed by the server, and in practical application, the terminal device can also directly execute the power consumption security sensing method under the condition that necessary conditions are met,
similarly, the present embodiment provides a computer-readable medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the electricity consumption safety perception method of the present embodiment.

Claims (10)

1. A power utilization safety sensing method based on a random forest algorithm is characterized by comprising the following steps:
collecting power consumption data, and extracting a current peak-peak value, a current amplitude value, a current integral and variance, a current peak value, a current power information entropy, signal proportions of different current frequency bands, a current waveform factor, a current peak value index, a current pulse index and a fault arc zero-crossing retention probability index as characteristic indexes;
and step two, inputting the characteristic indexes into the trained random forest algorithm model, respectively outputting probability values of different power utilization conditions represented by the power utilization data by all decision trees in the random forest algorithm model, then taking the average of all the probability values, and taking the power utilization condition with the maximum probability value in the average as the current power utilization condition result.
2. The method according to claim 1, wherein in the first step, the characteristic index includes:
current peak-to-peak value: peak to peak total residual current PvdiThe difference between the last peak value corresponding to the sampling point i and the last inverse peak value; peak to peak value Pva of total currentiThe difference between the last peak value corresponding to the sampling point i and the last inverse peak value;
current amplitude value: amplitude Ad of the residual currentiThe amplitude of the residual current at the sampling point i is shown; amplitude Aa of the total currentiThe amplitude of the total current at a sampling point i is shown;
current integral and variance: residual current integral INTGdiSumming sampling values of the residual current in a recent power frequency period T at a sampling point i moment; residual current variance DdiCalculating the variance of the sampling value of the residual current in a recent power frequency period T at the sampling point i moment; integral of the Total Current INTGaiSumming sampling values of the total current in a recent power frequency period T at a sampling point i moment; total current variance DaiCalculating the variance of the sampling value of the total current in a recent power frequency period T at the sampling point i moment;
current peak value: residual current kurtosis KdiThe kurtosis is obtained by the current at a sampling point i for a sampling value in a recent power frequency period T; total current kurtosis KaiThe kurtosis is obtained by the total current at a sampling point i for a sampling value in a recent power frequency period T;
current power information entropy: power information entropy PseiCalculating power spectrum information entropy of a sampling value in a recent period T for the current at a sampling point i;
signal proportion of different current frequency bands: the residual current low frequency band signal specific gravity Rdl is a low frequency band: the proportion of 0-8kHz signal quantity to the total signal quantity; residual current mid-band signal specific gravity Rdm, which is the mid-frequency range: the 8kHz-40kHz signal quantity accounts for the proportion of the total signal quantity; the residual current high-frequency band signal specific gravity Rdh is a high-frequency band: the proportion of 40kHz-80kHz signal quantity to the total signal quantity; the total current low-frequency band signal proportion Ral is a low-frequency band: the proportion of 0-8kHz signal quantity to the total signal quantity; the specific gravity Rad of the total current intermediate frequency band signal is an intermediate frequency section: the 8kHz-40kHz signal quantity accounts for the proportion of the total signal quantity; the total current high-frequency band signal proportion Rah is a high-frequency band: the proportion of 40kHz-80kHz signal quantity to the total signal quantity;
current form factor: total residual current form factor SdiThe ratio of the root mean square value of the total residual current at the time of the sampling point i to the absolute value of the root mean square value of the total residual current in nearly one power frequency period; total current form factor SaiThe ratio of the root mean square value of the total current at the time of the sampling point i to the absolute value of the root mean square value of the total current at the time of the sampling point i;
current peak index: total residual current peak indicator CdiThe ratio of the maximum value of the residual current close to a power frequency period to the effective value of the residual current close to the power frequency period at the ith sampling point moment; total current peak index CaiThe ratio of the maximum value of the total current close to a power frequency period to the effective value of the total current close to the power frequency period at the ith sampling point moment;
current pulse index: residual current pulse index SdiThe ratio of the maximum value of the residual current of nearly one power frequency period to the absolute average value of the residual current of nearly one monthly power frequency period at the ith sampling moment; total current pulse index SaiThe ratio of the maximum value of the total current of nearly one power frequency period to the absolute average value of the total current of nearly one monthly power frequency period at the ith sampling moment;
fault arc zero-crossing retention probability index: and calculating according to the historical data sequence by using a long-term and short-term memory artificial neural network.
3. The method according to claim 1, wherein the fault arc zero-crossing retention probability indicator is obtained by the following steps:
inputting the total current value change data before the current i moment into the trained long-short term memory artificial neural network model, and outputting the probability value PA of the fault arc occurring at the current i moment by the long-short term memory artificial neural network modeli
The total current value change data is obtained by averagely dividing sampling points in one period of the current power frequency into a plurality of sub data sets according to the time sequence and inputting the sub data sets of adjacent time before the current i moment as the total current value change data.
4. The method of claim 3, wherein the long-short term memory artificial neural network model is trained by:
step 1, establishing a long-term and short-term memory artificial neural network model;
step 2, collecting historical current data, collecting the data in a power frequency period, dividing the data into a training set and a test set, and then averagely dividing sampling points in one current power frequency period into a plurality of subdata sets according to a time sequence;
step 3, inputting the subdata set into a long-short term memory artificial neural network model so as to output a probability value of the occurrence of the fault arc at the moment i, and then optimizing weight parameters of the model by adopting an Adam algorithm; repeating the step 3 until the training target function reaches the vicinity of the preset limit, thereby completing the training;
and 4, inputting the test set into the trained model, and if no overfitting phenomenon occurs, completing the test of the model.
5. The method according to claim 4, wherein in step 1, the number of units, namely the hidden layer units, in the fully-connected layer neural network of the long-short term memory artificial neural network model is 40; the learning rate was set to 0.0006; the number of model layers was set to 3.
6. The method as claimed in claim 1, wherein in the second step, the random forest algorithm model is trained by the following steps:
firstly, establishing a random forest algorithm model;
acquiring historical data including total voltage, total current and total residual current, and extracting features to obtain a feature set;
step three, training a random forest algorithm model by using a training set generated based on the feature set through a bag method;
and fourthly, testing the random forest algorithm model through a test set generated by a bag method based on the feature set, and finishing training after the test is qualified.
7. The method as claimed in claim 6, wherein in step (i), when the random forest algorithm model is established, the model is initialized, that is, the number of decision trees in the model is solidified, and the structure of the model is solidified.
8. The method of claim 6, wherein said bagging step comprises the steps of:
step 1) extracting a sample from a historical electricity consumption data set containing N samples, and marking the number j of times;
step 2) copying the sample and putting back the historical electricity utilization data set;
step 3) repeating the steps 1) and 2) until j is equal to N, and obtaining a self-service collection custom feature with the same size as the historical electricity utilization data setkAnd training the kth sub-tree.
Step 4), repeating 1), 2) and 3) until k is equal to the number of decision trees in the model, and finishing the generation of a training set; and samples which are not extracted in the historical electricity utilization data set are used as a test set.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-8.
CN202111634082.1A 2021-12-29 2021-12-29 Power utilization safety sensing method, equipment and medium based on random forest algorithm Pending CN114580829A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111634082.1A CN114580829A (en) 2021-12-29 2021-12-29 Power utilization safety sensing method, equipment and medium based on random forest algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111634082.1A CN114580829A (en) 2021-12-29 2021-12-29 Power utilization safety sensing method, equipment and medium based on random forest algorithm

Publications (1)

Publication Number Publication Date
CN114580829A true CN114580829A (en) 2022-06-03

Family

ID=81769892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111634082.1A Pending CN114580829A (en) 2021-12-29 2021-12-29 Power utilization safety sensing method, equipment and medium based on random forest algorithm

Country Status (1)

Country Link
CN (1) CN114580829A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115409134A (en) * 2022-11-02 2022-11-29 湖南一二三智能科技有限公司 User electricity utilization safety detection method, system, equipment and storage medium
CN115907567A (en) * 2023-02-21 2023-04-04 浙江大学 Load event detection method and system based on robustness random segmentation forest algorithm
CN116990648A (en) * 2023-09-26 2023-11-03 北京科技大学 Fault arc detection method based on one-dimensional cavity convolutional neural network
CN117929952A (en) * 2024-03-21 2024-04-26 国网(山东)电动汽车服务有限公司 Novel arc fault detection method for electric automobile charging pile
CN118091332A (en) * 2024-04-28 2024-05-28 烟台淼盾物联技术有限公司 Electrical fire monitoring system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062571A (en) * 2017-12-27 2018-05-22 福州大学 Diagnosing failure of photovoltaic array method based on differential evolution random forest grader
CN108303632A (en) * 2017-12-14 2018-07-20 佛山科学技术学院 Circuit failure diagnosis method based on random forests algorithm
CN110930198A (en) * 2019-12-05 2020-03-27 佰聆数据股份有限公司 Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
US20210303970A1 (en) * 2020-03-31 2021-09-30 Sap Se Processing data using multiple neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108303632A (en) * 2017-12-14 2018-07-20 佛山科学技术学院 Circuit failure diagnosis method based on random forests algorithm
CN108062571A (en) * 2017-12-27 2018-05-22 福州大学 Diagnosing failure of photovoltaic array method based on differential evolution random forest grader
CN110930198A (en) * 2019-12-05 2020-03-27 佰聆数据股份有限公司 Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment
US20210303970A1 (en) * 2020-03-31 2021-09-30 Sap Se Processing data using multiple neural networks

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115409134A (en) * 2022-11-02 2022-11-29 湖南一二三智能科技有限公司 User electricity utilization safety detection method, system, equipment and storage medium
CN115409134B (en) * 2022-11-02 2023-02-03 湖南一二三智能科技有限公司 User electricity utilization safety detection method, system, equipment and storage medium
CN115907567A (en) * 2023-02-21 2023-04-04 浙江大学 Load event detection method and system based on robustness random segmentation forest algorithm
CN115907567B (en) * 2023-02-21 2023-05-09 浙江大学 Load event detection method and system based on robust random segmentation forest algorithm
CN116990648A (en) * 2023-09-26 2023-11-03 北京科技大学 Fault arc detection method based on one-dimensional cavity convolutional neural network
CN116990648B (en) * 2023-09-26 2023-12-19 北京科技大学 Fault arc detection method based on one-dimensional cavity convolutional neural network
CN117929952A (en) * 2024-03-21 2024-04-26 国网(山东)电动汽车服务有限公司 Novel arc fault detection method for electric automobile charging pile
CN117929952B (en) * 2024-03-21 2024-05-28 国网(山东)电动汽车服务有限公司 Novel arc fault detection method for electric automobile charging pile
CN118091332A (en) * 2024-04-28 2024-05-28 烟台淼盾物联技术有限公司 Electrical fire monitoring system and method

Similar Documents

Publication Publication Date Title
CN114580829A (en) Power utilization safety sensing method, equipment and medium based on random forest algorithm
CN110082640B (en) Distribution network single-phase earth fault identification method based on long-time memory network
CN112149873B (en) Low-voltage station line loss reasonable interval prediction method based on deep learning
Li et al. A power system disturbance classification method robust to PMU data quality issues
CN105208040A (en) Network attack detection method and device
Jain et al. Rule‐based classification of energy theft and anomalies in consumers load demand profile
Qu et al. False data injection attack detection in power systems based on cyber-physical attack genes
CN110852441B (en) Fire disaster early warning method based on improved naive Bayes algorithm
CN111027697A (en) Genetic algorithm packaged feature selection power grid intrusion detection method
CN109934303A (en) A kind of non-invasive household electrical appliance load recognition methods, device and storage medium
CN110619182A (en) Power transmission line parameter identification and power transmission network modeling method based on WAMS big data
CN109829627A (en) A kind of safe confidence appraisal procedure of Electrical Power System Dynamic based on integrated study scheme
CN116401532B (en) Method and system for recognizing frequency instability of power system after disturbance
CN116008731A (en) Power distribution network high-resistance fault identification method and device and electronic equipment
Fahim et al. An unsupervised protection scheme for overhead transmission line with emphasis on situations during line and source parameter variation
Gong et al. Series arc fault identification method based on multi-feature fusion
Guo et al. A data-enhanced high impedance fault detection method under imbalanced sample scenarios in distribution networks
CN114386024A (en) Power intranet terminal equipment abnormal attack detection method based on ensemble learning
Wang et al. The Cable Fault Diagnosis for XLPE Cable Based on 1DCNNs‐BiLSTM Network
CN114202174A (en) Electricity price risk grade early warning method and device and storage medium
CN117034149A (en) Fault processing strategy determining method and device, electronic equipment and storage medium
CN116298735A (en) AC arc fault detection method and related device for low-voltage distribution network
Anderson et al. Detect and identify topology change in power distribution systems using graph signal processing
CN103513168B (en) GIS and cable local discharge comprehensive judging method
Wu et al. Identification and correction of abnormal measurement data in power system based on graph convolutional network and gated recurrent unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination