CN116578551A

CN116578551A - GRU-GAN-based power grid data restoration method

Info

Publication number: CN116578551A
Application number: CN202310338909.7A
Authority: CN
Inventors: 罗弦; 郭兆丰; 孙明; 廖荣涛; 刘芬; 郭岳; 杨荣浩; 姚渭箐; 黄俊东; 胡欢君; 李想; 张岱; 李磊; 叶宇轩; 王敬靖; 袁翔宇; 王博涛
Original assignee: Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Current assignee: Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Priority date: 2023-03-31
Filing date: 2023-03-31
Publication date: 2023-08-11

Abstract

The application relates to a GRU-GAN-based power grid data restoration method, which comprises the following specific steps: step 1: data acquisition and preprocessing, namely searching a complete cleaned and abnormal value-removed data set X based on voltage history data acquired by a TTU, and carrying out normalization processing on the acquired data set to enable the normalized data set to accord with normal distribution; step 2: constructing a generated type countermeasure network GAN based on a gate control circulating nerve unit GRU; step 3: training of GRU-GAN neural networks such that data distribution using random noise generationDistribution of near-real dataThe method comprises the steps of carrying out a first treatment on the surface of the Step 4: repairing the missing data; step 5: and (5) analyzing error of missing data, and performing performance evaluation on the model by adopting a Root Mean Square Error (RMSE) index. The method can generate the predicted value which accords with the original time sequence data distribution, and achieves the purpose of supplementing the sample data, thereby effectively solving the problems of lack and missing of high-frequency data perception of the new energy power grid.

Description

GRU-GAN-based power grid data restoration method

Technical Field

The application belongs to the technical field of novel power systems, and particularly relates to a grid data restoration method based on GRU-GAN.

Background

The construction of a novel power system taking new energy as a main body is a main target of transformation and upgrading of a national power grid, a sensing layer responsible for data sensing and acquisition is positioned at the bottommost layer in a digital technical support system architecture, and acquired measurement data is the basis of the whole system. The new energy power is connected in a grid, so that higher requirements are put on the accuracy and timeliness of the power grid measurement data acquisition, and many devices need high-frequency acquisition, and the frequency reaches the minute level. However, due to the influence of various factors, during the process of data sensing, transmission and processing, data may be lost, the power grid data is typical time series data, the values before and after the data have strong dependence, and the lost parts can greatly prevent modeling of the time series data. At this time, if the missing data can be recovered based on the intrinsic characteristics of the data, the data integrity can be ensured, and the data use value can be improved.

At present, an effective method is provided for repairing the power grid data at home and abroad, the main method comprises a history averaging method, an adjacent data interpolation method and a long-short-term memory neural network (LSTM) based on history data,

according to the historical data distribution, the historical average method averages the historical data related to a plurality of characteristics. However, this requires a new power system with a stable operation of the device and a main body of new energy, and is characterized by unstable power output and large data variation.

The adjacent data interpolation method uses local data information, and when the correlation of adjacent data is weak, the error of repair data increases, and especially when data is lost for a long period of time (the condition of data loss for a long period of time of 15 minutes or more), the data interpolation error is particularly large.

Short-term prediction can be well realized by long-term and short-term memory neural networks (LSTM), but the prediction precision is obviously reduced during long-term prediction.

The time sequence characteristics, the relativity and the load change rule of the measured data in the power grid system can be used as important basis for reconstructing the missing data, and the difficulty is that complex time-space relations exist among the factors, and modeling description is difficult to carry out by using an explicit mathematical model.

Disclosure of Invention

The embodiment of the application aims to provide a GRU-GAN-based power grid data restoration method which can generate a predicted value conforming to original time sequence data distribution so as to achieve the purpose of supplementing sample data, thereby effectively solving the problems of lack of high-frequency data perception, lack and the like of a new energy power grid.

In order to achieve the above purpose, the present application provides the following technical solutions:

the embodiment of the application provides a GRU-GAN-based power grid data restoration method, which comprises the following specific steps:

step 1: the data is collected and pre-processed,

based on TTU collected voltage history data, searching a complete cleaned and abnormal value-removed data set X, carrying out normalization processing on the collected data set, enabling the normalized data set to accord with normal distribution, and assuming a spatial distribution relation；

Step 2: constructing a generated type countermeasure network GAN based on the gated circulating nerve unit GRU,

respectively designing internal structures of the generator and the discriminator, training the generator and the discriminator, and optimizing hidden variables of the generated countermeasure network GAN to obtain a stable GRU generator;

step 3: training of the GRU-GAN neural network,

taking GRU as a generator G of GAN, and distinguishing the generated false data and true data as 0 and 1 respectively by a discriminator D, optimizing models of the generator G and the discriminator D, carrying out fast convergence on the loss of the discriminator D along with iteration, training the generator G after the discriminator D reaches the optimum, and finally enabling the generator G to reach Nash equilibrium so as to enable the data G (z) generated by random noise z to be spatially distributedApproach->；

Step 4: the repair of the missing data is performed,

generating a group of predicted data G (z) through the random noise data z, and obtaining content loss Lt by making differences between the G (z) and partial values which are not missing in real measurement; inputting G (z) into a discriminator D to obtain an output prior loss Lp, reversely optimizing random noise data z through the two losses until the sum of the two losses reaches the minimum, inputting the z at the moment into a generator G, and obtaining an output value which can be regarded as an optimal repair value of a missing value;

step 5: the error analysis of the missing data is performed,

the performance of the model is evaluated by adopting a Root Mean Square Error (RMSE) index, and the calculation formula is as follows:

，

where N represents the number of missing data,，/>representing the kth post-repair predicted data and real data, respectively.

The training of the GRU-GAN neural network in the step 3 is specifically as follows:

training arbiter D: the random noise data z is passed through a generator G to obtain an output G (z), identified as 0, with a spatial distribution of G (z) ofReal data sample->Labeled 1, the objective of the arbiter D is to be able to distinguish very well the true sample +.>And false sample->The loss function considers both the ability to identify true data and the ability to identify false data, so the loss function of the arbiter is the sum of the two, so the classification problem of D is a binary problem, and the loss function is defined as:

，

the optimization objective of the discrimination network D is therefore:

，

converting minimisation into maximisation problems and writing into the desired form:

，

where θ is the parameter set of the discrimination network D, the parameter θ is optimized using a gradient-increasing algorithm,

training generator G: the random noise data z passes through the generator G to obtain output G (z), the network parameters of the discriminator D are fixed, and false samples are expectedCan well cheat the discrimination network D, and the target is a false sample +.>The better the tag is when the output of the discrimination network is closer to true, the better the output D (G (z)) of the discrimination network is expected to be closer to 1 when the generation network is trained, so the label is 1, and the loss function between D (G (z)) and 1 is minimized:

，

the equivalent steps are as follows:

，

where CE represents the desire for a condition,is the spatial distribution of the noise data z, phi is a parameter set for the network G, and a gradient descent algorithm can be used to optimize the parameter phi.

The above steps aim to make the data G (z) distribution law of noise data z generationGradually fitting sample data +.>Spatial distribution of->The generator continuously tries to generate data which is close to the real data distribution rule, so that the arbiter cannot judge whether the data come from the real data or not;

after the above two steps are repeated for a plurality of times, nash equilibrium is finally achieved, and the generator can obtain the distribution rule of the real data without supervision and output a trained generator G.

The repairing of the missing data in the step 4 specifically comprises the following steps:

the a priori loss is Lp, lp=d (G (z)), D being the arbiter, G being the generator,

when processing the missing data set, a mask array Ms is generated according to the real measurement data, and the bit positions are determined to have values to formK represents the dimension of the metrology data,

，

calculating a difference in the values not missing in the real metrology data calculates a content loss Lt, in hopes that the predicted data G (z) is sufficiently similar to the real data with the missing,

content loss is defined as:，

wherein Ms is the corresponding mask array, I is the measurement data containing missing values, representing the inner product operation of the vector, using 1-norm metric difference,

the optimization targets of the measurement missing data are as follows:，

and using LL as an optimization target, using an adam optimizer to enable the generated G (z) to be as close to the missing measurement value as possible, wherein after training is finished, the finally repaired measurement data consists of two parts, one part is a part which is not missing in the original measurement data, and the other part is a part with the position of 1 corresponding to the mask array Ms in G (z).

Compared with the prior art, the application has the beneficial effects that: the application adopts the gating circulating neural network very suitable for modeling the sequence data to build the internal network structure, effectively excavates the space-time characteristics among the power grid measurement data, and simultaneously, the discriminator can further improve the prediction precision of the network on the sequence level, and reduce the influence of error accumulation on the prediction performance of the network. The GAN network model plays games with each other through the generator network and the discriminator network, and finally realizes network training after multiple rounds of games. After the method is completed, the generator can generate the predicted value conforming to the original time sequence data distribution, and the purpose of supplementing sample data is achieved, so that the problems of lack, missing and the like of high-frequency data perception of a new energy power grid are effectively solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method according to an embodiment of the present application;

FIG. 2 is a block diagram of the GAN of the present application;

FIG. 3 is a block diagram of a GRU of the application;

FIG. 4 is a diagram of a missing data repair architecture of the GRU-GAN of the application;

FIG. 5 is a graph of measured and predicted values after the iterative training of the GAN of the present application;

FIG. 6 is an interpolation fill map of the present application;

FIG. 7 is a graph comparing the effects of using long-term history data according to the present application;

FIG. 8 is a graph showing the comparison of recent historical data effects of the present application;

FIG. 9 is a graph comparing effects of using very little historical data according to the present application;

fig. 10 is a graph comparing the effect of not using historical data (data on the same day) according to the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Referring to fig. 1, the application provides a grid data restoration method based on a GRU-GAN, which is characterized by comprising the following specific steps:

step 1: the data is collected and pre-processed,

step 3: training of the GRU-GAN neural network,

Step 4: the repair of the missing data is performed,

step 5: the error analysis of the missing data is performed,

，

Examples:

data acquisition and preprocessing:

based on the voltage history data collected by the TTU, a complete data set X which is cleaned and abnormal values are found. Some TTU equipment collects 30-day voltage (15 min sampling interval), 2880 points are taken as a total, the collected data set is normalized, and then the normalized data set is enabled to conform to normal distribution by using a standarscaler method, and the spatial distribution relation is assumed. 2600 points in the training set and 280 points in the testing set.

The sample noise vector z uses a normal distribution of random noise (interval 0,0.1), which can better generate voltage data.

Building a GRU-based generation type countermeasure network:

firstly, respectively designing the internal structures of a generator and a discriminator, and training the internal structures; selecting Adam as an optimizer to optimize hidden variables of a GAN network, and finally obtaining a stable GRU generator;

the potential spatial latency vector (i.e. random noise) is input, mapped to the real sample space (x), the "dummy data" identical to the original data probability distribution is generated and output by the gated cyclic neural unit GRU, the mapping function is called G (z) =x, and the function is fitted by means of the GRU gated cyclic neural network. The GAN structure diagram is shown in figure 2,

generator G structure definition:

in the table above, the gating cycle unit GRU has been proved to be useful for learning long-term dependencies in sequence modeling, both GRU and LSTM overcome the problem of the disappearance of the conventional RNN gradient, and the structure of GRU is simpler and more efficient. The core of the GRU is composed of reset gates and update gates, which aim to save and transfer more hidden information from the input data. The internal structure of the GRU is shown in figure 3,

x represents input data, h represents output of GRU unit, r is reset gate, z is update gate, update gate is combined with input at current timeAnd the output of GRU at the previous time +.>To determine how much information is reserved at the previous moment, the calculation formula is as follows

，

In the method, in the process of the application,by->And->Decision (S)>Activating a function for sigmoid->And->Is the weight ofA matrix. The reset gate is combined->And->To control the degree to which state information at a previous time is ignored

，

And->Is a weight matrix, generates new memory information based on reset gate +.>，

，

And->Is a weight matrix, output at the current moment +.>Is that

，

Specifically, the GRU passes the state at the last timeAnd input of the current node->To obtain two gating states. For time series data in actual predictions, the GRU may capture distribution information in the time series.

The structural definition of the discriminator D:

training process of GRU-GAN neural network:

GRU is taken as a generator G of GAN, a discriminator D is responsible for discriminating generated 'false data' and 'true data' into 0 and 1 respectively, model optimization of the generator G and the discriminator D is carried out through Min (D loss) Max (G loss), loss of the discriminator is quickly converged along with iteration, training of the generator is carried out after the discriminator reaches the optimum, and finally Nash equilibrium is achieved, so that distribution p is achieved _g (z) near p _r (x)。

The object is: distribution ofApproach->，

The realization steps are as follows:

training arbiter D: the random noise data z is passed through a generator G to obtain an output G (z), identified as 0, with a spatial distribution of G (z) ofReal data sample->Labeled 1, the objective of the arbiter D is to be able to distinguish very well the true sample +.>And false sample->The loss function considers both the ability to identify true and false data, so the loss function of the arbiter is bothAnd, therefore, the classification problem of D is a bipartite problem, the loss function is defined as:

，

the optimization objective of the discrimination network D is therefore:

，

wherein θ is a parameter set of the discrimination network D, and the parameter θ is optimized by using a gradient-increasing algorithm.

，

the equivalent steps are as follows:

，

The above steps aim to regularize the G (z) distribution of the data generated based on the noise zFitting gradually to the spatial distribution of the sample data +.>The generator continuously tries to generate data close to the real data distribution rule, so that the arbiter cannot judge whether the data comes from the real data

Repair of missing data

The random noise data z is passed through a trained generator G to obtain predicted data which can theoretically be used as repair data. However, there are many random noises z satisfying the distribution rule of the measured data, and it is necessary to select a group closest to the real data from the generated random noises. In this process, the network parameter weights of both generator G and arbiter D are already fixed values, and the random noise z is trained by the loss function.

Referring to fig. 4, when the actual measured data is missing, generating a group of predicted data G (z) by random noise z, and obtaining content loss Lt by making difference between the G (z) and the partial value which is not missing in the actual measurement; the G (z) is input to the arbiter D, resulting in an a priori loss Lp of output, by which the random noise z is in turn optimized until the sum of the two losses is minimized, at which time z is input to the generator G, the resulting output value being considered as the best repair value for the missing value.

when processing the missing data set, a mask array Ms is generated according to the real measurement data, and the bit positions are determined to have values to formK represents the dimension of the metrology data.

，

The content loss Lt may be calculated by calculating the difference of the values not missing in the real metrology data in hopes that the predicted data G (z) is sufficiently similar to the real data with the missing.

Content loss is defined as:，

wherein Ms is a mask array corresponding to the mask array, I is measurement data containing missing values, and represents the inner product operation of the vector, and a 1-norm measurement difference is adopted.

In summary, the optimization objective of measuring missing data is:，

with LL as the optimization objective, an adam optimizer is used so that the generated G (z) is as close as possible to the missing measurement value. After training, the final repaired measurement data is composed of two parts, one part is a part which is not missing in the original measurement data, and the other part is a part with the position of the corresponding mask array Ms of G (z) being 1.

Missing data error analysis

And (3) performing performance evaluation on the model by adopting a Root Mean Square Error (RMSE) index, wherein the calculation formula is as follows:

，

wherein N represents a deletionNumber of data，/>Representing the kth post-repair predicted data and real data, respectively.

Analysis 1: GRU-GAN repair VS interpolation repair

As shown in fig. 5, the measured value and the predicted value of the voltage data after 901 GAN iterative training are about 80% of data repair accuracy. Whereas the interpolation padding results for the long-term missing data set, as shown in figure 6 below,

aiming at the large-scale missing situation (such as 1-6 hours of data loss) above 15 minutes, the filling and repairing effects of the GAN anti-neural network algorithm are obviously superior to those of other methods;

analysis 2: influence of Long-term historical data and day data on training results

(1) As shown in FIG. 7, using long-term history data (52 electricity meters, 740 total), 20% data was missing, RMSE Performance: 0.1622.

(2) As shown in FIG. 8, using recent history data (52 electricity meters, 256 total), 20% data was missing, RMSE Performance: 0.1534

(3) As shown in FIG. 9, using very little historical data (52 electricity meters, 128 pieces of data), 20% of the data was missing, RMSE Performance: 0.2957

(4) As shown in fig. 10, without using historical data (data on the same day), 20% of the data was missing, RMSE Performance: 0.1567,

the voltage class data is distributed and learned by taking medium-and-long-term data and current day data as samples, and a good filling result can be obtained.

According to the GRU-GAN-based power grid data restoration method, GRU is used as a generator G of GAN, so that space-time characteristics hidden between time sequence sequences can be fully utilized by GRU mining, and the GRU is matched with a discriminator D through the generator GThe mutual antagonism and optimization help the two to reach Nash equilibrium state rapidly. The hidden variable z is optimized through priori loss constraint and true loss constraint, so that the generator G can predict high-precision repair data; the method is an unsupervised implementation, is completely driven by data, does not involve an explicit modeling step, and directly realizes the generation of data distribution p _g (z) near the true data distribution p _r (x) The method has higher repairing precision under the condition of a large amount of measurement data missing, and particularly has more obvious repairing effect under the condition of continuous long-term missing.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. The GRU-GAN-based power grid data restoration method is characterized by comprising the following specific steps of:

step 1: the data is collected and pre-processed,

step 3: training of the GRU-GAN neural network,

the GRU is taken as a generator G of GAN, and a discriminator D is responsible for discriminating the generated false data and true data into 0 and 1 respectively to carry out the generator GOptimizing the model of the discriminator D, carrying out rapid convergence on the loss of the discriminator D along with iteration, training a generator G after the discriminator D reaches the optimum, and finally enabling the generator G to reach Nash equilibrium so as to enable the data G (z) generated by random noise z to be distributed in spaceApproach->；

Step 4: the repair of the missing data is performed,

step 5: the error analysis of the missing data is performed,

，

2. The grid data restoration method based on the GRU-GAN according to claim 1, wherein the training of the GRU-GAN neural network in the step 3 is specifically as follows:

training discriminatorD: the random noise data z is passed through a generator G to obtain an output G (z), identified as 0, with a spatial distribution of G (z) ofReal data sample->Labeled 1, the objective of the arbiter D is to be able to distinguish very well the true sample +.>And false sample->The loss function considers both the ability to identify true data and the ability to identify false data, so the loss function of the arbiter is the sum of the two, so the classification problem of D is a binary problem, and the loss function is defined as:

，

the optimization objective of the discrimination network D is therefore:

，

the equivalent steps are as follows:

，

where CE represents the desire for a condition,is the spatial distribution of the noise data z, phi is a parameter set for the generation network G, a gradient descent algorithm can be used to optimize the parameter phi,

the above steps aim to make the data G (z) distribution law of noise data z generationGradually fitting sample data +.>Spatial distribution of->The generator continuously tries to generate data which is close to the real data distribution rule, so that the arbiter cannot judge the numberWhether or not the data is from real data;

3. The grid data restoration method based on the GRU-GAN according to claim 1, wherein the restoration of the missing data in the step 4 is specifically:

，

content loss is defined as:，

the optimization targets of the measurement missing data are as follows:，