CN117421971A

CN117421971A - Wind-light-load power short-term prediction method based on multi-task progressive learning

Info

Publication number: CN117421971A
Application number: CN202311189052.3A
Authority: CN
Inventors: 李丹; 梁云嫣; 甘月琳; 杨帆; 方泽仁; 胡越; 贺帅
Original assignee: China Three Gorges University CTGU
Current assignee: China Three Gorges University CTGU
Priority date: 2023-09-15
Filing date: 2023-09-15
Publication date: 2024-01-19

Abstract

A wind-light-load power short-term prediction method based on multi-task progressive learning comprises the following steps: step 1: collecting prediction input data, and collecting historical power data and multidimensional weather prediction data of a target area wind-light-load hour level; step 2: respectively carrying out normalization operation on various data serving as an input variable and an output variable according to the characteristics of the data; step 3: dividing the data set into a training set, a verification set and a test set; step 4: establishing a multi-task progressive learning model MTPL-DSTFN based on a depth space-time fusion network; step 5: setting model super parameters, initializing weights and offsets, setting a loss function, training an MTPL-DSTFN model to obtain optimal weights and offset parameters, and searching the optimal super parameters of the optimal model by using grids through a verification set sample; step 6: inputting the test sample into an MTPL-DSTFN model with optimal super parameters, and performing inverse normalization on the output prediction result to obtain a power prediction result of wind-light-load at each time of the prediction day.

Description

Wind-light-load power short-term prediction method based on multi-task progressive learning

Technical Field

The invention belongs to the field of renewable energy power generation and comprehensive absorption, and particularly relates to a wind-light-charge power short-term prediction method based on multi-task progressive learning.

Background

Along with the high-speed industrialized development of society, the aggravation of energy consumption and the environmental deterioration threaten the sustainable development of human society, the traditional energy production and consumption system taking fossil energy as a core is difficult to succeed, and a multi-energy optimization and complementation energy supply and demand system is constructed, so that the integral transformation of the energy industry is guided, and the method becomes the important importance of the development of the energy industry in China. Clean energy sources such as wind energy and solar energy have the advantages of being renewable, clean, pollution-free, mature in technology and the like, become important props for energy conversion, and are fully developed and utilized. However, wind power and photovoltaic output are affected by geographical positions and climatic conditions, so that obvious volatility, randomness and uncertainty are presented. With the increase of the grid-connected scale of new energy, uncertainty of wind and light output inevitably has great influence on operation management of a power grid, and in order to eliminate the influence, it is important to accurately know the output of each wind power/photovoltaic power generation system at the source end of a power system. Similarly, as the novel load ratio of the electric automobile on the demand side, the high-speed rail load and the like is continuously increased, the randomness and the uncertainty of the electric automobile are necessarily increased.

The output power of the same source load has space-time correlation and space correlation between different source loads due to the influence of the same meteorological process. Most of the existing researches separate different types of source charges, predict the source charges independently, ignore abundant similar shared information of regional source charges, and have great limitation. Therefore, when the power prediction of the source charges is carried out, the aim of improving the joint prediction precision of the multiple source charges is hopefully achieved by comprehensively considering the correlation between the source charges, which is the starting point of the research. For example, the comprehensive energy system load and wind and light resource prediction method based on ARIMA-LSTM model discloses: load and wind and light resource prediction models combined by autoregressive moving average and long short-term memory (LSTM) network are used for improving prediction performance through advantage complementation among different models. Although various source load power prediction results can be obtained by the method, the coupling relation between the source loads is not considered sufficiently, parameters are shared in the training process, specific information with tasks is difficult to extract, and high prediction accuracy cannot be obtained. multiple-Task Learning (MTL) can learn multiple tasks simultaneously in one model to improve Learning efficiency, and through a partial sharing mechanism of parameters, coupling information of related tasks can be considered while unique information of own tasks is extracted, so that related researchers introduce MTL ideas in source load prediction. However, when the actual tasks exhibit loose correlations, even conflicting relationships, performance may be degraded, a phenomenon known as negative migration. And the existing MTL model usually ignores the dependency between samples when the task dependency is complex, improves some tasks by sacrificing the performance of other tasks, and cannot improve all tasks simultaneously compared with the corresponding single task model, which is called a "teeter-totter phenomenon".

In order to solve the problems, the invention introduces an improved multi-task learning mechanism when wind-light-load combined prediction is carried out, researches a multi-task progressive learning wind-light-load short-term power prediction method considering time-space correlation according to a network structure and a loss function which are reasonably designed according to the characteristics of different types of source loads, realizes progressive learning of special properties of heterogeneous source loads while sharing information between the source loads, and improves the performance of multi-source load short-term power combined prediction.

Disclosure of Invention

The invention aims to account for similarity and difference of wind power, photovoltaic and load power time evolution characteristics, designs a multi-task progressive learning framework of separating and fusing shared information and a specific information subnet through a deep space-time fusion network, and respectively performs shallow-to-deep progressive extraction on the shared and specific space-time information of wind-light-load so as to realize accurate and effective prediction on wind-light-load short-term power.

In order to solve the technical problems, the invention adopts the following technical scheme:

a wind-light-load power short-term prediction method based on multi-task progressive learning comprises the following steps:

step 1: collecting prediction input data, and collecting historical power data and multidimensional weather prediction data of a target area wind-light-load hour level;

step 2: data preprocessing, namely respectively carrying out normalization operation on various data serving as an input variable and an output variable according to characteristics of the data;

step 3: dividing a data set into a training set, a verification set and a test set, firstly extracting multi-periodic characteristics based on historical wind, light and load data of the training set by using Fourier transformation, and recording a more obvious public period length T ₀₁ ，T ₀₂ ，…，T _0u Period length T specific to wind, light and charge _W1 ，T _W2 ，…，T _Wu 、T _P1 ，T _P2 ，…，T _Pu And T _L1 ，T _L2 ，…，T _Lu Determining the convolution kernel size of a multi-core convolution layer in each task subnet;

step 4: forming a common information subnet input matrix X from wind-light-load historical power data and common meteorological data ₀ Each source load history power sequence forms a specific information subNetwork input matrix X ₁ ，X ₂ ，X ₃ Taking the power values of the wind-light-load s prediction moments as output variables, and establishing a multi-task progressive learning model (multi-task progressive learning model based on deep spatio-temporal fusion network, MTPL-DSTFN) based on a depth space-time fusion network;

step 5: setting model super parameters, initializing weights and offsets, setting a loss function, training an MTPL-DSTFN model to obtain optimal weights and offset parameters, and searching the optimal super parameters of the optimal model by using grids through a verification set sample;

step 6: inputting the test sample into an MTPL-DSTFN model with optimal super parameters, and performing inverse normalization on the output prediction result to obtain a power prediction result of wind-light-load at each time of the prediction day.

In step 1, the collected wind power, photovoltaic and load historical power data and weather forecast data are input; with a shared input vector x for each prediction subtask _t0 ＝[P _t0-d ，P _t0-d+1 ，…，P _t0-1 ，Q _t0 ，Q _t0+1 ，…，Q _t0+s-1 ]∈R ^l Wherein P is _t0-d ，P _t0-d+1 ，…，P _t0-1 Wind power, photovoltaic and load power respectively of the previous d historical moments, Q _t0 ，Q _t0+1 ，…，Q _t0+s-1 And row vectors formed by multidimensional public weather forecast data (wind speed, irradiation intensity and air temperature) of the last s forecast moments respectively. With specific input vector x for wind power prediction branch _t1 ＝[P _t1-d ，P _t1-d+1 ，…，P _t1-1 ]，P _t1-d ，P _t1-d+1 ，…，P _t1-1 The wind power values at the first d moments are respectively, and the special input vectors of the photovoltaic and load prediction branches are respectively x _t2 ＝[P _t2-d ，P _t2-d+1 ，…，P _t2-1 ]And x _t3 ＝[P _t3-d ，P _t3-d+1 ，…，P _t3-1 ]。

Step 2, respectively carrying out normalization processing on wind power, photovoltaic and load power data and meteorological data acquired in the step 1, wherein the wind power and photovoltaic power data are normalized to an interval [0,1] by taking rated capacities of a wind power station and a photovoltaic power station as references, the meteorological data such as load power, wind speed, irradiance, temperature and the like adopt a maximum and minimum normalization method, wind direction adopts a sin/cos trigonometric function normalization method, and the specific normalization formula is as follows:

wherein, wind power and photovoltaic power before and after normalization are set as x ₁ Andload power, wind speed, irradiance, temperature, etc. are x ₂ And->The maximum and minimum values of the load power, wind speed, irradiance, temperature and other samples are respectively x _max 、x _min Rated capacity of wind power and photovoltaic power is x _N 。

In step 3, based on the real measurement data of wind power, photovoltaic and load power in the target area of the training set, respectively carrying out Fourier transform on the real measurement data and drawing an amplitude frequency curve, wherein the frequency components corresponding to the points with larger amplitude values in the amplitude frequency curve occupy larger proportion in the original sequence, converting the frequency points into wind-light-load sharing and specific multi-periodicity characteristics according to sampling frequency, and recording a more obvious public period length T ₀₁ ，T ₀₂ ，…，T _0u Characteristic period length T of wind power _W1 ，T _W2 ，…，T _Wu Photovoltaic having period length T _P1 ，T _P2 ，…，T _Pu And the characteristic period length T of the load _L1 ，T _L2 ，…，T _Lu This provides a reference for the setting of model superparameters at a later time.

Step 4, after determining input and output data, starting to build a multi-task progressive learning model based on a depth space-time fusion network, wherein step 4 comprises the following sub-steps:

step 4.1: the shared information sub-network is a branch for mainly extracting wind-light-load shared information, and its input information includes wind-light-load historical power data and public meteorological data to form input matrix X ₀ Firstly, extracting wind-light-charge time sequence information through a shared GRU unit, and at the moment t, receiving the current state x by the shared GRU _t And the hidden state h of the last moment _t-1 Output h of the network _t Formed by dynamic control of the update gate and the reset gate; defining a weight matrix W related to the input _r ,W _u ,W _z The method comprises the steps of carrying out a first treatment on the surface of the Weight matrix R associated with a cyclic connection _r ,R _u ,R _z Bias vector b _r ,b _u ,b _z Sigma is a sigmoid activation function, tan h is a hyperbolic tangent function, and # is a dot product. The GRU first passes through the hidden state h at the last moment _t-1 And input x at the current time _t To obtain two gating states, where r _t To reset the gate, z _t To update the door:

r _t ＝σ(W _r x _t +R _r h _t-1 +b _r )

z _t ＝σ(W _z x _t +R _z h _t-1 +b _z )

after the gating signal is obtained, reset gating is used to obtain a reset hidden state, then the hidden state is spliced with the input, and the data range is made to be [ -1,1 through an activation function tanh]Obtaining

Then the hidden state transferred at the previous moment is selectively forgotten and the hidden state information containing the current moment is selectively forgottenIs memorized to get h _t The update expression is:

step 4.2: hidden state matrix [ h ] obtained by processing shared GRU through shared multi-core convolution layer _t-w+1 ，h _t-w+2 ，…，h _t-1 ]Is processed (w represents the time window length) to extract the common multicycle characteristic of wind-light-load. Firstly, carrying out convolution operation on row vectors of a hidden state matrix h in k channels by using convolution check of u sizes to obtain k different feature graphs f _map The formula is as follows:

where i represents the ith row vector of matrix h, concat represents the feature concatenation operation, K represents the convolution kernel, and its subscript represents the different sizes of the convolution kernel (T ₀₁ ，T ₀₂ ，…，T _0u Representing the length of the convolution kernel, 1 representing the width of the convolution kernel) and a channel, and then performing downsampling operation (down) on a plurality of sub-blocks divided by the feature map through a sliding window to obtain a new feature map f _m ′ _ap (k) A. The invention relates to a method for producing a fibre-reinforced plastic composite The calculation formula is as follows:

in order to fuse the convolution operation characteristic diagrams with multiple channels and multiple sizes, the output vectors of the pooling layer are spliced in the channel direction to obtainThe calculation formula is as follows:

the last step of the multi-core convolution layer is to linearly map the stitched feature map to the row vector H of the new state matrix H _i The formula is as follows:

W _f and b _f The weights and offsets of the mapping process are represented.

Step 4.3: after the two-step processing of steps 4.1 and 4.2 to obtain the time evolution mode information including wind-light-load time sequence and multicycle, the new hidden state matrix H obtained through the shared GRU and the shared multi-core convolution layer is processed by using a time-varying mode attention mechanism to extract the related information of the space variables with different time evolution modes, firstly, a scoring function f for evaluating the correlation is defined, and the obtained attention weight is normalized, wherein the calculation formula is as follows:

f(H _i ,h _t )＝(H _i ) ^T W _a h _t

a _i ＝σ(f(H _i ,h _t ))

wherein W is _a Is a weight matrix which is needed to be obtained through neural network training, and sigma represents a sigmoid activation function; then the ith row vector (containing time evolution mode information) and the obtained attention weight a in the hidden state matrix H _i Weighted summation is carried out, and a weighted summation vector v which contains wind-light-load sharing information and considers the wind-light-load space-time correlation is output ₀ ：

m represents the number of hidden layer neurons.

Step 4.4: specific information subnet inputs historical power sequence X of each source load ₁ ，X ₂ ，X ₃ Similar to steps 4.1-4.3, extracting the timing information through the GRU unit, and then hiding the state matrix h containing the timing information ₁ ，h ₂ And h ₃ Input into the multi-core convolutional layer according to the specific period length T of wind-light-charge _W ,T _P And T _L Design convolution kernel size T _W ×1，T _P X 1, and T _L X 1, outputting a new hidden state matrix H obtained by multi-core convolution operation ₁ ，H ₂ And H ₃ Different rows of the matrix represent vectors obtained by different convolution kernel size operations, different columns represent feature vectors of different time steps, then a feature attention mechanism is used for focusing on influences of different columns of the matrix, namely, each input feature, and a weighted sum vector v containing wind power, photovoltaic and load specific information is respectively output ₁ 、v ₂ And v ₃ 。

Step 4.4: the final stage of the model is to sub-network the shared information and the unique information to obtain the characteristic vector v ₀ 、v ₁ 、v ₂ And v ₃ Fusion is carried out to realize joint prediction considering complete information between the interior of the multitask and the multitask, and finally short-term prediction values of wind power, photovoltaic power and load power are respectively output through a full-connection layerAnd->The following formula is adopted:

wherein v is ₀ 、v ₁ 、v ₂ And v ₃ As a feature vector of the object set,and->Short-term predicted values of wind power, photovoltaic power and load power are respectively obtained.

Step 5, setting super parameters, such as the number m of neurons of each branch, the length w of a sample time window, the number g of GRU layers of a time sequence network, the number k of channels of a multi-core convolution layer and the size of convolution kernels, after the multi-task progressive learning model is built in the step 4; initializing weights and offsets, selecting training samples, taking a mean square error as a loss function, and training a model by adopting an SGD optimization algorithm to obtain optimal weight and offset parameters; inputting the verification set sample into a trained depth space-time fusion model, and optimizing optimal super parameters of the model according to verification errors by adopting grid search, wherein the optimizing ranges of some main super parameters are as follows: neuron number m: {16, 32, 64, 100, 128, 200, 300}; number of convolved channels k: {16, 24, 32, 48, 64}; number of sequential network layers g: {1,2,3,4,5,6}; length of convolution kernel T _u ：[1，168]。

Step 6, inputting the test sample into a multi-task progressive learning model with optimal super parameters, and performing inverse normalization on the output prediction result to obtain a power prediction result of wind-light-load at each time of the prediction dayAnd->Wherein the method comprises the steps ofAnd->s is the number of time steps predicted backward.

Compared with the prior art, the invention has the following technical effects:

1) The invention introduces an improved multi-task learning mechanism when carrying out wind-light-load combined prediction, designs a reasonable network structure and a reasonable loss function according to the characteristics of different types of source loads, researches a multi-task progressive learning wind-light-load short-term power prediction method considering time-space correlation, realizes progressive learning of special properties of heterogeneous source loads while sharing information between source loads, and thereby improves the performance of multi-source load short-term power combined prediction;

2) According to the invention, the space-time correlation of wind-light-load power is considered, and short-term power prediction results of wind power, photovoltaic and load in a plurality of time steps are output, so that grid-connected operation of large-scale wind power and photovoltaic is realized, and positive influence is generated on the safety and stable operation of a power grid;

3) According to the invention, the shared information and the specific information of wind, light and load are explicitly separated, and the negative migration and the 'teeterboard' phenomenon of a single task and a traditional multi-task learning model are relieved through the separation and interaction of the specific-shared information of each task;

4) The invention designs a depth space-time fusion network to progressively extract space-time characteristics of wind, light and load, which not only can extract time evolution characteristics of different wind, light and load, but also can extract dynamic space characteristics among wind, light and load;

5) The invention introduces an improved multi-task learning mechanism, can take into account a plurality of prediction tasks, has higher prediction precision and stronger stability, and can be well suitable for wind-light-load combined power prediction;

6) The practical calculation example results show that the method provided by the invention has reasonable structure, not only has good performance on the practical wind-light-load data set, but also has higher prediction efficiency, better prediction precision and robustness compared with the current rest hot prediction models.

Drawings

The invention is further illustrated by the following examples in conjunction with the accompanying drawings:

FIG. 1 is a wind-light-load short-term power prediction flow based on deep space-time fusion network multi-task progressive learning;

FIG. 2 is a wind power Fourier transform amplitude-frequency diagram;

FIG. 3 is a graph of the amplitude versus frequency of the photovoltaic Fourier transform;

FIG. 4 is a Fourier transform amplitude-frequency plot of the load;

FIG. 5 is a separation and fusion framework of multi-tasking unique-sharing information;

FIG. 6 is a deep learning progressive feature extraction architecture;

FIG. 7 is a graph showing the comparison of load power prediction curves under different models in an embodiment of the present invention;

FIG. 8 is a comparison of wind power prediction curves under different models in an embodiment of the present invention;

FIG. 9 is a graph showing comparison of photovoltaic power prediction curves under different models in an embodiment of the present invention;

FIG. 10 is a graph showing load prediction error versus cycle characteristics according to an embodiment of the present invention;

FIG. 11 is a graph showing wind power prediction error comparison for different cycle characteristics according to an embodiment of the present invention;

FIG. 12 is a graph showing photovoltaic prediction error contrast at different periodic characteristics according to an embodiment of the present invention;

FIG. 13 is a graph showing comparison of the results of ablation study of load under different prediction methods according to an embodiment of the present invention;

FIG. 14 is a graph showing error contrast of wind power ablation study results under different prediction methods according to the embodiment of the invention;

fig. 15 is a graph comparing errors of results of photovoltaic ablation study under different prediction methods according to an embodiment of the present invention.

Detailed Description

The invention discloses a wind-light-load power short-term prediction method based on multi-task progressive learning, which mainly comprises four modules: the input module is used for collecting and preprocessing data, and the object is the historical power and weather forecast data of the wind-light-load of the target area; the special-shared information separation and joint learning module designs a plurality of subnets to independently process input data with different characteristics, so that the subnets are prevented from learning secondary related information to influence the learning effect of each subtask; the progressive feature extraction module progressively extracts wind-light-load space-time features of higher levels through a multistage deep learning network on the basis of transverse separation sharing and specific information extraction tasks; and finally, outputting a wind-light-load power day-ahead prediction scene by the output module. According to the invention, by researching the wind-light-charge power short-term prediction method based on multi-task progressive learning, progressive learning of special properties of heterogeneous source charges is realized while information is represented among the source charges, the negative migration phenomenon and the teeterboard phenomenon of the existing multi-task models are made up, and the purposes of improving prediction precision and robustness are achieved.

As shown in fig. 1, the wind-light-load short-term power prediction method based on deep space-time fusion network multi-task progressive learning comprises the following steps:

step 1: collecting historical power data and multidimensional weather forecast data of wind-light-load hour level of a target area, wherein each forecast subtask has a shared input vector x _t0 ＝[P _t0-d ，P _t0-d+1 ，…，P _t0-1 ，Q _t0 ，Q _t0+1 ，…，Q _t0+s-1 ]Wherein P is _t0-d ，P _t0-d+1 ，…，P _t0-1 Wind power, photovoltaic and load power respectively of the previous d historical moments, Q _t0 ，Q _t0+1 ，…，Q _t0+s-1 Row vectors composed of multidimensional public weather forecast data (wind speed, irradiation intensity and air temperature) at the last s forecast moments respectively; with specific input vector x for wind power prediction branch _t1 ＝[P _t1-d ，P _t1-d+1 ，…，P _t1-1 ]，P _t1-d ，P _t1-d+1 ，…，P _t1-1 The wind power values at the first d moments are respectively, and the special input vectors of the photovoltaic and load prediction branches are respectively x _t2 ＝[P _t2-d ，P _t2-d+1 ，…，P _t2-1 ]And x _t3 ＝[P _t3-d ，P _t3-d+1 ，…，P _t3-1 ]。

Step 2: normalizing the wind power, photovoltaic and load power data and meteorological data acquired in the step 1, wherein the wind power and photovoltaic power data are normalized to the interval [0,1] by taking rated capacities of a wind power station and a photovoltaic power station as references]The load power, wind speed, irradiance, temperature and other meteorological data adopt a maximum and minimum normalization method, and the wind direction adopts a sin/cos trigonometric function normalization method; let wind power and photovoltaic power before and after normalization be x ₁ Andload power, wind speed, irradiance, temperature, etc. are x ₂ And->The maximum and minimum values of the load power, wind speed, irradiance, temperature and other samples are respectively x _max 、x _min Rated capacity of wind power and photovoltaic power is x _N The specific normalization formula is as follows:

step 3: dividing input data into a training set, a verification set and a test set according to the proportion of 70%, 15% and 15%, respectively carrying out Fourier transform on the training set based on real measurement data of wind power, photovoltaic and load power in a target area of the training set, drawing amplitude-frequency curves as shown in figures 2,3 and 4, wherein the frequency components corresponding to larger amplitude points in the amplitude-frequency curves occupy larger proportion in an original sequence, converting the frequency points into periodic wind-light-load sharing and characteristic multi-periodic characteristics according to sampling frequency, and recording obvious public period length T ₀₁ ，T ₀₂ ，…，T _0u Characteristic period length T of wind power _W1 ，T _W2 ，…，T _Wu Photovoltaic having period length T _P1 ，T _P2 ，…，T _Pu Characteristic cycle length T of load _L1 ，T _L2 ，…，T _Lu This provides a reference for the setting of model superparameters at a later time.

Step 4: after determining input and output data, starting to establish a multi-task progressive learning model based on a depth space-time fusion network, wherein the design of the model mainly expands from two aspects: on one hand, an MTL architecture for separating and then combining and learning the shared information and the specific information of the multiple tasks is designed according to the association between the multiple tasks, so that the separation and interaction between the multiple task information are realized, and the inherent complex correlation between the multiple tasks is extracted; on the other hand, a reasonable deep learning network is required to be introduced as a progressive extraction tool of wind-light-load space-time characteristics, the source-load hiding characteristic is deeply excavated, potential information of each task is fully utilized, and shallow-to-deep progressive extraction is carried out on wind-light-load sharing and special space-time information, so that the aim of improving wind-light-load combined prediction precision is fulfilled. Said step 4 comprises the sub-steps of:

step 4.1: the shared information sub-network is a branch for mainly extracting wind-light-load shared information, and its input information includes wind-light-load historical power data and public meteorological data to form input matrix X ₀ Firstly, extracting wind-light-charge time sequence information through a shared GRU unit, and at the moment t, receiving the current state x by the shared GRU _t And the hidden state h of the last moment _t-1 Output h of the network _t Formed by dynamic control of the update gate and the reset gate. Defining a weight matrix W related to the input _r ,W _u ,W _z The method comprises the steps of carrying out a first treatment on the surface of the Weight matrix R associated with a cyclic connection _r ,R _u ,R _z Bias vector b _r ,b _u ,b _z Sigma is a sigmoid activation function, tan h is a hyperbolic tangent function, and # is a dot product. The GRU first passes through the hidden state h at the last moment _t-1 And input x at the current time _t To obtain two gating states, where r _t To reset the gate, z _t To update the door:

r _t ＝σ(W _r x _t +R _r h _t-1 +b _r )

z _t ＝σ(W _z x _t +R _z h _t-1 +b _z )

Then, the hidden state transmitted at the previous moment is selectively forgotten and the hidden state information containing the current moment is selectively memorized, so as to obtain h _t The update expression is:

step 4.2: hidden state matrix [ h ] obtained by processing shared GRU through shared multi-core convolution layer _t-w+1 ，h _t-w+2 ，…，h _t-1 ]And (3) processing the row vectors (w represents the time window length) to extract multi-periodicity characteristics of the multi-wind power plant. Firstly, carrying out convolution operation on row vectors of a hidden state matrix h in k channels by using convolution check of u sizes to obtain k different feature graphs f _map The formula is as follows:

W _f and b _f The weights and offsets of the mapping process are represented.

f(H _i ,h _t )＝(H _i ) ^T W _a h _t

a _i ＝σ(f(H _i ,h _t ))

m represents the number of hidden layer neurons.

Step 4.4: the final stage of the model is to sub-network the shared information and the unique information to obtain the characteristic vector v ₀ 、v ₁ 、v ₂ And v ₃ Fusion is carried out to realize joint prediction considering complete information between the interior of the multitask and the multitask, and finally short-term prediction values of wind power, photovoltaic power and load power are respectively output through a full-connection layerAnd->

Step 5: setting superparameters after the establishment of a multi-task progressive learning model, e.g. the spirit of each branchThe number of the warp elements is m, the length w of the sample time window, the number g of GRU layers of the time sequence network, the number k of the multi-core convolution layer channels and the convolution kernel size T _u Loss function weights λ, γ, μ, etc.; initializing weights and offsets, selecting training samples, taking a mean square error as a loss function, and training a model by adopting an SGD optimization algorithm to obtain optimal weight and offset parameters; inputting the verification set sample into a trained depth space-time fusion model, and optimizing optimal super parameters of the model according to verification errors by adopting grid search, wherein the optimizing ranges of some main super parameters are as follows: neuron number m: {16, 32, 64, 100, 128, 200, 300}; number of convolved channels k: {16, 24, 32, 48, 64}; number of sequential network layers g: {1,2,3,4,5,6}; length of convolution kernel T _u ：[1，168]Etc.

Step 6: inputting the test sample into a multi-task progressive learning model with optimal super parameters, and performing inverse normalization on the output prediction result to obtain a power prediction result of wind-light-load at each time of the prediction dayAnd->Wherein the method comprises the steps ofAnd->s is the number of time steps predicted backward.

The data adopted by the calculation example come from meteorological data such as minute-level actual measurement power, wind speed, irradiance, temperature and the like of wind power, photovoltaic and load all year round in a certain region 2016 in the north of China, and the data are inevitably abnormal in data deletion and the like in the process of acquisition and storage, so that a certain means is needed to be adopted for preprocessing the data set before the data set is subjected to data mining. Firstly, quality detection is carried out on a data set, wherein the quality detection comprises detection of abnormal data values and missing data values, and secondly, correction is carried out on the detected abnormal data. To simplify input dataBy processing the raw data, a new data set with an hour-level time resolution is generated, and each source load has 8784 samples. The prediction model inputs historical wind-light-load power 3 days in advance, and the model outputs a wind-light-load power curve of the next day. The prediction task of this chapter is executed under the environment of Python3.7, and experimental hardware is configured to be 3.60GHzCore (TM)/i 9-9900KF CPU/NVIDIA GeForce RTX 2070SUPER-GPU/32GB memory. The distribution ratio of the training set, the verification set and the test set is 70%, 15% and 15%, the model is learned on the training set, the optimal super parameters are determined through multiple tests on the verification set, and the test set is used for displaying the prediction effect of the model. In order to eliminate the influence of the neural network training parameter initialization randomness, the predicted result takes the average value of the 20 repeated experimental results.

In this embodiment, the minute-scale measured power, wind speed, irradiance, temperature and other meteorological data of wind power, photovoltaic and load all year round in the domestic target area are used, firstly, quality detection is performed on the data set, including detection of abnormal data values and missing data values, and secondly, the detected abnormal data are corrected. To simplify the input data, a new dataset of hour-level temporal resolution was generated by processing the raw data, each type of source load having 8784 samples. The prediction model inputs historical wind-light-load power 3 days in advance, and the model outputs a wind-light-load power curve of the next day. The example predicts the task to be performed in a Python3.7 environment, with experimental hardware configured at 3.60GHzCore (TM)/i 9-9900KF CPU/NVIDIA GeForce RTX 2070SUPER-GPU/32GB memory. In order to eliminate the influence of the neural network training parameter initialization randomness, the predicted result takes the average value of the 20 repeated experimental results. The present embodiment uses the mean absolute percentage error (X _MAPE ) And root mean square error (X) _RMSE ) The prediction accuracy was evaluated, and their calculation formula was as follows:

wherein: n is the number of test samples; y is _i Andthe actual value and the predicted value of the source load power of the ith sampling point of the predicted day are respectively.

Fig. 7, fig. 8 and fig. 9 are respectively a comparison of predicted power and a true value of three different source loads of load, wind power and photovoltaic in different prediction models within a certain week in the test set in the embodiment of the invention. From the integral point of source load prediction, the load and the photovoltaic power show stronger regularity, and the wind power has more randomness and volatility, thus reflecting the unique property of the heterogeneous source load. The MTL-DSTFN model provided by the chapter has the best performance in all source load predictions and the best fitting degree with a true value, and can accurately simulate fluctuation trend even in wind power with larger output fluctuation. The load prediction curve comparison chart shown in fig. 7 selects load data samples from day 2016, 11, 14 and 19 for comparison, all models can track the overall change trend of the load more accurately, the partial charts of wave crests and wave troughs can find that the fitting capacity of the partial charts of wave crests and wave troughs is relatively weak, the time sequence prediction model based on improved CNN such as TCN has better fitting effect at wave crests but larger defects at wave troughs, and the prediction performance is unstable. The source load power curve fitting capacity of the combined model GRU-CNN is improved compared with that of a single model, but the prediction capacity at the wave crest and the wave trough is slightly weaker. Compared with other models, the MTL-DSTFN of the model has stronger wave crest and wave trough change trend tracking capability, and the model has stronger stability. FIG. 8 depicts a graph of actual versus predicted curves for wind power data samples from day 10 to day 16 for 11 months, with wind power having a large prediction error relative to the load prediction curve due to the strong volatility and uncertainty of wind power, with a large increase in short term prediction accuracy. The overall trend of the prediction curve can show that the predicted values of all models are higher than the true values, and the excellent power change tracking capability of the model provided by the chapter can be better seen in the wind power prediction with larger fluctuation although the difference of various models in load prediction is smaller, so that the superior performance of the model is further proved. According to the photovoltaic power comparison curves of the models drawn in fig. 9, the photovoltaic output is similar to the load, the strong regularity is shown, the fitting difference of the various models is mainly reflected at the wave crest, the change trend of the photovoltaic power can be well tracked in the rest period, and the characteristic of the predicted object is fully reflected to have a great influence on the prediction difficulty.

Fig. 10, 11 and 12 are the results of comparing the proposed model with the schemes of the embodiment of the present invention, in which the three different source charges of load, wind power and photovoltaic do not distinguish between the shared-unique periodic characteristics, consider only the shared periodic characteristics and consider only the unique periodic characteristics. As can be seen from FIGS. 10 to 12, the model MTPL-DSTFN provided by the present invention has the smallest prediction error in the prediction of different source charges, and the transverse comparison of the three source charges can show that the model provided by the present chapter has the most obvious improvement effect in wind power prediction, compared with the scheme one, the scheme two and the scheme three, the scheme X _MAPE The error is reduced by 2.96%,1.99% and 1.82%, respectively, X _RMSE Errors are reduced by 12.68%,7.58% and 4.09%, respectively, which represents a great necessity for reasonably considering multicycle data such as wind power. For the load and the photovoltaic, the two data are regular, and a good prediction effect can be achieved through time sequence information extraction of a time sequence network, so that the effect of refining the cycle characteristics is less obvious. The prediction effect of the scheme II and the scheme III is better than that of the scheme I, which shows that the prediction precision can be improved by distinguishing the sharing-specific period, the scheme II and the scheme III have different performances in different types of source charges, the scheme II has better performance in load and photovoltaic prediction, the scheme III has better performance in wind power prediction, the sharing-specific information is essential information for the multi-task source charges, and the ingenious fusion method can capture the sharing characteristic of the multi-source charges in an omnibearing mannerAnd the unique characteristics, higher prediction precision is obtained, and the stability of the model is improved.

FIGS. 13, 14 and 15 depict comparative histograms of ablation study errors of three different source loads of load, wind power and photovoltaic under different prediction models, the former two models adopt single-task learning, and DSTFM-S performs better in load prediction than the two models, so that the deep learning architecture is used for powerfully mining load time evolution information, DSTFM is a multivariable prediction model, has excellent overall prediction capability on a predicted object with obvious time-space correlation, such as a multi-wind power field in a research object area of a third chapter, and DSTFM aims at optimizing the prediction average effect of multiple input variables, and the average X of the DSTFM is calculated according to the comparison results _MAPE 10.49%, reduced DSTFM-S relative to univariate predictions, and DSFTM performed best in wind power predictions, even with lower X than the model presented herein _MAPE The improvement rate of the DSTFM-S is up to 3.39%, but the model has obvious defects, the load and the photovoltaic prediction effect are poor, and even the model is better than the X of a single model LSTM _MAPE Higher, this is the seesaw phenomenon mentioned above, so the third chapter model DSTFM is not used when wind-light-load combined prediction is performed in this chapter; and then comparing the model MTL-DSTFM with the sharing branch removed, comparing the model DSTFM-S with a single task model DSTFM-S, reducing the source load prediction error to different degrees, and comparing the model DSTFM with a multivariable prediction model DSTFM, wherein the load prediction average X is calculated _MAPE Reduces by 1.98 percent and reflects the necessity of introducing a multi-task learning mechanism. The model introduces an improved multi-task learning mechanism, adopts a progressive characteristic extraction mode of sharing specific information separation, and compared with MTL-DSTFM, the source load prediction average X of the model _MAPE Improved by 0.68%, average X _RMSE The improvement of 6.12% indicates that the MTPL-DSTFN has better prediction performance whether the prediction precision or the robustness is improved.

Claims

1. The wind-light-load power short-term prediction method based on multi-task progressive learning is characterized by comprising the following steps of:

step 4: forming a common information subnet input matrix X from wind-light-load historical power data and common meteorological data ₀ Each source load history power sequence forms a special information subnet input matrix X ₁ ，X ₂ ，X ₃ Taking power values of wind-light-load s prediction moments as output variables, and establishing a multi-task progressive learning model MTPL-DSTFN based on a depth space-time fusion network;

2. The method according to claim 2, wherein in step 1, the collected wind power, photovoltaic and load historical power data and weather forecast data are input; with a shared input vector x for each prediction subtask _t0 ＝[P _t0 -d，P _t0-d+1 ，…，P _t0-1 ，Q _t0 ，Q _t0+1 ，…，Q _t0+s-1 ]∈R ^l Wherein P is _t0-d ，P _t0-d+1 ，…，P _t0-1 Wind power, photovoltaic and load power respectively of the previous d historical moments, Q _t0 ，Q _t0+1 ，…，Q _t0+s-1 Row vectors composed of multidimensional public weather forecast data at the last s forecast moments are special input vectors x for wind power forecast branches _t1 ＝[P _t1-d ，P _t1-d+1 ，…，P _t1-1 ]，P _t1-d ，P _t1-d+1 ，…，P _t1-1 The wind power values at the first d moments are respectively, and the special input vectors of the photovoltaic and load prediction branches are respectively x _t2 ＝[P _t2-d ，P _t2-d+1 ，…，P _t2-1 ]And x _t3 ＝[P _t3-d ，P _t3-d+1 ，…，P _t3-1 ]。

3. The method according to claim 1, wherein step 2 performs normalization processing on the wind power, photovoltaic and load power data and meteorological data collected in step 1, wherein the wind power and photovoltaic power data are normalized to an interval [0,1] based on rated capacities of a wind farm and a photovoltaic power station, the load power and meteorological data such as wind speed, irradiance and temperature adopt a maximum and minimum normalization method, and the wind direction adopts a sin/cos trigonometric function normalization method, and the specific normalization formula is as follows:

4. The method according to claim 1, characterized in that: in step 3, based on the real measurement data of wind power, photovoltaic and load power in the target area of the training set, respectively carrying out Fourier transform on the real measurement data and drawing an amplitude frequency curve, wherein the frequency components corresponding to the points with larger amplitude values in the amplitude frequency curve occupy larger proportion in the original sequence, converting the frequency points into wind-light-load sharing and specific multi-periodicity characteristics according to sampling frequency, and recording a more obvious public period length T ₀₁ ，T ₀₂ ，…，T _0u Characteristic period length T of wind power _W1 ，T _W2 ，…，T _Wu Photovoltaic having period length T _P1 ，T _P2 ，…，T _Pu And the characteristic period length T of the load _L1 ，T _L2 ，…，T _Lu This provides a reference for the setting of model superparameters at a later time.

5. The method according to claim 1, characterized in that: step 4, after determining input and output data, starting to build a multi-task progressive learning model based on a depth space-time fusion network, wherein step 4 comprises the following sub-steps:

step 4.1: the shared information sub-network is a branch for mainly extracting wind-light-load shared information, and its input information includes wind-light-load historical power data and public meteorological data to form input matrix X ₀ Firstly, extracting wind-light-charge time sequence information through a shared GRU unit, and at the moment t, receiving the current state x by the shared GRU _t And the hidden state h of the last moment _t-1 Output h of the network _t Formed by dynamic control of the update gate and the reset gate; defining a weight matrix W related to the input _r ,W _u ,W _z The method comprises the steps of carrying out a first treatment on the surface of the Weight matrix R associated with a cyclic connection _r ,R _u ,R _z Bias vector b _r ,b _u ,b _z Sigma is a sigmoid activation function, tan h is a hyperbolic tangent function, and as a dot product, GRU first passes through the hidden state h at the previous moment _t-1 And input x at the current time _t To obtain two gating states, where r _t To reset the gate, z _t To update the door:

r _t ＝σ(W _r x _t +R _r h _t-1 +b _r )

z _t ＝σ(W _z x _t +R _z h _t-1 +b _z )

step 4.2: hidden state matrix [ h ] obtained by processing shared GRU through shared multi-core convolution layer _t-w+1 ，h _t-w+2 ，…，h _t-1 ]The row vector of the hidden state matrix h is processed (w represents the time window length), the common multi-periodicity characteristic of wind-light-load is extracted, and convolution operation is carried out on the row vector of the hidden state matrix h by convolution check of u sizes in k channels to obtain k different characteristic diagrams f _map The formula is as follows:

where i represents the ith row vector of matrix h, concat represents the feature concatenation operation, K represents the convolution kernel, and its subscript represents the different sizes of the convolution kernel (T ₀₁ ，T ₀₂ ，…，T _0u Representing the length of the convolution kernel, 1 representing the width of the convolution kernel) and a channel, and then performing downsampling operation (down) on a plurality of sub-blocks divided by the feature map through a sliding window to obtain a new feature map f _m ′ _ap (k) The formula is as follows:

W _f and b _f Weights and biases representing the mapping process;

f(H _i ,h _t )＝(H _i ) ^T W _a h _t

a _i ＝σ(f(H _i ,h _t ))

m represents the number of hidden layer neurons;

step 4.4: specific information subnet inputs historical power sequence X of each source load ₁ ，X ₂ ，X ₃ Similar to steps 4.1-4.3, extracting the timing information through the GRU unit, and then hiding the state matrix h containing the timing information ₁ ，h ₂ And h ₃ Input into the multi-core convolutional layer according to the specific period length T of wind-light-charge _W ,T _P And T _L Design convolution kernel size T _W ×1，T _P X 1, and T _L X 1, outputting a new hidden state matrix H obtained by multi-core convolution operation ₁ ，H ₂ And H ₃ Different rows of the matrix represent vectors obtained by different convolution kernel size operations, different columns represent feature vectors of different time steps, then a feature attention mechanism is used for focusing on influences of different columns of the matrix, namely, each input feature, and weighted sum vectors containing wind power, photovoltaic and load specific information are respectively outputv ₁ 、v ₂ And v ₃ ；

6. The method according to claim 1 or 5, characterized in that: step 5, setting super parameters, such as the number m of neurons of each branch, the length w of a sample time window, the number g of GRU layers of a time sequence network, the number k of channels of a multi-core convolution layer and the size of convolution kernels, after the multi-task progressive learning model is built in the step 4; the weights and offsets are then initialized, training samples are selected to mean square errorAs a loss function, training a model by adopting an SGD optimization algorithm to obtain optimal weight and bias parameters; inputting the verification set sample into a trained depth space-time fusion model, and optimizing optimal super parameters of the model according to verification errors by adopting grid search, wherein the optimizing ranges of some main super parameters are as follows: neuron number m: {16, 32, 64, 100, 128, 200, 300}; number of convolved channels k: {16, 24, 32, 48, 64}; number of sequential network layers g: {1,2,3,4,5,6}; length of convolution kernel T _u ：[1，168]。

7. The method according to claim 1, characterized in that: step 6, inputting the test sample into a multi-task progressive learning model with optimal super parameters, and performing inverse normalization on the output prediction result to obtain a power prediction result of wind-light-load at each time of the prediction dayAnd->Wherein->And->s is the number of time steps predicted backward.