CN116960991B - Probability-oriented power load prediction method based on graph convolution network model - Google Patents

Probability-oriented power load prediction method based on graph convolution network model Download PDF

Info

Publication number
CN116960991B
CN116960991B CN202311222388.5A CN202311222388A CN116960991B CN 116960991 B CN116960991 B CN 116960991B CN 202311222388 A CN202311222388 A CN 202311222388A CN 116960991 B CN116960991 B CN 116960991B
Authority
CN
China
Prior art keywords
matrix
self
graph
head
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311222388.5A
Other languages
Chinese (zh)
Other versions
CN116960991A (en
Inventor
何州
裘一蕾
陈细平
宋小波
陈卫强
姚家渭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Half Cloud Technology Co ltd
Original Assignee
Hangzhou Half Cloud Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Half Cloud Technology Co ltd filed Critical Hangzhou Half Cloud Technology Co ltd
Priority to CN202311222388.5A priority Critical patent/CN116960991B/en
Publication of CN116960991A publication Critical patent/CN116960991A/en
Application granted granted Critical
Publication of CN116960991B publication Critical patent/CN116960991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/10Power transmission or distribution systems management focussing at grid-level, e.g. load flow analysis, node profile computation, meshed network optimisation, active network management or spinning reserve management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Power Engineering (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a probability-oriented power load prediction method based on a graph convolution network model. And (3) passing the space-time characteristic matrix and the self-adaptive adjacent matrix through a stacked graph rolling network to obtain a graph rolling characteristic matrix. Finally, obtaining the self-attention mechanism by the graph convolution characteristic matrix through linear transformationQKAndVand (3) performing a self-attention mechanism to obtain a self-attention feature matrix, performing residual connection on the self-attention feature matrix and the graph convolution feature matrix, and performing normalization processing to obtain a predicted load value matrix. The method can comprehensively extract the space relevance information from two layers of geography and semantics simultaneously so as to enrich the receptive field and further capture the multi-scale enhanced space characteristics, thereby obtaining more accurate prediction results.

Description

Probability-oriented power load prediction method based on graph convolution network model
Technical Field
The application belongs to the technical field of power load prediction, and particularly relates to a probability-oriented power load prediction method based on a graph rolling network model.
Background
In recent years, power load prediction has been attracting public attention because of its ability to provide good economic and social benefits. However, as the modernization process accelerates, the power system scale is also rapidly expanding, as are the complexity and difficulty of power load data predictions. Thus, accurate power load prediction has become a challenging task. Probabilistic power load prediction is a well-known method that effectively addresses this challenge, taking into account the uncertainty and variability of the power load data, and giving a prediction interval. The probability power load prediction can improve the accuracy of load prediction, and a scientific power dispatching plan is provided for a power system.
Most of the existing researches adopt a time sequence decomposition strategy to process the time sequence characteristics of the load data, but neglect the importance degree of different decomposition components, so that the captured time sequence relevance information is incomplete. In addition, the prior researches mainly pay attention to the geographic relevance between adjacent load areas, neglect the semantic relevance between the load areas, and lead to insufficient extraction capability of the spatial relevance information.
Disclosure of Invention
The purpose of the application is to provide a probability-oriented power load prediction method based on a graph convolution network model, so as to solve the problem that the extraction capacity of spatial correlation information is insufficient in the prior art.
In order to achieve the above purpose, the technical scheme of the application is as follows:
a probabilistic power load prediction method based on a graph rolling network model comprises the following steps:
the historical power load sequence data of the area to be predicted is decomposed into a trend component, a seasonal component and a residual component, the trend component and the seasonal component respectively pass through a global convolution channel, then are spliced to obtain a global time sequence feature matrix, and the residual component passes through a local convolution channel to obtain a local time sequence feature matrix;
processing the local time sequence feature matrix and the global time sequence feature matrix by adopting a gating mechanism, and obtaining a space-time feature matrix through full connection processing;
the space-time characteristic matrix and the self-adaptive adjacent matrix are passed through a stacked graph rolling network to obtain a graph rolling characteristic matrix;
obtaining self-attention mechanism by linear transformation of graph convolution characteristic matrix、/>And->The matrix is then subjected to a self-attention mechanism to obtain a self-attention feature matrix;
and carrying out residual connection on the self-attention characteristic matrix and the graph convolution characteristic matrix, and then carrying out normalization processing to obtain a predicted load value matrix.
Further, the self-attention mechanism is obtained by linear transformation of the graph convolution characteristic matrix、/>And->A matrix, and then performing a self-attention mechanism to obtain a self-attention feature matrix, comprising:
obtaining a multi-head attention mechanism by linearly changing the graph convolution characteristic matrixAnd->Matrix, the characteristic matrix of the graph convolution is obtained by linear change>Matrix, then->And->Each head of the matrix gets multiscale +_ through multiscale convolutional neural network>And->Each head of the matrix, then +/for multiple dimensions>And->Each head of the matrix is associated with->And executing a self-attention mechanism by the matrix, and finally splicing self-attention results corresponding to each head to obtain a self-attention characteristic matrix.
Further, the global convolution channel comprises a convolution layer and a full connection layer.
Further, the local convolution channels include a convolution layer, a pooling layer, and a full connection layer.
Further, the processing the local time sequence feature matrix and the global time sequence feature matrix by adopting a gating mechanism includes:
by means ofThe activation function performs an activation operation on the local timing feature matrix and then adds to the global timing feature matrix.
Further, the self-attention mechanism is obtained by linear transformation of the graph convolution characteristic matrix、/>And->A matrix, comprising:
wherein,for query matrix, ++>For key matrix>Is a value matrix,/->、/>And->Is a matrix of learnable parameters for generating a query matrix, a key matrix and a value matrix, respectively,/for each of the query matrix, the key matrix and the value matrix>The feature matrix is rolled up for the graph.
Further, the multi-head attention mechanism is obtained by linearly changing the graph convolution characteristic matrixAnd->A matrix, comprising:
wherein,representing the division of the input matrix into +>Head(s) and(s) of a person>For the i-th head of the query matrix, < +.>For the i-th head of the key matrix, < >>And->Is a matrix of learnable parameters for the ith head of the query matrix and the key matrix respectively,the feature matrix is rolled for the drawing;
the graph convolution characteristic matrix is obtained through linear changeA matrix, comprising:
wherein,is a value matrix,/->Is a matrix of learnable parameters for generating a value matrix;
the saidAnd->Each head of the matrix gets multiscale +_ through multiscale convolutional neural network>And->Each head of the matrix comprises:
wherein,and->The ith head of the multi-scale query matrix and the ith head of the key matrix obtained by processing the multi-scale convolutional neural network are respectively +.>Is the number of convolutions, < >>Is the weight value of the kth convolution layer, < >>Is the matrix of learnable parameters of the kth convolutional layer,/>is->Input matrix of->Is the offset coefficient of the kth convolutional layer.
Further, for multi-scaleAnd->Each head of the matrix is associated with->The matrix executes a self-attention mechanism, and finally, the self-attention results corresponding to the heads are spliced to obtain a self-attention feature matrix, which comprises the following steps:
wherein,is a self-attention feature matrix,Trepresenting transpose operation,/->Is a splicing function->Is a normalization function.
According to the probabilistic power load prediction method based on the graph convolution network model, different decomposition components are processed respectively by using the network model with the double convolution channels, so that the capability of extracting time sequence relevance information of the model is improved. On the basis, a novel graph convolution network with a multi-scale self-attention mechanism enhanced is provided, and the spatial relevance information can be comprehensively extracted from two layers of geography and semantics at the same time, so that the receptive field is enriched, the multi-scale enhanced spatial characteristics are captured, and a better prediction result is obtained.
Drawings
Fig. 1 is a flowchart of a probabilistic power load prediction method based on a graph convolutional network model.
Fig. 2 is a schematic diagram of a prediction network model constructed in the present application.
Fig. 3 is a schematic diagram of space-time feature matrix extraction based on improved STL decomposition according to an embodiment of the present application.
Fig. 4 is a schematic diagram of a multi-scale convolutional neural network according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
Example 1:
FIG. 1 provides a probabilistic power load prediction method based on a graph rolling network model, comprising:
s1, historical power load sequence data of a region to be predicted is decomposed into a trend component, a season component and a residual component, the trend component and the season component respectively pass through a global convolution channel, then are spliced to obtain a global time sequence feature matrix, and the residual component passes through a local convolution channel to obtain a local time sequence feature matrix.
The object of the present application is to obtain a mapping function of the load value at the future time T, which can be predicted by GAs shown in formula (1):
(1);
wherein,is a time-space feature matrix at the time t,His the length of the history data and,Tis the length of the predicted data.
For this purpose, a predictive network model is constructed as shown in fig. 2, which is a graph roll-up network model (MSGCN-ISTL for short) based on improved STL decomposition and multi-scale self-attention mechanism enhancement.
In training the network model, the input sequence data takes some existing data set. When the trained network model is adopted for prediction, the input sequence data is the historical power load sequence data of the area to be predicted.
In this embodiment, first, STL decomposition is performed on input sequence data, and the input sequence data is decomposed into a trend component, a season component, and a residual component by the STL decomposition. The solution process is shown in the formula (2).
(2);
Wherein,is the input sequence data at time t, +.>Is a trend component->Is a seasonal component, ++>Is the residual component. Time series decomposition (STL) is an advantageous way of preprocessing to reduce the complexity of the input sequence data. STL decomposition is a common time series decomposition method with good robustness.
The individual components have different importance in representing the input sequence data. Wherein the trend component facilitates analysis of long-term changes in the load data and the seasonal component facilitates analysis of periodic changes in the load data. The residual component contains less regularity information, which is less important in capturing the timing characteristics of the data than the other two components. Thus, the present embodiment utilizes the GDCNN approach to improve STL to ISTL with the aim of taking targeted strategies for the different decomposition components, as shown in FIG. 3.
As shown in fig. 3, the present embodiment first inputs trend and seasonal components into the global convolution channel to capture key features representing trend changes and seasonal changes in load data, as shown in equation (3):
(3);
wherein,the time sequence feature matrix is obtained through global convolution channel processing, and the global convolution channel comprises a convolution layer and a full connection layer. />Is a splicing function->And->A matrix of leachable parameters and a coefficient of deviation, respectively, for calculating trend components, < >>And->A matrix of leachable parameters and a coefficient of deviation, respectively, for calculating the seasonal components. Namely, the trend component and the season component respectively pass through a global convolution channel, and then are spliced to obtain a global time sequence feature matrix +.>
Since the residual component is less regular and of smaller magnitude than the other two components, the use of a pooling layer in the local convolution channel further eliminates the redundant information of that component, as shown in equation (4):
(4);
wherein,the local time sequence characteristic matrix is obtained by processing a local convolution channel, wherein the local convolution channel comprises a convolution layer, a pooling layer and a full connection layer>Is a pooling operation, ++>And->A matrix of leachable parameters and a coefficient of deviation, respectively, for calculating the residual component.
And S2, processing the local time sequence feature matrix and the global time sequence feature matrix by adopting a gating mechanism, and obtaining a space-time feature matrix through full connection processing.
The present embodiment uses a gating mechanism pairAnd->Data flow control is performed by usingFor->Performing activation operation and associating it with +.>The addition is as shown in formula (5).
(5);
Wherein,is a time sequence characteristic matrix obtained by the processing of a gating mechanism>Is an activation function.
Then, using the full connection layer pairPerforming dimension conversion to obtain a space-time feature matrix +.>As shown in formula (6).
(6);
Wherein,is a space-time feature matrix, < >>And->Is a matrix of learnable parameters and a coefficient of deviation for the fully connected layer.
And S3, passing the space-time characteristic matrix and the self-adaptive adjacent matrix through a stacked graph rolling network to obtain a graph rolling characteristic matrix.
The present embodiment employs an adaptive adjacency matrix generation method to generate adjacency matrices that are not predefined and can be adaptively updated. Adjacency matrixThe definition of (2) is shown as a formula (7):
(7);
wherein,and->Is an activation function for normalizing the adjacency matrix; />Source node and destination node respectively initialized randomly are embedded,/->Is the depth of node embedding. When the graph convolutions iterate, the adjacency matrix is adaptively updated to extract the spatial correlation between the load regions.
The present embodiment uses an unweighted graphTo represent the spatial structure of the load zone network. The figure regards each load area in the network as a node, wherein +.>Is a set of load area nodes, +.>;/>Is the number of nodes in the load area network, but +.>Is a collection of edges in the network. Then, the adjacency matrix is defined as +.>For representing the association between the load areas. The adjacency matrix uses 0 and 1 to represent the association relationship between the load areas, wherein 0 represents no association between the two load areas, and 1 represents the two load areasThere is an association between the domains.
Load-based network graphSpace-time feature matrix->And adjacency matrix->The graph roll-up network GCN may be constructed by stacking several convolution layers, as shown in equation (8):
(8);
wherein,is the output matrix of the GCN layer, +.>Is the number of stacked GCN layers, +.>Is->Adjacency matrix of the individual GCN layers, +.>Is->A matrix of learnable parameters for the GCN layers. I.e. by means of a stacked graph rolling network, a graph rolling feature matrix is obtained>
S4, obtaining a self-attention mechanism by linear transformation of the graph convolution feature matrix、/>And->The matrix, then the self-attention mechanism is executed to obtain the self-attention feature matrix.
The pair of the embodimentPerforming linear transformation to obtain a query matrix and a key matrix, wherein the query matrix and the key matrix are shown in formulas (9) and (10):
(9);
(10);
wherein,for query matrix, ++>For key matrix>And->Is a matrix of learnable parameters for generating a query matrix and a key matrix, respectively.
Next, toPerforming linear transformation to obtain a value matrix, as shown in formula (11):
(11);
wherein,is a value matrix,/->Is a matrix of learnable parameters used to generate a value matrix.
Then, a self-attention mechanism is performed to obtain a self-attention feature matrix, as shown in formula (12):
(12);
wherein,is the self-attention feature matrix calculated, < ->Representing transpose operation,/->Is a normalization function.
It should be noted that, the self-attention mechanism is implemented by self-attention neural network, which is a relatively mature technology in the art, and will not be described herein.
And S5, carrying out residual connection on the self-attention characteristic matrix and the graph convolution characteristic matrix, and then carrying out normalization processing to obtain a predicted load value matrix.
Residual connection is a widely used technique in deep network training and has been proven to be very efficient in transferring information between neural network layers. Therefore, the present embodiment will be self-attention feature matrixAnd->Add and input the result to the batch normalization layer. Finally, a matrix of predicted load values of the model can be obtained>I.e. prediction resultsAs shown in formula (13).
(13);
Wherein,is a batch normalization function.
Example 2:
unlike embodiment 1, this embodiment proposes a multi-scale convolutional neural network (abbreviated as MSCN) and improves the self-attention mechanism to a multi-scale self-attention mechanism (abbreviated as MSSA).
The spatial correlation between load areas is reflected not only at the geographical level of the load network but also at the semantic level of the load network. However, existing self-attention mechanisms can only work on a single scale, and cannot effectively extract complex spatial correlation features.
For this reason, the present embodiment proposes a multi-scale convolutional neural network (abbreviated as MSCN), and improves the self-attention mechanism to a multi-scale self-attention mechanism (abbreviated as MSSA). The MSCN comprises a plurality of convolution layers with convolution kernels of different scales, so that the MSSA has rich receptive fields, and can comprehensively extract multi-scale enhanced spatial features of two layers of geography and semantics.
In step S4 of this embodiment, the multi-head attention mechanism is obtained by linearly changing the convolution feature matrixAndmatrix, the characteristic matrix of the graph convolution is obtained by linear change>Matrix, then->And->Each head of the matrix gets multiscale +_ through multiscale convolutional neural network>And->Each head of the matrix, then +/for multiple dimensions>And->Each head of the matrix is associated with->And executing a self-attention mechanism by the matrix, and finally splicing self-attention results corresponding to each head to obtain a self-attention characteristic matrix.
The framework of the multi-scale self-attention mechanism MSSA of this embodiment is shown in fig. 2, and first, a multi-head mechanism is adopted to improve the accuracy of the model for attention weight allocation. Next, toPerforming linear transformation to obtain a query matrix and a key matrix, wherein the query matrix and the key matrix are shown in formulas (14) and (15):
(14);
(15);
wherein,representing the division of the input matrix into +>Head(s) and(s) of a person>Is the first of the query matrixiHead(s) and(s) of a person>Is the first of key matrixiHead(s) and(s) of a person>And->Is the first to generate the query matrix and the key matrix respectivelyiA matrix of learnable parameters for an individual head.
Next, toPerforming linear transformation to obtain a value matrix, as shown in formula (16):
(16);
wherein,is a value matrix,/->Is a matrix of learnable parameters used to generate a value matrix.
In the embodiment, each head of the query matrix and the key matrix is obtained through a multi-scale convolutional neural network MSCN. Computation function of MSCNAs shown in formula (17):
(17);
wherein,is the number of convolutions, < >>Is the firstkWeight value of each convolution layer, +.>Is the firstkA matrix of learnable parameters of the convolutional layers, +.>Is->Input matrix of->Is the firstkThe offset coefficients of the convolutions layers. The multi-scale convolutional neural network MSCN in this embodiment includes a plurality of convolutional layers with different scale convolutional kernels, as shown in fig. 4, each convolutional layer performs a convolutional operation on the input matrix, so as to obtain operation results under different receptive fields, and then performs weighted summation on the results obtained by the different convolutional layers to obtain a final processing result.
According to the difference of input matrixes in the formula (17), the calculation processes of the multi-scale query matrix and the key matrix in the MSCN are respectively shown as the formulas (18) and (19):
(18);
(19);
wherein,and->The ith heads of the obtained multi-scale query matrix and key matrix are processed by using an MSCN method respectively.
Then, the processing result of the multi-scale self-attention mechanism is as shown in formula (20):
(20);
wherein,is a weighted load matrix (self-attention feature matrix),>representing transpose operation,/->Is a splicing function->Is a normalization function.
And S5, carrying out residual connection on the attention characteristic matrix and the graph convolution characteristic matrix, and then carrying out normalization processing to obtain a predicted load value matrix.
Residual connection is a widely used technique in deep network training and has been proven to be very efficient in transferring information between neural network layers. Thus, the present embodiment will weight the load matrixAnd +.>And inputs the result to the batch normalization layer. Finally, a matrix of predicted load values for the model herein can be obtained>(predicted load value matrix) as shown in the formula (21):
(21);
wherein,is a batch normalization function。
The present application verifies the above technical solution by experiments that use two common data sets GEFCom2012 and GEFCom2017 to verify the performance of the built model. The dataset GEFCom2012 included 32944 pieces of load data from the 20 load areas in the united states collected from month 1 2004 to month 6 2008. The dataset GEFCom2017 includes 397464 pieces of load data from hundreds of load areas in the united states collected from month 1 2005 to month 12 2011.
Comparing the proposed model with several typical models, the superiority of the proposed model in probability load prediction was verified. First, a model is selected that extracts only the temporal relevance of payload data, including Q-LSTM, CNN-LSTM, DA-QLSTM, CNN-BiLSTM, and GDCNN-AR-AMPO. Second, a model is selected that extracts the temporal-spatial correlations of the load data, including Ada-GWN. The specific details of the above model are as follows:
(1) Q-LSTM: a hybrid model of LSTM model and marbles loss function is fused.
(2) CNN-LSTM: a hybrid model of CNN and LSTM models is fused.
(3) DA-QLSTM: a hybrid model of a two-stage attention mechanism and an LSTM model is fused.
(4) CNN-BiLSTM: a hybrid model of CNN and BiLSTM models is fused.
(5) GDCNN-AR-AMPO: a hybrid model of a gated double convolutional neural network, an attention mechanism and a pooling operation is fused.
(6) Ada-gwn: a hybrid model of the spatiotemporal GNN model is fused.
In the experiment, the number of GCN layers is set to 2, the embedding depth of adjacent matrix nodes is set to 10, and the number of heads in a multi-head mechanism is set to 8. When training the model, the initial learning rate was set to 0.005, the optimizer was set to Adam, and the batch size was set to 512.
For the comprehensive evaluation of the model, four evaluation indexes are used in the application, including prediction interval coverage probability (PICP for short), average prediction interval width (MPIW for short), winkler score (WS for short) and pachinko loss (PL for short). Wherein a larger PICP value indicates better performance, and smaller other three index values indicate better performance.
(1) PICP: the coverage of the prediction interval (abbreviated as PI) to the actual load value is represented by the formula (22):
(22);
wherein,is the length of the predicted time, +.>、/>And->The actual load of PI at time t, the lower prediction interval and the upper prediction interval, respectively,/->Is an intermediate transition variable.
(2) MPIW: the width of PI is represented by formula (23):
(23)。
(3) WS: the coverage and width of PI are represented as shown in formula (24):
(24);
wherein,is confidence interval level, +.>Is an intermediate transition variable.
(4) PL: a loss function of probability prediction as shown in equation (25):
(25);
wherein,PL value at time t, q is the quantile of confidence interval, < >>Is the quantile at time t.
Tables 1-2 show comparative experimental results at different experimental parameter settings. PICP and MPIW are commonly used evaluation indexes in the prediction of the probability power load, and WS and PL simultaneously consider the coverage rate and the width of PI, so that the overall performance of a prediction model is more scientifically evaluated. From the experimental results, the model of the application is the optimal model, and Ada-GWN is the suboptimal model.
TABLE 1
TABLE 2
Wherein quantile refers to the quantile of the confidence interval in the probability prediction, i.e. the confidence level. Table 1 is experimental results using the GEFCom2012 dataset and table 2 is experimental results using the GEFCom2017 dataset, from which the following conclusions can be drawn from the experimental data of the four evaluation indexes in tables 1-2:
(1) Ada-GWN and MSGCN-ISTL as proposed herein have significant advantages over the other five baseline models in each set of comparative experiments, especially in WS and PL metrics. The experimental results show that the time sequence correlation information in the historical data is not enough to be captured, and the space correlation information among the load areas is also an important support for realizing accurate probability power load prediction.
(2) In most comparative experiments, the MSGCN-ISTL model was evaluated more optimally than Ada-GWN, indicating that improving the GCN using ISTL and MSCN methods can more fully capture the relevance information between load areas. In addition, under different experimental parameters, the overall performance of the MSGCN-ISTL is superior to that of all baseline models, which shows that the models have good robustness.
(3) From the comparison of Q-LSTM and DA-QLSTM, it can be seen that DA-QLSTM has better performance, indicating that the attention mechanism is effective in extracting characteristic information of payload data. Furthermore, the performance of CNN-LSTM is superior to Q-LSTM, indicating that CNN layer is efficient in feature extraction.
The application also verifies the effectiveness of each module in the MSGCN-ISTL model by an ablation experiment. The ablation experiment was mainly performed by comparing MSGCN-ISTL with its three simplified models as follows:
(1) MSGCN-ISTL w/o ISTL: the model removes ISTL modules on the basis of MSGCN-ISTL, i.e., does not contain an improved STL decomposition strategy.
(2) MSGCN-ISTL w/o MSSA: the model removes the MSSA module on the basis of MSGCN-ISTL, i.e., does not include a self-attention mechanism.
(3) MSGCN-ISTL w/o MSCN: the model removes the MSCN method on the basis of MSGCN-ISTL, i.e. the self-attention mechanism of the model is single-scale.
Tables 3 and 4 show the results of ablation experiments on the data sets GEFCom2012 and GEFCom2017, respectively.
TABLE 3 Table 3
TABLE 4 Table 4
From the experimental data of the four evaluation indexes in tables 3 to 4, the following conclusions can be drawn:
(1) MSGCN-ISTL achieved optimal performance under different experimental conditions of both data sets, indicating that each module of the model was effective.
(2) Compared with MSGCN-ISTL w/o ISTL, MSGCN-ISTL performs better, especially on MPIW, WS and PL indexes, and the fact that the GDCNN method is used for respectively processing decomposition components with different importance is helpful for capturing time sequence relevance information more comprehensively.
(3) The prediction performance of MSGCN-ISTL and MSGCN-ISTL w/o MSCN is superior to MSGCN-ISTL w/o MSSA, which shows that the self-attention mechanism is utilized to capture the association information between load areas, so that the prediction performance of the model can be effectively improved.
(4) MSGCN-ISTL gives better results than MSGCN-ISTL w/o MSCN. This shows that the MSSA method is effective because the method enriches the receptive field, and can comprehensively extract the multi-scale enhanced spatial features including geographic and semantic layers, thereby realizing the comprehensive extraction of the spatial relevance features.
The application provides a new probability power load prediction model MSGCN-ISTL, which aims to more comprehensively capture space-time correlation information from load data so as to realize accurate probability power load prediction. Specifically, the STL decomposition strategy STL is improved so as to realize that different processing methods are adopted for decomposition components with different importance, thereby improving the extraction capability of the model for time sequence characteristics in the load data stream. In addition, the application provides a new space-time correlation extraction method-MSGCN to enrich receptive fields and further improve the extraction capacity of multi-scale enhanced spatial features at two levels of geography and semantics. Compared with the existing baseline model, MSGCN-ISTL shows superiority and robustness under different experimental groups.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (6)

1. The probabilistic power load-oriented prediction method based on the graph rolling network model is characterized by comprising the following steps of:
the historical power load sequence data of the area to be predicted is decomposed into a trend component, a seasonal component and a residual component, the trend component and the seasonal component respectively pass through a global convolution channel, then are spliced to obtain a global time sequence feature matrix, and the residual component passes through a local convolution channel to obtain a local time sequence feature matrix;
processing the local time sequence feature matrix and the global time sequence feature matrix by adopting a gating mechanism, and obtaining a space-time feature matrix through full connection processing;
the space-time characteristic matrix and the self-adaptive adjacent matrix are passed through a stacked graph rolling network to obtain a graph rolling characteristic matrix;
obtaining self-attention mechanism by linear transformation of graph convolution characteristic matrix、/>And->The matrix is then subjected to a self-attention mechanism to obtain a self-attention feature matrix;
residual connection is carried out on the self-attention characteristic matrix and the graph convolution characteristic matrix, and then normalization processing is carried out, so that a predicted load value matrix is obtained;
wherein the self-attention mechanism is obtained by linear transformation of the graph convolution characteristic matrix、/>And->A matrix, and then performing a self-attention mechanism to obtain a self-attention feature matrix, comprising:
obtaining a multi-head attention mechanism by linearly changing the graph convolution characteristic matrixAnd->Matrix, the characteristic matrix of the graph convolution is obtained by linear change>Matrix, then->And->Each head of the matrix gets multiscale +_ through multiscale convolutional neural network>And->Each head of the matrix, then +/for multiple dimensions>And->Each head of the matrix is associated with->The matrix executes a self-attention mechanism, and finally, the self-attention results corresponding to the heads are spliced to obtain a self-attention feature matrix;
wherein,the characteristic matrix of the graph convolution is subjected to linear change to obtain a multi-head attention mechanismAnd->A matrix, comprising:
wherein,representing the division of the input matrix into +>Head(s) and(s) of a person>For the i-th head of the query matrix, < +.>For the i-th head of the key matrix, < >>And->Is a matrix of learnable parameters for the ith head for generating a query matrix and a key matrix, respectively,/for each head>The feature matrix is rolled for the drawing;
the graph convolution characteristic matrix is obtained through linear changeA matrix, comprising:
wherein,is a value matrix,/->Is a matrix of learnable parameters for generating a value matrix;
the saidAnd->Each head of the matrix gets multiscale +_ through multiscale convolutional neural network>And->Each head of the matrix comprises:
wherein,and->The ith head of the multi-scale query matrix and the ith head of the key matrix obtained by processing the multi-scale convolutional neural network are respectively +.>Is the number of convolutions, < >>Is the weight value of the kth convolution layer, < >>Is the matrix of learnable parameters of the kth convolutional layer,>is->Input matrix of->Is the offset coefficient of the kth convolutional layer.
2. The probabilistic power load prediction method based on a graph-convolution network model as claimed in claim 1, wherein the global convolution channel comprises a convolution layer and a full-connection layer.
3. The probabilistic power load prediction method based on a graph rolling network model as claimed in claim 1, wherein the local convolution channels include a convolution layer, a pooling layer and a full-connection layer.
4. The probabilistic power load prediction method based on a graph rolling network model as claimed in claim 1, wherein the processing the local time sequence feature matrix and the global time sequence feature matrix by using a gating mechanism comprises:
by means ofThe activation function performs an activation operation on the local timing feature matrix and then adds to the global timing feature matrix.
5. The probabilistic power load prediction method based on a graph rolling network model as claimed in claim 1, wherein the graph rolling feature matrix is obtained by a self-attention mechanism through linear transformation、/>And->A matrix, comprising:
wherein,for query matrix, ++>For key matrix>Is a value matrix,/->、/>And->Is a matrix of learnable parameters for generating a query matrix, a key matrix and a value matrix, respectively,/for each of the query matrix, the key matrix and the value matrix>The feature matrix is rolled up for the graph.
6. The probabilistic power load prediction method based on a graph rolling network model as claimed in claim 1, wherein the model is used for multi-scaleAnd->Each head of the matrix is associated with->The matrix executes a self-attention mechanism, and finally, the self-attention results corresponding to the heads are spliced to obtain a self-attention feature matrix, which comprises the following steps:
wherein,is a self-attention feature matrix,Trepresenting transpose operation,/->Is a splicing function->Is a normalization function.
CN202311222388.5A 2023-09-21 2023-09-21 Probability-oriented power load prediction method based on graph convolution network model Active CN116960991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311222388.5A CN116960991B (en) 2023-09-21 2023-09-21 Probability-oriented power load prediction method based on graph convolution network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311222388.5A CN116960991B (en) 2023-09-21 2023-09-21 Probability-oriented power load prediction method based on graph convolution network model

Publications (2)

Publication Number Publication Date
CN116960991A CN116960991A (en) 2023-10-27
CN116960991B true CN116960991B (en) 2023-12-29

Family

ID=88458790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311222388.5A Active CN116960991B (en) 2023-09-21 2023-09-21 Probability-oriented power load prediction method based on graph convolution network model

Country Status (1)

Country Link
CN (1) CN116960991B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117709394A (en) * 2024-02-06 2024-03-15 华侨大学 Vehicle track prediction model training method, multi-model migration prediction method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991090A (en) * 2021-02-05 2021-06-18 江南大学 Photovoltaic power prediction method based on Transformer model
CN114707772A (en) * 2022-06-06 2022-07-05 山东大学 Power load prediction method and system based on multi-feature decomposition and fusion
CN115578851A (en) * 2022-07-14 2023-01-06 西北师范大学 Traffic prediction method based on MGCN
CN115600744A (en) * 2022-10-20 2023-01-13 中国烟草总公司重庆市公司(Cn) Method for predicting population quantity of shared space-time attention convolutional network based on mobile phone data
WO2023030513A1 (en) * 2021-09-05 2023-03-09 汉熵通信有限公司 Internet of things system
CN115862324A (en) * 2022-11-24 2023-03-28 南京邮电大学 Space-time synchronization graph convolution neural network for intelligent traffic and traffic prediction method
CN116258260A (en) * 2023-02-20 2023-06-13 浙江财经大学 Probability power load prediction method based on gating double convolution neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991090A (en) * 2021-02-05 2021-06-18 江南大学 Photovoltaic power prediction method based on Transformer model
WO2023030513A1 (en) * 2021-09-05 2023-03-09 汉熵通信有限公司 Internet of things system
CN114707772A (en) * 2022-06-06 2022-07-05 山东大学 Power load prediction method and system based on multi-feature decomposition and fusion
CN115578851A (en) * 2022-07-14 2023-01-06 西北师范大学 Traffic prediction method based on MGCN
CN115600744A (en) * 2022-10-20 2023-01-13 中国烟草总公司重庆市公司(Cn) Method for predicting population quantity of shared space-time attention convolutional network based on mobile phone data
CN115862324A (en) * 2022-11-24 2023-03-28 南京邮电大学 Space-time synchronization graph convolution neural network for intelligent traffic and traffic prediction method
CN116258260A (en) * 2023-02-20 2023-06-13 浙江财经大学 Probability power load prediction method based on gating double convolution neural network

Also Published As

Publication number Publication date
CN116960991A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
Zhang et al. A novel combination forecasting model for wind power integrating least square support vector machine, deep belief network, singular spectrum analysis and locality-sensitive hashing
CN110379506B (en) Arrhythmia detection method using binarization neural network for electrocardiogram data
CN116960991B (en) Probability-oriented power load prediction method based on graph convolution network model
CN112633478A (en) Construction of graph convolution network learning model based on ontology semantics
CN113673775A (en) Time-space combination prediction method based on CNN-LSTM and deep learning
CN109492748B (en) Method for establishing medium-and-long-term load prediction model of power system based on convolutional neural network
CN103440525B (en) Lake and reservoir water bloom emergency treatment decision-making method based on Vague value similarity measurement improved algorithm
CN101324926B (en) Method for selecting characteristic facing to complicated mode classification
Patel et al. An algorithm to construct decision tree for machine learning based on similarity factor
CN106779219A (en) A kind of electricity demand forecasting method and system
Zuo et al. Representation learning of knowledge graphs with entity attributes and multimedia descriptions
CN112182221A (en) Knowledge retrieval optimization method based on improved random forest
CN113268370A (en) Root cause alarm analysis method, system, equipment and storage medium
CN111062511B (en) Aquaculture disease prediction method and system based on decision tree and neural network
CN115115113A (en) Equipment fault prediction method and system based on graph attention network relation embedding
Hu et al. Lightweight multi-scale network with attention for facial expression recognition
CN113918727A (en) Construction project knowledge transfer method based on knowledge graph and transfer learning
CN116401561B (en) Time-associated clustering method for equipment-level running state sequence
CN116842848A (en) Industrial process soft measurement modeling method and system based on hybrid search evolution optimization
CN114819253A (en) Urban crowd gathering hotspot area prediction method, system, medium and terminal
Zhang et al. Compressing knowledge graph embedding with relational graph auto-encoder
Wu et al. Lightweight compressed depth neural network for tomato disease diagnosis
Ren et al. Recognition of common pests in agriculture and forestry based on convolutional neural networks
CN110188692A (en) A kind of reinforcing that effective target quickly identifies circulation Cascading Methods
CN114343676B (en) Electroencephalogram emotion recognition method and device based on self-adaptive hierarchical graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant