CN115034426B - Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode - Google Patents

Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode Download PDF

Info

Publication number
CN115034426B
CN115034426B CN202210270326.0A CN202210270326A CN115034426B CN 115034426 B CN115034426 B CN 115034426B CN 202210270326 A CN202210270326 A CN 202210270326A CN 115034426 B CN115034426 B CN 115034426B
Authority
CN
China
Prior art keywords
model
load
phase space
data
space reconstruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210270326.0A
Other languages
Chinese (zh)
Other versions
CN115034426A (en
Inventor
侯慧
刘超
吴细秀
石英
谢长君
唐金锐
黄亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202210270326.0A priority Critical patent/CN115034426B/en
Publication of CN115034426A publication Critical patent/CN115034426A/en
Application granted granted Critical
Publication of CN115034426B publication Critical patent/CN115034426B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/08Computing arrangements based on specific mathematical models using chaos models or non-linear system models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/003Load forecast, e.g. methods or systems for forecasting future load demand
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/04Power grid distribution networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Educational Administration (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Optimization (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Primary Health Care (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pure & Applied Mathematics (AREA)

Abstract

The invention provides a rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode, which comprises the steps of firstly adopting a phase space reconstruction technology to process historical load data of a time sequence so as to avoid collecting key information such as temperature, humidity and the like; then, a rolling load prediction method is established, and predicted data is used as a training value of the next training to predict the load for a longer time; and finally, predicting the load of 24 hours in the future by using a multi-model fusion Stacking integrated learning mode model, and comparing the load with the prediction results of other 5 single models to verify the scientificity and effectiveness of the established rolling load prediction method of the phase space reconstruction and multi-model fusion Stacking integrated learning mode. The method provided by the invention realizes that the power load can be accurately predicted when the key information such as temperature, humidity and the like is missing.

Description

Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode
Technical Field
The invention relates to the field of power load prediction, in particular to a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode.
Background
The power load prediction is an important daily work of departments such as power generation, planning, scheduling and the like of a power system. The accurate power load prediction can provide decision support for power grid dispatching, and the reliability and safety of power system operation are improved. The rapid development of artificial intelligence and machine learning technology provides a new solution to the problem of load prediction.
At present, key characteristic information such as temperature, humidity and wind speed is required for power load prediction, but specific data of the key characteristics are often difficult to obtain. In addition, the long-term load data is predicted, and the predicted data has a large difference with the difference in length of the predicted data. The technology of carrying out load prediction by adopting single models such as Back Propagation (BP) is mature, but the single models have certain sensitivity to data and have limitation on the computing capability. The single model algorithm has higher requirements on the quality of data, and when the quality of the data can not meet the requirements of the model operation, the problems of over fitting, under fitting and the like can be faced, so that the prediction accuracy of the power load can be reduced, and the prediction accuracy needs to be further improved.
Disclosure of Invention
The invention provides a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode, which is used for solving or at least partially solving the technical problem of low conforming prediction precision in the prior art.
In order to solve the technical problems, the invention provides a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode, which comprises the following steps:
s1: acquiring historical load data, and processing the historical load data by adopting a phase space reconstruction technology, wherein the historical load data is in a time sequence form;
s2: a rolling load prediction method is adopted, the prediction length is determined in advance, and the load for a longer time is predicted by moving training data;
S3: and establishing a load prediction model of a multi-model fusion Stacking integrated learning mode, and importing the processed historical load data into the load prediction model to predict future load.
In one embodiment, the processing of the historical load data in step S1 using a phase space reconstruction technique includes:
S1.1: a chaotic time series of historical load data is obtained, which is x 1,x2,…,xN-1,xN, and x t (t=1, 2, …, N- (m-1) τ) is transformed as follows:
xt=(xt,x(t+τ),x(t+2τ),…,x[t+(m-1)τ])T
Where τ is the delay time and m is the embedding dimension;
S1.2: the chaotic time sequence x 1,x2,…,xN-1,xN is converted into a new data space with time delay tau and dimension m by adopting a phase space reconstruction method, namely:
Wherein each column represents a vector or phase point, for N data points, delay time is given as tau, embedding dimension is m, and N- (m-1) tau vectors or phase points are reconstructed;
S1.3: determining a delay time;
S1.4: the embedding dimension is determined.
In one embodiment, step S1.3 selects a mutual information method to determine the delay time, wherein the functional relationship between the mutual information of the chaotic time sequence x 1,x2,…,xN-1,xN and the delay time is shown in the following formula, the first minimum point of the mutual information function is taken as the delay time,
Where P (x t) is the probability that x t occurs in time series x 1,x2,…,xN-1,xN, P (x t+τ) is the probability that x t+τ occurs in time series x 1+τ,x2+τ,…,xN-1+τ,xN+τ, and P (x t,xt+τ) is the joint probability that x t and x t+τ occur simultaneously in x 1,x2,…,xN-1,xN and x 1+τ,x2+τ,…,xN-1+τ,xN+τ, respectively.
In one embodiment, step S1.4 determines the embedding dimension according to the degree of change in the distance between the phase point and the nearest neighbor point in the chaotic time series after the phase space reconstruction, specifically:
s1.4.1: after the chaotic time series phase space reconstruction is obtained, the distance R m (t) between a phase point X (t) and the nearest neighbor point in the space is obtained, wherein X (t) = (X t,x(t+τ),x(t+2τ),…,x[t+(m-1)τ]),Rm(t)=||X(t)-Xj (t) |;
S1.4.2: the distance R m+1 between the phase point X (t) in space and the nearest neighbor is obtained when the dimension increases to m +1,
S1.4.3: judging whether the ratio of R m+1 to R m (t) is larger than a threshold value, and if so, taking the value corresponding to m as an embedding dimension.
In one embodiment, step S2 includes:
the functional relationship between the reconstructed phase space of the original time sequence and the time sequence subjected to the predicted periodic displacement is obtained as follows:
Wherein N represents the number of phase points obtained after phase space reconstruction, N+ (m-1) τ represents the length of the time sequence x i, As the phase point after phase space reconstruction, ψ represents the transition function of phase space reconstruction;
Splicing the predicted value of the previous stage at the tail part of the historical data, and jointly using the predicted value of the previous stage as a training value of the next stage, and predicting the load of the next stage by using a machine learning machine.
In one embodiment, the load prediction model based on the multi-model fusion Stacking integrated learning mode in step S3 includes two layers: a base learning layer and a meta learning layer, wherein the base learning layer comprises: random Forest (RF), adaptive enhancement (Adaptive boosting, adaBoost), gradient lifting regression (Gradient Boosting Regression, GBR), decision Tree (DT), extreme gradient lifting (eXtreme gradient boosting, XGBoost), the meta-learning layer uses (Long Short Term Memory, LSTM) models.
In one embodiment, the method further comprises: and evaluating the prediction result of the integrated model.
The above technical solutions in the embodiments of the present application at least have one or more of the following technical effects:
the invention provides a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode. On the one hand, a phase space reconstruction technology based on chaos theory is adopted in the aspects of historical load data processing and use, so that a new thought is provided for the use of the time series load data of the key information such as the missing temperature, the humidity, the wind speed and the like; on the other hand, the difference of the long-term load prediction on the prediction result is considered, and a rolling load prediction method is established. In the aspect of prediction model establishment, a RF, adaBoost, GBR, DT, XGBoost multi-model fusion Stacking integrated learning mode load prediction model is established, and the accuracy of load prediction is further improved.
Finally, the prediction results of the integrated model and the prediction results of 5 single models such as LSTM, BP, extreme learning machine (Extreme LEARNING MACHINE, ELM), lasso algorithm (Lasso SHRINKAGE AND selection operator) and RF are evaluated, and the average absolute error, root mean square error and average absolute percentage error are used as evaluation indexes to evaluate the prediction performance of the integrated model. The invention can accurately predict the power load even when the key information such as the missing temperature, the humidity, the wind speed and the like is missing, and improves the prediction precision. The method provides a new idea for improving the data quality of load prediction under limited information, and has wide application prospect in future industrial practice.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is an overall flowchart of a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode in an embodiment of the invention;
FIG. 2 is a schematic diagram of a rolling load prediction method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a load prediction model of a multi-model fusion Stacking integrated learning mode in an embodiment of the invention;
FIG. 4 is a schematic diagram of average absolute error of each model in an embodiment of the present invention;
FIG. 5 is a graph showing root mean square error of each model in an embodiment of the present invention;
FIG. 6 is a graph showing the average absolute error percentage of each model in an embodiment of the present invention;
FIG. 7 is a graph showing the load prediction results of each model according to an embodiment of the present invention.
Detailed Description
Based on the technical problems in the prior art, the invention adopts the phase space reconstruction technology in the chaos theory to solve the problem of missing key characteristic information. The Stacking ensemble learning approach is a model integration technique that combines information from multiple predictive models to generate a new model. By taking the advantages and the shortages, different machine learning algorithms are combined together in different modes, so that the performance superior to that of a single algorithm is obtained, and the Stacking integrated learning model obtains the optimal prediction effect. The invention aims to accurately predict the power load even when key characteristic information such as temperature, humidity and the like is missing.
The main inventive concept of the present invention comprises:
Firstly, adopting a phase space reconstruction technology to process historical load data of a time sequence so as to avoid collecting key information such as temperature, humidity and the like; then, a rolling load prediction method is established, and predicted data is used as a training value of the next training to predict the load for a longer time; and finally, predicting the load of 24 hours in the future by using a multi-model fusion Stacking integrated learning mode model, and comparing the load with the prediction results of other 5 single models to verify the scientificity and effectiveness of the established rolling load prediction method of the phase space reconstruction and multi-model fusion Stacking integrated learning mode. The method provided by the invention realizes that the power load can be accurately predicted when the key information such as temperature, humidity and the like is missing.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode, which comprises the following steps:
s1: acquiring historical load data, and processing the historical load data by adopting a phase space reconstruction technology, wherein the historical load data is in a time sequence form;
s2: a rolling load prediction method is adopted, the prediction length is determined in advance, and the load for a longer time is predicted by moving training data;
S3: and establishing a load prediction model of a multi-model fusion Stacking integrated learning mode, and importing the processed historical load data into the load prediction model to predict future load.
Aiming at the defects and optimization requirements of the existing research, the invention provides a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode.
Specifically, step S1 determines parameters of a phase space reconstruction model based on a chaos theory according to the characteristics of time sequence load, and prepares for later data processing;
Step S2: a method of rolling load prediction is established. The method is to determine the predicted length in advance and predict the load for a longer time by moving the training data.
Step S3: and establishing a load prediction model of a multi-model fusion Stacking integrated learning mode. And importing the reconstructed historical load data into an integrated model to predict future loads.
In one embodiment, the processing of the historical load data in step S1 using a phase space reconstruction technique includes:
S1.1: a chaotic time series of historical load data is obtained, which is x 1,x2,…,xN-1,xN, and x t (t=1, 2, …, N- (m-1) τ) is transformed as follows:
xt=(xt,x(t+τ),x(t+2τ),…,x[t+(m-1)τ])T
Where τ is the delay time and m is the embedding dimension;
S1.2: the chaotic time sequence x 1,x2,…,xN-1,xN is converted into a new data space with time delay tau and dimension m by adopting a phase space reconstruction method, namely:
Wherein each column represents a vector or phase point, for N data points, delay time is given as tau, embedding dimension is m, and N- (m-1) tau vectors or phase points are reconstructed;
S1.3: determining a delay time;
S1.4: the embedding dimension is determined.
In one embodiment, step S1.3 selects a mutual information method to determine the delay time, wherein the functional relationship between the mutual information of the chaotic time sequence x 1,x2,…,xN-1,xN and the delay time is shown in the following formula, the first minimum point of the mutual information function is taken as the delay time,
Where P (x t) is the probability that x t occurs in time series x 1,x2,…,xN-1,xN, P (x t+τ) is the probability that x t+τ occurs in time series x 1+τ,x2+τ,…,xN-1+τ,xN+τ, and P (x t,xt+τ) is the joint probability that x t and x t+τ occur simultaneously in x 1,x2,…,xN-1,xN and x 1+τ,x2+τ,…,xN-1+τ,xN+τ, respectively.
In one embodiment, step S1.4 determines the embedding dimension according to the degree of change in the distance between the phase point and the nearest neighbor point in the chaotic time series after the phase space reconstruction, specifically:
s1.4.1: after the chaotic time series phase space reconstruction is obtained, the distance R m (t) between a phase point X (t) and the nearest neighbor point in the space is obtained, wherein X (t) = (X t,x(t+τ),x(t+2τ),…,x[t+(m-1)τ]),Rm(t)=||X(t)-Xj (t) |;
S1.4.2: the distance R m+1 between the phase point X (t) in space and the nearest neighbor is obtained when the dimension increases to m +1,
S1.4.3: judging whether the ratio of R m+1 to R m (t) is larger than a threshold value, and if so, taking the value corresponding to m as an embedding dimension.
Specifically, after the phase space of the chaotic time series is reconstructed, a nearest adjacent point exists for a phase point X (t) = (X t,x(t+τ),x(t+2τ),…,x[t+(m-1)τ]) in the space, the distance between two phase points is R m(t)=||X(t)-Xj (t) |, when the dimension is increased to m+1, the distance between the two points changes, and the distance after the change is R m+1. If R m+1 is much larger than R m, this is believed to be due to two points in the high-dimensional phase space that were not otherwise adjacent being projected into the low-dimensional space becoming adjacent points.
Thus, by setting a formulaIf S m > S, then X (t) and X j (t) are pseudo-nearest neighbors, where S is a threshold.
In one embodiment, step S2 includes:
the functional relationship between the reconstructed phase space of the original time sequence and the time sequence subjected to the predicted periodic displacement is obtained as follows:
Wherein N represents the number of phase points obtained after phase space reconstruction, N+ (m-1) τ represents the length of the time sequence x i, As the phase point after phase space reconstruction, ψ represents the transition function of phase space reconstruction;
Splicing the predicted value of the previous stage at the tail part of the historical data, and jointly using the predicted value of the previous stage as a training value of the next stage, and predicting the load of the next stage by using a machine learning machine.
Specifically, to obtain N phase points, a time series x i of length N+ (m-1) τ is required.
In one embodiment, the load prediction model based on the multi-model fusion Stacking integrated learning mode in step S3 includes two layers: a base learning layer and a meta learning layer, wherein the base learning layer comprises: RF, adaBoost, GBR, DT, XGBoost model, and the element learning layer adopts an LSTM model.
In the implementation process, the rolling load prediction model is based on an LSTM model, so that the element learning layer adopts the LSTM model.
The respective models are specifically described below.
(1) Principle of RF model operation.
① The number of training examples (samples) is denoted by N 1, and the number of features is denoted by M 1.
② The number m 1 of the features is input and is used for determining a decision result of one node on the decision tree; wherein M 1 should be much smaller than M 1.
③ Samples are taken from N 1 training examples (samples) for N 1 times in a subsampling manner, a training set (i.e. bootstrap sampling) is formed, and errors are estimated by using the non-sampled examples (samples) as predictions. ④ For each node, m 1 features are randomly selected, and the decision for each node on the decision tree is determined based on these features. From these m 1 features, the optimal splitting pattern was calculated.
⑤ Each tree grows completely without pruning, which is likely to be employed after a normal tree classifier has been built.
(2) AdaBoost model operation principle
① The first weak classifier is obtained by learning N 1 training samples.
② The error-divided samples and other new data are combined to form a new training sample of N 1, and a second weak classifier is obtained through learning of the sample.
③ Adding the samples with the 1 and 2 being separated by other new samples to form another new training sample of N 1, and obtaining a third weak classifier by learning the sample
④ And finally, a strong classifier is lifted. I.e. which class a certain data is divided into is determined by the weight of each classifier.
(3) GBR model operation principle
① Training sampleA differentiable loss function L (y, F (x)), the number of cycles M.
② The algorithm is started with a constant F 0 (x) that satisfies the following equation:
wherein x i,yi is the input value and the output value of the sample, n is the number of training samples, and ρ is the gradient descent step size.
③ Fitting the weak learner h m (x) with the pseudo-residual, establishing an endpoint region R jm(j=1...Jm, and calculating the h m (x) weight coefficient gamma jm for each endpoint region (i.e., each leaf)
④ Updating learning rate alpha
Fm(x)=Fm-1+αγjm
⑤ Output update result F m (x)
(4) DT model operation principle
① The feature points are densely sampled. Dividing each frame of picture of the video into a plurality of scales; densely sampling feature points on each scale of picture in a grid division mode; and removing some untraceable characteristic points lacking in variation, and removing the characteristic points below a certain threshold value by calculating the characteristic values of the pixel point autocorrelation matrix.
② And tracking the characteristic point track. Let the coordinates of a feature point densely sampled in the previous step be P t=(xt,yt), we can calculate the position of the feature point in the next frame image by using a formula P t+1=(xt+1,yt+1). The position of a certain feature point on the continuous L-frame image forms a track (P t,Pt+1.....,Pt+L), and the subsequent feature extraction is performed along each track.
③ Trajectory-based feature extraction. For a track of length L, the shape can be described by (Δp t,...,△Pt+L-1), where the displacement vector is:
ΔPt=(Pt+1-Pt)=(xt+1-xt,yt+1-yt)
the feature trajectory is described as:
④ And (5) feature coding. For a video there are a large number of tracks, each corresponding to a set of features (trajectory, HOG, HOF, MBH), so that these sets of features need to be encoded to obtain a fixed length encoded feature for final video classification. The DT algorithm uses the Bag of Features method to encode the Features, and when training the codebook, the DT algorithm randomly selects 100000 groups of Features for training.
⑤ Classification-SVM
After the corresponding features of the video are obtained, the DT algorithm adopts an SVM (RBF-x 2 kernel) classifier to classify, and a one-against-rest strategy is adopted to train the game traffic divider.
(5) XGBoost model operation principle
① XGBoost is an optimized integrated tree model, modified and expanded from the gradient-lifting tree model.
The integrated model of the tree is as follows:
Wherein, in the formula: model predictive value for the ith sample; k is the number of trees; f is the collection space of the tree; x i represents the eigenvector of the i-th data point; f k corresponds to the structure q and leaf weight ω -related condition of the kth independent tree.
The XGBoost model loss function L consists of two parts:
Wherein: part 1 is the predicted value And a training error between the target real value y i; part 2 is the sum of the tree complexity, which is a regularization term used to control the complexity of the model, i.e
Where γ and λ represent penalty coefficients to the model.
② The objective function for the t-th round can be written as:
definition:
The method can obtain:
The omega bias derivative can be obtained:
substituting the weight into the objective function to obtain:
③ The smaller the loss function, the better the representation model, the partitioning of the subtree by greedy algorithm is performed and the feasible partitioning points are enumerated, i.e. each time a new partitioning is added to the existing leaf, and the maximum gain thus obtained is calculated. The gain L Gain is calculated as follows:
Wherein: items 1 and 2 respectively represent gains generated after splitting the left subtree and the right subtree; item 3 is the gain without subtree splitting.
(6) LSTM model operation principle
Since the rolling load prediction method is LSTM based, the meta-learner layer uses LSTM to predict load. The LSTM algorithm is a special recurrent neural network, solves the problems of gradient disappearance, gradient explosion and the like in the long sequence training process, and the calculation formula of the LSTM unit is as follows:
ft=σ(Wf[ht-1,xt]+bf)
it=σ(Wi[ht-1,xt]+bi)
ot=σ(Wo[ht-1,xt]+bo)
ht=ot*tanh(Ct)
where f t denotes a forget threshold, i t denotes an input threshold, o t denotes an input threshold, C t denotes a cell state, The state of the cell at the previous time is represented by h t, the output of the current cell, W f,Wi,Wo,WC, the weight matrix of the corresponding gate input variable, b f,bi,bo,bC, the offset of the corresponding gate, C t-1, the output of the memory cell, h t-1, the output of the cell at the previous time, x t, the input of the cell, and σ the S-type function.
In one embodiment, the method further comprises: and evaluating the prediction result of the integrated model.
Specifically, in order to evaluate the performance of the built integrated model, the prediction results of 5 single models such as LSTM, BP, ELM, lasso, RF were compared. And comprehensively evaluating the prediction effect by adopting the dimension evaluation index and the dimensionless evaluation index. The dimension evaluation indexes are mean absolute error (Mean absolute error, MAE) and root mean square error (Root mean square error, RMSE). The dimensionless assessment index is the mean absolute percent error (Mean absolute percentage error, MAPE). The calculation formula of each index is as follows:
Wherein y i and The actual value and the predicted value of the sample i are respectively, and N is the sample size.
Embodiments of the present invention are described below with reference to fig. 1 to 7, and the specific steps are as follows:
Fig. 1 is an overall flowchart of a rolling load prediction method based on a phase space reconstruction and multi-model fusion Stacking integrated learning mode.
Step 1: according to the characteristics of time sequence load, determining parameters of a phase space reconstruction model based on a chaos theory, and preparing for later data processing;
The parameters of the phase space reconstruction model determined based on the chaos theory in the step 1 are specifically as follows:
Step 1.1: the historical load data for 1 month was downloaded on the data management tool DATA MINER website of the jm grid in the united states at 1 hour intervals. The downloaded data are historical load data for 7, 11, 15, 19, 23, 27 and 31 days before day 7, month 1 of 2021, to predict the load after day 7, month 1 of 2021.
Step 1.2: the downloaded load time sequence is x 1,x2,…,xN-1,xN, and x t (t=1, 2, …, N- (m-1) τ) is transformed as follows:
xt=(xt,x(t+τ),x(t+2τ),…,x[t+(m-1)τ])T
Where τ is the delay time and m is the embedding dimension. The delay time is 8 and the embedding dimension is 3% by simulation.
Step 1.3: according to the phase space reconstruction method, the load time sequence x 1,x2,…,xN-1,xN is converted into a new data space with delay of 8 and dimension of 3.
The phase space model is run to end, and the data is divided into 4 sub-data set files of training input, training output, test input_test and test output_test required by the prediction model, and the data is used for training the model and predicting future loads.
Step 2: a method of rolling load prediction is established. The method is to determine the predicted length in advance and predict the load for a longer time by moving the training data.
The rolling load prediction method in the step2 is specifically as follows:
Fig. 2 is a schematic diagram of a rolling load prediction method.
Step 2.1: the reconstruction phase space of the original time sequence and the time sequence subjected to the prediction period displacement are considered to have a functional relation, the predicted value of 7 months and 1 day is spliced at the tail part of the original training set as a training value, so that the load of 7 months and 2 days is predicted, and the load of longer time is predicted by the same method.
Step 3: and establishing RF, adaBoost, GBR, DT, XGBoost and LSTM model multi-model fusion Stacking integrated learning mode load prediction models. And importing the reconstructed historical load data into an integrated model to predict future loads.
Fig. 3 is a schematic diagram of a load prediction model of a multi-model fusion Stacking integrated learning mode.
The specific steps of the load prediction model of the multi-model fusion Stacking integrated learning mode in the step 3 are as follows:
Step 3.1: determining parameters of the RF model as the number of decision trees n_ estimators =2000; best feature selection oob _score=false.
Step 3.2: the parameters of the AdaBoost model were determined to be the weak learner number n_ estimators =500.
Step 3.3: determining parameters of the GBR model as the number of integrated weak estimators n_ estimators =2000; learning_rate=0.01.
Step 3.4: the parameters of the DT model are determined to be maximum depth max_depth=80.
Step 3.5: determining XGBoost parameters of the model as the weak learner number n_ estimator =1000; learning_rate=0.05.
Step 3.6: determining parameters of the LSTM model as the training round number epochs =50; the number of one-time input data batch_size=1; training progress display form verbose=2; whether to reorder shuffle = False.
Step 4: and evaluating the prediction result of the integrated model, and comparing the integrated model with 5 single models such as LSTM, BP, ELM, lasso, RF and the like, so as to verify that the built integrated model has higher prediction precision.
FIG. 4 is a graph showing average absolute error of each model; FIG. 5 is a graph showing root mean square error for each model; FIG. 6 is a graph showing the average absolute error percentage of each model; fig. 7 is a schematic diagram of the load prediction results of each model.
The specific steps for evaluating the prediction result of the integrated model in the step 4 are as follows:
Step 4.1: determining parameters of the LSTM model as the training round number epochs =50; the number of one-time input data batch_size=1; training progress display form verbose=2; whether to reorder shuffle = False.
Step 4.2: determining the parameters of the BP model as training times net. Training target minimum error net. Learning rate net.trainParam.lr=0.01
Step 4.3: the parameters of the ELM model were determined to be the number number of hidden neurons =25 of hidden neurons.
Step 4.4: the parameters of the Lasso model were determined to be damping ratio coefficients alpha=0.5.
Step 4.5: determining XGBoost parameters of the model as the weak learner number n_ estimator =1000; random state random_state=0.
Step 4.6: and importing the reconstructed data set into the single model, predicting the load of 7 months and 2 days, and visualizing the prediction result of each model.
Step 4.7: comparing the output result of the integrated model with the predicted results of the above 5 single models, the study shows that: compared with a single model, the built integrated model has higher prediction precision.
The invention has the beneficial effects that:
On the one hand, a phase space reconstruction technology based on chaos theory is adopted in the aspects of historical load data processing and use, so that a new thought is provided for the use of the time series load data of the key information such as the missing temperature, the humidity, the wind speed and the like; on the other hand, the difference of the long-term load prediction on the prediction result is considered, and a rolling load prediction method is established. In the aspect of prediction model establishment, a RF, adaBoost, GBR, DT, XGBoost multi-model fusion Stacking integrated learning mode load prediction model is established, and the accuracy of load prediction is further improved. And finally, evaluating the prediction results of the integrated model and the prediction results of the LSTM, BP, ELM, lasso single models such as the RF and the like, and evaluating the prediction performance of the integrated model by taking the average absolute error, the root mean square error and the average absolute percentage error as evaluation indexes. The invention realizes accurate prediction of the power load when the key information such as the missing temperature, the humidity, the wind speed and the like is missing. The method provides a new idea for improving the data quality of load prediction under limited information, and has wide application prospect in future industrial practice.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (5)

1. The rolling load prediction method based on the phase space reconstruction and multi-model fusion Stacking integrated learning mode is characterized by comprising the following steps of:
s1: acquiring historical load data, and processing the historical load data by adopting a phase space reconstruction technology, wherein the historical load data is in a time sequence form;
s2: a rolling load prediction method is adopted, the prediction length is determined in advance, and the load for a longer time is predicted by moving training data;
S3: establishing a load prediction model of a multi-model fusion Stacking integrated learning mode, and importing the processed historical load data into the load prediction model to predict future load;
Wherein, step S2 includes:
the functional relationship between the reconstructed phase space of the original time sequence and the time sequence subjected to the predicted periodic displacement is obtained as follows:
Wherein N represents the number of phase points obtained after phase space reconstruction, N+ (m-1) τ represents the length of the time sequence x i, As the phase point after phase space reconstruction, ψ represents the transition function of phase space reconstruction;
Splicing the predicted value of the previous stage at the tail part of the historical data, and jointly using the predicted value as a training value of the next stage, and predicting the load of the next stage by using a machine learning machine;
in the step S3, the load prediction model based on the multi-model fusion Stacking integrated learning mode comprises two layers: a base learning layer and a meta learning layer, wherein the base learning layer comprises: random forest, self-adaptive enhancement, gradient lifting regression, decision tree and extreme gradient lifting, and the element learning layer adopts an LSTM model.
2. The rolling load prediction method according to claim 1, wherein the processing of the historical load data using the phase space reconstruction technique in step S1 includes:
S1.1: a chaotic time series of historical load data is obtained, which is x 1,x2,…,xN-1,xN, and x t (t=1, 2, …, N- (m-1) τ) is transformed as follows:
xt=(xt,x(t+τ),x(t+2τ),…,x[t+(m-1)τ])T
Where τ is the delay time and m is the embedding dimension;
S1.2: the chaotic time sequence x 1,x2,…,xN-1,xN is converted into a new data space with time delay tau and dimension m by adopting a phase space reconstruction method, namely:
Wherein each column represents a vector or phase point, for N data points, delay time is given as tau, embedding dimension is m, and N- (m-1) tau vectors or phase points are reconstructed;
S1.3: determining a delay time;
S1.4: the embedding dimension is determined.
3. The rolling load prediction method according to claim 2, wherein step S1.3 selects a mutual information method to determine a delay time, wherein the functional relationship between the mutual information of the chaotic time series x 1,x2,…,xN-1,xN and the delay time is shown in the following formula, a first minimum point of the mutual information function is taken as the delay time,
Where P (x t) is the probability that x t occurs in time series x 1,x2,…,xN-1,xN, P (x t+τ) is the probability that x t+τ occurs in time series x 1,x2,…,xN-1,xN, and P (x t,xt+τ) is the joint probability that x t and x t+τ occur simultaneously in x 1,x2,…,xN-1,xN and x 1+τ,x2+τ,…,xN-1+τ,xN+τ, respectively.
4. The rolling load prediction method according to claim 2, wherein step S1.4 is to determine the embedding dimension according to the degree of change in the distance between the phase point in the chaotic time series and the nearest neighbor point after the phase space reconstruction, specifically:
s1.4.1: after the chaotic time series phase space reconstruction is obtained, the distance R m (t) between a phase point X (t) and the nearest neighbor point in the space is obtained, wherein X (t) = (X t,x(t+τ),x(t+2τ),…,x[t+(m-1)τ]),Rm(t)=||X(t)-Xj (t) |;
S1.4.2: the distance R m+1 between the phase point X (t) in space and the nearest neighbor is obtained when the dimension increases to m +1,
S1.4.3: judging whether the ratio of R m+1 to R m (t) is larger than a threshold value, and if so, taking the value corresponding to m as an embedding dimension.
5. The rolling load prediction method according to claim 1, characterized in that the method further comprises: and evaluating the prediction result of the integrated model.
CN202210270326.0A 2022-03-18 2022-03-18 Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode Active CN115034426B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210270326.0A CN115034426B (en) 2022-03-18 2022-03-18 Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210270326.0A CN115034426B (en) 2022-03-18 2022-03-18 Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode

Publications (2)

Publication Number Publication Date
CN115034426A CN115034426A (en) 2022-09-09
CN115034426B true CN115034426B (en) 2024-07-02

Family

ID=83119961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210270326.0A Active CN115034426B (en) 2022-03-18 2022-03-18 Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode

Country Status (1)

Country Link
CN (1) CN115034426B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115409292A (en) * 2022-10-31 2022-11-29 广东电网有限责任公司佛山供电局 Short-term load prediction method for power system and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232593A (en) * 2020-11-04 2021-01-15 武汉理工大学 Power load prediction method based on phase space reconstruction and data driving
CN112766585A (en) * 2021-01-25 2021-05-07 三峡大学 Electric power short-term rolling load prediction method, system and terminal based on soft ensemble learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414031B (en) * 2019-05-07 2021-10-22 深圳大学 Method and device for predicting time sequence based on volterra series model, electronic equipment and computer readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232593A (en) * 2020-11-04 2021-01-15 武汉理工大学 Power load prediction method based on phase space reconstruction and data driving
CN112766585A (en) * 2021-01-25 2021-05-07 三峡大学 Electric power short-term rolling load prediction method, system and terminal based on soft ensemble learning

Also Published As

Publication number Publication date
CN115034426A (en) 2022-09-09

Similar Documents

Publication Publication Date Title
CN108280551B (en) Photovoltaic power generation power prediction method utilizing long-term and short-term memory network
CN106600059B (en) Intelligent power grid short-term load prediction method based on improved RBF neural network
JP2022500769A (en) Power system heat load prediction method and prediction device
CN110648014B (en) Regional wind power prediction method and system based on space-time quantile regression
CN113537600B (en) Medium-long-term precipitation prediction modeling method for whole-process coupling machine learning
CN111861013B (en) Power load prediction method and device
CN111047078B (en) Traffic characteristic prediction method, system and storage medium
CN110674965A (en) Multi-time step wind power prediction method based on dynamic feature selection
CN111768622A (en) Short-time traffic prediction method based on improved wolf algorithm
CN112766603A (en) Traffic flow prediction method, system, computer device and storage medium
CN114548591A (en) Time sequence data prediction method and system based on hybrid deep learning model and Stacking
CN116187835A (en) Data-driven-based method and system for estimating theoretical line loss interval of transformer area
CN115034426B (en) Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode
CN110738363B (en) Photovoltaic power generation power prediction method
CN115907122A (en) Regional electric vehicle charging load prediction method
CN117636183A (en) Small sample remote sensing image classification method based on self-supervision pre-training
CN116933037A (en) Photovoltaic output prediction method based on multi-model fusion and related device
CN115481788B (en) Phase change energy storage system load prediction method and system
CN116523001A (en) Method, device and computer equipment for constructing weak line identification model of power grid
CN114254828B (en) Power load prediction method based on mixed convolution feature extractor and GRU
CN113962431B (en) Bus load prediction method for two-stage feature processing
CN115330072A (en) Power load prediction method based on CNN (convolutional neural network) and AdaRNN (AdaRNN neural network) model
CN114861967A (en) Power load prediction method, system, device and storage medium
CN112232557A (en) Switch machine health degree short-term prediction method based on long-term and short-term memory network
He et al. Application of neural network model based on combination of fuzzy classification and input selection in short term load forecasting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant