CN113962454A - LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization - Google Patents

LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization Download PDF

Info

Publication number
CN113962454A
CN113962454A CN202111213171.9A CN202111213171A CN113962454A CN 113962454 A CN113962454 A CN 113962454A CN 202111213171 A CN202111213171 A CN 202111213171A CN 113962454 A CN113962454 A CN 113962454A
Authority
CN
China
Prior art keywords
lstm
prediction
model
particle
energy consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111213171.9A
Other languages
Chinese (zh)
Inventor
谌东海
王宁
刘杰
王伟
刘畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changjiang Institute of Survey Planning Design and Research Co Ltd
Original Assignee
Changjiang Institute of Survey Planning Design and Research Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changjiang Institute of Survey Planning Design and Research Co Ltd filed Critical Changjiang Institute of Survey Planning Design and Research Co Ltd
Priority to CN202111213171.9A priority Critical patent/CN113962454A/en
Publication of CN113962454A publication Critical patent/CN113962454A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Physiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization. The method comprises the following steps: performing correlation analysis on time and feature dimensions of the original data set by adopting an MI mutual information method, and selecting front N' dimension features most effective on the energy consumption prediction target value; step two: performing secondary feature selection on the N-dimensional features to obtain N' dimensional features after PMI feature selection; step three: performing model training and prediction on the data after PMI dual feature selection by adopting an LSTM model to obtain an initial prediction sequence y (t); step four: and optimizing the hyperparameter units, dropout and batchsize of the LSTM model by adopting a PSO algorithm, thereby improving the prediction precision of the LSTM model and finally obtaining the PMI-LSTM-PSO model. The method has the advantages of high prediction precision, high algorithm efficiency and stable prediction performance.

Description

LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization
Technical Field
The invention relates to the technical field of building energy consumption prediction, in particular to an LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization.
Background
With the wide application of more and more complex science and technology products, the demand for electric power is gradually increasing worldwide, and the electric power grid needs to be controlled to realize the sustainable development of electric power. In the artificial intelligence era, the power internet of things is gradually connected into daily life, the development of the smart power grid also needs adaptive testing capability, and the smart power meter is produced accordingly. The continuous expansion of the infrastructure of the intelligent electric meter in the global range also lays a foundation for introducing an active electric energy system into an intelligent power grid. Since the 'strong smart grid' plan was introduced in 2009, the power grid companies in China are always deploying smart meters, power distribution automation, embedded intelligence and other technologies on a large scale.
For household buildings and enterprise buildings, the prediction of energy consumption is used for improving the use efficiency of energy consumption and reducing the energy consumption, so that the method has great practical significance. Commercial and residential buildings account for 30% to 40% of the total energy consumption of intelligent buildings. Current trends indicate that this percentage may increase in the near future and that global energy consumption and penetration are increasing. Short-term energy consumption prediction is crucial, and is a challenging problem due to the complexity and various uncertainties of infrastructure behavior of buildings, and the disadvantages of low efficiency, serious waste of electric energy, weak information interaction capability and low automation degree of the traditional power grid.
In view of this, researchers have developed many predictive methods to improve grid quality and optimize energy usage. In many related researches, a time series model ARIMA and the like are also often used as a reference model for verifying whether the prediction performance of some newly proposed methods is superior. Researchers now often use historical data in conjunction with machine learning and deep learning algorithms, such as Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), adaptive neuro-fuzzy inference systems (ANFIS), and Extreme Learning Machines (ELMs) for prediction. The convolutional neural network, the BP neural network and the like have been studied in the field of power consumption, but are still in the early stage of the prediction method.
In the data preprocessing process, the accuracy of the model is largely determined by the quality of feature selection of the original data. The predictive model is better enhanced if the number of input data features can be reduced by selecting the most efficient and useful inputs. The methods of feature selection methods include correlation analysis and numerical sensitivity analysis, but these methods are linear input selection methods, while the energy consumption data are nonlinear. Therefore, the mutual information feature selection method is more effective, and the efficiency of calculating the correlation between input data and output data is high. Feature variable selection based on mutual information is a novel variable selection method, wherein mutual information is quantized and the correlation between different related variables is calculated.
1) MI mutual information algorithm
Mutual Information (MI), which represents the interdependence between two variables X and Y.
Mutual information I (X; Y) between X and Y is defined as:
Figure BDA0003308548610000021
wherein p (x, y) is a joint probability density function, and p (x), p (y) are edge probability density functions of x and y, respectively. MI is the amount of information used to evaluate the contribution of the occurrence of one event to the occurrence of another event. The MI mutual information method is characterized in that mutual information measurement of all characteristics and target characteristics is calculated, then sequencing is carried out, and N' characteristics with the highest correlation are selected, so that the purpose of characteristic selection is achieved.
2) Correlation coefficient of Person
Figure BDA0003308548610000022
Wherein the content of the first and second substances,
Figure BDA0003308548610000023
are the average values of X and Y, respectively. If r is more than or equal to 0.5, the correlation between X and Y is stronger, otherwise, the correlation between X and Y is weaker. The features can be further reduced by performing a quadratic feature selection by the Person correlation coefficient.
3) LSTM model
LSTM is a deep learning model that can efficiently process longer time series and automatically learn data and mine deeper functions. However, similar to other neural network models, the setting of part of hyper-parameters in the LSTM neural network model often depends on the experience of researchers, and such models lack scientific rigor. PSO has the advantage of being simple to implement, PSO solutions provide faster convergence speed, and no many parameters need to be adjusted. Genetic algorithms and ant colony algorithms, etc. do not have such a guiding mechanism.
The long-short Time neural memory network (LSTM) is proposed by Hochreiter and used for solving the problems of gradient extinction and gradient explosion existing in Back-propagation Through Time (BPTT). With the continuous improvement of the model, the LSTM network architecture is gradually developed into the widely used LSTM network architecture. The internal part of the device consists of 3 unique gate structures and 1 state module for storing and memorizing. The structure of the LSTM cell is shown in fig. 1. Wherein C istFor the state information stored in the local LSTM cell, htFor the output of the hidden layer of this unit, ftTo forget the door, itIn order to input the information into the gate,
Figure BDA0003308548610000031
as information of the current time otIn order to output the output gate, the output gate is provided with a gate,
Figure BDA0003308548610000032
which represents the multiplication of the elements of the matrix,
Figure BDA0003308548610000033
representing a matrix addition.
Forget the door: control the last unitState Ct-1Degree of forgetting:
ft=σ(Wf·[ht-1,xt]+bf) (3)
an input gate: control which information is added to the unit:
it=σ(Wi·[ht-1,xt]+bi) (4)
Figure BDA0003308548610000034
updating the state of the unit: according to ftSelectively recording new information to CtThe method comprises the following steps:
Figure BDA0003308548610000035
an output gate: c is to betActivating and controlling CtDegree of being filtered:
ot=σ(Wo·[ht-1,xt]+bo) (7)
Figure BDA0003308548610000038
Wf,Wi
Figure BDA0003308548610000036
Woweight matrices corresponding to the respective modules, bf,bi
Figure BDA0003308548610000037
boIs a bias term, sigma is a sigmoid activation function, and tanh is a hyperbolic tangent activation function defined as
σ(x)=1/(1+e-x) (9)
tanh(x)=(ex-e-x)/(ex+e-x) (10)
The output layer is represented by the formula (11)tObtaining the final predicted value y through a full connection layer (dense)t
Wherein, Wy,byRespectively, a weight matrix and an offset term.
yt=σ(Wy·ht+by) (11)
The LSTM controls the transfer of historical information through a gate function, and has certain time sequence processing and prediction capabilities.
4) PSO particle swarm optimization algorithm
The basic idea of the particle swarm optimization is as follows: a group of birds randomly flies to a certain position in a certain area to search for food, and all the birds only know the distance between the birds and the food and the position information of other birds. Each bird, when flying away from the current location to another location, will rely on the following information: at present, the surrounding area of the bird nearest to the food is judged according to the flying experience of the bird.
The PSO is initialized to a population of random particles (random solution). The optimal solution is then found by iteration. In each iteration, the particle updates itself by tracking two "extrema" (the local optimal solution pbest, the global optimal solution gbest). After finding these two optimal values, the particle updates its velocity and position by the following formula.
vi=vi+c1×rand()×(pbesti-xi)+c2×rand()×(gbesti-xi) (12)
xi=xi+vi
Wherein i is 1, 2, …, and N is the total number of particles in the particle group.
vi: current velocity of ith particle
And rand (): random number between (0, 1)
xi: current position of i particle
c1And c2: learning factor
pbestiAnd gbestiRespectively is the local optimum of the current particle swarmLocation and global optimum location.
However, the existing MI mutual information algorithm, LSTM model and PSO particle swarm optimization algorithm have low precision on energy consumption prediction and unstable prediction performance, and do not meet the requirement of building energy consumption prediction. Therefore, it is necessary to develop an energy consumption prediction method applied to buildings, which has high prediction accuracy and stable prediction performance.
Disclosure of Invention
The invention aims to provide an LSTM energy consumption prediction method based on multi-dimensional feature selection and particle swarm optimization, which is an energy consumption prediction method applied to buildings, and has high prediction precision and stable prediction performance.
In order to achieve the purpose, the technical scheme of the invention is as follows: an energy consumption prediction method based on MI-LSTM-PSO is characterized by comprising the following steps: as shown in fig. 2, includes the steps of,
the method comprises the following steps: performing correlation analysis on time and feature dimensions of the original data set by adopting an MI mutual information method, and selecting front N' dimension features most effective on the energy consumption prediction target value, thereby eliminating redundant data and playing a role in improving the efficiency of a model algorithm;
step two: calculating a pearson correlation coefficient value between the top N 'dimensional feature selected by the MI mutual information method and the predicted sequence, and selecting an N' dimensional feature with the pearson correlation coefficient value being greater than or equal to 0.5;
step three: performing model training and prediction on the N' dimensional feature data after PMI dual feature selection by adopting an LSTM model to obtain an initial prediction sequence y (t);
step four: and optimizing the hyperparameter units, dropout and batchsize of the LSTM model by adopting a particle swarm optimization PSO algorithm, thereby improving the prediction precision of the LSTM model and finally obtaining the MI-LSTM-PSO model.
In the above technical solution, in the first step and the second step, N' is 60, that is, the first 60-dimensional feature most effective for the energy consumption prediction target value is selected.
In the above technical solution, the first step specifically includes the following steps,
s11, forming the first 24-hour 20-dimensional feature data into 24M (i.e. 480) -dimensional feature components by using a sliding window, wherein the original data sequence comprises: photovoltaic power generation capacity of 2 areas, energy consumption of 17 areas and different facilities, and total input electric quantity of a system power grid (data can be in different sequences according to different scene data sets);
s12, selecting the feature of the above 24M (480) dimension feature component by MI mutual information method;
Figure BDA0003308548610000051
wherein p (X, Y) is a joint probability density function of X and Y, and p (X) and p (Y) are marginal density functions, if X and Y are not related at all, p (X, Y) will be equal to p (X) p (Y), and mutual information will be equal to 0, if I (X; Y) is larger, the correlation between the two variables is stronger;
s13, determining the optimal parameter N of MI feature selection dimension through experimental optimization; if the value of N is too large, the model training data set will contain too much redundant information and noise, which will deteriorate the prediction performance, while if the value of N is too small, the model training data set will contain too little information, which will also deteriorate the prediction result; generally, the optimal N value is between 3M and 6M, and the feature dimension with better prediction performance and smaller N value is selected;
and S14, based on mutual information sequencing of the characteristic sequence x (t) and the target sequence Y, integrating time and characteristic dimension data, and selecting the first 60-dimensional characteristic most effective on the energy consumption prediction target value as a training data set of a subsequent model.
In the above technical solution, the second step specifically comprises the following steps,
s21, calculating a pearson correlation coefficient of the above 60-dimensional feature component with the target sequence Y (i.e. Gi);
Figure BDA0003308548610000061
wherein the content of the first and second substances,
Figure BDA0003308548610000062
are respectively X and are respectively a group of X,the average value of Y; if r is more than or equal to 0.5, the correlation between X and Y is stronger, otherwise, the correlation between X and Y is weaker;
and S22, selecting 37-dimensional feature data with the pearson correlation coefficient larger than or equal to 0.5 according to the fact that the pearson correlation coefficient is smaller than 0.5, which indicates that the correlation between the two is weak.
In the above technical solution, the LSTM network includes three gate structures and a state module for storing memory, as shown in fig. 1, the third step specifically includes the following steps:
s31, setting CtFor the state information stored for the local LSTM cell, xtAs input to the input layer, htFor the output of the hidden layer of this unit, ftTo forget the door, itIn order to input the information into the gate,
Figure BDA0003308548610000063
as information of the current time otFor the output gate, "×" indicates matrix element multiplication, "+" indicates addition operation, σ is sigmoid function;
s32, forget gate: for controlling the last cell state Ct-1The degree of forgetting, the expression of which is as follows:
ft=σ(Wf*[ht-1,xt]+bf) (3)
s33, input gate: for controlling which information is added to the unit, the expression is as follows:
it=σ(Wi*[ht-1,xt]+bi) (4)
s34, cell stored state information: for according to ftAnd itSelectively recording new information to CtWherein the expression is as follows:
Figure BDA0003308548610000071
Figure BDA0003308548610000072
s35, output gate: for mixing CtActivating and controlling CtThe degree of filtering is expressed as follows:
ot=σ(Wo*[ht-1,xt]+bo) (7)
ht=ot*tanh(Ct) (8)
wherein h istThe output of the hidden layer of the unit; h ist-1The output of the previous unit hidden layer; wf、Wi
Figure BDA0003308548610000073
WoAre respectively ft、it
Figure BDA0003308548610000074
otCorresponding weight matrix, bf、bi
Figure BDA0003308548610000075
boAre respectively ft、it
Figure BDA0003308548610000076
otThe corresponding bias term, tanh, is a hyperbolic tangent activation function, defined as follows:
σ(x)=1/(1+e-x) (9)
tanh(x)=(ex-e-x)/(ex+e-x) (10)
s36, the output layer is htObtaining the final predicted value y through a full connection layert
yt=σ(Wy*ht+by) (11)
In the above formula, WyAnd byRespectively, a weight matrix and an offset term.
In the above technical solution, the step four specifically includes the following steps,
s41, initializing modification parameters, setting the range units belonging to [20,300], dropout belonging to [0,1], batchsize belonging to [20,300 ];
s42, randomly initializing a particle swarm (20 particles) in an initial range, calculating an adaptive value (mean absolute error MAE) of each particle according to a fixness function (LSTM model fitting result), and determining the optimal position (pbest) of the particle swarm of the iteration and the optimal orientation (gbest) of a historical particle swarm according to the prediction index MAE of each current particle;
s43, updating the position and the speed of the current particle according to the position and the speed of the optimal particle, fitting the updated particle through an LSTM model, calculating the MAE of each particle, and updating pbest and gbest according to the MAE;
vi=vi+c1×rand()×(pbesti-xi)+c2×rand()×(gbesti-xi) (12)
xi=xi+vi
in formula (12): i is 1, 2, …, N is the total number of particles in the population;
vi: the current velocity of the ith particle;
and rand (): a random number between (0, 1);
xi: i current position of the particle;
c1and c2: a learning factor;
pbestiand gbestiRespectively obtaining a local optimal position and a global optimal position of the current particle swarm;
s44, after the updated particles are trained through an LSTM model, calculating the adaptive value of each particle, and updating the optimal position of the particle swarm of the iteration and the optimal orientation of the historical particle swarm according to the adaptive value;
s45, when the fitness value of the optimal particle is not changed any more or the iteration number reaches the upper limit value, the algorithm is considered to have converged at the moment; if the particle is not converged, the flow returns to S33 to update the particle;
and S46, substituting the obtained optimal particle parameters units, dropout and batchsize into the LSTM model, and performing model prediction on the data in the first step to obtain a final prediction result.
The foregoing "+" indicates: and multiplied by it.
The invention has the following advantages:
(1) the invention is an energy consumption prediction method applied to buildings, which has high prediction precision and stable prediction performance;
(2) according to the method, redundant characteristics are reduced by 87.5% through MI, a good effect is achieved on improving the efficiency of the model algorithm, and the efficiency of the model algorithm is high;
(3) the method adopts the PSO algorithm to optimize the hyperparameter units, dropout and Batchsize of the LSTM model, thereby improving the prediction precision of the LSTM model and achieving good model fitting effect;
(4) the prediction value of the PMI-PSO-LSTM model is basically in the confidence interval of the true value, the prediction trend is close to the true value, and the prediction precision is high;
(5) the MAE and SMAPE of the PMI-PSO-LSTM combined model are superior to all results of other models, and the PMI-PSO-LSTM combined model has higher robustness and more stable prediction performance.
Drawings
Fig. 1 is a schematic view of the internal structure of a conventional LSTM.
FIG. 2 is a schematic structural diagram of the PMI-PSO-LSTM model of the present invention.
FIG. 3 is a graph comparing the predicted results of the basic model according to the embodiment of the present invention.
FIG. 4 is a scatter plot comparing the prediction results of the base model in accordance with the present invention.
FIG. 5 is a comparison graph of the combined model prediction results according to the embodiment of the present invention.
FIG. 6 is a comparison scatter plot of combined model prediction results according to an embodiment of the present invention.
FIG. 7 is a comparison chart of evaluation indexes of the model according to the embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail with reference to the accompanying drawings, which are not intended to limit the present invention, but are merely exemplary. While the advantages of the invention will be clear and readily understood by the description.
Examples
The invention will be described in detail by taking the prediction of the electricity consumption of a certain building as an example, and has a guiding function for applying the invention to the prediction of the energy consumption of other buildings.
The implementation takes the historical electricity consumption of a certain building as a time sequence to predict the electricity consumption of a short-term single step 1 h.
In this embodiment, the prediction of the power consumption of a certain building includes the following contents:
1. experimental data set and MI feature selection
The data set used in the embodiment is the electricity consumption of a building from 10 and 15 days in 2019 to 6 and 4 days in 2019, and the data set has 20 characteristics in total. These features are described in table 1. Where column 5 data is the pearson's correlation coefficient value for the current feature and the Gi feature.
Table 1 data set description
Figure BDA0003308548610000101
In the present embodiment, the data of the previous 24 hours is used to predict the value of Gi in the next hour, so that the data of 20 features in 24 hours is formed into 480 feature components using a sliding window. Then, the first 60-dimensional feature with the maximum MI value among the 480 feature components formed by the sliding window method is selected by using an MI mutual information method.
The selection results are shown in table 2; the selection results are shown in table 2;
wherein, selected characteristics such as Gi (t-1) represent that the previous hour is input from the public power grid of the industrial factory building by taking the current time as a reference;
TABLE 2 characteristics of MI selection
Figure BDA0003308548610000111
Wherein the selected characteristic, such as Gi (t-1), indicates that the previous hour was entered from the industrial plant utility grid based on the current time. The MI value is the size of the mutual information value of the current characteristic component X and the Gi component (i.e. I (X; Gi (t)) based on the current time, and as can be seen from Table 2, the mutual information values of most of characteristics in the previous four hours and the Gi characteristic at the current time are larger, and the mutual information values of the characteristics in the previous 24 hours of Gi, Ao, Co and A2 and the Gi characteristic at the current time are also relatively larger, therefore, the MI value is reduced by 87.5% of redundant characteristics, and the method plays a good role in improving the efficiency of the model algorithm.
The data set used in this example is a 20-dimensional feature, and the previous 24-hour data is used to predict the 25 th hour data in the future.
Experiments were performed on the 20-dimensional feature data set in this example, and the experimental results show that:
1) the prediction result obtained by selecting the first 60-dimensional features is almost the same as the 100-dimensional feature;
2) when the feature data dimension is increased (namely, the feature data dimension is selected to be more than 100), the prediction result is deteriorated;
3) when the feature data dimension is reduced (i.e. the feature data dimension is selected to be less than 60), the data set contains too little information, which also degrades the prediction result.
Therefore, the present embodiment selects the top 60-dimensional feature having the largest MI value among the 480 feature components formed using the sliding window method using the MI mutual information method.
2. Evaluation index
4 evaluation indexes are used for evaluating the quality of the model.
Root mean square error: RMSE, the smaller the number, the better the model fit.
Figure BDA0003308548610000121
Mean absolute error: the smaller the MAE, the better the model fitting.
Figure BDA0003308548610000122
Mean absolute percentage error of symmetry: SMAPE, the smaller the value, the better the model fitting effect.
Figure BDA0003308548610000123
Coefficient of block: r2, the larger the number, the better the model fit.
Figure BDA0003308548610000124
In the formulae (13), (14), (15), (16),
Figure BDA0003308548610000125
to predict value, yiIn order to be the true value of the value,
Figure BDA0003308548610000126
the mean of the true values, n is the number of data.
3. Model parameter setting
In order to verify the prediction effect of the proposed MI + PSO-LSTM combined model, this example uses two groups of 6 experimental models (i.e. M1-M6) in Table 3 for experimental comparison, and the main parameters of the models are shown in tables 4 and 5.
TABLE 3 experimental reference model
No Model (model) Description of the invention
M1 ARIMA Differential integration moving average autoregressive model
M2 KNR K nearest neighbor (regression) model
M3 LSTM LSTM model
M4 MI-LSTM Mutual information method + LSTM model
M5 PMI-LSTM Mutual information method + LSTM model
M6 PMI-LSTM-PSO Mutual information method + PSO optimization LSTM model
Table 4 comparative model main parameters 1
Figure BDA0003308548610000131
Table 5 comparative model principal parameters 2
Figure BDA0003308548610000132
4. Analysis of model Experimental data
4.1 analysis of basic model test results
In the embodiment, a basic model M1-M3 of Table 3 is adopted, and single-step prediction experiment comparison is carried out on the total input electric quantity Gi of the public power grid through characteristics 1-20.
In the experimental comparison results (table 6), the best LSTM model prediction results can be seen from the four model prediction evaluation indexes, namely the coefficient of performance, the root mean square error, and the symmetric average absolute percentage error.
TABLE 6 comparison of basic model experiments
Model (model) R2 RMSE MAE SMAPE
ARIMA 0.872609 12.1688 7.496174 8.548175
KNR 0.849556 13.21612 8.155453 9.543262
LSTM 0.889503 11.211024 6.622012 7.594866
The comparison of the predicted results of 1h power usage predicted by ARMA, K neighbors and LSTM with the true values is shown in fig. 3 and 4. It can be seen from fig. 3 and 4 that the predicted trend of the LSTM model is closest to the true value, and only the LSTM model is within the confidence interval of the original value. The result curve predicted by the ARIMA and K neighbor model is not in the confidence interval of the true value, and the problem of prediction lag exists. In summary, the predicted effect of the LSTM model is best compared to the ARMA, K-nearest neighbor regression model. LSTM was chosen as the experimental base model.
4.2 analysis of the results of the LSTM combined model experiment
In the embodiment, 20 groups of single-step prediction comparison experiments are carried out on the total input electric quantity Gi of the public power grid through the characteristics 1-20 by adopting the combined models M3-M6 shown in the table 3.
The comparison of the predicted results of the four models for predicting 1h electricity consumption Gi with the true values is shown in fig. 5 and 6. It can be seen from fig. 5 and 6 that the predicted values of the four models are substantially within the confidence interval of the true values, and the predicted trend of the PMI-PSO-LSTM model is closest to the true values. As can be seen from fig. 7, the evaluation indexes of the PMI-PSO-LSTM model are all optimal (in fig. 7, M3, M4, M5, and M6 are combination models M3-M6 in table 3, respectively, in this embodiment).
Table 7 shows the average of the experimental results of 20 groups of four combined models, the first four columns are four evaluation indexes of the prediction model, and the fifth column is the training time of the prediction model. As can be seen from Table 7, the MI + PSO-LSTM model did not improve significantly on R2, but improved performance by about 20%, 10%, 5% on MAE, SMAPE, respectively, compared to the LSTM, MI-LSTM, and PMI-LSTM models. Compared with the LSTM model, the performance of MI-LSTM is not improved significantly, but after features are selected through MI, the dimension of input data is reduced by 87.5%, and the time for model training is reduced by about 63%. Compared with the MI-LSTM model, the PMI-LSTM performance is hardly improved, but after secondary feature selection, the dimension of input data is reduced by about 40%, so that the time for model training is reduced by about 20%;
TABLE 7 comparison of evaluation indexes of combination models
Model (model) R2 RMSE MAE SMAPE t
LSTM 0.88724 11.18282 7.19766 8.56986 159S
MI-LSTM 0.89722 10.67590 6.66639 7.82360 59S
PMI-LSTM 0.92301 10.73256 6.42070 7.49299 46S
MI-PSO-LSTM 0.90482 10.27717 6.12843 6.87869 44S
FIG. 7 is a box plot of four evaluation indexes of 20 experiments of M3-M6. The '+' symbols in fig. 7 that are not within the box shape are outliers (negligible). As can be seen from FIG. 7, the four evaluation indexes of the MI-PSO-LSTM model are obviously superior to those of the other three models, the MAE and SMAPE of the MI-PSO-LSTM model are superior to all the results of the other models, and the R2 and RMSE of the MI-PSO-LSTM model are also superior to those of the other models by about 95%. The four evaluation indexes of MI + LSTM are partially overlapped with LSTM, but the overall trend of MI-LSTM is superior to that of the LSTM model. As can be seen from FIG. 7, the box plot shape (upper and lower quartile difference) of the MI-PSO-LSTM model is minimal compared to the LSTM, MI-LSTM, and PMI-LSTM models, indicating that the MI-PSO-LSTM model is more stable than the other models.
In summary, the invention provides a short-term energy consumption combined prediction model based on PMI, PSO and LSTM. Firstly, in the data preprocessing stage, the mutual information method and the Pearson coefficient are used for carrying out double feature selection on the original data, and redundant features are deleted. And then matching and optimizing the network architecture of the LSTM by using the PSO to ensure that the adaptability of the topology structure of the LSTM and the current input data is the best, and finally inputting the data after the characteristic selection into the optimized LSTM to predict the energy consumption data in a short term. In order to verify the effect of the MI-PSO-LSTM model on short-term energy consumption prediction, a multi-dimensional single-step prediction comparison experiment is carried out on an energy consumption time sequence dataset of a certain building. The results of the above experiments are combined to show that 4 evaluation indexes of the MI-PSO-LSTM combined model are all optimal, namely that the MI-PSO-LSTM model has higher prediction precision and robustness and more stable prediction performance. The MI-PSO-LSTM combined model can provide a beneficial research idea for exploring the aspect of predictive analysis of time series by utilizing deep learning. However, the MI-PSO-LSTM combined model still has a large optimization space, such as a noise filtering problem and a feature dynamic intelligent selection problem which are researched in time series, so that the model prediction accuracy is further optimized.
Other parts not described belong to the prior art.

Claims (5)

1. A LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
the method comprises the following steps: performing correlation analysis on time and feature dimensions of the original data set by adopting an MI mutual information method, and selecting front N' dimension features most effective on the energy consumption prediction target value;
step two: performing secondary feature selection on the N-dimensional features selected in the step one by adopting a Person correlation coefficient to obtain N' dimensional features after PMI feature selection;
step three: carrying out model training and prediction on the N' dimensional feature data after PMI feature selection by adopting an LSTM model to obtain an initial prediction sequence y (t);
step four: and optimizing the hyperparameter units, dropout and batchsize of the LSTM model by adopting a particle swarm optimization PSO algorithm, thereby improving the prediction precision of the LSTM model and finally obtaining the PMI-LSTM-PSO model.
2. The LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization according to claim 1, wherein: the first step specifically comprises the following steps of,
s11, forming the M-dimensional feature data of the first 24 hours into 24M-dimensional feature components using a sliding window, wherein the original data sequence includes: photovoltaic power generation of 2 areas, energy consumption of 17 different facilities of the areas, and total electric quantity input by a system power grid;
s12, selecting the characteristics of the 24M dimensional characteristic components by using an MI mutual information method;
Figure FDA0003308548600000011
in formula (1): p (X, Y) is a joint probability density function of X and Y, and p (X) and p (Y) are marginal density functions, if X and Y are not related at all, p (X, Y) will be equal to p (X) p (Y), and mutual information will be equal to 0, if I (X; Y) is larger, the correlation between the two variables is stronger;
s13, determining the optimal parameter N of MI feature selection dimension through experimental optimization; if the value of N is too large, the model training data set will contain too much redundant information and noise, which will deteriorate the prediction performance, while if the value of N is too small, the model training data set will contain too little information, which will also deteriorate the prediction result; generally, the optimal N value is between 3M and 6M, and the feature dimension with better prediction performance and smaller N value is selected;
and S14, based on mutual information sequencing of the characteristic sequence x (t) and the target sequence Y, integrating time and characteristic dimension data, and selecting the first N' dimension characteristic most effective on the energy consumption prediction target value as a training data set of a subsequent model.
3. The LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization according to claim 2, wherein: the second step specifically comprises the following steps:
s21, calculating a Pearson correlation coefficient of the N' -dimensional characteristic component and the target sequence Y;
Figure FDA0003308548600000021
in formula (2):
Figure FDA0003308548600000022
respectively the average values of X and Y;
if r is more than or equal to 0.5, the correlation between X and Y is stronger, otherwise, the correlation between X and Y is weaker;
s22, selecting the N' dimension characteristic data with pearson correlation coefficient larger than or equal to 0.5.
4. The LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization according to claim 3, wherein: the LSTM network internally comprises three gate structures and a state module for storing and memorizing, and the third step specifically comprises the following steps:
s31, setting CtFor the state information stored for the local LSTM cell, xtAs input to the input layer, htFor the output of the hidden layer of this unit, ftTo forget the door, itIn order to input the information into the gate,
Figure FDA0003308548600000023
as information of the current time otFor the output gate, "×" indicates matrix element multiplication, "+" indicates addition operation, σ is sigmoid function;
s32, forget gate: for controlling the last cell state Ct-1The degree of forgetting, the expression of which is as follows:
ft=σ(Wf*[ht-1,xt]+bf) (3)
s33, input gate: for controlling which information is added to the unit, the expression is as follows:
it=σ(Wi*[ht-1,xt]+bi) (4)
s34, cell stored state information: for according to ftAnd itSelectively recording new information to CtWherein the expression is as follows:
Figure FDA0003308548600000031
Figure FDA0003308548600000032
s35, output gate: for mixing CtActivating and controlling CtThe degree of filtering is expressed as follows:
ot=σ(Wo*[ht-1,xt]+bo) (7)
ht=ot*tanh(Ct) (8)
formula (3) to formula (8): w isf、Wi
Figure FDA0003308548600000033
WoAre respectively ft、it
Figure FDA0003308548600000034
otCorresponding weight matrix, bf、bi
Figure FDA0003308548600000035
boAre respectively ft、it
Figure FDA0003308548600000036
otThe corresponding bias term, tanh, is a hyperbolic tangent activation function, defined as follows:
σ(x)=1/(1+e-x) (9)
tanh(x)=(ex-e-x)/(ex+e-x) (10)
s36, the output layer is htObtaining the final predicted value y through a full connection layert
yt=σ(Wy*ht+by) (11)
In formula (11): wyAnd byRespectively, a weight matrix and an offset term.
5. The LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization according to claim 4, wherein: the fourth step specifically comprises the following steps of,
s41, initializing modification parameters, setting the range units belonging to [20,300], dropout belonging to [0,1], batchsize belonging to [20,300 ];
s42, randomly initializing the particle swarm in an initial range, calculating an adaptive value of each particle according to the fixness function, and determining pbest of the iterated particle swarm and gbest of the historical particle swarm according to the prediction index MAE of each current particle;
s43, updating the position and the speed of the current particle according to the position and the speed of the optimal particle, fitting the updated particle through an LSTM model, calculating the MAE of each particle, and updating pbest and gbest according to the MAE;
vi=vi+c1×rand()×(pbesti-xi)+c2×rand()×(gbesti-xi) (12)
xi=xi+vi
in formula (12): i is 1, 2, …, N is the total number of particles in the population;
vi: the current velocity of the ith particle;
and rand (): a random number between (0, 1);
xi: i current position of the particle;
c1and c2: a learning factor;
pbestiand gbestiRespectively obtaining a local optimal position and a global optimal position of the current particle swarm;
s44, after the updated particles are trained through an LSTM model, calculating the adaptive value of each particle, and updating the optimal position of the particle swarm of the iteration and the optimal orientation of the historical particle swarm according to the adaptive value;
s45, when the fitness value of the optimal particle is not changed any more or the iteration number reaches the upper limit value, the algorithm is considered to have converged at the moment; if the particle is not converged, the flow returns to S33 to update the particle;
and S46, substituting the obtained optimal particle parameters units, dropout and batchsize into the LSTM model, and performing model prediction on the data in the first step to obtain a final prediction result.
CN202111213171.9A 2021-10-18 2021-10-18 LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization Pending CN113962454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111213171.9A CN113962454A (en) 2021-10-18 2021-10-18 LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111213171.9A CN113962454A (en) 2021-10-18 2021-10-18 LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization

Publications (1)

Publication Number Publication Date
CN113962454A true CN113962454A (en) 2022-01-21

Family

ID=79464357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111213171.9A Pending CN113962454A (en) 2021-10-18 2021-10-18 LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization

Country Status (1)

Country Link
CN (1) CN113962454A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561554A (en) * 2023-04-18 2023-08-08 南方电网电力科技股份有限公司 Feature extraction method, system, equipment and medium of boiler soot blower

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986470A (en) * 2018-08-20 2018-12-11 华南理工大学 The Travel Time Estimation Method of particle swarm algorithm optimization LSTM neural network
CN111783953A (en) * 2020-06-30 2020-10-16 重庆大学 24-point power load value 7-day prediction method based on optimized LSTM network
CN111985706A (en) * 2020-08-15 2020-11-24 西北工业大学 Scenic spot daily passenger flow volume prediction method based on feature selection and LSTM

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986470A (en) * 2018-08-20 2018-12-11 华南理工大学 The Travel Time Estimation Method of particle swarm algorithm optimization LSTM neural network
CN111783953A (en) * 2020-06-30 2020-10-16 重庆大学 24-point power load value 7-day prediction method based on optimized LSTM network
CN111985706A (en) * 2020-08-15 2020-11-24 西北工业大学 Scenic spot daily passenger flow volume prediction method based on feature selection and LSTM

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561554A (en) * 2023-04-18 2023-08-08 南方电网电力科技股份有限公司 Feature extraction method, system, equipment and medium of boiler soot blower

Similar Documents

Publication Publication Date Title
Shamshirband et al. A survey of deep learning techniques: application in wind and solar energy resources
Ye et al. Predicting electricity consumption in a building using an optimized back-propagation and Levenberg–Marquardt back-propagation neural network: Case study of a shopping mall in China
CN109754113B (en) Load prediction method based on dynamic time warping and long-and-short time memory
Jallal et al. A hybrid neuro-fuzzy inference system-based algorithm for time series forecasting applied to energy consumption prediction
CN111563611B (en) Cloud data center renewable energy space-time prediction method for graph rolling network
CN110705743B (en) New energy consumption electric quantity prediction method based on long-term and short-term memory neural network
Kalogirou et al. Artificial intelligence techniques in solar energy applications
CN112116144B (en) Regional power distribution network short-term load prediction method
CN111260136A (en) Building short-term load prediction method based on ARIMA-LSTM combined model
CN111915092A (en) Ultra-short-term wind power prediction method based on long-time and short-time memory neural network
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
Zhao et al. Heating load prediction of residential district using hybrid model based on CNN
CN116526473A (en) Particle swarm optimization LSTM-based electrothermal load prediction method
CN114119273A (en) Park comprehensive energy system non-invasive load decomposition method and system
Fan et al. Multi-objective LSTM ensemble model for household short-term load forecasting
CN115640901A (en) Small sample load prediction method based on hybrid neural network and generation countermeasure
CN115759458A (en) Load prediction method based on comprehensive energy data processing and multi-task deep learning
Gao et al. A hybrid improved whale optimization algorithm with support vector machine for short-term photovoltaic power prediction
Huang et al. Short-term load forecasting based on a hybrid neural network and phase space reconstruction
CN109408896B (en) Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production
CN113591957B (en) Wind power output short-term rolling prediction and correction method based on LSTM and Markov chain
Goh et al. Hybrid SDS and WPT-IBBO-DNM based model for ultra-short term photovoltaic prediction
CN113962454A (en) LSTM energy consumption prediction method based on dual feature selection and particle swarm optimization
Zuo Integrated forecasting models based on LSTM and TCN for short-term electricity load forecasting
Wang et al. Prediction of heating load fluctuation based on fuzzy information granulation and support vector machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination