CN116663745A - LSTM drainage basin water flow prediction method based on PCA_DWT - Google Patents

LSTM drainage basin water flow prediction method based on PCA_DWT Download PDF

Info

Publication number
CN116663745A
CN116663745A CN202310727924.0A CN202310727924A CN116663745A CN 116663745 A CN116663745 A CN 116663745A CN 202310727924 A CN202310727924 A CN 202310727924A CN 116663745 A CN116663745 A CN 116663745A
Authority
CN
China
Prior art keywords
data
lstm
input
water flow
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310727924.0A
Other languages
Chinese (zh)
Inventor
林琼斌
周宇
黄若辰
柴琴琴
林文忠
陈清楚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202310727924.0A priority Critical patent/CN116663745A/en
Publication of CN116663745A publication Critical patent/CN116663745A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a PCA_DWT-based LSTM watershed water flow prediction method, which comprises the following specific steps: (1) collecting historical data of a river; (2) Adopting a principal component analysis method to reduce the dimension of the acquired data; (3) Decomposing the input sequence subjected to dimension reduction into an approximate sequence by adopting wavelet decomposition DWT; (4) Dividing the sequence into m+1 different groups according to the frequency size; (5) normalizing; (6) Dividing the data of different groups into a training set and a testing set respectively; (7) simultaneously iteratively training each set of data; (8) inputting the test set into the iterative neural network; (9) Then the predicted data is used for forming new input, and the correction weight and the threshold value of the neural network are corrected until iteration N is reached, namely, all test sets are iterated; (10) And finally predicting the flow rate condition of the water resource in the future N hours. By the technical scheme, the water flow velocity condition of a certain river basin in a future period can be accurately predicted.

Description

LSTM drainage basin water flow prediction method based on PCA_DWT
Technical Field
The application relates to the technical field of LSTM drainage basin water flow prediction, in particular to a PCA_DWT-based LSTM drainage basin water flow prediction method.
Background
Water resources are indispensable for the economic and social development of humans. From ancient times to date, the uneven distribution of water resources in space and time, unbalanced development between socioeconomic and water resources and water pollution problems have caused water resource crisis for many countries and regions. Therefore, scheduling of water resources is particularly important. The prediction of the water resources is the basis of scheduling, and the water resources are pre-scheduled according to the predicted water incoming condition, so that the problem of unbalanced space-time distribution of the water resources is solved, and the water resource crisis of certain countries and regions is effectively relieved. Meanwhile, the water resource incoming water prediction has a great influence on the hydropower station, and the power generation condition of the hydropower station in a period of time in the future is further predicted according to the predicted incoming water, so that the operation efficiency of the hydropower station is improved.
At present, common prediction methods for watershed water inflow conditions include regression analysis, BP artificial neural network model, holt index smoothing method, quota method and the like.
Regression analysis is a structural analysis method, which examines the variation condition of a predicted object in the variation process of different influencing factors by determining the influencing factors. The method can change a complex prediction target into a relatively simple prediction, has strong adaptability and considers a plurality of influencing factors. Therefore, in addition to predicting the predicted object, the influence factors can be analyzed. Regression analysis can be classified into a unitary linear regression analysis, a multiple linear regression analysis, and a nonlinear regression analysis. The nature of linear regression is to model the change in the amount of water coming in an equi-differential law, which may be reasonable at some stages of the water flow. However, the actual flow rate cannot increase without constraint according to a linear rule. For short-term prediction, since the fluctuation of data is large, the influence factors are relatively complex, and it is difficult to accurately predict the influence degree of the influence factors, so that the method is not suitable for short-term prediction.
The BP artificial neural network model belongs to a machine learning type, a hidden layer (one layer or a plurality of layers) is added between an input layer and an output layer, and a plurality of neurons are contained in the hidden layer, are called hidden units, have no direct connection with the outside, but can influence the relationship between the input and the output by changing the hidden units. The basic idea is a gradient descent method, which uses gradient search technology in order to minimize the error mean square error between the actual and desired output values of the network. The BP network does not need to determine the mathematical relationship between the external factors and the flow rate, only relies on own training to learn a certain rule, and obtains the prediction result closest to the actual value when some external factors are given. BP networks also have some major drawbacks. First the learning rate is slow and is prone to trapping local minima. Secondly, the number of neurons in the hidden layer of the neural network is selected without corresponding theoretical support, and the most suitable network layer number and the number of neurons can be found only by means of a trial and error method or some optimization algorithms. Therefore, the method can generate larger errors in predicting the flow rate, and has influence on the allocation of water resources in the future.
The Holt exponential smoothing method extends on the basis of a simple exponential smoothing model, so that the model can predict future trends. The method comprises two smoothing equations, namely a horizontal smoothing equation and a trend smoothing equation. According to the method, different weights are given to each period of data according to the time period by exponential decay of the weights, so that the defect that the moving average method gives the same weight to each period of data is overcome, the predicted value is closer to the actual value, and the fluctuation condition of the demand can be reflected better. The data demand of the exponential smoothing method is small, the future demand can be predicted by only a few data, the demand for data storage is also very small, and the systemization and the automation are easy to carry out. However, the exponential smoothing method is difficult to find the optimal exponential smoothing coefficient, has hysteresis for adjusting the data change, and cannot predict the abrupt change of the water flow velocity.
The method can be used for predicting the water area flow rate, but has the problems of larger error or higher network training difficulty and the like.
Disclosure of Invention
Therefore, the application aims to provide the LSTM watershed water flow prediction method based on the PCA_DWT, and the finally obtained prediction quantity can well reflect future changes and has small error with an actual value.
In order to achieve the above purpose, the application adopts the following technical scheme: an LSTM drainage basin water flow prediction method based on PCA_DWT comprises the following specific steps:
step 1, collecting historical data of a river, wherein the historical data comprise flow speed, wind direction and temperature of each hour;
step 2, reducing the dimension of the acquired data by adopting a principal component analysis method to obtain a plurality of factors and sequences thereof which affect the maximum flow velocity of the water flow;
step 3, decomposing the input sequence subjected to dimension reduction into an approximate sequence and m high-frequency sequences by adopting wavelet decomposition DWT, wherein m is the decomposition depth;
step 4, dividing the sequence into m+1 different groups according to the frequency size;
step 5, dividing the data of different groups into input and output, and normalizing;
step 6, dividing the data of different groups into a training set and a testing set respectively;
step 7, simultaneously sending each group of data into different long-short-term memory neural networks LSTM for iterative training, obtaining predicted flow of one hour in the future in each iteration, and reversely correcting the weight and the threshold of the LSTM compared with the actual value until all the training set data are iterated;
step 8, inputting the test set into the iterative neural network to obtain flow velocity prediction data of one hour in the future;
step 9, forming new input by using the predicted data, correcting the correction weight and the threshold value of the neural network until the training set is iterated, and predicting the data of the test set;
and 10, carrying out wavelet reconstruction on the output of m+1 different neural networks, carrying out inverse normalization, and finally predicting the flow velocity condition of the water resource in the future N hours.
In a preferred embodiment, the step of principal component analysis in step 2 is as follows: selecting the accumulated characteristic value to be more than 0.85;
calculating a covariance matrix:
Σ=(s ij )
determining the eigenvalue lambda of sigma i And corresponding orthogonalization unit eigenvectors, Σ first m larger eigenvalues λ 1 ≥λ 2 ≥λ 3 ...≥λ m >0 is the variance corresponding to the first m principal components, lambda i Corresponding unit feature vector a ij The coefficients of the principal components with respect to the original variables; the variance information contribution ratio of the principal component is used to reflect the size of the information amount:
and selecting main components: finally, a plurality of principal components are selected and determined through the variance information accumulation contribution rate G (m);
calculating the load of the main component, and the load of the original variable Xj on the main components Fi;
in a preferred embodiment, in the step 5, each group of data in the step 4 is divided into input and output, and the form is as follows:
V_output={y 1 ,y 2 ,y 3 ,...y n }
wherein data is historical data of the collected flow rate of the river basin;
v_input is input, { x ij In }, i=1, 2,..m represents the i-th influencing factor, j represents the j-th data thereof; m is the dimension of the data after dimension reduction; v_output is the output.
In a preferred embodiment, in the step 6, the inputs of the training set and the testing set divided in the step 5, and the outputs of the training set and the testing set are normalized by the following formulas, taking normalization of the outputs as an example;
in the formula, v' i The water flow speed is normalized; v' min Is the minimum value of the mapping range; v' max Is the maximum value of the mapping range; v i Is the actual water flow speed; v min Is the minimum flow rate; v max Is the maximum flow rate.
In a preferred embodiment, the input set and the output set of each group of data in step 4 are set according to 4: the scale of 1 is divided into training and test sets.
In a preferred embodiment, in step 8, five different LSTM neural networks are required, which are independent of each other, and the LSTM network is operated by three gates from input to outputRespectively, input gates i t Forgetting door f t Output door o t The method comprises the steps of carrying out a first treatment on the surface of the The mathematical model of an LSTM network single neuron is as follows:
forgetting the door:
f t =σ(W xf ·x t +W hf ·h t-1 +b f )
in which W is xf For inputting x t Weight to forget gate; w (W) hf Hidden state h for neurons at time t-1 t-1 Weight to forget gate; b f Threshold value for forgetting door;
an input door:
i t =σ(W xi ·x t +W hi ·h t-1 +b i )
g t =tanhW xg ·x t +W hg ·h t-1 +b g
the cell state at time t is long term equation:
C t =C t-1 ·f t +g t ·i t
in which W is xi For inputting x t Weights to input gates; w (W) hi Hidden state h for neurons at time t-1 t-1 Weights to input gates; b i A threshold value for an input gate; g t Is the current cell state; w (W) xg For inputting x t Weights to cell status; w (W) hg Hidden state h for neurons at time t-1 t-1 Weights to cell status;
output door:
o t =σ(W xo ·x t +W ho ·h t-1 +b o )
in which W is xo For inputting x t Weights to output gates; w (W) ho Hidden state h for neurons at time t-1 t-1 Weights to output gates; b o A threshold value for an output gate;
hidden state equation of neuron at time t:
h t =o t ·tanh(C t )
in the above formula, σ (z) is Sigmoid activation function; tanh (z) is the tanh activation function;
forming an LSTM neural network by connecting a plurality of LSTM units in series and in parallel, sending the normalized data in the step 4 into the established neural network, and selecting 15 the number of hidden layers of the neural network; when the neural network input is x= { X 1 ,x 2 ,x 3 ,…x n Then the output of LSTM neural network is y= { Y 1 ,y 2 ,y 3 ,...y n }。
In a preferred embodiment, in step 9, the outputs of the five different LSTM neural networks are reconstructed and combined into one signal, i.e. the normalized predicted signal.
In a preferred embodiment, in step 10, the data reconstructed in step 9 is inversely normalized and restored to the actual magnitude, and the final result is the final predicted value.
In a preferred embodiment, in step 11, the error calculation, the evaluation index selects the absolute error average value, MAE, the average deviation error MBE, the average absolute percentage error MAPE, and the determination coefficient R2; the specific formula is as follows:
in the formula, v' t Predicting flow for the model; v t Is the actual flow;is the average value of the actual flow; the formula is as follows;
compared with the prior art, the application has the following beneficial effects: the design is continuously updated and optimized, network parameters are updated in time, and the finally obtained predicted quantity can well reflect future changes and has small error with an actual value.
Drawings
FIG. 1 is a schematic flow chart of a preferred embodiment of the present application;
FIG. 2 is a schematic diagram of a wavelet decomposition flow chart in accordance with a preferred embodiment of the present application;
wherein cD1, cD2, cD3 and cD4 are high-frequency signals, and the corresponding frequency is f 1 <f 2 <f 3 <f 4 Respectively reflecting the characteristics of the original signal at different frequencies; cA4 is a low frequency signal of scale 4, primarily reflecting the increasing trend of the data.
Detailed Description
The application will be further described with reference to the accompanying drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application; as used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
The application provides a time sequence-based watershed water flow speed prediction method, which is designed by continuously updating and optimizing and timely updating network parameters with reference to figures 1 to 2, wherein the finally obtained predicted quantity can well reflect future changes and has smaller error with actual values. The method comprises the following specific steps:
step (1): data is acquired. Historical data of flow rates in a certain basin are collected in hours.
Step (2): and D, carrying out principal component analysis on the data in the step I, extracting a plurality of characteristics with the greatest influence on the water area flow velocity so as to reduce the training difficulty of the neural network, wherein the principal component analysis comprises the following steps: the accumulated characteristic value is generally selected to be more than 0.85;
calculating a covariance matrix:
Σ=(s ij )
determining the eigenvalue lambda of sigma i And corresponding orthogonalization unit eigenvectors, Σ first m larger eigenvalues λ 1 ≥λ 2 ≥λ 3 ...≥λ m >0 is the variance corresponding to the first m principal components, lambda i Corresponding unit feature vector a ij Is the coefficient of the principal component with respect to the original variable. The variance (information) contribution rate of the principal component is used to reflect the magnitude of the information amount:
and selecting main components: the final choice of several principal components is determined by the variance (information) cumulative contribution G (m).
The principal component load is calculated, and the load of the original variable Xj on the principal components Fi is calculated.
Step (3): performing wavelet decomposition with the decomposition depth of 4 on the data subjected to the feature extraction in the step (2) to obtain 4 high-frequency signals and an approximate signal in total;
step (4): grouping information of different frequencies after wavelet decomposition, wherein each frequency is divided into 5 groups;
step (5): dividing and inputting each group of data in the step (4), outputting the data in the form as follows:
V_output={y 1 ,y 2 ,y 3 ,...y n }
wherein data is historical data of the collected flow rate of the river basin;
v_input is input, { x ij In }, i=1, 2,..m represents the i-th influencing factor, j represents the j-th data thereof; m is the dimension of the data after dimension reduction;
v_output is output;
step (6): normalization. Normalizing the inputs of the training set and the testing set divided in the step (5) and the outputs of the training set and the testing set by the following formulas, taking normalization of the outputs as an example.
In the formula, v' i The water flow speed is normalized;
v' min is the minimum value of the mapping range;
v' max is the maximum value of the mapping range;
v i is the actual water flow speed;
v min is the minimum flow rate;
v max is the maximum flow rate.
Step (7): the training set and the test set are partitioned. The input set and the output set of each group of data in the step (4) are respectively processed according to the following steps: the scale of 1 is divided into training and test sets.
Step (8): the long-term memory neural network LSTM is established, and five different LSTM neural networks independent of each other are needed because the data with 5 different frequencies are shared by 4 times of wavelet decomposition.
The LSTM network requires three gate operations from input to output, namely input gates (i t ) Forget door (f) t ) Output door (o) t ). The mathematical model of an LSTM network single neuron is as follows:
forgetting the door:
f t =σ(W xf ·x t +W hf ·h t-1 +b f )
in which W is xf For inputting x t Weight to forget gate;
W hf hidden state h for neurons at time t-1 t-1 Weight to forget gate;
b f is the threshold of forgetting the door.
An input door:
i t =σ(W xi ·x t +W hi ·h t-1 +b i )
g t =tanhW xg ·x t +W hg ·h t-1 +b g
cell state at time t (long term) equation:
C t =C t-1 ·f t +g t ·i t
in which W is xi For inputting x t Weights to input gates;
W hi hidden state h for neurons at time t-1 t-1 Weights to input gates;
b i a threshold value for an input gate; g t Is the current cell state (short time);
W xg for inputting x t Weights to cell status (short time);
W hg hidden state h for neurons at time t-1 t-1 Weights to cell status (short time); output door:
o t =σ(W xo ·x t +W ho ·h t-1 +b o )
in which W is xo For inputting x t Weights to output gates;
W ho hidden state h for neurons at time t-1 t-1 Weights to output gates;
b o is the threshold of the output gate.
Hidden state equation of neuron at time t:
h t =o t ·tanh(C t )
in the above formula, σ (z) is Sigmoid activation function. tanh (z) is the tanh activation function.
And (3) forming an LSTM neural network by connecting a plurality of LSTM units in series and in parallel, and sending the normalized data in the step (4) into the established neural network, wherein the number of hidden layers of the neural network is selected 15. When the neural network input is x= { X 1 ,x 2 ,x 3 ,…x n LSTM nerve }, thenThe output of the network is y= { Y 1 ,y 2 ,y 3 ,...y n }。
Step (9): wavelet reconstruction. Reconstructing the outputs of five different LSTM neural networks, and combining into a signal, i.e. a normalized predicted signal
Step (10): and (3) performing inverse normalization, namely performing inverse normalization on the reconstructed data in the step (9), and restoring to an actual magnitude, wherein a final result is a final predicted value.
Step (11): and (5) calculating errors. The evaluation index selects an absolute error average (MAE), an average deviation error (MBE), an average absolute percent error (MAPE), and a determination coefficient (R2). The specific formula is as follows:
in the formula, v' t Predicting flow for the model; v t Is the actual flow;
is the average value of the actual flow; the formula is as follows.

Claims (9)

1. The LSTM drainage basin water flow prediction method based on PCA_DWT is characterized by comprising the following specific steps of:
step 1, collecting historical data of a river, wherein the historical data comprise flow speed, wind direction and temperature of each hour;
step 2, reducing the dimension of the acquired data by adopting a principal component analysis method to obtain a plurality of factors and sequences thereof which affect the maximum flow velocity of the water flow;
step 3, decomposing the input sequence after dimension reduction into an approximate sequence and m high-frequency sequences by adopting discrete wavelet decomposition DWT, wherein m is the decomposition depth;
step 4, dividing the sequence into m+1 different groups according to the frequency size;
step 5, dividing the data of different groups into input and output, and normalizing;
step 6, dividing the data of different groups into a training set and a testing set respectively;
step 7, simultaneously sending each group of data into different long-short-term memory neural networks LSTM for iterative training, obtaining predicted flow of one hour in the future in each iteration, and reversely correcting the weight and the threshold of the LSTM compared with the actual value until all the training set data are iterated;
step 8, inputting the test set into the iterative neural network to obtain flow velocity prediction data of one hour in the future;
step 9, forming new input by using the predicted data, correcting the correction weight and the threshold value of the neural network until the training set is iterated, and predicting the data of the test set;
and 10, carrying out wavelet reconstruction on the output of m+1 different neural networks, carrying out inverse normalization, and finally predicting the flow velocity condition of the water resource in the future N hours.
2. The LSTM basin water flow prediction method based on pca_dwt according to claim 1, wherein the step of principal component analysis in step 2 is as follows: selecting the accumulated characteristic value to be more than 0.85;
calculating a covariance matrix:
∑=(s ij )
determining the eigenvalue lambda of sigma i And corresponding orthogonalization unit eigenvectors, Σ first m larger eigenvalues λ 1 ≥λ 2 ≥λ 3 ...≥λ m >0 is the variance corresponding to the first m principal components, lambda i Corresponding unit feature vector a ij The coefficients of the principal components with respect to the original variables; the variance information contribution ratio of the principal component is used to reflect the size of the information amount:
and selecting main components: finally, a plurality of principal components are selected and determined through the variance information accumulation contribution rate G (m);
calculating the load of the main component, and the load of the original variable Xj on the main components Fi;
3. the LSTM watershed water flow prediction method based on pca_dwt according to claim 1, wherein in step 5, each group of data in step 4 is input and output in the form of:
V_output={y 1 ,y 2 ,y 3 ,...y n }
wherein data is historical data of the collected flow rate of the river basin;
v_input is input, { x ij In }, i=1, 2,..m represents the i-th influencing factor, j represents the j-th data thereof; m is the dimension of the data after dimension reduction; v_output is the output.
4. The LSTM watershed water flow prediction method based on pca_dwt according to claim 1, wherein in step 6, the inputs of the training set and the test set divided in step 5, the outputs of the training set and the test set are normalized by the following formulas, taking normalization of the outputs as an example;
in the formula, v' i The water flow speed is normalized; v' min Is the minimum value of the mapping range; v' max Is the maximum value of the mapping range; v i Is the actual water flow speed; v min Is the minimum flow rate; v max Is the maximum flow rate.
5. The LSTM drainage basin water flow prediction method based on PCA_DWT of claim 1, wherein in the step 3, discrete wavelet decomposition is carried out on the data subjected to dimension reduction to obtain high-frequency components and deep information of mining historical data.
6. The method for predicting the water flow of an LSTM basin based on PCA_DWT as claimed in claim 1, wherein in the step 8, five different LSTM neural networks independent of each other are needed, the process from input to output of the LSTM network needs to be operated by three gates, namely input gate i t Forgetting door f t Output door o t The method comprises the steps of carrying out a first treatment on the surface of the The mathematical model of an LSTM network single neuron is as follows:
forgetting the door:
f t =σ(W xf ·x t +W hf ·h t-1 +b f )
in which W is xf For inputting x t Weight to forget gate; w (W) hf Hidden state h for neurons at time t-1 t-1 Weight to forget gate; b f Threshold value for forgetting door;
an input door:
i t =σ(W xi ·x t +W hi ·h t-1 +b i )
g t =tanh(W xg ·x t +W hg ·h t-1 +b g )
the cell state at time t is long term equation:
C t =C t-1 ·f t +g t ·i t
in which W is xi For inputting x t Weights to input gates; w (W) hi Hidden state h for neurons at time t-1 t-1 Weights to input gates; b i A threshold value for an input gate; g t Is the current cell state; w (W) xg For inputting x t Weights to cell status; w (W) hg Hidden state h for neurons at time t-1 t-1 Weights to cell status;
output door:
o t =σ(W xo ·x t +W ho ·h t-1 +b o )
in which W is xo For inputting x t Weights to output gates; w (W) ho Hidden state h for neurons at time t-1 t-1 Weights to output gates; b o A threshold value for an output gate;
hidden state equation of neuron at time t:
h t =o t ·tanh(C t )
in the above formula, σ (z) is Sigmoid activation function; tanh (z) is the tanh activation function;
forming an LSTM neural network by connecting a plurality of LSTM units in series and in parallel, sending the normalized data in the step 4 into the established neural network, and selecting 15 the number of hidden layers of the neural network; when the neural network input is x= { X 1 ,x 2 ,x 3 ,...x n Then the output of LSTM neural network is y= { Y 1 ,y 2 ,y 3 ,...y n }。
7. The method for predicting the water flow of an LSTM basin based on PCA_DWT as claimed in claim 6, wherein in the step 9, the outputs of five different LSTM neural networks are reconstructed and combined into one signal, namely the normalized predicted signal.
8. The LSTM drainage basin water flow prediction method based on PCA_DWT according to claim 7, wherein in step 10, the data reconstructed in step 9 is inversely normalized and restored to an actual magnitude, and the final result is the final predicted value.
9. The method for predicting water flow in LSTM basin based on pca_dwt according to claim 8, wherein in step 11, the error calculation is performed, and the evaluation index selects an absolute error average value, MAE, an average deviation error MBE, an average absolute percentage error MAPE, and a determination coefficient R2; the specific formula is as follows:
in the formula, v' t Predicting flow for the model; v t Is the actual flow;is the average value of the actual flow; the formula is as follows;
CN202310727924.0A 2023-06-20 2023-06-20 LSTM drainage basin water flow prediction method based on PCA_DWT Pending CN116663745A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310727924.0A CN116663745A (en) 2023-06-20 2023-06-20 LSTM drainage basin water flow prediction method based on PCA_DWT

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310727924.0A CN116663745A (en) 2023-06-20 2023-06-20 LSTM drainage basin water flow prediction method based on PCA_DWT

Publications (1)

Publication Number Publication Date
CN116663745A true CN116663745A (en) 2023-08-29

Family

ID=87726070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310727924.0A Pending CN116663745A (en) 2023-06-20 2023-06-20 LSTM drainage basin water flow prediction method based on PCA_DWT

Country Status (1)

Country Link
CN (1) CN116663745A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117977579A (en) * 2024-03-28 2024-05-03 西华大学 DWT-LSTM neural network-based adjustable load prediction method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117977579A (en) * 2024-03-28 2024-05-03 西华大学 DWT-LSTM neural network-based adjustable load prediction method

Similar Documents

Publication Publication Date Title
CN108900346B (en) Wireless network flow prediction method based on LSTM network
CN109214575B (en) Ultrashort-term wind power prediction method based on small-wavelength short-term memory network
US11409347B2 (en) Method, system and storage medium for predicting power load probability density based on deep learning
CN109102126B (en) Theoretical line loss rate prediction model based on deep migration learning
CN110705743B (en) New energy consumption electric quantity prediction method based on long-term and short-term memory neural network
CN110309603B (en) Short-term wind speed prediction method and system based on wind speed characteristics
CN112990556A (en) User power consumption prediction method based on Prophet-LSTM model
CN108445752B (en) Random weight neural network integrated modeling method for self-adaptively selecting depth features
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN112434848B (en) Nonlinear weighted combination wind power prediction method based on deep belief network
CN111079989B (en) DWT-PCA-LSTM-based water supply amount prediction device for water supply company
CN109583588B (en) Short-term wind speed prediction method and system
Wang et al. Deep echo state network with multiple adaptive reservoirs for time series prediction
CN113705877A (en) Real-time monthly runoff forecasting method based on deep learning model
Li et al. A novel combined prediction model for monthly mean precipitation with error correction strategy
CN103605909A (en) Water quality predication method based on grey theory and support vector machine
CN113554466A (en) Short-term power consumption prediction model construction method, prediction method and device
CN104050547A (en) Non-linear optimization decision-making method of planning schemes for oilfield development
CN115099519A (en) Oil well yield prediction method based on multi-machine learning model fusion
CN114036850A (en) Runoff prediction method based on VECGM
CN116663745A (en) LSTM drainage basin water flow prediction method based on PCA_DWT
CN115310536A (en) Reservoir water level prediction early warning method based on neural network and GCN deep learning model
CN114169645A (en) Short-term load prediction method for smart power grid
CN110717581A (en) Short-term load prediction method based on temperature fuzzy processing and DBN
CN114169251A (en) Ultra-short-term wind power prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination