CN115329930A

CN115329930A - Flood process probability forecasting method based on mixed deep learning model

Info

Publication number: CN115329930A
Application number: CN202210880964.4A
Authority: CN
Inventors: 崔震; 郭生练; 尹家波; 周研来; 王俊
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2022-07-26
Filing date: 2022-07-26
Publication date: 2022-11-11

Abstract

The invention discloses a flood process probability forecasting method based on a hybrid deep learning model, which comprises the following steps: firstly collecting and researching meteorologic hydrological basic data of a drainage basin, establishing a conceptual model, forecasting a flood process in multiple periods, secondly, taking a flow forecasting process of the conceptual model as an external input, nesting a Mixed Density Network (MDN) in an output layer of a long-time memory (LSTM-EDE) neural network based on an external input coding-decoding structure, establishing a mixed LSTM-EDE-MDN probability forecasting model, meanwhile, establishing a loss function by adopting a maximum likelihood estimation method, training neural network parameters, and finally obtaining a condition distribution function and a forecasting interval of each forecasting period, thereby quantifying forecasting uncertainty. The invention couples the LSTM-EDE neural network and the MDN which take the forecast flow of the conceptual model as the external source input, solves the problem of exposure deviation, can obtain the probability forecast of the multi-period flood process on the premise of considering the time correlation of the output variable, and improves the applicability, the interpretability and the reliability of the deep learning model.

Description

Flood process probability forecasting method based on mixed deep learning model

Technical Field

The invention belongs to the technical field of hydrologic prediction, and particularly relates to a flood process probability prediction method based on a hybrid deep learning model.

Background

Hydrologic prediction is an important flood control and drought resistance non-engineering measure, and how to improve flood prediction precision and prolong prediction period is always a key technical bottleneck restricting the operation and management level of a reservoir. In recent years, artificial intelligence technology is rapidly developed, a deep learning model capable of effectively processing nonlinear and unsteady time sequences appears, a long-term storage (LSTM) neural network is one of the most representative models, and compared with the traditional artificial neural network, the long-term storage (LSTM) neural network can obtain better forecasting precision in multi-period flood forecasting. Feng Jun and the like (2019) provide a short-term flood forecasting method for small and medium rivers based on an LSTM neural network, the method can extract effective characteristics, has high forecasting precision, is superior to a traditional support vector machine model, and particularly greatly improves peak current time and forecasting precision of flood peak values in a flood peak stage. However, these deep learning models lack the basis of physical cause, have low interpretability, and are not beneficial to being applied in engineering management. Zhou et al (2022) use the predicted flow based on the conceptual hydrological model as an additional input to the deep learning model in an attempt to couple the process of producing confluence in the process of predicting flow. Meanwhile, the prediction flow based on the process hydrological model can supplement the input of the LSTM model in a long prediction period, relieve the overfitting problem of the LSTM model and improve the interpretability and the prediction precision of the deep learning model to a certain extent.

Meanwhile, as deep learning research continues to be in depth, coding-decoding structures that can solve the sequence-sequence problem have emerged. The encoding process can extract important features of the input sequence and compress the input sequence to an intermediate vector with a fixed length, and the decoding process can convert the intermediate vector into a target output sequence. The LSTM neural network of the coupled coding-decoding structure can transmit effective characteristics extracted from a previous time step to a next time step in the coding and decoding processes, directly obtains the flood forecasting process in multiple forecast periods on the premise of ensuring the time correlation of output variables, and has higher interpretability and applicability compared with a single-output LSTM model. Wang Fan and so on (2022) propose a flood forecasting model and a flood forecasting method based on a deep learning framework based on an encoding-decoding structure, which can input forecast rainfall data as a model and make full use of rain information, thereby improving forecasting precision and prolonging forecasting period.

Due to the influence of uncertain factors such as model input, parameters, structures and the like of meteorological forcing and the like, the problem of uncertainty inevitably exists in flood forecasting, and the risk information provided for flood control decision making by the deterministic point estimation of the deep learning model is limited. The probability forecast can reflect forecast uncertainty information and can provide risk information for decision-makers to make decisions. Liu Zhangjun, etc. (2017) propose a method for coupling collective rainfall forecast and real-time flood probability forecast, which can improve the accuracy of flood forecast and provide a more reliable forecast interval for decision makers by correcting deterministic forecast errors in real time, coupling collective rainfall forecast information and a Bayesian forecast system based on Copula functions. The probability forecast can improve the forecast value and the credibility, and is very necessary for flood control decision and other works.

In summary, the research of the deep learning model still has some disadvantages: (1) The neural network based on the traditional coding-decoding structure can not learn the production convergence process of the conceptual hydrological model and has the problem of exposure deviation, namely the training process is inconsistent with the verification process, so that the deep learning model is low in interpretability and unstable in performance, and the prediction precision is reduced; (2) Most of the output of the deep learning model is deterministic point estimation, prediction uncertainty estimation cannot be provided, and the prediction value and the reliability are low.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides the flood process probability forecasting method based on the hybrid deep learning model, which overcomes the problem of exposure deviation, considers the time correlation of output variables and the uncertainty of quantitative forecasting, and can further improve the applicability, the interpretability and the reliability.

In order to solve the technical problem, the invention adopts the following technical scheme:

a flood process probability forecasting method based on a hybrid deep learning model is characterized by comprising the following steps:

step 1, collecting and analyzing meteorological hydrological data, calculating average production convergence time of a drainage basin, and setting and forecasting forecast period length according to actual requirements;

step 2, calibrating conceptual hydrological model parameters according to the data collected in the step 1, and forecasting a multi-period forecast period flow process by adopting the calibrated conceptual hydrological model;

step 3, taking the flow forecasting process of the conceptual model in the step 2 as an external source input, establishing a long-time memory LSTM-EDE neural network, taking the output of a hidden layer of the LSTM-EDE neural network as the input of a mixed density network MDN, and constructing an LSTM-EDE-MDN model for quantifying forecasting uncertainty;

step 4, setting an activation function and a hyperparameter of the LSTM-EDE-MDN model constructed in the step 3, establishing a loss function to optimize the hyperparameter, and sorting input and target output variables of the LSTM-EDE-MDN model;

and 5, training an LSTM-EDE-MDN model according to the input variables and the target output variables sorted in the step 4, further calculating a conditional probability distribution function of the target variables, obtaining a prediction interval under a certain confidence level, and quantifying uncertainty of a prediction process.

Further, step 1 specifically includes:

step 1.1, collecting meteorological hydrological data including but not limited to precipitation, air temperature, evaporation and flow of a drainage basin outlet section, wherein the time scale of the data is a daily scale or a daily scale;

1.2, dividing data into training period data, verification period data and testing period data;

step 1.3, estimating average river basin production convergence time by adopting actual measurement data of precipitation and flow collected in the step 1-1 according to correlation coefficients of precipitation and flow with different time lags, wherein the time lag number corresponding to the maximum correlation coefficient is the average river basin production convergence time;

step 1.4, determining the forecast period length of flood forecast according to the actual task requirements such as flood control and the like, wherein the forecast period length is less than or equal to the average production convergence time of a drainage basin.

Further, step 2 specifically includes:

2-1, selecting a proper conceptual hydrological model according to actual conditions;

2-2, according to the data arranged in the step 1, adopting an SCE-UA method to rate model parameters, and verifying the effectiveness of the model and testing the performance of the model;

and 2-3, performing flood forecasting by adopting the conceptual hydrological model tested in the step 2-2 to obtain a flow process of a multi-period forecasting period.

Further, step 3 specifically includes:

step 3-1, coupling an LSTM neural network into an external source input encoding-decoding structure EDE structure, and constructing an LSTM-EDE model, wherein an interface for receiving a conceptual hydrological model forecasting flow process is developed in the EDE structure decoding process;

step 3-2, taking Y as a target output variable, taking hidden layer output X of an LSTM-EDE model decoding process as input of a mixed density network MDN, establishing a probability prediction model of LSTM-EDE-MDN mixed deep learning, outputting weights w and parameters theta of a plurality of kernel functions by the LSTM-EDE-MDN model, and adding and combining the kernel functions according to the weights w into a conditional density function f (Y | theta, X) of the target variable Y:

wherein m is the number of kernel functions,

is a Gaussian kernel function, w _i Is the weight of the ith kernel function.

Further, step 4 specifically includes:

step 4-1, setting an activation function and a hyper-parameter of the LSTM-EDE-MDN model;

4-2, constructing a loss function according to a maximum likelihood estimation method, and optimizing and adjusting the hyperparameter through the loss function;

step 4-3, actual measurement precipitation and flow data before the forecast basis time are used as input of an LSTM-EDE-MDN model coding process, and the input time step number is equal to the average production convergence time of the basin; forecasting flow by adopting a conceptual hydrological model as input of a decoding process; the output of the LSTM-EDE-MDN model is a conditional distribution function of a target variable at each forecasting moment, and the output time step number is equal to the length of a forecasting period; in addition, a data set suitable for LSTM-EDE-MDN model training, verification and testing is arranged in the data collected in the step 1, and comprises input of an encoding and decoding process and target output variables.

Further, in step 4-2, the constructed loss function is:

wherein n is the data length of a batch;

the neural network quantifies the probability density of the target variable in the LSTM-EDE-MDN model output condition distribution function through the loss function, and then the hyper-parameter is adjusted; the mixing density function generated by the LSTM-EDE-MDN model is f (Y-theta, X), and when the LSTM-EDE-MDN model is trained, a parameter which enables the probability density of the target variable Y in the log-likelihood function ln (f (Y-theta, X)) to be maximum is optimized through an adaptive moment estimation algorithm.

Further, step 5 specifically includes:

step 5-1, corresponding input variables and target output variables are arranged in the step 4 and are respectively used as data sets for LSTM-EDE-MDN model training, verification and testing, and the training sets arranged in the step 4 are adopted to train a plurality of LSTM-EDE-MDN models;

step 5-2, substituting the verification set sorted in the step 4 into the LSTM-EDE-MDN model trained in the step 5-1 to obtain a conditional probability distribution function of a target variable, and selecting a network parameter with the optimal generalization performance of the verification set from a plurality of LSTM-EDE-MDN models by taking the minimum continuous ranking probability score index as a target;

and 5-3, testing the probability prediction performance of the LSTM-EDE-MDN model by adopting the test set sorted in the step 4 according to the optimized network parameters in the step 5-2, calculating a conditional probability distribution function and a certainty prediction result of the target variable, and simultaneously setting a confidence level to obtain a prediction interval so as to realize quantitative prediction uncertainty.

Compared with the prior art, the invention has the beneficial effects that:

1. the LSTM-EDE-MDN model constructed by the method is nested with an EDE structure capable of receiving the forecasting flow of the conceptual hydrological model, so that the production convergence process of the conceptual hydrological model can be learned, the exposure deviation problem can be overcome, the training process is consistent with the verification process, and the interpretability, the calculation efficiency and the forecasting precision are improved;

2. the LSTM-EDE-MDN model constructed by the method is nested with the MDN technology, can convert the deterministic point estimation into the probability distribution estimation on the premise of considering the time correlation of the output variable, and obtains the forecasting interval with a certain confidence level, thereby achieving the purpose of quantitatively forecasting the uncertainty of the flood process, providing more risk information for a decision maker, and improving the forecasting value and the reliability.

Drawings

FIG. 1 is a schematic structural diagram of an LSTM-EDE-MDN model according to an embodiment of the present invention; wherein, (a) is an encoding process, (b) is a decoding process, and (c) is a probability forecasting process;

FIG. 2 is a schematic diagram of a hybrid density network in an embodiment of the present invention;

FIG. 3 is a diagram of actual flow, deterministic prediction, and 95% confidence prediction intervals for an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the following embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

The present invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.

The embodiment of the invention provides a flood process probability forecasting method based on a hybrid deep learning model, which comprises the following steps:

step 1, collecting and analyzing meteorological hydrological data, calculating average production convergence time of a drainage basin and setting and forecasting forecast period length according to actual requirements; the method specifically comprises the following steps:

step 1-1, selecting a research basin, and collecting meteorological hydrological data, wherein the meteorological hydrological data specifically comprise actual measurement data such as precipitation, air temperature, evaporation and flow of a basin outlet section, and the time scale of the data can be a daily scale or an intra-day scale;

step 1-2, dividing the data collected in step 1-1 into training period data, verification period data and testing period data;

step 1-3, estimating average production convergence time of the computational basin by adopting precipitation-runoff relation, estimating the average production convergence time of the computational basin by adopting actual measurement data of precipitation and flow collected in the step 1-1 according to correlation coefficients of precipitation and flow with different time lags, wherein the time lag number corresponding to the maximum correlation coefficient is the average production convergence time of the computational basin;

step 1-4, determining the forecast period length of flood forecast according to the actual task requirements such as flood control and the like, wherein the forecast period length is less than or equal to the average production convergence time of a drainage basin.

Step 2, calibrating conceptual hydrological model parameters according to the data collected in the step 1, and forecasting a multi-period forecast period flow process by adopting the calibrated conceptual hydrological model; the method specifically comprises the following steps:

step 2-1, selecting a proper conceptual hydrological model according to actual conditions, wherein the proper conceptual hydrological model comprises but is not limited to a new An Jiang model and the like;

step 2-2, according to the training period data arranged in the step 1, parameters of the conceptual hydrographic model are calibrated by adopting an SCE-UA method, the validity of the conceptual hydrographic model is checked by adopting verification period data, and the performance of the conceptual hydrographic model is tested by adopting test period data;

and 2-3, carrying out flood forecasting by adopting the tested conceptual hydrological model to obtain a flow process of a multi-period forecasting period.

Step 3, taking the forecast flow of the conceptual model in the step 2 as an external source input, establishing a long-term memory (LSTM-EDE) neural network based on an external source input coding-decoding structure, taking the output of a hidden layer of the LSTM-EDE neural network as the input of a Mixed Density Network (MDN), and constructing an LSTM-EDE-MDN model for quantifying forecast uncertainty; the method specifically comprises the following steps:

and 3-1, coupling the LSTM neural network into an exogenous input encoding-decoding structure (EDE) structure to construct an LSTM-EDE model (the structure diagram is shown in the figure 1 (a) and (b)). The EDE structure develops an interface for receiving an external source input sequence in the decoding process, can receive the flow forecasting process of the conceptual hydrological model (a dotted line in figure 1 (b)), can learn the production convergence process of the conceptual hydrological model, can overcome the exposure deviation problem of the traditional coding-decoding structure, and improves the interpretability, the long-forecast-period forecasting precision and the calculation efficiency;

and 3-2, taking Y as a target output variable, taking the hidden layer output X of the LSTM-EDE model decoding process as the input (MDN, figure 1 (c)) of the Mixed Density Network (MDN), and establishing the LSTM-EDE-MDN mixed deep learning probability forecasting model (figure 1). The MDN can combine the LSTM-EDE model with a mixed density function, generate weights w and parameters of a plurality of kernel functions by means of a neural network, and add and combine the kernel functions according to the weights w into a conditional density function f (Y | theta, X), wherein theta is a function parameter set to approximate the real distribution of a target variable. The flood forecasting sequence is a one-dimensional time sequence, the mixed density network adopts a Gaussian kernel function, the network output is the weight w, the expected mu and the variance sigma of a plurality of kernel functions, wherein the w is normalized through a softmax function so as to ensure that effective discrete distribution is formed; sigma is processed by an exponential function to ensure that the function is a non-negative value. The probability density function f (Y | θ, X) of the target variable Y given the input LSTM-EDE model hidden layer output X is:

wherein m is the number of kernel functions,

is a Gaussian kernel function, w _i Is the weight of the ith kernel function.

A commonly used kernel function is gaussian kernel function formula:

output variable Y of MDN _f The number of elements is 3m;

the LSTM-EDE-MDN model can transform point estimates generated by the decoding process into estimates of probability distributions to reflect the uncertainty of the prediction process, taking into account the temporal correlation of the output variables.

Step 4, setting an activation function and a hyper-parameter of the LSTM-EDE-MDN model constructed in the step 3, establishing a loss function according to a maximum likelihood estimation method, and sorting input and target output variables of the LSTM-EDE-MDN model; the method specifically comprises the following steps:

and 4-1, setting an activation function and a hyper-parameter of the LSTM-EDE-MDN probability prediction model, wherein the activation function selects a tanh function, and the hyper-parameter comprises the number of LSTM neural network hidden layers, the number of neurons in the hidden layers, the number of kernel functions contained in MDN and the like in the encoding and decoding processes. In the encoding and decoding process, n layers of LSTM neural networks containing m neuron nodes and MDNs are adopted to select k Gaussian kernel functions, and FIG. 2 is a diagram of MDNs containing 3 Gaussian kernel functions;

and 4-2, constructing a loss function according to a maximum likelihood estimation method, wherein the loss function is different from a deterministic output deep learning loss function (such as mean square error, mean absolute error and the like), and the loss function of the LSTM-EDE-MDN probability prediction model is used for adjusting the hyperparameter by quantifying the probability density of a target variable in a network output condition distribution function. The mixing density function generated by the LSTM-EDE-MDN model is f (Y-theta, X), and when the LSTM-EDE-MDN model is trained, a parameter which enables the probability density of the target variable Y in the log-likelihood function ln (f (Y-theta, X)) to be maximum is optimized through an adaptive moment estimation (Adam) algorithm. The Adam algorithm always optimizes the neural network hyper-parameters in the back propagation process towards the direction of the fastest loss function reduction rate, and therefore the loss function is defined as:

in the formula: n is the data length of one batch (batch).

Step 4-3, actual measurement precipitation and flow data before the forecast basis time are used as input variables of the LSTM-EDE-MDN model coding process, and the input time steps are equal to the average production convergence time of the basin; and adopting a conceptual hydrological model forecast flow as an input of a decoding process, namely an exogenous input sequence. The output of the LSTM-EDE-MDN model is a conditional distribution function of the target variable at each forecasting moment, and the output time step number is equal to the length of the forecasting period. In addition, a data set suitable for LSTM-EDE-MDN model training, verification and testing is arranged in the data collected in the step 1 and serves as an input of an encoding process and a decoding process.

And 5, training an LSTM-EDE-MDN model according to the input variables and the target output variables sorted in the step 4, further calculating a conditional probability distribution function of the target variables, obtaining a prediction interval under a certain confidence level, and realizing quantitative prediction uncertainty.

Step 5-1, training a plurality of LSTM-EDE-MDN models by adopting the training set sorted in the step 4, wherein the training process comprises setting Adam algorithm parameters, such as the learning rate is set to be 0.001, and the batch size (batch size) and the iteration number (epoch) used for training the neural network are respectively set to be 120 and 600 so as to rapidly realize the optimization process of the loss function; the dropout (dropout) is set to 0.1 to obtain the optimum generalization performance, etc.

And 5-2, substituting the verification set data sorted in the step 4 into the LSTM-EDE-MDN models trained in the step 5-1 to obtain a conditional probability distribution function of the target variable. And selecting a network parameter with the optimal generalization performance of the verification set from a plurality of LSTM-EDE-MDN models by taking the minimum Continuous Ranking Probability Score (CRPS) index as a target. The continuous probability ranking score (CRPS) index can evaluate the fitting degree of the conditional distribution function of probability prediction and the actual distribution of prediction quantity, and can comprehensively consider the reliability and concentration of probability prediction. Lower CRPS values are considered to have better probabilistic forecasting performance, and are calculated by the formula:

in the formula: n is the number of samples, Q _o,i The ith target variable (measured flow rate) is indicated. F (-) is the estimated probability distribution function, I (-) is the illustrative function, and r denotes the flow variable.

And 5-3, testing the probability forecasting performance of the LSTM-EDE-MDN model by adopting the test set data arranged in the step 4 according to the optimized network parameters in the step 5-2, so as to obtain a conditional probability distribution function and a deterministic forecasting result of the target variable, and simultaneously setting a 95% confidence level to obtain a forecasting interval, thereby realizing the quantitative forecasting uncertainty. Fig. 3 shows a comparison between the actually measured flow rate, the expected value predicted flow rate calculated according to the model of the present embodiment, and the prediction interval with 95% confidence. As can be seen from fig. 3, the expected value flow forecasting process obtained by calculation in this embodiment can better fit the actual measured flow process, and meanwhile, the forecasting interval with 95% confidence can cover most of the actual measured flow points, which indicates that the forecasting interval is reasonable and reliable, and can reasonably quantify the forecasting uncertainty.

In summary, the invention firstly collects and researches the meteorological hydrological basic data of the drainage basin, establishes a conceptual model, forecasts the flood process in multiple periods, secondly, couples the Mixed Density Network (MDN) at the output layer of the LSTM-EDE neural network which takes the forecasting flow of the conceptual model as the external source input, constructs the LSTM-EDE-MDN probability forecasting model, and simultaneously adopts the maximum likelihood estimation method to establish the loss function, trains the neural network parameters, finally obtains the condition distribution function and the forecasting interval in each forecasting period, and realizes the probability forecasting. The invention couples the LSTM-EDE neural network and the mixed density function which take the forecast flow of the conceptual model as the external source input, can solve the problem of exposure deviation of the traditional coding-decoding structure and the convergence process of learning the conceptual hydrological model, and can obtain the probability forecast of the multi-period flood process on the premise of considering the time correlation of the output variable, thereby quantifying the forecast uncertainty and improving the applicability, the interpretability and the credibility of the deep learning model.

While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims

1. A flood process probability forecasting method based on a hybrid deep learning model is characterized by comprising the following steps:

2. The flood process probability forecasting method based on the hybrid deep learning model according to claim 1, wherein the step 1 specifically comprises:

step 1.2, dividing data into training period data, verification period data and testing period data;

step 1.3, the measured rainfall and flow data collected in the step 1-1 are adopted, the average river basin production converging time is estimated according to the correlation coefficients of the rainfall and the flow with different time lags, and the time lag number corresponding to the maximum correlation coefficient is the average river basin production converging time;

step 1.4, determining the forecast period length of flood forecast according to the task requirements of actual flood control and the like, wherein the forecast period length is less than or equal to the average production convergence time of a drainage basin.

3. The flood process probability forecasting method based on the hybrid deep learning model according to claim 1, wherein the step 2 specifically comprises:

4. The flood process probability forecasting method based on the hybrid deep learning model according to claim 1, wherein the step 3 specifically comprises:

and 3-2, taking Y as a target output variable, taking the hidden layer output X of the LSTM-EDE model decoding process as the input of the mixed density network MDN, and establishing an LSTM-EDE-MDN mixed deep learning probability forecasting model. The LSTM-EDE-MDN model outputs weights w and parameters theta of a plurality of kernel functions, and the kernel functions are combined into a conditional density function f (Y | theta, X) of a target variable Y according to the weights w:

wherein m is the number of kernel functions,

is a Gaussian kernel function, w _i Is the weight of the ith kernel function.

5. The flood process probability forecasting method based on the hybrid deep learning model according to claim 1, wherein the step 4 specifically comprises:

step 4-3, actual measurement precipitation and flow data before the forecast basis time are used as input of an LSTM-EDE-MDN model coding process, and the input time step number is equal to the average production convergence time of the basin; forecasting flow by adopting a conceptual hydrological model as input of a decoding process; the output of the LSTM-EDE-MDN model is a conditional distribution function of a target variable at each forecast moment, and the output time step number is equal to the length of a forecast period; in addition, a data set suitable for LSTM-EDE-MDN model training, verification and testing is arranged in the data collected in the step 1, and comprises input variables and target output variables of the encoding and decoding processes.

6. The flood process probability forecasting method based on the hybrid deep learning model according to claim 1, wherein in the step 4-2, the constructed loss function is:

wherein n is the data length of a batch;

7. The flood process probability forecasting method based on the hybrid deep learning model according to claim 1, wherein the step 5 specifically comprises:

step 5-1, corresponding input and target output variables are arranged in the step 4 and are respectively used as data sets for LSTM-EDE-MDN model training, verification and testing, and the training sets arranged in the step 4 are adopted to train a plurality of LSTM-EDE-MDN models;