CN115099497A

CN115099497A - CNN-LSTM-based real-time flood forecasting intelligent method

Info

Publication number: CN115099497A
Application number: CN202210741890.6A
Authority: CN
Inventors: 任明磊; 徐炜; 魏国振; 王刚; 赵丽平; 王凯; 顾李华; 喻海军; 胡友兵; 杨雨霞
Original assignee: Huaihe River Water Resources Commission Hydrology Bureau (information Center); China Institute of Water Resources and Hydropower Research; Chongqing Jiaotong University
Current assignee: Huaihe River Water Resources Commission Hydrology Bureau (information Center); China Institute of Water Resources and Hydropower Research; Chongqing Jiaotong University
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-09-23
Anticipated expiration: 2042-06-28
Also published as: CN115099497B

Abstract

The invention discloses a CNN-LSTM-based real-time flood forecasting intelligent method. The invention constructs a deep learning network (CNN-LSTM) for hydrological process prediction based on a Convolutional Neural Network (CNN) and a long-short term memory (LSTM) network. In CNN-LSTM, CNN is used to identify and extract spatial precipitation data and LSTM is used to learn the time series relationship between precipitation and flow. These two networks enable the CNN-LSTM to have the ability to identify spatial and temporal information. CNN-LSTM has a powerful ability to learn non-linearities and complex processes in hydrologic modeling.

Description

CNN-LSTM-based real-time flood forecasting intelligent method

Technical Field

The invention belongs to the technical field of hydrologic forecasting, and particularly relates to a CNN-LSTM-based real-time flood forecasting intelligent method.

Background

The runoff of the drainage basin has the high nonlinear characteristic, and the runoff process reflects the space and time distribution of influence factors such as drainage basin landform, vegetation and rainfall. With the continuous enhancement of the learning ability of the deep neural network, the application of the deep neural network in learning the spatial and temporal distribution of runoff influencing factors of a watershed is one of important research directions in the future hydrology field. Deep learning is the most popular research direction in the field of machine learning at present. Deep neural networks have shown great potential in various applications such as target recognition and speech recognition. Among them, Convolutional Neural Network (CNN) and cyclic Neural Network (RNN) are two major mainstream networks in the deep Neural Network model.

The improvement of the CNN learning efficiency promotes the development of machine learning and becomes one of core network models in deep learning. The prototypes of CNN neural networks are derived from the convolutional neural layer. The neural layer is based on the receptive field theory of biological neurology, so that the CNN model has strong external information perception capability in deep learning. On the basis of the convolutional neural layer, after a coupling back propagation algorithm, a convolutional neural network is constructed and successfully applied to a handwritten character recognition system. Subsequently, the proposed classical convolutional neural network LeNet-5 enables the recognition accuracy of the handwritten characters to be greatly improved. Although CNN achieves good recognition results in simple handwritten digit pattern recognition, it has poor learning results for large-scale and complex pattern data recognition. According to the AlexNet convolutional neural network structure of Alex Krizhevsky in 2012, a ReLU function is used as an activation function, and the overfitting problem of a full connection layer is solved through a Dropout technology, so that the identification effect and the learning speed of the network are further improved. In 2012, Krizhevsky et al achieved the best classification using CNN with extended depth. Thus, CNNs are beginning to be subject to more and more research, bringing CNNs into a completely new development era. In 2014, Szegedy et al established a CNN with more than 20 layers, called GoogleNet. The network parameters are about 12 times less than that of the AlexNet neural network, and the network has higher graph classification capability. In 2014, simony et al enhanced model performance by increasing convolutional layer depth of CNN network structures, which is called VGG model. Research results show that when the number of the weight layers reaches 16-19, the performance of the model can be effectively improved. The AlexNet, the googlLeNet and the VGG lay the foundation of a classical CNN neural network, and a plurality of excellent neural networks are developed on the basis.

A Recurrent Neural Network (RNN) is a time-recursive Network, and has a strong learning ability for time series data. Due to the problem of gradient disappearance or gradient explosion of the RNN recurrent neural network, the RNN gradually loses the learning ability of the remote information. In order to solve RNN gradient disappearance and gradient explosion, Hochreiter & Schmidhuber (1997) constructs a Long short-term memory (LSTM) network by improving an RNN recurrent neural network model. Compared with the common RNN, the LSTM Network has better performance in learning long sequence data and belongs to a special Recurrent Neural Network (RNN). On the basis of standard LSTM, Gers & Schmidhuber (2000) proposed a manifold LSTM variant with the addition of a "peephole connection" to allow the portal layer to also receive input of cellular status. A better LSTM variant Gated refresh Unit, GRU), which combines the forgetting gate and the input gate into a single refresh gate, was subsequently proposed. The cellular state and the hidden state are also mixed. The final model is simpler than the standard LSTM model and is a very popular variant. As a nonlinear model, LSTM is well suited for constructing larger deep neural networks.

In recent years, with the continuous improvement of deep learning technology, the CNN neural network has high efficiency in identifying images, while the LSTM has an ultra-strong learning ability for time series data. Currently, in the field of natural language processing, CNN and LSTM are used in combination for emotion analysis in movie recommendations. LSTM can acquire a representation of the entire sentence, and can capture long-term dependencies on the feature sequence. In other fields, CNN and LSTM are used in combination for gesture recognition, and comparison has shown that CNN-LSTM performs better than CNN or LSTM alone. CNN-LSTM is also used for identification and description in computer vision.

However, the CNN and the LSTM have very strong nonlinear space-time identification capability, and the establishment of the neural network for learning the space-time change of the hydrological meteorological elements is an important technical direction for the development of the hydrological model in the future. However, at present, no CNN-LSTM coupled research result is applied to hydrological forecasting and flood forecasting.

In the hydrological model, a traditional statistical model and a lumped hydrological model, such as a seasonal autoregressive model, a Xinanjiang model. Although the time change of the runoff can be reflected, the spatial information of the watershed cannot be reflected. The distributed hydrological model is based on the grid data and can consider the space-time change of the hydrological meteorological elements, but the distributed hydrological model has the defects of large calculation amount, difficult parameter rate and the like. The hydrological model, although developed over a long period of time, is still not sufficiently accurate in the physical mechanism of the hydrological response. Therefore, it is a direction worth further research and development to build a hydrological deep learning model according to CNN and LSTM networks.

Disclosure of Invention

The invention mainly aims to establish a deep neural network model capable of learning the runoff producing and converging processes of a drainage basin by adopting CNN and LSTM, is used for learning the nonlinear space-time variation process of the height of hydrological elements, and provides a CNN-LSTM-based real-time flood forecasting intelligent method.

The CNN-LSTM deep learning model is constructed by combining the CNN and the LSTM. Firstly, CNN is adopted to read and identify spatial distribution information influencing the runoff process, and then LSTM is adopted to learn the time variation characteristics of the runoff process. The purpose of the invention is realized by the following technical scheme.

A CNN-LSTM-based real-time flood forecasting intelligent method adopts a CNN-LSTM model to carry out flood forecasting, wherein the CNN in the CNN-LSTM model is used for identifying and extracting spatial precipitation data, the LSTM is used for learning a time sequence relation between precipitation and flow, meteorological data are input, and a predicted runoff result is input, and the CNN-LSTM model is constructed by the following steps:

step 1: randomly generating meteorological data for many years by adopting a K-NN algorithm;

step 2: constructing a basin runoff simulation model based on a SWAT model, inputting meteorological data simulated by the K-NN in the step 1 into the SWAT model, and generating corresponding runoff data;

and step 3: training the LSTM network by taking the runoff data generated in the step 2 as a previous data set;

and 4, step 4: connecting the LSTM network with the CNN network to construct a CNN + LSTM neural network: the CNN + LSTM neural network comprises a CNN convolution layer, a CNN pooling layer, a CNN and LSTM connection layer, an LSTM cell layer and a full connection layer B;

the CNN convolutional layer extracts the spatial characteristics of the river basin meteorological information by reading a meteorological data grid matrix, inputs the river basin meteorological spatial information into a CNN model as a characteristic graph feature map matrix, and identifies the input matrix by adopting a filter; performing convolution calculation on the filter and the input matrix area covered by the filter, and outputting a feature matrix after convolution;

the pooling layer performs downsampling on the matrix after convolution, and performs dimensionality reduction and principal component extraction to reduce optimization difficulty and parameter quantity;

the CNN and LSTM connection layer comprises a flatten layer and a full connection layer A, the flatten layer expands data output by the pool layer into a one-dimensional matrix, and the full connection layer A is adopted for linear conversion to output a matrix Y with the same number as LSTM cells _out As input information for the LSTM network;

the LSTM cell layer learning input data matrix Y _out Establishing a nonlinear relation between meteorological data and hydrological data in the time change process of the system; outputting a one-dimensional matrix LSTM Output to a full-connected layer B for learning after the circulating learning of the LSTM cell layer;

and finally outputting the predicted runoff by the full-connection layer B, wherein the full-connection layer B is a nerve layer after the activation function conversion.

Further, the step 1 specifically operates as follows:

1) taking actual measurement historical data of the average value of 5 days in M years as a sample, and taking M as the data year;

2) randomly selecting 1 sample from 1 month and 1 day samples of M years as an initial value for simulating a first day;

3) selecting n days before and after the t day as a potential adjacent day L; wherein t is 1, 2, … …, 365; n ranges from 14 days, L from 2 x 7 x M;

4) according to Euclidean distance D _i Calculating the similarity degree of weather conditions of the weather of the current day and the weather conditions of the potential adjacent day L, then sequencing the L distances from small to large, and randomly selecting K distances as potential nearest adjacent days; k is 6-14;

5) to show that the closer the distance is, the more similar the representative meteorological conditions are, the cumulative probability density function P is established _j For the K potential nearest neighbor day samples;

6) generating random numbers which are uniformly distributed in a range of 0-1, then randomly extracting one day from K potential nearest adjacent days as a nearest adjacent day of the current simulation day, and taking the day after the nearest adjacent day as a meteorological simulation value of the t +1 st day;

7) and (5) repeating the steps 4) to 6), and simulating to meet the requirement of the required days.

Further, the construction of the basin runoff simulation model based on the SWAT model in the step 2 comprises the following steps:

(1) the data used were: the method comprises the steps that DEM data, land utilization data, soil data and multi-year weather data are preprocessed to generate data required by a model;

(2) converting DEM, land utilization data and soil data into projection coordinates based on ArcGIS, generating a plurality of sub-watersheds according to the set minimum watershed area, and inputting the land utilization data and the soil data to construct a hydrological response unit;

(3) calibrating the model by using a sufi2 algorithm in the hydrological station actual measurement runoff and the SWAT-CUP, and selecting the NSE highest parameter as the SWAT parameter;

(4) generating daily runoff data of at least 100 years by using a calibrated SWAT model and K-NN simulated meteorological data as input, abandoning simulated runoff in the previous 10 years as a model preheating period, abandoning meteorological data of a corresponding time period, and reserving the simulated runoff and the corresponding meteorological data; and selecting the corresponding sub-basin outlet runoff from the sub-basins as LSTM model experimental data.

The invention has the beneficial effects that:

(1) the problem of insufficient mechanism recognition exists in the current production and confluence mechanism of the basin hydrology, and the traditional hydrology model still has great application limitation in runoff simulation and forecasting. The CNN-LSTM model belongs to an intelligent model, has stronger learning ability, can make up for the defects of a professional model, can be trained and learned according to the observed rainfall and runoff processes, and has the advantages of easy modeling and the like.

(2) The river basin flood process is influenced by rainfall, and has the characteristics and characteristics of high nonlinearity and space-time variation. And the CNN-LSTM can capture the space-time change process of rainfall in the flow domain and has very strong nonlinear learning capacity, so that the CNN-LSTM has stronger adaptability in flood forecasting.

Drawings

FIG. 1 is a diagram of a CNN-LSTM model architecture;

FIG. 2 shows the structure of the cell layer of the LSTM;

FIG. 3 illustrates an LSTM memory cell structure;

FIG. 4 is a training error variation process under different network structure and parameter conditions;

FIG. 5 is a comparison graph of simulated runoff and actual measurement runoff of a muddy river basin;

FIG. 6 is a radial flow process in the muddy river during the CNN-LSTM network training and verification stage;

fig. 7 is a training error change process under different T-value situations of the muddy river basin.

Detailed Description

A CNN-LSTM-based real-time flood forecasting intelligent method adopts a CNN-LSTM model to carry out flood forecasting, wherein CNN in the CNN-LSTM model is used for identifying and extracting spatial precipitation data, LSTM is used for learning a time sequence relation between precipitation and flow, meteorological data are input, and a predicted runoff result is input, and the CNN-LSTM model is constructed by the following steps:

step 1: and the meteorological data for many years are randomly generated by adopting a K-NN algorithm and used for pre-training a network model to make up for the defect of less data in the hydrological field. The invention adopts a K-NN algorithm to simulate and generate meteorological data samples of meteorological sites, and comprises the following steps:

1) taking the measured historical data of the average value of 5 days in M years as a sample (M is the year number of measured data);

3) the t-th day (t ═ 1, 2, … …, 365) was selected as the center, and n days before and after were taken as the potential adjacent days (L). In this embodiment, n is 14d, and L is 2 × 7 × M, which is a potential adjacent day of each time interval.

4) According to Euclidean distance D _i Calculating the similarity degree of weather conditions of the weather of the current day and the weather conditions of the potential adjacent day L, then sequencing the L distances from small to large, selecting the first K distances as potential nearest adjacent days, and taking the K distances as 6 days in the research.

In the formula: i is [1,15 x M-1 ]],

Denotes the value of the jth variable at the h station on the t day, m is the number of stations, s _t In the form of a covariance matrix,

is the average of the weather of the day,

the weather mean value of L days.

5) To show that the closer the distance is, the more similar the representative meteorological conditions are, the cumulative probability density function P is established _j For K potential nearest neighbor day samples.

6) And generating random numbers which are uniformly distributed in a range of 0-1, then randomly extracting one day from K potential nearest adjacent days as the nearest adjacent day of the current simulation day, and taking the day after the nearest adjacent day as the meteorological simulation value of the t +1 th day.

Step 2: constructing a basin runoff simulation model based on a SWAT model, inputting meteorological data simulated by the K-NN in the step 1 into the SWAT model, and generating corresponding runoff data; this serves as the preliminary data set for LSTM network training. (KNN generates meteorological data, inputs SWAT, generates runoff data, after the data is more, is used for LSTM network training, and makes up the problem of less hydrological data).

Constructing a SWAT model:

(1) the data used were: DEM data with the resolution of 1000 x 1000m, land utilization data with the resolution of 1000 x 1000m, soil data with the resolution of 1000 x 1000m and solar weather data in 1991 and 2016 are obtained, and the obtained data are preprocessed to generate data required by the model.

(2) And converting the DEM, the land utilization data and the soil data into projection coordinates based on ArcGIS, generating 25 sub-watersheds according to the set minimum watershed area, and inputting the land utilization data and the soil data to construct a hydrological response unit.

(3) And (3) calibrating the model by using a sufi2 algorithm in the hydrological station actual measurement runoff and the SWAT-CUP, and selecting the NSE highest parameter as the SWAT parameter.

(4) And (3) generating daily runoff data of 140 years by using the calibrated SWAT model and K-NN simulated meteorological data as input, omitting the simulated runoff as a model preheating period in the previous 10 years, omitting the meteorological data of a corresponding time period, and reserving the simulated runoff and the corresponding meteorological data of 130 years. And selecting the corresponding sub-basin outlet runoff from the sub-basins as LSTM model experimental data.

And step 3: taking the runoff data generated in the step 2 as a previous-stage data set to train the LSTM network;

on the basis of the CNN and LSTM neural network structures, the CNN and LSTM neural networks are established after the CNN and LSTM neural networks are reconstructed. The CNN + LSTM model is used for basin runoff simulation and forecasting modeling, so that the model can identify and learn the time-space change process of meteorological elements such as basin rainfall, temperature and the like. The structure of the CNN + LSTM neural network is shown in fig. 1.

CNN convolutional layer:

the meteorological data input by the CNN model is a grid matrix, and the spatial distribution of meteorological information in a flow domain is reflected. The main function of the Convolutional Layer (volumetric Layer) is to extract the spatial features of the meteorologic information by reading the matrix. Firstly, the Meteorological Spatial Distribution (MSD) is used as the feature map moment of the feature patternMatrix (H) _i ⅹW _i ) The CNN model is input as shown in equation (4). Then, adopting Dc filter filters to identify the input matrix; the weight matrix and the bias of the filter are shown in equation (4). And scanning the feature map matrix of the feature graph by using the filter, wherein the upper left corner starts to move, from left to right, from top to bottom, and finally scanning to the lower right corner of the feature map matrix is finished. And (5) performing convolution calculation on the Dc filters and the input matrix areas covered by the filters respectively by adopting a formula (5). The Dc convolved feature matrices (feature maps), Fconv, will be finally output. Since the present embodiment is convolved in the "same" manner, the height and width of the Fin and Fconv matrices remain unchanged.

In the formula, F _in Representing the input feature pattern map; f _conv Representing the feature pattern map output after convolution calculation; h _i Is represented by F _in Height of feature graph feature map; w _i Is represented by F _in The width of the feature map; h _c Is represented by F _conv Height of the feature pattern; w _c Is shown as F _conv The width of the characteristic graph; dc represents the number of convolution kernels; f. of _conv Representing the convolution kernel output result, and representing the calculated values of the height h and the width w of the d-th convolution kernel matrix; w _filter(d) Representing the weight value in the d-th convolution kernel filter, b _filter(d) And representing the deviation value corresponding to the (d) th convolution kernel filter, and n represents the convolution step of the convolution kernel.

Pooling layer (Pooling):

because the dimension of the input matrix is large, in order to reduce the optimization difficulty and the number of parameters, downsampling (down sampling) is performed on the feature map matrix after convolution, and the step is pooling posing, and the process is as in formula (6). The pooling of the matrix has significance in reducing the size and extracting the principal components. Calculation of Pooling shows. This example uses the maximum value method for pooling calculation, as shown in equation (7).

In the formula, n _p Step size for pooling; f. of _conv Representing the output result of the pooling calculation; f _pool Representing a pooling layer feature matrix.

Connection layer of CNN and LSTM:

in fig. 1, the flat layer (flat layer) expands the feature map (fpool) output from the pooling layer (posing layer) into a one-dimensional matrix (Lin ═ Hp × Wp ×) and then adopts the linear transformation of the fully connected layer a (full connected layer a) to output a matrix Y having the same number of LSTM cells (LSTM cells) _out . The number of LSTM cells is K, (L) _in K). The full connected layer A here does not add an activation function, as shown in equation (8). The matrix Y is divided into _out As input information to the LSTM network. And then, an LSTM cell layer (LSTM cell layer) is adopted to learn the time change process of the input data, and a nonlinear relation between the meteorological data and the hydrological data is established.

Y _out (L _out )＝X _in (L _in )×W _fully (L _in ,L _out )+b _fully (8)

In the formula, Y _out Representing an output result matrix; l is _out Representing the length of the output number of the output layer; l is _in Representing the output quantity of the LSTM cell; x _in Representing an input matrix; w _fully And b _fully Representing the weights and bias matrices of the output layers.

Construction of LSTM cell layer:

LSTM is a special type of recurrent neural network that is made up of multiple layers of memory cells, each layer containing multiple cell units, as shown in fig. 2. Knowledge information is mutually transmitted between cells of each layer and between cell layers, so that the LSTM has strong memory and learning capacity due to large-scale memory and learning of the cells.

The memory capacity of each cell is controlled by a3 'gate', the passing capacity of characteristic information can be effectively controlled, and the purposes of eliminating gradient explosion and gradient disappearance are achieved. The internal structure of the LSTM memory cell is shown in fig. 3. The inputs at time t include: hidden layer state variable C at last time t-1 _t-1 And memory cell state variable h _t-1 Inputting variable x at the present moment _t . The output includes: hidden layer state variable c at current moment _t And the output variable h at the current time _t 。

The specific structure of the input data passing through the memory unit is as follows:

forget door f _t : determining how much information the memory cell discards from a previous state;

f _t ＝σ(W _f [h _t-1 ,x _t ]+b _f ) (9)

in the formula, f _t A forgetting gate output matrix representing time t; h is a total of _t-1 A memory cell state variable representing time t-1; x is the number of _t Represents the time t input variable; sigma represents a neural network constructed by a Sigmoid activation function; w _f And b _f Representing the weights and bias values of the network.

Input door i _t : determining the updating degree of the state of the memory unit, namely the input degree of the current time information, to be added into the memory information;

i _t ＝σ(W _i [h _t-1 ,x _t ]+b _i ) (10)

in the formula, i _t An input gate output matrix representing time t; tanh representation constructed using a tanh activation functionA network; w _i And b _i Representing the weights and bias values of the sigma network; w _c And b _c Representing weights and bias values of the tanh network;

representing a hidden layer state variable information matrix;

output gate o _t : the degree to which the current information is output is determined.

o _t ＝σ(W _o [[h _t-1 ,x _t ]+b _o ]) (12)

h _t ＝o _t ×tanh(c _t ) (14)

In the formula: w _o And b _o Weights and deviation values for the output gate sigma network; c. C _t-1 Representing a hidden layer state variable information matrix. Full connecting Layer (full Connected Layer)

After the cyclic learning of the LSTM Cell in this embodiment, a one-dimensional matrix LSTM Output is Output to the full link layer b 1; after learning of the fully connected layers b1 and b2, the predicted runoff is finally output. The fully-connected layer B is the nerve layer after the activation function conversion, as shown in equation (15).

In the formula, X _in An input matrix representing a network layer; l is _in And L _out Respectively representing the lengths of the information quantities in the input matrix and the output matrix; w _fully A weight matrix representing a network layer; b _fully Representing an offset value of the network layer; f. of _ReLU The method comprises the steps of representing that an activation function ReLU is adopted to convert network layer output;

and the network layer output after the conversion of the activation function is represented.

Loss function

The CNN + LSTM network acquires a batch (a batch) of training samples (training data) at a time for learning. The loss function is to evaluate the error between the network output value and the training sample target value at each learning, as shown in formula (16); and based on the loss function value, learning and updating the network weight by adopting a gradient descent algorithm.

In the formula, T is the number of pitch sizes learned at a time, i.e., the time length of input data. (ii) a

An output value representing a neural network layer; z _t Is a target value, this text is the measured runoff of the input network; omega _t An error calculation function representing the time period t; loss represents the error between the neural network output value and the measured value;

model effect evaluation index

The simulation effect of LSTM networks was evaluated using nash efficiency coefficients (NSE) and decision coefficients R2. The method is generally used for verifying the quality of the hydrological model simulation result. The calculation formula is as follows:

in the formula: t is the batch size number of one learning, namely the time length of input data;

and

measured values and simulated values at the t-th time are obtained;

the average of the observations. The closer the NSE value is to 1, the higher the model confidence.

In the formula: q _m,i Measured data is obtained; q _s,i Is analog data;

is the measured data mean value;

for modeling the mean value of the data, R2 has a value in the range of [0,1 ]]The closer to 1, the better the simulation effect.

Example 1

The present example selects the muddy river basin as a research example. The watershed basis characteristics are shown in table 1.

Muddy river basin: the watershed originates from the south foot of Changbai mountain in northeast China and flows from the northeast to the southwest. The northern edge of the northeast rainstorm center in China in the muddy river basin belongs to the temperate zone monsoon type continental climate. The river basin is provided with multiple mountains, the mountain is steep, the vegetation is good, and the human activities are few. The runoff volume of 6-9 months accounts for about 70% of the annual runoff volume.

TABLE 1

In this embodiment, a runoff model constructed by CNN + LSTM is used, and a SWAT distributed hydrological model and LSTM are used as a comparison model. Firstly, the influence of the CNN + LSTM structure and parameters on the learning efficiency is analyzed, and on the basis, the simulation and prediction performance of the CNN + LSTM is contrastively analyzed.

(a) CNN convolutional and pooling layers

The convolutional and pooling layers in the CNN are the channels for the CNN + LSTM network to read and identify the input information. In order to compare the influence of the convolutional layer and the pooling layer number on the network learning efficiency, the present embodiment adopts 4 CNN convolutional layer structure scenarios to analyze, as shown in table 2. On the basis of the 4 network structures, the network is trained by adopting the hydrological meteorological process from 1970 to 1999 of the muddy river basin.

The results in table 2 show that the output data volume of CNN convolution and pooling layers can be gradually compressed by adding convolution and pooling layers. Fig. 4(a) shows the effect of the number of CNN convolutional layers and pooling layers on the CNN + LSTM network learning efficiency under different CNN convolutional layer and pooling layer scenarios, and the result indicates that adding convolutional and pooling layers does not significantly improve the CNN + LSTM network learning efficiency. This also shows that a1 and a3 both effectively extract the weather information in the input matrix.

Fig. 4(b) shows the variation of the network learning efficiency under the condition of different filter numbers. The result shows that the learning efficiency is continuously reduced along with the increase of the number of filters. The main reason is that the input matrix information is the interpolation of meteorological data, and has no other implicit meanings. For example, the input matrix is a rainfall interpolation matrix, the first filter of the CNN is to extract rainfall, and the second is the implicit information of the rainfall interpolation matrix. The implicit information actually changes the meaning of the rainfall information and gives other unknown implicit meanings to the rainfall information. On the basis of meteorological data and implicit information data thereof, the LSTM network cannot efficiently capture characteristics such as data period and the like. Therefore, the result of fig. 4(b) shows that the learning efficiency is the highest when the filter is one.

TABLE 2

(b) Layer of LSTM cells

In the LSTM Cell layers (LSTM Cell layers), the number of LSTM Cell layers and the number (n) of cells in each layer have a large influence on the network learning efficiency. Fig. 4(c) and (d) show the loss function value change process of the network training under different Cell numbers (n) and Cell numbers (n), respectively, and the process shows the change of the network learning efficiency.

Fig. 4(c) shows the learning efficiency of the CNN + LSTM network under different number of layers. The results show that the learning efficiency is better when the number of layers is set to 2. Fig. 4(d) shows the learning efficiency of the CNN + LSTM network under different Cell numbers n in the Cell layer. The results show that the loss function value change curves when n is 15,20 and 25 are more concentrated along with the increase of n, and the improvement of the network learning efficiency becomes insignificant. Therefore, in the muddy river basin, the number of the network layers is set to be 2, and each layer contains 25 Cells.

In a CNN + LSTM network, the fully connected layers b (full connected layers) establish a nonlinear conversion bridge between the LSTM cells output and the target label target lab. In order to compare the influence of the number of fully connected layers on the CNN + LSTM network learning capability, the present embodiment sets 4 fully connected layers structural scenarios, as shown in table 3. In a CNN + LSTM network, each layer has 25 LSTM cells, and the output of the last layer of LSTM Cell layer is a one-dimensional matrix with a length of 25. The output layer of the embodiment has only 1 value, namely the hydrops station runoff process.

(c) Full connected layer B (full connected layer B)

b1 scenario converts the LSTM Cell output matrix directly to the output layer. In scenarios of b2, b3, and b4, a full connected layer with different input and output structures is established between LSTM Cell layers. The CNN + LSTM networks of 4 scenarios are trained respectively, and the loss function value change process is shown in fig. 4 (e). The results show that the nonlinear fitting capability of the fully connected layers is continuously improved with the increase of the number of the fully connected layers.

TABLE 3

(d) Study sample Length (Batch size T)

In the CNN + LSTM network, the parameter that has the greatest influence on the learning of the network is the learning sample length (T), i.e., the batch size of the training samples. In this embodiment, different numbers of input samples are set to train the network, i.e., T is 30,60,120,180, and 240. FIG. 7 is a process of changing the value of the loss function. The results show that the value of the network loss function is stable as T increases, which proves that the increase of the training data volume is beneficial to the network to learn the time series characteristics of the data.

And respectively adopting SWAT, LSTM and CNN + LSTM to establish a runoff simulation model of the watershed. The performance evaluation factors for the model rate period (1970-1999) and the test period (2000-2010) are shown in table 4. In the training stage, the LSTM performs the most excellently, and the NSE value reaches 99.21%, which indicates that the LSTM has super-strong nonlinear fitting capability. The NSE value of CNN + LSTM is only 75.24% and the REV value is 6.5%, which performs the worst. In verifying stage, the predicted effect of LSTM is close to that of the SWAT model. However, the NSE value of LSTM decreased from 99.21 in the training phase to 65.95. The result shows that the performance of the training phase is not because the LSTM actually learns the rainfall-runoff relationship of the drainage basin, but the performance is better due to the established pseudo-correlation relationship.

FIG. 5 is a scatter plot of simulated runoff versus actual runoff showing that the accuracy of the forecast phase substantially maintains the forecast capacity of the training phase, particularly in the simulation effect of the open and dry periods (left and middle plots) ² 80.96 and 99.31 respectively, and forecast accuracy index R of verification period ² All over 80 (80.87 and 85.14, respectively), indicating that a high level of prediction accuracy is still maintained. Flood process forecasting is always the key point of bottleneck and breakthrough of hydrological model technology, and precision index R of CNN + LSTM network training and verification period in flood period (right graph) ² 75.41 and 67.82 respectively show that the accuracy of the flood period is reduced to a certain extent, but the forecasting capability is still maintained, and as can be seen from fig. 6(a) and 6(b), the high water flow process (light blue process line) and the actual measurement process (black process line) of the CNN + LSTM network to the flood almost coincide, which shows that the simulation and actual measurement runoff processes have high coincidence, and this also shows the strong forecasting capability of the CNN + LSTM network.

TABLE 4

Finally, it should be noted that the above is only intended to illustrate the technical solution of the present invention and not to limit it, and although the present invention has been described in detail with reference to the preferred arrangement, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. A CNN-LSTM-based real-time flood forecasting intelligent method is characterized in that a CNN-LSTM model is adopted for flood forecasting, CNN in the CNN-LSTM model is used for identifying and extracting spatial precipitation data, LSTM is used for learning a time sequence relation between precipitation and flow,

the construction of the CNN-LSTM model comprises the following steps:

and 2, step: constructing a basin runoff simulation model based on a SWAT model, inputting meteorological data simulated by the K-NN in the step 1 into the SWAT model, and generating corresponding runoff data;

the pooling layer performs downsampling on the feature matrix after convolution, and performs dimensionality reduction and principal component extraction to reduce optimization difficulty and parameter quantity;

the connection layer of the CNN and the LSTM comprises a flat layer and a full connection layer A, flathe tten layer expands the data output by the pooling layer into a one-dimensional matrix, and then adopts the linear conversion of the full-connection layer A to output a matrix Y with the same number as the LSTM cells _out As input information for the LSTM network;

the LSTM cell layer learning input data matrix Y _out Establishing a nonlinear relation between meteorological data and hydrological data in the time change process of the system; after the LSTM cell layer is circularly learned, outputting a one-dimensional matrix LSTM Output to a full connection layer B for learning;

2. The intelligent method for real-time flood forecasting according to claim 1, wherein the step 1 specifically operates as follows:

3) selecting the front and back n days as potential adjacent days L by taking the t day as the center; wherein t is 1, 2, … …, 365; n ranges from 14 days, and L ranges from 2 × 7 × M;

6) generating random numbers which are uniformly distributed in a range of 0-1, then randomly extracting one day from K potential nearest adjacent days as the nearest adjacent day of the current simulation day, and taking the day after the nearest adjacent day as a meteorological simulation value of the t +1 th day;

7) and repeating the steps 4) to 6), and simulating to meet the requirement of required days.

3. The intelligent method for real-time flood forecasting according to claim 1, wherein the construction of the watershed runoff simulation model based on the SWAT model in the step 2 comprises the following steps:

(1) data acquisition and pretreatment: the method comprises the steps that DEM data, land utilization data, soil data and measured weather data of many years are preprocessed to generate data required by a model;

(3) calibrating the model by using a real-measured runoff of the hydrological station and a sufi2 algorithm in the SWAT-CUP, and selecting the NSE highest parameter as the SWAT parameter;

4. The intelligent method for real-time flood forecasting according to claim 1, wherein the number of the filters in step 4 is set to one.

5. The intelligent method for real-time flood forecasting according to claim 1, wherein in step 4, the number of cell layers in the LSTM cell layer is set to 2.

6. The intelligent method for real-time flood forecasting according to claim 1, wherein the number of layers of the fully connected layer B in step 4 is not less than 3.