CN112633584B - River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model - Google Patents

River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model Download PDF

Info

Publication number
CN112633584B
CN112633584B CN202011600080.6A CN202011600080A CN112633584B CN 112633584 B CN112633584 B CN 112633584B CN 202011600080 A CN202011600080 A CN 202011600080A CN 112633584 B CN112633584 B CN 112633584B
Authority
CN
China
Prior art keywords
water quality
data
river
pollution accident
downstream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011600080.6A
Other languages
Chinese (zh)
Other versions
CN112633584A (en
Inventor
***
王永桂
张雅新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Qilian Ecological Technology Co ltd
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202011600080.6A priority Critical patent/CN112633584B/en
Publication of CN112633584A publication Critical patent/CN112633584A/en
Application granted granted Critical
Publication of CN112633584B publication Critical patent/CN112633584B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Economics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a river sudden water pollution accident water quality prediction method based on an improved seq2seq and a long-short term memory network, which comprises the following steps: extracting time characteristics of water quality data and hydrological data through an encoder of a seq2seq model architecture, and extracting spatial characteristics of a river reach and a downstream river reach of a pollution accident through a multilayer perceptron artificial neural network; the time characteristic and the space characteristic are spliced to update the hidden state vector, and a decoder outputs continuous water quality prediction sequence data of a downstream river reach, so that an ensemble-seq2seq model is constructed and trained; and predicting continuous water quality sequence data of the downstream river reach based on an ensemble-seq2seq model. The invention solves the problems that the traditional prediction model can not consider time and space characteristics at the same time and only can predict a single time point, the problem of predicting the downstream water quality after the river sudden water pollution accident occurs is solved, the comprehensive water quality prediction performance is improved, and compared with the traditional prediction model, the invention has better approximation precision and generalization capability and is convenient for the downstream to take corresponding treatment measures.

Description

River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model
Technical Field
The invention relates to the field of water quality prediction, in particular to a river sudden water pollution accident water quality prediction method based on an improved LSTM-seq2seq model.
Background
At present, common water quality prediction methods include mechanism models such as EFDC and SWAT, and machine learning methods such as time series prediction, grey theory and artificial neural network. The mechanism model is an important means for performing emergency accident simulation prediction, and although a mature one-dimensional, two-dimensional or even three-dimensional hydrodynamic water quality model is provided in the aspect of the emergency accident prediction model, the acquisition of the input conditions and parameters of the model is difficult, the parameter calibration is complex, so that the problems that the emergency accident simulation prediction cannot be performed quickly, the accuracy of the simulation result is low and the like are still important bottleneck problems that the emergency accident prediction model is limited to be applied to actual water environment risk emergency treatment. In addition, machine learning methods are also commonly used for water quality prediction, for example, patent CN108053054A uses ARIMA smooth data to optimize and construct a wavelet neural network model by using a genetic algorithm to predict river water quality. The patent CN108334977B establishes a deep network with a generative confrontation network layer and a BP neural network layer, extracts the depth features of the data source to form initialization data, and then performs optimization analysis of the BP neural network to obtain a water quality prediction result. The patent CN 110852515A uses an ED-LSTM model to predict the predicted value of the future water quality index. However, these methods either do not combine the time characteristic and the spatial characteristic of the water quality data, or predict the water quality at a single time point, and cannot predict the water quality in continuous time to generate a prediction sequence in continuous time. In addition, the method does not predict a certain water quality index of a certain river section at the downstream after a sudden water pollution accident of the river.
Disclosure of Invention
In view of the above, the invention provides a river sudden water pollution accident simulation and prediction technology, which fuses time and space characteristics, outputs continuous sequence prediction by using a seq2seq model, can obtain continuous prediction sequence data of a certain water quality index at the downstream of a water pollution accident, solves the problem that the traditional prediction model cannot simultaneously consider the time and space characteristics and can only predict a single time point, improves the comprehensive water quality prediction performance, has better approximation precision and generalization capability compared with the traditional prediction model, and is convenient for the downstream to take corresponding treatment measures. The method is used for realizing convenient, quick, accurate and long-term simulation prediction of each water pollution index of the downstream river after a sudden water pollution accident of the river occurs.
The invention provides a river sudden water pollution accident water quality prediction method based on an improved LSTM-seq2seq model, which couples the LSTM-seq2seq model with a full-connection neural network, fuses time and space characteristics and outputs a continuous prediction sequence, and specifically comprises the following steps:
s1, acquiring hydrological water quality data of a section where the sudden water pollution accident of the historical river occurs, water quality data of a downstream section and distance data between the section and the section where the pollution accident occurs;
s2, preprocessing the data acquired in S1 to form a training set and a verification set;
s3, extracting the time characteristics of hydrological water quality data of a historical river burst water pollution accident occurrence section and water quality data of a downstream section through an encoder of a seq2seq model, and extracting the spatial characteristics of the relative positions of the upstream section and the downstream section of distance data of the pollution accident occurrence section through a multilayer perceptron artificial neural network;
s4, splicing the time characteristic and the spatial characteristic to update an implicit state vector for predicting the downstream river reach; obtaining continuous time prediction water quality sequence data of a downstream section through LSTM layer output of a decoder of the seq2seq model, thereby constructing an ensemble-seq2seq model; training a model by using the training set, and optimally adjusting parameters of the model by using the verification set to obtain an ensemble-seq2seq model after training;
s5, inputting the hydrological water quality data of the current pollution accident occurrence river reach and the distance between the target prediction river reach and the pollution accident occurrence river reach into the trained ensemble-seq2seq network model, and outputting to obtain continuous sequence prediction data of a certain water quality index of the downstream river reach.
Further, in step S2, the preprocessing specifically includes:
s11: filling missing data in the hydrological water quality data, the water quality data of the downstream river reach and the distance data between the hydrological water quality data and the river reach where the pollution accident occurs by adopting a Lagrange interpolation method to obtain filled complete data;
s12: and (4) filling the complete data according to the following steps of 8: 2, dividing all data into a training set and a verification set according to the proportion;
s13: and carrying out z-score standardization processing on input data of the training set and the verification set to obtain a standardized training set and a standardized verification set.
Further, the hydrological water quality data of the river reach where the historical river sudden water pollution accident occurs in step S1 includes dissolved oxygen, high manganese acid salt index, five-day biochemical oxygen demand, ammonia nitrogen, total phosphorus, concentration of heavy metal ions, pH and water temperature of the current river reach section;
the hydrological data includes flow and water level;
the water quality data of the downstream river reach specifically refers to the water quality data of any section of the downstream of the river reach where the sudden water pollution accident occurs, and the water quality data comprises dissolved oxygen, high-manganese-acid-salt index, five-day biochemical oxygen demand, concentration, pH and water temperature of ammonia nitrogen, total phosphorus, heavy metal ions and the like;
the distance between the section of the downstream river reach and the section of the river reach where the pollution accident occurs is specifically the length of the river between the two river reaches.
Furthermore, the hydrological water quality data of the river reach of the historical river burst water pollution accident and the water quality data of the downstream river reach are time sequence data.
Further, the encoder of the seq2seq model of step S3 and the decoder of the seq2seq model of step S4 are each composed of LSTM layers, and the activation function of LSTM is set as the RELU function.
Further, in the seq2seq model in step S3, the hidden state matrix is added with the spatial features learned by the multilayer perceptron artificial neural network from the distance data of the river reach where the pollution accident occurs.
The water quality continuous prediction data output by the ensemble-seq2seq model in step S4 is specifically continuous prediction data for a selected water quality index.
The beneficial effects provided by the invention are as follows: the time and space characteristics are fused, the seq2seq model is used for outputting the continuous sequence prediction, the future continuous water quality sequence data of the downstream of the water pollution accident section can be obtained, the problem that the traditional prediction model can only predict a single time point is solved, and compared with the traditional prediction model, the method has better approximation precision and generalization capability, improves the comprehensive water quality prediction performance, and is convenient for the downstream management department to take corresponding treatment measures.
Drawings
FIG. 1 is a flow chart of a water quality prediction method for a river sudden water pollution accident based on an improved LSTM-seq2seq model;
FIG. 2 shows a structure of an ensemble-seq2seq model in the water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a flow chart of a water quality prediction method for a river sudden water pollution accident based on an improved LSTM-seq2seq model according to the present invention.
Fig. 1 is a flow chart of a water quality prediction method for a river sudden water pollution accident based on an improved LSTM-seq2seq model according to an embodiment of the present invention, as shown in fig. 1, the method includes two stages:
1) in the model training stage, data of sudden water pollution accidents of the historical river are preprocessed, and a model is built and trained; the model training comprises the following steps:
s1, acquiring hydrological water quality data of a section where the sudden water pollution accident of the historical river occurs, water quality data of a downstream section and distance data between the section and the section where the pollution accident occurs;
firstly, acquiring hydrological water quality data of a section where a sudden water pollution accident of a historical river occurs, water quality data of a downstream section and distance data between the section and the section where the sudden water pollution accident occurs. The water quality data of the river reach of the sudden water pollution accident of the historical river includes but is not limited to dissolved oxygen, high-manganese-acid-salt index, five-day biochemical oxygen demand, ammonia nitrogen, total phosphorus, concentration of heavy metal ions and the like, pH, water temperature and the like of the current river reach section; hydrologic data including but not limited to flow, water level, etc.; the water quality data of the downstream river reach specifically refers to the water quality data of a certain section of the downstream of the river reach where the sudden water pollution accident occurs, and the data include but are not limited to dissolved oxygen, high-manganese-acid-salt index, five-day biochemical oxygen demand, ammonia nitrogen, total phosphorus, heavy metal ion concentration, pH, water temperature and the like; the distance between the section of the downstream river reach and the section of the river reach where the pollution accident occurs is specifically the length of the river between the two river reaches. In addition, optionally, for an area with sufficient geographic data, geographic attributes such as the slope of the river reach where the pollution accident occurs and the slope of the downstream river reach can be acquired as input data to obtain higher prediction accuracy.
S2, preprocessing the data acquired in S1 to form a training set and a verification set;
the pretreatment specifically comprises the following steps:
s11: filling missing data in the hydrological water quality data, the water quality data of the downstream river reach and the distance data between the hydrological water quality data and the river reach where the pollution accident occurs by adopting a Lagrange interpolation method to obtain filled complete data;
s12: and (4) the filled complete data is processed according to the following steps of 8: 2, dividing all data into a training set and a verification set according to the proportion;
s13: and carrying out z-score standardization processing on input data of the training set and the verification set to obtain a standardized training set and a standardized verification set.
The first embodiment is as follows:
lagrange interpolation is carried out on the acquired data, and missing values are filled by using a lagrange function in a scipy library of python. And then, forming an input data matrix with the shape of (batch _ size, w, features _ num) by the water quality data and the hydrological data of the river section where the river burst water pollution accident occurs, wherein the input data matrix is used as input data X, the batch _ size is the size of the training batch, the w is the time length of the input hydrological water quality data, and the features _ num is the number of the hydrological water quality parameters. The distance between the cross section of the downstream river reach and the cross section of the river reach where the pollution accident occurs is used as input data D, and the vector shape is (1). Optionally, geographical attributes such as the slope of the river reach where the pollution accident occurs and the downstream river reach are used as input data C, the vector shape is (attribute _ num), and the attribute _ num is the number of the geographical attributes. The data of a certain water quality index of the downstream river reach is used as an output vector y, the shape of the output vector y is (batch _ size, n,1), the batch size of the training is the batch size, the same as the batch _ size of the X, and n is the time length of the target predicted hydrological water quality data of the downstream. Input data X, D, and optionally C and corresponding output data y are mapped to form a data set, in accordance with 8: 2, the data is divided into a training set and a validation set, and the input data of the training set and the validation set is normalized by z-score using the mean and variance of the training set, which is expressed by the formula:
Figure GDA0003565987710000061
wherein xiAs the original data, it is the original data,
Figure GDA0003565987710000062
the mean value of the original data of the training set is std, and the standard deviation of the original data of the training set is std.
S3, extracting the time characteristics of hydrological water quality data of a section where the sudden water pollution accident of the historical river occurs and water quality data of a downstream section through an encoder of a seq2seq model, and extracting the space characteristics of the relative positions of the upstream section and the downstream section of distance data of the section where the sudden water pollution accident occurs through an artificial neural network of a multilayer perceptron;
s4, splicing the time characteristic and the spatial characteristic to update an implicit state vector of the downstream river reach prediction; obtaining continuous time prediction water quality sequence data of a downstream section through LSTM layer output of a decoder of the seq2seq model, thereby constructing an ensemble-seq2seq model; training a model by using the training set, and optimally adjusting parameters of the model by using the verification set to obtain an ensemble-seq2seq model after training;
example 2:
an ensemble-seq2seq model shown in fig. 2 is established by using tensoflow, keras or pytorch, both an encoder unit and a decoder unit of the seq2seq model are LSTM units, and spatial features learned by a multi-layer perceptron artificial neural network are added to an implicit state matrix of the seq2seq model. The input of the ensemble-seq2seq model is composed of the X, the D and the optional C, the input of the encoder is the X, the output of the multi-layer perceptron artificial neural network is the D and the optional C, and the output is composed of the y. The loss function is formula (2), the model evaluation index is a nash efficiency coefficient (NSE), and the coefficients (CC) are determined, and the calculation formulas are formula (3) and formula (4), respectively.
Figure GDA0003565987710000071
Figure GDA0003565987710000075
Figure GDA0003565987710000072
YmIs the true value, YpIn order to predict the value of the target,
Figure GDA0003565987710000073
is the average of the true values and is,
Figure GDA0003565987710000074
is the average of the predicted values. The closer the NSE and CC values are to 1, the more accurate the model can predict. To prevent overfitting, dropout layers with a discard ratio of 0.5 were used between the LSTM layers. The model training process adopts early stopping strategy. The learning rate is reduced during training using the learning rate decay strategy to maximize model convergence. In addition, model training is performed by adopting an adam optimization algorithm. And training the model by using a training set, optimizing the parameters of the model by using a verification set, automatically adjusting the parameters by using a keras tuner to obtain the optimized parameters of the model, and finally obtaining the trained model.
2) And in the water quality prediction stage, the model trained in the model training stage is used for predicting the water quality index change at a certain position downstream of the specific river sudden water pollution accident.
S5, inputting the hydrological water quality data of the current river reach of the pollution accident and the distance between the target prediction river reach and the river reach of the pollution accident to the trained ensemble-seq2seq network model, and outputting the continuous sequence prediction data for obtaining a certain water quality index of the downstream river reach
When a sudden water pollution accident happens to a certain river, hydrological water quality sequence data of an accident occurrence place or a certain downstream place of the accident occurrence place are measured by field workers or an automatic monitoring station, the length of a target prediction river reach between the target prediction river reach and the target prediction river reach is obtained by using arcgis or table lookup, the data are input into a trained ensemble-seq2seq network model, and continuous water quality prediction data of the downstream river reach are output.
The beneficial effects provided by the invention are as follows: the time and space characteristics are fused, the seq2seq model is used for outputting the continuous sequence prediction, the future continuous water quality sequence data of the downstream of the water pollution accident section can be obtained, the problem that the traditional prediction model can only predict a single time point is solved, and compared with the traditional prediction model, the method has better approximation precision and generalization capability, improves the comprehensive water quality prediction performance, and is convenient for the downstream management department to take corresponding treatment measures.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims (7)

1. A river sudden water pollution accident water quality prediction method based on an improved LSTM-seq2seq model is characterized by comprising the following steps: coupling an LSTM-seq2seq model with a fully-connected neural network, fusing time and space characteristics, and outputting a continuous prediction sequence, specifically comprising the following steps:
s1, acquiring hydrological water quality data of a section where the sudden water pollution accident of the historical river occurs, water quality data of a downstream section and distance data between the section and the section where the pollution accident occurs;
s2, preprocessing the data acquired in S1 to form a training set and a verification set;
s3, extracting the time characteristics of hydrological water quality data of a historical river burst water pollution accident occurrence section and water quality data of a downstream section through an encoder of a seq2seq model, and extracting the spatial characteristics of the relative positions of the upstream section and the downstream section of distance data of the pollution accident occurrence section through a multilayer perceptron artificial neural network;
s4, splicing the time characteristic and the spatial characteristic to update an implicit state vector for predicting the downstream river reach; obtaining continuous time prediction water quality sequence data of a downstream section through LSTM layer output of a decoder of the seq2seq model, thereby constructing an ensemble-seq2seq model;
the specific process for constructing the ensemble-seq2seq model is as follows:
establishing an ensemble-seq2seq model by using tensoflow, keras or pytorch, wherein an encoder unit and a decoder unit of the seq2seq model are both LSTM units, and adding spatial features learned by an artificial neural network of a multilayer perceptron into an implicit state matrix of the seq2seq model; the input of the ensemble-seq2seq model consists of input data X, D and C, the input of an encoder is X, the output of the multilayer perceptron artificial neural network is D and C, and the output consists of y; the loss function is formula (1), the model evaluation index is a Nash efficiency coefficient (NSE), and the coefficient (CC) is determined, and the calculation formulas are respectively formula (2) and formula (3);
Figure FDA0003555800630000011
Figure FDA0003555800630000012
Figure FDA0003555800630000021
Ymis the true value, YpIn order to predict the value of the target,
Figure FDA0003555800630000022
is the average of the true values and is,
Figure FDA0003555800630000023
is the average of the predicted values; the closer the NSE and CC values are to 1, the more accurate the model is predicted; wherein X is an input data matrix consisting of water quality data and hydrologic data of a river reach where the river sudden water pollution accident occurs; d is the distance between the section of the downstream river section and the section of the river section where the pollution accident occurs; c is the slope geographical attribute of the river reach and the downstream river reach where the pollution accident occurs; y is data of any water quality index of the downstream river reach;
training a model by using the training set, and optimally adjusting parameters of the model by using the verification set to obtain an ensemble-seq2seq model after training;
s5, inputting hydrological water quality data of the current river reach of the pollution accident and the distance between the target prediction river reach and the current river reach of the pollution accident to the trained ensemble-seq2seq network model, and outputting to obtain continuous sequence prediction data of a certain water quality index of the downstream river reach.
2. The water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model as claimed in claim 1, wherein: in step S2, the preprocessing specifically includes:
s11: filling missing data in the hydrological water quality data, the water quality data of the downstream river reach and the distance data between the hydrological water quality data and the river reach where the pollution accident occurs by adopting a Lagrange interpolation method to obtain filled complete data;
s12: and (4) the filled complete data is processed according to the following steps of 8: 2, dividing all data into a training set and a verification set according to the proportion;
s13: and carrying out z-score standardization processing on input data of the training set and the verification set to obtain a standardized training set and a standardized verification set.
3. The water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model as claimed in claim 1, wherein: the hydrological water quality data of the river reach of the historical river burst water pollution accident in the step S1 comprise dissolved oxygen, high-manganese-acid-salt index, five-day biochemical oxygen demand, ammonia nitrogen, total phosphorus, concentration of heavy metal ions, pH and water temperature of the current river reach section;
the hydrological data includes flow and water level;
the water quality data of the downstream river reach specifically refers to the water quality data of any section of the downstream of the river reach where the sudden water pollution accident occurs, and the water quality data comprises dissolved oxygen, high-manganese-acid-salt index, five-day biochemical oxygen demand, ammonia nitrogen, total phosphorus, concentration of heavy metal ions, pH and water temperature;
the distance between the section of the downstream river reach and the section of the river reach where the pollution accident occurs is specifically the length of the river between the two river reaches.
4. The water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model as claimed in claim 1, wherein: the hydrological water quality data of the river reach of the historical river sudden water pollution accident and the water quality data of the downstream river reach are sequence data of a period of time.
5. The water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model as claimed in claim 1, wherein: the encoder of the seq2seq model of step S3 and the decoder of the seq2seq model of step S4 are both composed of LSTM layers, the activation function of LSTM is set as RELU function.
6. The water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model as claimed in claim 1, characterized in that: in the seq2seq model described in step S3, the hidden state matrix is added with the spatial features learned from the distance data of the river reach where the pollution accident occurs by the multi-layer perceptron artificial neural network.
7. The water quality prediction method for the river sudden water pollution accident based on the improved LSTM-seq2seq model as claimed in claim 1, wherein: the water quality continuous prediction data output by the ensemble-seq2seq model in step S4 is specifically continuous prediction data for a selected water quality index.
CN202011600080.6A 2020-12-29 2020-12-29 River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model Active CN112633584B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011600080.6A CN112633584B (en) 2020-12-29 2020-12-29 River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011600080.6A CN112633584B (en) 2020-12-29 2020-12-29 River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model

Publications (2)

Publication Number Publication Date
CN112633584A CN112633584A (en) 2021-04-09
CN112633584B true CN112633584B (en) 2022-06-21

Family

ID=75286480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011600080.6A Active CN112633584B (en) 2020-12-29 2020-12-29 River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model

Country Status (1)

Country Link
CN (1) CN112633584B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182709B (en) * 2020-09-28 2024-01-16 中国水利水电科学研究院 Method for rapidly predicting water drainage temperature of large reservoir stoplog gate layered water taking facility
CN113379029B (en) * 2021-04-22 2022-08-30 中国地质大学(武汉) Water quality prediction method of deep learning model based on physical law and process drive
CN114031147B (en) * 2021-11-02 2022-06-14 航天环保(北京)有限公司 Method and system for improving water quality by utilizing wave cracking nano material
CN114187783B (en) * 2021-12-06 2023-10-31 中国民航大学 Method for analyzing and predicting potential conflict in airport flight area

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153874A (en) * 2017-04-11 2017-09-12 中国农业大学 Water quality prediction method and system
CN109508811A (en) * 2018-09-30 2019-03-22 中冶华天工程技术有限公司 Parameter prediction method is discharged based on principal component analysis and the sewage treatment of shot and long term memory network
CN111160628A (en) * 2019-12-13 2020-05-15 重庆邮电大学 Air pollutant concentration prediction method based on CNN and double-attention seq2seq

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10410113B2 (en) * 2016-01-14 2019-09-10 Preferred Networks, Inc. Time series data adaptation and sensor fusion systems, methods, and apparatus
US11556789B2 (en) * 2019-06-24 2023-01-17 Tata Consultancy Services Limited Time series prediction with confidence estimates using sparse recurrent mixture density networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153874A (en) * 2017-04-11 2017-09-12 中国农业大学 Water quality prediction method and system
CN109508811A (en) * 2018-09-30 2019-03-22 中冶华天工程技术有限公司 Parameter prediction method is discharged based on principal component analysis and the sewage treatment of shot and long term memory network
CN111160628A (en) * 2019-12-13 2020-05-15 重庆邮电大学 Air pollutant concentration prediction method based on CNN and double-attention seq2seq

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Rainfall‐Runoff Model With LSTM‐Based Sequence‐to‐Sequence Learning;Zhongrun Xiang etal.;《Water Resources Research》;20200331;第56卷(第1期);第1-17页 *
基于Seq2Seq 模型的港口进出口货物量预测;王涛 等;《计算机***应用》;20200228;第29卷(第3期);第132-139页 *
基于Seq2seq模型的推荐应用研究;陈俊航 等;《计算机科学》;20190630;第46卷(第6A期);第493-496页 *

Also Published As

Publication number Publication date
CN112633584A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
CN112633584B (en) River sudden water pollution accident water quality prediction method based on improved LSTM-seq2seq model
CN109887282B (en) Road network traffic flow prediction method based on hierarchical timing diagram convolutional network
CN112418547B (en) Bus stop passenger flow prediction method based on GCN-LSTM combination model
CN105488528B (en) Neural network image classification method based on improving expert inquiry method
CN102183621B (en) Aquaculture dissolved oxygen concentration online forecasting method and system
CN109242140A (en) A kind of traffic flow forecasting method based on LSTM_Attention network
CN109635763B (en) Crowd density estimation method
CN110232461A (en) More interconnection vector machine water quality prediction methods based on quantum genetic algorithm optimization
CN111105065A (en) Rural water supply system and method based on machine learning
CN112766603A (en) Traffic flow prediction method, system, computer device and storage medium
CN116992779A (en) Simulation method and system of photovoltaic energy storage system based on digital twin model
CN115359366A (en) Remote sensing image target detection method based on parameter optimization
CN113435124A (en) Water quality space-time correlation prediction method based on long-time and short-time memory and radial basis function neural network
CN110289987B (en) Multi-agent system network anti-attack capability assessment method based on characterization learning
CN115730744A (en) Water consumption prediction method and system based on user mode and deep learning combined model
CN116485839A (en) Visual tracking method based on attention self-adaptive selection of transducer
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN110852808A (en) Asynchronous adaptive value evaluation method of electronic product based on deep neural network
CN112508734B (en) Method and device for predicting power generation capacity of power enterprise based on convolutional neural network
CN109697531A (en) A kind of logistics park-hinterland Forecast of Logistics Demand method
CN117150680A (en) Airfoil profile optimization design method based on deep learning and reinforcement learning
CN116911178A (en) Method and system for predicting capacity of small and medium-sized reservoirs based on weather forecast
CN116106909A (en) Radar echo extrapolation method, system and storage medium
CN114841461B (en) Air quality integrated prediction method based on time sequence missing perception and multi-source factor fusion
CN107844834A (en) A kind of nonlinear system modeling method based on mixed model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20210409

Assignee: Wuhan Qilian Ecological Technology Co.,Ltd.

Assignor: CHINA University OF GEOSCIENCES (WUHAN CITY)

Contract record no.: X2022420000070

Denomination of invention: Water quality prediction method for sudden water pollution accidents in rivers based on improved LSTM-seq2seq model

Granted publication date: 20220621

License type: Common License

Record date: 20220805

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240314

Address after: Room 1315, Luojia Creative Park Phase I College Student Entrepreneurship Base, No. 33 Luoyu Road, Hongshan District, Wuhan City, Hubei Province, 430000

Patentee after: Wuhan Qilian Ecological Technology Co.,Ltd.

Country or region after: China

Address before: 430000 Lu Mill Road, Hongshan District, Wuhan, Hubei Province, No. 388

Patentee before: CHINA University OF GEOSCIENCES (WUHAN CITY)

Country or region before: China