CN115965149A - Water quality index prediction method based on LSTM algorithm model - Google Patents

Water quality index prediction method based on LSTM algorithm model Download PDF

Info

Publication number
CN115965149A
CN115965149A CN202310012069.5A CN202310012069A CN115965149A CN 115965149 A CN115965149 A CN 115965149A CN 202310012069 A CN202310012069 A CN 202310012069A CN 115965149 A CN115965149 A CN 115965149A
Authority
CN
China
Prior art keywords
data
water quality
model
lstm
algorithm model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310012069.5A
Other languages
Chinese (zh)
Inventor
刘小梅
孙艳
赵洁
成志轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing North Control Yuehui Environmental Technology Co ltd
Original Assignee
Beijing North Control Yuehui Environmental Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing North Control Yuehui Environmental Technology Co ltd filed Critical Beijing North Control Yuehui Environmental Technology Co ltd
Priority to CN202310012069.5A priority Critical patent/CN115965149A/en
Publication of CN115965149A publication Critical patent/CN115965149A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Investigating Or Analyzing Materials By The Use Of Electric Means (AREA)

Abstract

The invention provides a water quality index prediction method based on an LSTM algorithm model, which relates to the technical field of sewage treatment and comprises the following steps: s1: acquiring historical water quality index data of a point to be detected, and preprocessing the data; s2: constructing an LSTM algorithm model based on the preprocessed data; s3: and predicting the water quality index through the established LSTM algorithm model. The invention predicts the water quality index based on the LSTM algorithm model on the basis of the existing on-line monitoring data, thereby predicting the change trend of the water quality index in the future and providing decision support for the integrated operation of a plant, a network and a river.

Description

Water quality index prediction method based on LSTM algorithm model
Technical Field
The invention relates to the technical field of sewage treatment, in particular to a water quality index prediction method based on an LSTM algorithm model.
Background
With the continuous development of computers, AI algorithms and big data technologies, artificial intelligence algorithms have more and more application scenes in the fields of water affairs and environmental protection. The integrated operation of the plant, the network and the river needs a large amount of data support, and the continuous online monitoring equipment can completely present the current situation and historical data, but cannot judge the future change trend. The water quality prediction is to utilize actual historical data and calculate and deduce the future change trend of the water quality of a certain monitoring point of the water body of the water environment by using a water quality mathematical model, and in regional water environment planning and management, the prediction of the water quality is an effective measure for preventing water pollution in advance and occupies an important position in water environment protection work.
The LSTM algorithm model is a variant of the recurrent neural network, and can effectively solve the problem of gradient explosion or disappearance of the simple recurrent neural network. The LSTM algorithm model can improve the long-term dependence problem existing in RNN; LSTM generally performs better than temporal recurrent neural networks and hidden markov models, and as a nonlinear model, LSTM can be used as a complex nonlinear unit to construct larger deep neural networks. The invention aims to provide a water quality index prediction method based on an LSTM algorithm model, which is used for providing decision support for integrated operation of a plant, a network and a river.
Disclosure of Invention
The invention aims to provide a water quality index prediction method based on an LSTM algorithm model, so as to solve the technical problem that the future change trend of a water quality index cannot be predicted in the integrated operation of a plant, a network and a river in the prior art. The technical effects that can be produced by the preferred technical scheme in the technical schemes provided by the invention are described in detail in the following.
In order to achieve the purpose, the invention provides the following technical scheme:
the invention provides a water quality index prediction method based on an LSTM algorithm model, which is characterized by comprising the following steps of:
s1: acquiring historical water quality index data of a point to be detected, and preprocessing the data;
s2: constructing an LSTM algorithm model based on the preprocessed data;
s3: and predicting the water quality index through the established LSTM algorithm model.
According to a preferred embodiment, the step of acquiring historical water quality index data of a point to be detected and preprocessing the data comprises:
s11: firstly, performing status analysis on data, and performing basic description on a data missing value and a data total time interval;
s12: abnormal value processing: firstly, performing descriptive statistics on attribute values to check unreasonable data and whether the data obey normal distribution, and when the distance average value of a sample is more than 3 standard deviations, determining the sample as an abnormal value and deleting the abnormal value from a data set;
s13: and (3) time interval processing: sorting each piece of data according to the detection time, calculating the time interval between each piece of data and the previous piece of data, recording, counting the data amount corresponding to all the time intervals in the data and displaying the data amount to a user, processing the data into data only containing the data corresponding to the time interval selected by the user after the user selects the required time interval, deleting the data corresponding to other time intervals in the data, ensuring that the time intervals of each piece of data are consistent, and keeping the continuity of the data;
s14: missing value processing: and indexing data containing missing values in the data set, and filling the indexed missing values according to the data at the previous time point.
According to a preferred embodiment, the step of constructing the LSTM algorithm model based on the preprocessed data comprises:
designing a model: according to the using environment during water quality prediction, an LSTM model is selected for calculation, water quality indexes of different data sets are coded according to water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes corresponding to input time sequences to serve as input of an LSTM circulation neural network, and the water quality indexes of N future time sequences are output through the LSTM model.
According to a preferred embodiment, the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and (3) developing an algorithm, wherein the algorithm comprises the following steps:
s21: dividing the data set into a training set, a verification set and a test set according to the proportion of 80%, 10% and 10%;
s22: processing a training set and a testing set into input and output variables, and processing the dimensionality of the input and output variables into 3 dimensions including sample size, step length and characteristics, wherein the sample size is the number of the training set, the step length is a lag period M, and the characteristics are input water quality indexes;
s23: define and train the LSTM model: defining the middle layer of the LSTM network as 128 neurons, wherein the LSTM layer is a full-connection layer with 128 nodes, and finally, directly predicting a vector containing 9 water quality indexes through linear transformation by an output layer;
s24: the model is iterated in a way that minimizes the loss of squared error, and the prediction effect of the model is verified using NSE nash coefficients.
According to a preferred embodiment, the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and setting model parameters including a lag period M, a predicted future time period N, iterable times and batches.
According to a preferred embodiment, the method further comprises analyzing the model validation results.
According to a preferred embodiment, the step of analyzing the model verification result comprises:
calculating NSE coefficients of water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes;
and drawing a model fitting effect graph, and obtaining a simulation effect through a visualization effect.
Based on the technical scheme, the water quality index prediction method based on the LSTM algorithm model at least has the following technical effects:
the water quality index prediction method based on the LSTM algorithm model obtains historical water quality index data of points to be detected and preprocesses the data; constructing an LSTM algorithm model based on the preprocessed data; and predicting the water quality index through the established LSTM algorithm model. The invention predicts the water quality index based on the LSTM algorithm model on the basis of the existing on-line monitoring data, thereby being capable of predicting the change trend of the water quality index in the future and providing decision support for the integrated operation of a factory, a network and a river.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a construction process of an LSTM algorithm model in the water quality index prediction method based on the LSTM algorithm model;
FIG. 2 is a graph of the effect of the fit of the model in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
The technical solution of the present invention is explained in detail below.
The invention provides a water quality index prediction method based on an LSTM algorithm model, which specifically comprises the following steps:
s1: and acquiring historical water quality index data of a point to be detected, and preprocessing the data.
The method comprises the following specific steps:
s11: firstly, carrying out status analysis on data, and basically describing a data missing value and a data total time interval.
In this embodiment, water quality index data of a newly added water quality monitoring station 02 (station _ id = 06) is selected as a case, and the ratio of the missing values is as shown in table 1:
TABLE 1
Figure BDA0004038055750000041
Through data screening processing, the time interval with the largest proportion in the data of the newly added water quality monitoring station 02 is 60 minutes, the data volume is 6143, and the data volume is shown in the following table 2.
TABLE 2 data screening results
Data source New water quality monitoring station 02 (station _ id = 06)
Monitoring station 1 is provided with
By screening data 6143 strip
S12: abnormal value processing: firstly, performing descriptive statistics on attribute values so as to check which values are unreasonable; then, checking whether the data obeys normal distribution, we can determine that the sample with the distance exceeding the average value by 3 standard deviations does not exist in the default situation, and when the sample is more than 3 standard deviations away from the average value, the sample is determined to be an abnormal value, and the abnormal value is deleted from the data set.
S13: and (3) time interval processing: and sequencing each piece of data according to the detection time, calculating the time interval (15min, 30min.) between each piece of data and the previous piece of data, recording, counting the data amount corresponding to all the time intervals in the data and displaying the data amount to a user, processing the data into data only containing the data corresponding to the time interval selected by the user after the user selects the required time interval, deleting the data corresponding to other time intervals in the data, ensuring that the time intervals of each piece of data are kept consistent, and keeping the continuity of the data.
S14: missing value processing: and indexing data containing missing values in the data set, and filling the indexed missing values according to the data at the previous time point.
The data processing results are shown in table 3.
TABLE 3 data preprocessing results
Data source Newly-added water quality monitoring station 02
Monitoring station 1 is provided with
By screening the data 4005 pieces
S2: and constructing an LSTM algorithm model based on the preprocessed data.
The method comprises the following steps: and (1) designing a model.
According to a use scene during water quality prediction, an LSTM model is selected for calculation, water quality indexes of different data sets are coded according to indexes such as water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen corresponding to an input time sequence (time _ step) to serve as input, the water quality indexes serve as input of an LSTM circulation neural network according to the time sequence, and water quality indexes (water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen) of N future time sequences are output through the LSTM model.
The LSTM model updates the W weight parameter matrix through sigmoid and tanh and extracts input characteristic vectors, linear regression is carried out on the extracted characteristic vectors to calculate water quality indexes of future N time sequences, wherein the water quality indexes comprise water temperature, PH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen, then the parameters are updated through back propagation until model loss convergence, meanwhile, the prediction accuracy is optimal, and the Nash efficiency coefficient NSE is greater than 0.6, so that the deviation of a prediction result on a verification set and a true value is minimized.
The step of constructing the LSTM algorithm model based on the preprocessed data further comprises the following steps: (2) developing an algorithm; the method specifically comprises the following steps:
s21: detecting station position, water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and time interval as model input; the newly added monitoring station 02 data set was used, with 80% as the training data set, 10% as the validation data set, and 10% as the test data set. The time interval was chosen to be 60 minutes.
S22: and processing the training set and the test set into input and output variables, and processing the dimensionality of the input and output variables into 3 dimensions including sample size, step length and characteristics (the sample size is the number of the training set, the step length is a lag period M, and the characteristics are input water quality indexes).
S23: the LSTM model is defined and trained.
The middle layer of the LSTM network is defined as 128 neurons, the LSTM layer is a fully-connected layer with 128 nodes, and finally, the output layer directly predicts a vector containing 9 water quality indexes through linear transformation.
S24: the model is iterated in a way that minimizes the loss of squared error, and the prediction effect of the model is verified using NSE nash coefficients.
(3) And setting model parameters.
The method comprises the following steps: the lag phase M represents that M pieces of usage history are the same, M =3.
And taking the data of the time intervals as input, and predicting a future time period N to represent that the model outputs N pieces of water quantity index data of the same time intervals in the future, wherein N =2.
The training turns represent the iterative times of the model, 100 is selected, and the model training is selected at the front end for setting, so that the more the iterative times, the better the model effect.
The batch (batch _ size) represents the number of samples selected in one training, 64 represents the number of data sent in each batch in the training stage, and the larger the setting value is, the larger the perception width of the model is, but the larger the perception width is, and the model training speed is influenced.
(4) And (5) training a model.
According to the actual situation of the case, defining the input water quality index, the training data set, the verification data set, the test data set, the time interval and the like, and training and testing the algorithm model by checking the algorithm simulation result and adjusting the parameters.
(5) And analyzing a model verification result.
And calculating the NSE coefficients of indexes such as water temperature, PH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus, total nitrogen and the like.
NSE coefficient of each index as shown in Table 4
Index (I) PH Total nitrogen Total phosphorus Ammonia nitrogen Temperature of water Dissolved oxygen Electrical conductivity of Permanganate index (GMI)
Coefficient of NSE -0.7437 -1.3325 -1.7130 0.17379 0.34024 0.0013 -0.151 -0.16368
(6) And drawing a model fitting effect graph as shown in FIG. 2. Fig. 2 shows a model fitting effect graph, in which a blue (black) curve represents a true value and an orange (measured back) curve represents a predicted value, and the simulation effect of the LSTM is good from the viewpoint of visualization effect.
S3: and predicting the water quality index through the established LSTM algorithm model.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (7)

1. A water quality index prediction method based on an LSTM algorithm model is characterized by comprising the following steps:
s1: acquiring historical water quality index data of a point to be detected, and preprocessing the data;
s2: constructing an LSTM algorithm model based on the preprocessed data;
s3: and predicting the water quality index through the established LSTM algorithm model.
2. The LSTM algorithm model-based water quality indicator prediction method of claim 1, wherein the step of obtaining historical water quality indicator data of points to be detected and preprocessing the data comprises:
s11: firstly, performing status analysis on data, and performing basic description on a data missing value and a data total time interval;
s12: abnormal value processing: firstly, performing descriptive statistics on attribute values to check unreasonable data and whether the data obey normal distribution, and when the distance average value of a sample is more than 3 standard deviations, determining the sample as an abnormal value and deleting the abnormal value from a data set;
s13: and (3) time interval processing: sorting each piece of data according to the detection time, calculating the time interval between each piece of data and the previous piece of data, recording, counting the data amount corresponding to all the time intervals in the data and displaying the data amount to a user, processing the data into data only containing the data corresponding to the time interval selected by the user after the user selects the required time interval, deleting the data corresponding to other time intervals in the data, ensuring that the time intervals of each piece of data are consistent, and maintaining the continuity of the data;
s14: missing value processing: and indexing data containing missing values in the data set, and filling the indexed missing values according to the data at the previous time point.
3. The LSTM algorithm model-based water quality indicator prediction method of claim 1, wherein the step of constructing the LSTM algorithm model based on the preprocessed data comprises:
designing a model: according to the using environment during water quality prediction, an LSTM model is selected for calculation, water quality indexes of different data sets are coded according to water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes corresponding to input time sequences to serve as input of an LSTM circulation neural network, and the water quality indexes of N future time sequences are output through the LSTM model.
4. The LSTM algorithm model-based water quality indicator prediction method of claim 3, wherein the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and (3) developing an algorithm, wherein the algorithm comprises the following steps:
s21: dividing a data set into a training set, a verification set and a test set according to the proportion of 80%, 10% and 10%;
s22: processing a training set and a testing set into input and output variables, and processing the dimensionality of the input and output variables into 3 dimensions including sample size, step length and characteristics, wherein the sample size is the number of the training set, the step length is a lag period M, and the characteristics are input water quality indexes;
s23: define and train the LSTM model: defining an intermediate layer of an LSTM network as 128 neurons, wherein the LSTM layer is a fully-connected layer with 128 nodes, and finally, directly predicting a vector containing 9 water quality indexes by an output layer through linear transformation;
s24: the model is iterated in a way that minimizes the loss of squared error, and the prediction effect of the model is verified using NSE nash coefficients.
5. The LSTM algorithm model-based water quality indicator prediction method of claim 4, wherein the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and setting model parameters including a lag period M, a predicted future time period N, iterable times and batches.
6. The LSTM algorithm model-based water quality indicator prediction method of claim 1 further comprising analyzing the model validation results.
7. The LSTM algorithm model-based water quality indicator prediction method of claim 6, wherein the step of analyzing the model validation result comprises:
calculating NSE coefficients of water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes;
and drawing a model fitting effect graph, and obtaining a simulation effect through a visualization effect.
CN202310012069.5A 2023-01-05 2023-01-05 Water quality index prediction method based on LSTM algorithm model Pending CN115965149A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310012069.5A CN115965149A (en) 2023-01-05 2023-01-05 Water quality index prediction method based on LSTM algorithm model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310012069.5A CN115965149A (en) 2023-01-05 2023-01-05 Water quality index prediction method based on LSTM algorithm model

Publications (1)

Publication Number Publication Date
CN115965149A true CN115965149A (en) 2023-04-14

Family

ID=87354664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310012069.5A Pending CN115965149A (en) 2023-01-05 2023-01-05 Water quality index prediction method based on LSTM algorithm model

Country Status (1)

Country Link
CN (1) CN115965149A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118016202A (en) * 2024-04-10 2024-05-10 华能山东发电有限公司白杨河发电厂 Chemical equipment operation analysis method and system based on steam-water quality

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118016202A (en) * 2024-04-10 2024-05-10 华能山东发电有限公司白杨河发电厂 Chemical equipment operation analysis method and system based on steam-water quality

Similar Documents

Publication Publication Date Title
CN112488395B (en) Method and system for predicting line loss of power distribution network
CN112990556A (en) User power consumption prediction method based on Prophet-LSTM model
CN109033513B (en) Power transformer fault diagnosis method and power transformer fault diagnosis device
CN112149873B (en) Low-voltage station line loss reasonable interval prediction method based on deep learning
CN111313403B (en) Markov random field-based network topology identification method for low-voltage power distribution system
CN109598052B (en) Intelligent ammeter life cycle prediction method and device based on correlation coefficient analysis
CN115965149A (en) Water quality index prediction method based on LSTM algorithm model
CN115542236B (en) Electric energy meter operation error estimation method and device
CN113344406A (en) Power quality monitoring reliability assessment method for intelligent fusion terminal in distribution network area
CN113538063A (en) Electricity charge abnormal data analysis method, device, equipment and medium based on decision tree
CN115580446A (en) Non-intrusive load detection method based on decentralized federal learning
Zhang et al. Research on water quality prediction method based on AE-LSTM
CN115308558A (en) Method and device for predicting service life of CMOS (complementary Metal oxide semiconductor) device, electronic equipment and medium
CN114280490B (en) Lithium ion battery state of charge estimation method and system
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
Chen et al. Satellite on-orbit anomaly detection method based on a dynamic threshold and causality pruning
CN107274025B (en) System and method for realizing intelligent identification and management of power consumption mode
CN112595918A (en) Low-voltage meter reading fault detection method and device
CN117113086A (en) Energy storage unit load prediction method, system, electronic equipment and medium
CN117171619A (en) Intelligent power grid terminal network anomaly detection model and method
CN111061708A (en) Electric energy prediction and restoration method based on LSTM neural network
CN116842684A (en) Electric energy meter, evaluation method and system of operation reliability of electric energy meter and electric energy meter processor
CN115829089A (en) Load composition analysis method, device and equipment
CN114692729A (en) New energy station bad data identification and correction method based on deep learning
CN112561203B (en) Method and system for realizing water level early warning based on clustering and GRU

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination