CN115965149A - Water quality index prediction method based on LSTM algorithm model - Google Patents
Water quality index prediction method based on LSTM algorithm model Download PDFInfo
- Publication number
- CN115965149A CN115965149A CN202310012069.5A CN202310012069A CN115965149A CN 115965149 A CN115965149 A CN 115965149A CN 202310012069 A CN202310012069 A CN 202310012069A CN 115965149 A CN115965149 A CN 115965149A
- Authority
- CN
- China
- Prior art keywords
- data
- water quality
- model
- lstm
- algorithm model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/152—Water filtration
Landscapes
- Investigating Or Analyzing Materials By The Use Of Electric Means (AREA)
Abstract
The invention provides a water quality index prediction method based on an LSTM algorithm model, which relates to the technical field of sewage treatment and comprises the following steps: s1: acquiring historical water quality index data of a point to be detected, and preprocessing the data; s2: constructing an LSTM algorithm model based on the preprocessed data; s3: and predicting the water quality index through the established LSTM algorithm model. The invention predicts the water quality index based on the LSTM algorithm model on the basis of the existing on-line monitoring data, thereby predicting the change trend of the water quality index in the future and providing decision support for the integrated operation of a plant, a network and a river.
Description
Technical Field
The invention relates to the technical field of sewage treatment, in particular to a water quality index prediction method based on an LSTM algorithm model.
Background
With the continuous development of computers, AI algorithms and big data technologies, artificial intelligence algorithms have more and more application scenes in the fields of water affairs and environmental protection. The integrated operation of the plant, the network and the river needs a large amount of data support, and the continuous online monitoring equipment can completely present the current situation and historical data, but cannot judge the future change trend. The water quality prediction is to utilize actual historical data and calculate and deduce the future change trend of the water quality of a certain monitoring point of the water body of the water environment by using a water quality mathematical model, and in regional water environment planning and management, the prediction of the water quality is an effective measure for preventing water pollution in advance and occupies an important position in water environment protection work.
The LSTM algorithm model is a variant of the recurrent neural network, and can effectively solve the problem of gradient explosion or disappearance of the simple recurrent neural network. The LSTM algorithm model can improve the long-term dependence problem existing in RNN; LSTM generally performs better than temporal recurrent neural networks and hidden markov models, and as a nonlinear model, LSTM can be used as a complex nonlinear unit to construct larger deep neural networks. The invention aims to provide a water quality index prediction method based on an LSTM algorithm model, which is used for providing decision support for integrated operation of a plant, a network and a river.
Disclosure of Invention
The invention aims to provide a water quality index prediction method based on an LSTM algorithm model, so as to solve the technical problem that the future change trend of a water quality index cannot be predicted in the integrated operation of a plant, a network and a river in the prior art. The technical effects that can be produced by the preferred technical scheme in the technical schemes provided by the invention are described in detail in the following.
In order to achieve the purpose, the invention provides the following technical scheme:
the invention provides a water quality index prediction method based on an LSTM algorithm model, which is characterized by comprising the following steps of:
s1: acquiring historical water quality index data of a point to be detected, and preprocessing the data;
s2: constructing an LSTM algorithm model based on the preprocessed data;
s3: and predicting the water quality index through the established LSTM algorithm model.
According to a preferred embodiment, the step of acquiring historical water quality index data of a point to be detected and preprocessing the data comprises:
s11: firstly, performing status analysis on data, and performing basic description on a data missing value and a data total time interval;
s12: abnormal value processing: firstly, performing descriptive statistics on attribute values to check unreasonable data and whether the data obey normal distribution, and when the distance average value of a sample is more than 3 standard deviations, determining the sample as an abnormal value and deleting the abnormal value from a data set;
s13: and (3) time interval processing: sorting each piece of data according to the detection time, calculating the time interval between each piece of data and the previous piece of data, recording, counting the data amount corresponding to all the time intervals in the data and displaying the data amount to a user, processing the data into data only containing the data corresponding to the time interval selected by the user after the user selects the required time interval, deleting the data corresponding to other time intervals in the data, ensuring that the time intervals of each piece of data are consistent, and keeping the continuity of the data;
s14: missing value processing: and indexing data containing missing values in the data set, and filling the indexed missing values according to the data at the previous time point.
According to a preferred embodiment, the step of constructing the LSTM algorithm model based on the preprocessed data comprises:
designing a model: according to the using environment during water quality prediction, an LSTM model is selected for calculation, water quality indexes of different data sets are coded according to water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes corresponding to input time sequences to serve as input of an LSTM circulation neural network, and the water quality indexes of N future time sequences are output through the LSTM model.
According to a preferred embodiment, the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and (3) developing an algorithm, wherein the algorithm comprises the following steps:
s21: dividing the data set into a training set, a verification set and a test set according to the proportion of 80%, 10% and 10%;
s22: processing a training set and a testing set into input and output variables, and processing the dimensionality of the input and output variables into 3 dimensions including sample size, step length and characteristics, wherein the sample size is the number of the training set, the step length is a lag period M, and the characteristics are input water quality indexes;
s23: define and train the LSTM model: defining the middle layer of the LSTM network as 128 neurons, wherein the LSTM layer is a full-connection layer with 128 nodes, and finally, directly predicting a vector containing 9 water quality indexes through linear transformation by an output layer;
s24: the model is iterated in a way that minimizes the loss of squared error, and the prediction effect of the model is verified using NSE nash coefficients.
According to a preferred embodiment, the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and setting model parameters including a lag period M, a predicted future time period N, iterable times and batches.
According to a preferred embodiment, the method further comprises analyzing the model validation results.
According to a preferred embodiment, the step of analyzing the model verification result comprises:
calculating NSE coefficients of water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes;
and drawing a model fitting effect graph, and obtaining a simulation effect through a visualization effect.
Based on the technical scheme, the water quality index prediction method based on the LSTM algorithm model at least has the following technical effects:
the water quality index prediction method based on the LSTM algorithm model obtains historical water quality index data of points to be detected and preprocesses the data; constructing an LSTM algorithm model based on the preprocessed data; and predicting the water quality index through the established LSTM algorithm model. The invention predicts the water quality index based on the LSTM algorithm model on the basis of the existing on-line monitoring data, thereby being capable of predicting the change trend of the water quality index in the future and providing decision support for the integrated operation of a factory, a network and a river.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a construction process of an LSTM algorithm model in the water quality index prediction method based on the LSTM algorithm model;
FIG. 2 is a graph of the effect of the fit of the model in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
The technical solution of the present invention is explained in detail below.
The invention provides a water quality index prediction method based on an LSTM algorithm model, which specifically comprises the following steps:
s1: and acquiring historical water quality index data of a point to be detected, and preprocessing the data.
The method comprises the following specific steps:
s11: firstly, carrying out status analysis on data, and basically describing a data missing value and a data total time interval.
In this embodiment, water quality index data of a newly added water quality monitoring station 02 (station _ id = 06) is selected as a case, and the ratio of the missing values is as shown in table 1:
TABLE 1
Through data screening processing, the time interval with the largest proportion in the data of the newly added water quality monitoring station 02 is 60 minutes, the data volume is 6143, and the data volume is shown in the following table 2.
TABLE 2 data screening results
Data source | New water quality monitoring station 02 (station _ id = 06) |
Monitoring station | 1 is provided with |
By screening data | 6143 strip |
S12: abnormal value processing: firstly, performing descriptive statistics on attribute values so as to check which values are unreasonable; then, checking whether the data obeys normal distribution, we can determine that the sample with the distance exceeding the average value by 3 standard deviations does not exist in the default situation, and when the sample is more than 3 standard deviations away from the average value, the sample is determined to be an abnormal value, and the abnormal value is deleted from the data set.
S13: and (3) time interval processing: and sequencing each piece of data according to the detection time, calculating the time interval (15min, 30min.) between each piece of data and the previous piece of data, recording, counting the data amount corresponding to all the time intervals in the data and displaying the data amount to a user, processing the data into data only containing the data corresponding to the time interval selected by the user after the user selects the required time interval, deleting the data corresponding to other time intervals in the data, ensuring that the time intervals of each piece of data are kept consistent, and keeping the continuity of the data.
S14: missing value processing: and indexing data containing missing values in the data set, and filling the indexed missing values according to the data at the previous time point.
The data processing results are shown in table 3.
TABLE 3 data preprocessing results
Data source | Newly-added water quality monitoring station 02 |
Monitoring station | 1 is provided with |
By screening the data | 4005 pieces |
S2: and constructing an LSTM algorithm model based on the preprocessed data.
The method comprises the following steps: and (1) designing a model.
According to a use scene during water quality prediction, an LSTM model is selected for calculation, water quality indexes of different data sets are coded according to indexes such as water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen corresponding to an input time sequence (time _ step) to serve as input, the water quality indexes serve as input of an LSTM circulation neural network according to the time sequence, and water quality indexes (water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen) of N future time sequences are output through the LSTM model.
The LSTM model updates the W weight parameter matrix through sigmoid and tanh and extracts input characteristic vectors, linear regression is carried out on the extracted characteristic vectors to calculate water quality indexes of future N time sequences, wherein the water quality indexes comprise water temperature, PH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen, then the parameters are updated through back propagation until model loss convergence, meanwhile, the prediction accuracy is optimal, and the Nash efficiency coefficient NSE is greater than 0.6, so that the deviation of a prediction result on a verification set and a true value is minimized.
The step of constructing the LSTM algorithm model based on the preprocessed data further comprises the following steps: (2) developing an algorithm; the method specifically comprises the following steps:
s21: detecting station position, water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and time interval as model input; the newly added monitoring station 02 data set was used, with 80% as the training data set, 10% as the validation data set, and 10% as the test data set. The time interval was chosen to be 60 minutes.
S22: and processing the training set and the test set into input and output variables, and processing the dimensionality of the input and output variables into 3 dimensions including sample size, step length and characteristics (the sample size is the number of the training set, the step length is a lag period M, and the characteristics are input water quality indexes).
S23: the LSTM model is defined and trained.
The middle layer of the LSTM network is defined as 128 neurons, the LSTM layer is a fully-connected layer with 128 nodes, and finally, the output layer directly predicts a vector containing 9 water quality indexes through linear transformation.
S24: the model is iterated in a way that minimizes the loss of squared error, and the prediction effect of the model is verified using NSE nash coefficients.
(3) And setting model parameters.
The method comprises the following steps: the lag phase M represents that M pieces of usage history are the same, M =3.
And taking the data of the time intervals as input, and predicting a future time period N to represent that the model outputs N pieces of water quantity index data of the same time intervals in the future, wherein N =2.
The training turns represent the iterative times of the model, 100 is selected, and the model training is selected at the front end for setting, so that the more the iterative times, the better the model effect.
The batch (batch _ size) represents the number of samples selected in one training, 64 represents the number of data sent in each batch in the training stage, and the larger the setting value is, the larger the perception width of the model is, but the larger the perception width is, and the model training speed is influenced.
(4) And (5) training a model.
According to the actual situation of the case, defining the input water quality index, the training data set, the verification data set, the test data set, the time interval and the like, and training and testing the algorithm model by checking the algorithm simulation result and adjusting the parameters.
(5) And analyzing a model verification result.
And calculating the NSE coefficients of indexes such as water temperature, PH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus, total nitrogen and the like.
NSE coefficient of each index as shown in Table 4
Index (I) | PH | Total nitrogen | Total phosphorus | Ammonia nitrogen | Temperature of water | Dissolved oxygen | Electrical conductivity of | Permanganate index (GMI) |
Coefficient of NSE | -0.7437 | -1.3325 | -1.7130 | 0.17379 | 0.34024 | 0.0013 | -0.151 | -0.16368 |
(6) And drawing a model fitting effect graph as shown in FIG. 2. Fig. 2 shows a model fitting effect graph, in which a blue (black) curve represents a true value and an orange (measured back) curve represents a predicted value, and the simulation effect of the LSTM is good from the viewpoint of visualization effect.
S3: and predicting the water quality index through the established LSTM algorithm model.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.
Claims (7)
1. A water quality index prediction method based on an LSTM algorithm model is characterized by comprising the following steps:
s1: acquiring historical water quality index data of a point to be detected, and preprocessing the data;
s2: constructing an LSTM algorithm model based on the preprocessed data;
s3: and predicting the water quality index through the established LSTM algorithm model.
2. The LSTM algorithm model-based water quality indicator prediction method of claim 1, wherein the step of obtaining historical water quality indicator data of points to be detected and preprocessing the data comprises:
s11: firstly, performing status analysis on data, and performing basic description on a data missing value and a data total time interval;
s12: abnormal value processing: firstly, performing descriptive statistics on attribute values to check unreasonable data and whether the data obey normal distribution, and when the distance average value of a sample is more than 3 standard deviations, determining the sample as an abnormal value and deleting the abnormal value from a data set;
s13: and (3) time interval processing: sorting each piece of data according to the detection time, calculating the time interval between each piece of data and the previous piece of data, recording, counting the data amount corresponding to all the time intervals in the data and displaying the data amount to a user, processing the data into data only containing the data corresponding to the time interval selected by the user after the user selects the required time interval, deleting the data corresponding to other time intervals in the data, ensuring that the time intervals of each piece of data are consistent, and maintaining the continuity of the data;
s14: missing value processing: and indexing data containing missing values in the data set, and filling the indexed missing values according to the data at the previous time point.
3. The LSTM algorithm model-based water quality indicator prediction method of claim 1, wherein the step of constructing the LSTM algorithm model based on the preprocessed data comprises:
designing a model: according to the using environment during water quality prediction, an LSTM model is selected for calculation, water quality indexes of different data sets are coded according to water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes corresponding to input time sequences to serve as input of an LSTM circulation neural network, and the water quality indexes of N future time sequences are output through the LSTM model.
4. The LSTM algorithm model-based water quality indicator prediction method of claim 3, wherein the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and (3) developing an algorithm, wherein the algorithm comprises the following steps:
s21: dividing a data set into a training set, a verification set and a test set according to the proportion of 80%, 10% and 10%;
s22: processing a training set and a testing set into input and output variables, and processing the dimensionality of the input and output variables into 3 dimensions including sample size, step length and characteristics, wherein the sample size is the number of the training set, the step length is a lag period M, and the characteristics are input water quality indexes;
s23: define and train the LSTM model: defining an intermediate layer of an LSTM network as 128 neurons, wherein the LSTM layer is a fully-connected layer with 128 nodes, and finally, directly predicting a vector containing 9 water quality indexes by an output layer through linear transformation;
s24: the model is iterated in a way that minimizes the loss of squared error, and the prediction effect of the model is verified using NSE nash coefficients.
5. The LSTM algorithm model-based water quality indicator prediction method of claim 4, wherein the step of constructing the LSTM algorithm model based on the preprocessed data further comprises: and setting model parameters including a lag period M, a predicted future time period N, iterable times and batches.
6. The LSTM algorithm model-based water quality indicator prediction method of claim 1 further comprising analyzing the model validation results.
7. The LSTM algorithm model-based water quality indicator prediction method of claim 6, wherein the step of analyzing the model validation result comprises:
calculating NSE coefficients of water temperature, pH, permanganate index, dissolved oxygen, turbidity, conductivity, ammonia nitrogen, total phosphorus and total nitrogen indexes;
and drawing a model fitting effect graph, and obtaining a simulation effect through a visualization effect.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310012069.5A CN115965149A (en) | 2023-01-05 | 2023-01-05 | Water quality index prediction method based on LSTM algorithm model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310012069.5A CN115965149A (en) | 2023-01-05 | 2023-01-05 | Water quality index prediction method based on LSTM algorithm model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115965149A true CN115965149A (en) | 2023-04-14 |
Family
ID=87354664
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310012069.5A Pending CN115965149A (en) | 2023-01-05 | 2023-01-05 | Water quality index prediction method based on LSTM algorithm model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115965149A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118016202A (en) * | 2024-04-10 | 2024-05-10 | 华能山东发电有限公司白杨河发电厂 | Chemical equipment operation analysis method and system based on steam-water quality |
-
2023
- 2023-01-05 CN CN202310012069.5A patent/CN115965149A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118016202A (en) * | 2024-04-10 | 2024-05-10 | 华能山东发电有限公司白杨河发电厂 | Chemical equipment operation analysis method and system based on steam-water quality |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112488395B (en) | Method and system for predicting line loss of power distribution network | |
CN112990556A (en) | User power consumption prediction method based on Prophet-LSTM model | |
CN109033513B (en) | Power transformer fault diagnosis method and power transformer fault diagnosis device | |
CN112149873B (en) | Low-voltage station line loss reasonable interval prediction method based on deep learning | |
CN111313403B (en) | Markov random field-based network topology identification method for low-voltage power distribution system | |
CN109598052B (en) | Intelligent ammeter life cycle prediction method and device based on correlation coefficient analysis | |
CN115965149A (en) | Water quality index prediction method based on LSTM algorithm model | |
CN115542236B (en) | Electric energy meter operation error estimation method and device | |
CN113344406A (en) | Power quality monitoring reliability assessment method for intelligent fusion terminal in distribution network area | |
CN113538063A (en) | Electricity charge abnormal data analysis method, device, equipment and medium based on decision tree | |
CN115580446A (en) | Non-intrusive load detection method based on decentralized federal learning | |
Zhang et al. | Research on water quality prediction method based on AE-LSTM | |
CN115308558A (en) | Method and device for predicting service life of CMOS (complementary Metal oxide semiconductor) device, electronic equipment and medium | |
CN114280490B (en) | Lithium ion battery state of charge estimation method and system | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
Chen et al. | Satellite on-orbit anomaly detection method based on a dynamic threshold and causality pruning | |
CN107274025B (en) | System and method for realizing intelligent identification and management of power consumption mode | |
CN112595918A (en) | Low-voltage meter reading fault detection method and device | |
CN117113086A (en) | Energy storage unit load prediction method, system, electronic equipment and medium | |
CN117171619A (en) | Intelligent power grid terminal network anomaly detection model and method | |
CN111061708A (en) | Electric energy prediction and restoration method based on LSTM neural network | |
CN116842684A (en) | Electric energy meter, evaluation method and system of operation reliability of electric energy meter and electric energy meter processor | |
CN115829089A (en) | Load composition analysis method, device and equipment | |
CN114692729A (en) | New energy station bad data identification and correction method based on deep learning | |
CN112561203B (en) | Method and system for realizing water level early warning based on clustering and GRU |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |