CN115470702B - Sewage treatment water quality prediction method and system based on machine learning - Google Patents

Sewage treatment water quality prediction method and system based on machine learning Download PDF

Info

Publication number
CN115470702B
CN115470702B CN202211112693.4A CN202211112693A CN115470702B CN 115470702 B CN115470702 B CN 115470702B CN 202211112693 A CN202211112693 A CN 202211112693A CN 115470702 B CN115470702 B CN 115470702B
Authority
CN
China
Prior art keywords
water quality
quality prediction
prediction model
water
sewage treatment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211112693.4A
Other languages
Chinese (zh)
Other versions
CN115470702A (en
Inventor
祝新哲
刘炳佑
孙连鹏
吕慧
邓欢忠
李若泓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202211112693.4A priority Critical patent/CN115470702B/en
Publication of CN115470702A publication Critical patent/CN115470702A/en
Application granted granted Critical
Publication of CN115470702B publication Critical patent/CN115470702B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Activated Sludge Processes (AREA)

Abstract

The invention discloses a sewage treatment water quality prediction method and a system based on machine learning, wherein the method comprises the following steps: acquiring daily historical water inflow data of a sewage treatment plant and constructing a water inflow quality database; dividing the water quality data of the water in the water quality database into a training set and a testing set; constructing a water quality prediction model by utilizing a training set based on a five-fold cross validation method; verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model; and inputting the target to be detected into an optimal water quality prediction model to obtain a prediction result. The system comprises: the system comprises a database construction module, a data division module, a model construction module, a verification module and a prediction module. By using the method, the water quality index of the inlet water can be rapidly and accurately predicted. The invention is used as a sewage treatment water quality prediction method and a sewage treatment water quality prediction system based on machine learning, and can be widely applied to the technical field of inlet water quality index prediction.

Description

Sewage treatment water quality prediction method and system based on machine learning
Technical Field
The invention relates to the technical field of inlet water quality index prediction, in particular to a sewage treatment water quality prediction method and system based on machine learning.
Background
In the urban sewage treatment process, the influent water quality has direct requirements on the treatment capacity of a sewage treatment plant, and the treatment process and the control of effluent indexes are also closely influenced.
At present, the water quality index of the inlet water in the sewage treatment process is based on the monitoring of various hardware devices, and the acquisition of data of the indexes, some of which are difficult to directly monitor on line, is delayed and non-real-time, which is not beneficial to the technical adjustment of instruments and equipment in various treatment units in a sewage treatment plant, such as reflux ratio, aeration rate and the like.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a sewage treatment water quality prediction method and system based on machine learning, which can rapidly and accurately predict water quality indexes.
The first technical scheme adopted by the invention is as follows: a sewage treatment water quality prediction method based on machine learning comprises the following steps:
Acquiring daily historical water inflow data of a sewage treatment plant and constructing a water inflow quality database;
dividing the water quality data of the water in the water quality database into a training set and a testing set;
constructing a water quality prediction model by utilizing a training set based on a five-fold cross validation method;
verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model;
And inputting the target to be detected into an optimal water quality prediction model to obtain a prediction result.
Further, the step of acquiring daily historical inflow data of the sewage treatment plant and constructing an inflow water quality database specifically comprises the following steps:
Acquiring daily historical inflow data of a sewage treatment plant and extracting an inflow traveling water quality index;
Classifying the extracted daily historical water inflow data according to the water inflow quality index;
calculating the quartile interval of each type of inflow water quality index by using a quartile algorithm;
removing abnormal values which are larger than a preset threshold value in each type of inflow water quality index;
and constructing a water inflow quality database according to the daily historical water inflow data after the abnormal values are removed.
Further, the influent water quality metrics include flow, chemical oxygen demand, five-day biochemical oxygen demand, total nitrogen, total phosphorus, ammonia nitrogen, pH, chromaticity, and suspended solids concentration.
Further, the step of constructing a water quality prediction model by using a training set based on the five-fold cross validation method specifically comprises the following steps:
Dividing the training set into 5 disjoint parts, selecting one part as a verification set, and the other four parts as training sets;
taking five-day biochemical oxygen demand in the training set as a training dependent variable, taking the rest of inflow water quality indexes in the training set as training independent variables, and adopting a deep learning algorithm to obtain a water quality prediction model;
verifying the water quality prediction model by using a verification set to obtain an experience error of the water quality prediction model;
re-selecting one part as a verification set, and circularly training the other four parts as training sets to obtain five water quality prediction models and corresponding experience errors;
And selecting a water quality prediction model with minimum experience error.
Further, the method further comprises the following steps:
And calculating a determined correlation coefficient of the water quality prediction model with the minimum experience error, and if the determined correlation coefficient is smaller than or equal to a preset value, readjusting each parameter of the water quality prediction model, and repeatedly performing five-fold cross validation to obtain the water quality prediction model which accords with the expectation.
Further, the step of verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model specifically comprises the following steps:
taking the five-day biochemical oxygen demand in the test set as an actual value, and inputting the rest of inflow water quality indexes into a water quality prediction model to obtain a predicted value of the five-day biochemical oxygen demand;
Calculating generalization errors and determining correlation coefficients according to the actual values of the five-day biochemical oxygen demand and the predicted values of the five-day biochemical oxygen demand;
and evaluating the water quality prediction model according to the generalization error and the determined correlation coefficient to obtain an optimal water quality prediction model.
Further, the method further comprises the following steps:
If the correlation coefficient is smaller than the preset value, reconstructing a water quality prediction model.
The second technical scheme adopted by the invention is as follows: a machine learning based sewage treatment water quality prediction system, comprising:
the database construction module is used for acquiring daily historical water inflow data of the sewage treatment plant and constructing a water inflow quality database;
the data dividing module is used for dividing the water quality data of the water in the water quality database into a training set and a testing set;
The model construction module is used for constructing a water quality prediction model by utilizing the training set based on a five-fold cross validation method;
The verification module is used for verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model;
and the prediction module is used for inputting the target to be detected into the optimal water quality prediction model to obtain a prediction result.
The method and the system have the beneficial effects that: firstly, acquiring daily historical inflow data of a sewage treatment plant, extracting inflow water quality indexes of the daily historical inflow data, removing abnormal values, and constructing an inflow water quality database according to the daily historical inflow data after removing the abnormal values so as to ensure the accuracy of constructing a water quality prediction model; secondly, dividing the water quality data of the water in the water quality database into a training set and a testing set; then, a water quality prediction model is built by utilizing a training set based on a five-fold cross validation method, so that the problem of less training data can be solved, and the built water quality prediction model is more accurate and stable due to the fact that all 5 folds of training data are used; then verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model, and determining the generalization capability of the water quality prediction model; and finally, predicting the target to be detected by using the optimal water quality prediction model to obtain a prediction result, thereby realizing quick and accurate prediction of water quality.
Drawings
FIG. 1 is a flow chart of steps of a machine learning-based sewage treatment water quality prediction method of the present invention;
FIG. 2 is a block diagram of a machine learning based sewage treatment water quality prediction system according to the present invention;
FIG. 3 is a schematic view of a water quality index musical instrument for water inflow according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a five-fold cross-validation method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the results of a water quality prediction model test according to an embodiment of the present invention.
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific examples. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
Referring to fig. 1, the invention provides a sewage treatment water quality prediction method based on machine learning, which comprises the following steps:
s1, acquiring daily historical inflow data of a sewage treatment plant and constructing an inflow water quality database;
s1.1, acquiring daily historical inflow data of a sewage treatment plant and extracting an inflow traveling water quality index;
specifically, daily historical water inflow data of a sewage treatment plant is obtained, wherein the daily historical water inflow data comprises online automatic monitoring data and manual sampling detection data, and water inflow water quality indexes comprise flow (Q), chemical Oxygen Demand (COD), five-day biochemical oxygen demand (BOD 5), total Nitrogen (TN), total Phosphorus (TP), ammonia nitrogen (NH 3 -N), pH, chromaticity and Suspended Solid (SS) concentration, and 1057 groups of water inflow water quality index data are obtained.
The flow rate refers to the inflow rate of the sewage treatment plant, namely the amount of sewage entering the sewage treatment plant in unit time, the change trend of the inflow rate can influence the treatment efficiency of the subsequent unit, and in addition, the influence of rainwater inflow and external water infiltration on the water quality and quantity is considered, so that hidden relations can exist between the flow rate and various water quality parameters.
The chemical oxygen demand is the amount of reducing substances to be oxidized in a water sample measured by a chemical method, the water sample takes the amount of oxidizing agents consumed by oxidizing the reducing substances in 1 liter of the water sample as an index under certain conditions, and the amount of the required oxygen after the water sample is oxidized is converted into milligrams per liter of the water sample, and the milligrams are expressed in mg/L, and reflect the pollution degree of the reducing substances in the water.
The five-day biochemical oxygen demand refers to the amount of dissolved oxygen consumed by microorganisms in decomposing certain oxidizable substances, particularly organic substances, in a certain volume of water in a certain period of time, expressed in mg/L or percentage and ppm, and is a comprehensive index reflecting the content of organic pollutants in water, and the higher the biochemical oxygen demand, the more organic pollutants in water and the more serious the water quality pollution.
Total nitrogen is the total amount of inorganic and organic nitrogen in various forms in water, includingAnd/>Inorganic nitrogen, protein, amino acid, organic amine and other organic nitrogen, calculated in milligrams of nitrogen per liter of water, are often used to represent the degree of pollution of the water body by nutrients, and the higher the value, the more serious the water quality pollution.
The total phosphorus is the sum of phosphorus existing in inorganic state and organic state in the wastewater, is one of indexes for measuring the water pollution degree, and the larger the numerical value is, the higher the water pollution degree is.
The ammonia nitrogen is nitrogen in the form of free ammonia and ionic ammonia, mainly comes from the decomposition of nitrogen-containing organic matters in domestic sewage, coking, synthesis of ammonia and other industrial waste water, is an important pollutant of water eutrophication and environmental pollution, and the higher the value, the more serious the water quality pollution.
The pH is measured as a routine daily sewage test in the operation management of sewage plants, and is not only a factor for monitoring the sewage quality, but also affects the living environment of microorganisms in activated sludge, and the greatly changed pH value in sewage is one of the indicators for judging sewage as pollution or some other environmental factors.
The chromaticity is the color of water, and refers to the degree of yellow or yellow brown-like color of soluble substances or colloid substances in the water, the chromaticity of the water is divided into surface color and true color, the surface color refers to the color of the water without suspended substances removed, the color comprises the color generated by the soluble substances and insoluble suspended substances, the true color refers to the color of the water after suspended substances are removed, the color is generated only by the soluble colored substances, and the clean or turbidity of the water is very low, and the true color is similar to the surface color; the industrial wastewater with deep coloring and more suspended matters and domestic sewage have larger difference.
The suspended solid concentration is a solid substance suspended in water, and comprises inorganic matters, organic matters, muddy sand, clay, microorganisms and the like which are insoluble in water, and the suspended substance content in water is one of indexes for measuring the pollution degree of water.
S1.2, classifying the extracted daily historical water inflow data according to water inflow quality indexes;
s1.3, calculating the quartile spacing of each type of inflow water quality index by using a quartile algorithm;
Specifically, the flow data is arranged from small to large according to the value and divided into four equal parts, so as to obtain the minimum value, the first quantile Q1, the median (the second quantile Q2), the third quantile Q3 and the maximum value of the flow data, and the inter-quartile distance IQR is obtained according to the difference between the first quartile and the third quartile, namely iqr=q3-Q1.
Similarly, the quartile spacing of the other types of inflow water quality indexes can be obtained.
S1.4, eliminating abnormal values which are larger than a preset threshold in each type of inflow water quality index;
Specifically, abnormal values larger than a preset threshold in each type of water quality index are removed, the preset threshold in this embodiment is preferably 1.5 times of the quartile interval, that is, the abnormal value is a value smaller than Q1-1.5×iqr or a value larger than q3+1.5×iqr, 984 sets of data are remained, and a piano chart of each type of water quality index is drawn, as shown in fig. 3.
S1.5, constructing a water inlet quality database according to daily historical water inlet data after abnormal values are removed.
S2, dividing the water quality data in the water quality database into a training set and a testing set;
Specifically, the water quality data of the water in the water quality database is divided into two parts according to a ratio of 4:1, 80% of the water quality data of the water is used as a training set, 20% of the water quality data of the water is used as a test set, wherein the sample data size of the training set is 787 groups, and the sample data size of the test set is 197 groups.
S3, as shown in FIG. 4, constructing a water quality prediction model by using a training set based on a five-fold cross validation method;
specifically, the principle of the K-fold cross validation method is as follows: dividing the data set into K parts, using the training data set of K-1 parts to construct a model, determining an optimal super-parameter value of the model, and then verifying the performance of the model based on the determined super-parameter value and based on the test data set of 1 part.
If the training set is relatively smaller, the K value is increased, more data are used for model training in each iteration process, the minimum deviation can be obtained, meanwhile, the algorithm time is prolonged, and the training blocks are highly similar, so that the evaluation result variance is higher.
If the training set is relatively large, the K value is reduced, the calculation cost of performance evaluation of repeated fitting of the model on different data blocks is reduced, and accurate evaluation of the model is obtained on the basis of average performance.
Therefore, the preferred embodiment of the present solution selects the five-fold cross-validation method.
S3.1, equally dividing the training set into 5 disjoint parts, selecting one part as a verification set, and the other four parts as training sets;
Specifically, as shown in fig. 4, the training set is divided into 5 disjoint sets approximately at random, the first set of sample data is 157 sets, the second set of sample data is 157 sets, the third set of sample data is 157 sets, the fourth set of sample data is 158 sets, and the fifth set of sample data is 158 sets, and the first set of sample data is selected as the verification set, and the remaining four sets are training sets.
S3.2, taking five-day biochemical oxygen demand in the training set as a training dependent variable, taking the rest of inflow water quality indexes in the training set as training independent variables, and adopting a deep learning algorithm to obtain a water quality prediction model;
Specifically, the biochemical oxygen demand of five days, i.e., BOD 5, is first obtained by using the microorganism at an optimum temperature, typically 20 ℃ as a standard temperature for measurement, and the organic matter can be basically completed in the first stage (99% of the completed process) under the measurement condition of BOD (oxygen is sufficient and not stirred) at 20 ℃ for 20 days, that is, the biochemical oxygen demand of the first stage is required to be measured, which is difficult to be achieved in practical work, and a standard time is specified for this purpose, typically 5 days as a standard time for measuring BOD, and thus the microorganism is called as five-day biochemical oxygen demand, the BOD is expressed as BOD 5, and the BOD 5 is about 70% of the BOD 20.
Secondly, the water quality of the influent water of the sewage treatment plant is data of mutual coupling and association among various physical, chemical and biological indexes, the relationship among the water quality index data is complex, multidimensional and nonlinear, and machine learning has the capability of mining association rules among the data, and by utilizing the characteristic, a deep learning model can be constructed by combining the historical water quality index data of the sewage treatment plant through a related algorithm of the machine learning, so that the water quality index of the difficult-to-measure influent water is rapidly predicted.
S3.3, verifying the water quality prediction model by using a verification set to obtain an experience error of the water quality prediction model;
In particular, empirical error refers to the error of a model over a training set.
The five-day biochemical oxygen demand in the verification set is taken as an actual value, the other water quality indexes of the inflow water are input into a water quality prediction model only to obtain a predicted value of the five-day biochemical oxygen demand, and the empirical error, namely root mean square error, of the water quality prediction model is calculated according to the actual value of the five-day biochemical oxygen demand and the predicted value of the five-day biochemical oxygen demand, and the calculation formula is specifically as follows:
in the above formula, N is the sample data size of the verification set, Y i is the actual value of the five-day biochemical oxygen demand, which is the predicted value of the five-day biochemical oxygen demand.
S3.4, re-selecting one part as a verification set, and circularly training the other four parts as training sets to obtain five water quality prediction models and corresponding experience errors;
Specifically, selecting a second sample data as a verification set and the other four sample data as training sets, and repeating the step S3.2 and the step S3.3 to obtain a second water quality prediction model and an experience error thereof; similarly, five water quality prediction models and corresponding experience errors can be obtained in total.
S3.5, selecting a water quality prediction model with the minimum empirical error.
Further, calculating a determined correlation coefficient of the water quality prediction model with the minimum experience error, and if the determined correlation coefficient is smaller than or equal to a preset value, namely R 2 is smaller than or equal to 0.6, readjusting each parameter of the water quality prediction model, and repeatedly performing five-fold cross validation to obtain the water quality prediction model which accords with the expectation.
Wherein, the calculation formula of the determined correlation coefficient is as follows:
in the above formula, N is the sample data size of the verification set, In order to input the index of the quality of the water of the inflow water except the five-day biochemical oxygen demand in the verification set into the water quality prediction model with the minimum empirical error, y i is the five-day biochemical oxygen demand in the verification set,/>The average value was calculated for all five days of biochemical oxygen demand in the validation set.
S4, verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model;
s4.1, taking five-day biochemical oxygen demand in the test set as an actual value, and inputting the rest of inflow water quality indexes into a water quality prediction model to obtain a predicted value of the five-day biochemical oxygen demand;
S4.2, calculating generalization errors and determining correlation coefficients according to the actual values of the five-day biochemical oxygen demand and the predicted values of the five-day biochemical oxygen demand;
The generalization error refers to an error of a model on a new sample set (test set), and a calculation formula is as follows:
in the above formula, N is the sample data size of the test set, Y i is the actual value of the five-day biochemical oxygen demand, which is the predicted value of the five-day biochemical oxygen demand.
The calculation formula for determining the correlation coefficient is specifically as follows:
in the above formula, N is the sample data size of the test set, Is the predicted value of the five-day biochemical oxygen demand, y i is the actual value of the five-day biochemical oxygen demand,/>The average value was calculated for the actual values using all five days of biochemical oxygen demand.
Wherein,
By the above formula, rsme=19.29 and r 2 = 0.6421 are calculated.
And S4.3, evaluating the water quality prediction model according to the generalization error and the determined correlation coefficient to obtain an optimal water quality prediction model.
Specifically, as shown in fig. 5, it can be intuitively seen that the actual value of the five-day biochemical oxygen demand is substantially identical to the predicted value of the five-day biochemical oxygen demand, and rsme=19.29 and r 2 = 0.6421 in step S4.2, and further, the generalization ability of the prediction model is illustrated by data.
Further, if the correlation coefficient is determined to be smaller than or equal to a preset value, namely R 2 is smaller than or equal to 0.6, the water quality prediction model is reconstructed.
S5, inputting the target to be measured into an optimal water quality prediction model to obtain a prediction result, so that not only is the quick and accurate prediction of the water quality index difficult to measure of the sewage treatment plant realized, the non-real-time property of the water quality index data difficult to measure is overcome, the soft measurement effect of the water quality of the inlet water of the sewage treatment plant is achieved, but also the prediction value obtained through the optimal water quality prediction model can be used for supplementing historical missing data and reducing the cost of an online monitoring sensor and an artificial monitoring cost of the sewage treatment plant.
As shown in fig. 2, a machine learning-based sewage treatment water quality prediction system includes:
the database construction module is used for acquiring daily historical water inflow data of the sewage treatment plant and constructing a water inflow quality database;
the data dividing module is used for dividing the water quality data of the water in the water quality database into a training set and a testing set;
The model construction module is used for constructing a water quality prediction model by utilizing the training set based on a five-fold cross validation method;
The verification module is used for verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model;
and the prediction module is used for inputting the target to be detected into the optimal water quality prediction model to obtain a prediction result.
The content in the method embodiment is applicable to the system embodiment, the functions specifically realized by the system embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.
While the preferred embodiment of the present application has been described in detail, the application is not limited to the embodiment, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the application, and these equivalent modifications and substitutions are intended to be included in the scope of the present application as defined in the appended claims.

Claims (7)

1. The sewage treatment water quality prediction method based on machine learning is characterized by comprising the following steps of:
Acquiring daily historical water inflow data of a sewage treatment plant and constructing a water inflow quality database;
dividing the water quality data of the water in the water quality database into a training set and a testing set;
constructing a water quality prediction model by utilizing a training set based on a five-fold cross validation method;
verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model;
Inputting the target to be detected into an optimal water quality prediction model to obtain a prediction result;
The step of acquiring daily historical inflow data of the sewage treatment plant and constructing an inflow water quality database specifically comprises the following steps:
Acquiring daily historical inflow data of a sewage treatment plant and extracting an inflow traveling water quality index;
Classifying the extracted daily historical water inflow data according to the water inflow quality index;
calculating the quartile interval of each type of inflow water quality index by using a quartile algorithm;
removing abnormal values which are larger than a preset threshold value in each type of inflow water quality index;
the outlier is a value less than Q1-1.5 xiqr or a value greater than q3+1.5 xiqr, wherein Q1 represents a first quantile, Q3 represents a third quantile, and IQR represents a quartile spacing;
and constructing a water inflow quality database according to the daily historical water inflow data after the abnormal values are removed.
2. The machine learning based sewage treatment water quality prediction method according to claim 1, wherein the influent water quality index includes flow rate, chemical oxygen demand, five-day biochemical oxygen demand, total nitrogen, total phosphorus, ammonia nitrogen, pH, chromaticity and suspended solid concentration.
3. The machine learning-based sewage treatment water quality prediction method according to claim 2, wherein the step of constructing the water quality prediction model by using the training set based on the five-fold cross validation method specifically comprises the following steps:
Dividing the training set into 5 disjoint parts, selecting one part as a verification set, and the other four parts as training sets;
taking five-day biochemical oxygen demand in the training set as a training dependent variable, taking the rest of inflow water quality indexes in the training set as training independent variables, and adopting a deep learning algorithm to obtain a water quality prediction model;
verifying the water quality prediction model by using a verification set to obtain an experience error of the water quality prediction model;
re-selecting one part as a verification set, and circularly training the other four parts as training sets to obtain five water quality prediction models and corresponding experience errors;
And selecting a water quality prediction model with minimum experience error.
4. A machine learning based sewage treatment water quality prediction method according to claim 3, further comprising:
And calculating a determined correlation coefficient of the water quality prediction model with the minimum experience error, and if the determined correlation coefficient is smaller than or equal to a preset value, readjusting each parameter of the water quality prediction model, and repeatedly performing five-fold cross validation to obtain the water quality prediction model which accords with the expectation.
5. The method for predicting the quality of sewage treatment water based on machine learning according to claim 1, wherein the step of verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model comprises the following steps:
taking the five-day biochemical oxygen demand in the test set as an actual value, and inputting the rest of inflow water quality indexes into a water quality prediction model to obtain a predicted value of the five-day biochemical oxygen demand;
Calculating generalization errors and determining correlation coefficients according to the actual values of the five-day biochemical oxygen demand and the predicted values of the five-day biochemical oxygen demand;
and evaluating the water quality prediction model according to the generalization error and the determined correlation coefficient to obtain an optimal water quality prediction model.
6. The machine learning based sewage treatment water quality prediction method according to claim 5, further comprising:
If the correlation coefficient is smaller than the preset value, reconstructing a water quality prediction model.
7. A machine learning-based sewage treatment water quality prediction system for performing a machine learning-based sewage treatment water quality prediction method as set forth in claim 1, comprising:
the database construction module is used for acquiring daily historical water inflow data of the sewage treatment plant and constructing a water inflow quality database;
the data dividing module is used for dividing the water quality data of the water in the water quality database into a training set and a testing set;
The model construction module is used for constructing a water quality prediction model by utilizing the training set based on a five-fold cross validation method;
The verification module is used for verifying the water quality prediction model by using the test set to obtain an optimal water quality prediction model;
and the prediction module is used for inputting the target to be detected into the optimal water quality prediction model to obtain a prediction result.
CN202211112693.4A 2022-09-14 2022-09-14 Sewage treatment water quality prediction method and system based on machine learning Active CN115470702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211112693.4A CN115470702B (en) 2022-09-14 2022-09-14 Sewage treatment water quality prediction method and system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211112693.4A CN115470702B (en) 2022-09-14 2022-09-14 Sewage treatment water quality prediction method and system based on machine learning

Publications (2)

Publication Number Publication Date
CN115470702A CN115470702A (en) 2022-12-13
CN115470702B true CN115470702B (en) 2024-06-11

Family

ID=84333391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211112693.4A Active CN115470702B (en) 2022-09-14 2022-09-14 Sewage treatment water quality prediction method and system based on machine learning

Country Status (1)

Country Link
CN (1) CN115470702B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952685B (en) * 2023-02-02 2023-09-29 淮阴工学院 Sewage treatment process soft measurement modeling method based on integrated deep learning
CN116433041B (en) * 2023-02-17 2024-04-05 广州珠科院工程勘察设计有限公司 Integrated treatment method and system for small-basin water ecology
CN116090678B (en) * 2023-04-11 2023-06-02 北京埃睿迪硬科技有限公司 Data processing method, device and equipment
CN117059201B (en) * 2023-07-26 2024-06-11 佛山市南舟智能科技有限公司 Method, device, equipment and storage medium for predicting chemical oxygen demand of sewage
CN117174198B (en) * 2023-11-02 2024-01-26 山东鸿远新材料科技股份有限公司 Automatic detection cleaning method and system based on zirconium oxychloride production

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598902A (en) * 2019-08-02 2019-12-20 浙江工业大学 Water quality prediction method based on combination of support vector machine and KNN
CN111291937A (en) * 2020-02-25 2020-06-16 合肥学院 Method for predicting quality of treated sewage based on combination of support vector classification and GRU neural network
CN111639111A (en) * 2020-06-09 2020-09-08 天津大学 Water transfer engineering-oriented multi-source monitoring data deep mining and intelligent analysis method
CN111768813A (en) * 2020-07-07 2020-10-13 扬州大学 Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model
CN112132333A (en) * 2020-09-16 2020-12-25 安徽泽众安全科技有限公司 Short-term water quality and water quantity prediction method and system based on deep learning
CN114242156A (en) * 2021-12-17 2022-03-25 厦门大学 Real-time prediction method and system for relative abundance of pathogenic vibrios on marine micro-plastic
CN114894725A (en) * 2022-03-21 2022-08-12 重庆邮电大学 Water quality multi-parameter spectral data Stacking fusion model and water quality multi-parameter measuring method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9235813B1 (en) * 2013-06-29 2016-01-12 Emc Corporation General framework for cross-validation of machine learning algorithms using SQL on distributed systems

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598902A (en) * 2019-08-02 2019-12-20 浙江工业大学 Water quality prediction method based on combination of support vector machine and KNN
CN111291937A (en) * 2020-02-25 2020-06-16 合肥学院 Method for predicting quality of treated sewage based on combination of support vector classification and GRU neural network
CN111639111A (en) * 2020-06-09 2020-09-08 天津大学 Water transfer engineering-oriented multi-source monitoring data deep mining and intelligent analysis method
CN111768813A (en) * 2020-07-07 2020-10-13 扬州大学 Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model
CN112132333A (en) * 2020-09-16 2020-12-25 安徽泽众安全科技有限公司 Short-term water quality and water quantity prediction method and system based on deep learning
CN114242156A (en) * 2021-12-17 2022-03-25 厦门大学 Real-time prediction method and system for relative abundance of pathogenic vibrios on marine micro-plastic
CN114894725A (en) * 2022-03-21 2022-08-12 重庆邮电大学 Water quality multi-parameter spectral data Stacking fusion model and water quality multi-parameter measuring method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
精确曝气流量控制***在污水处理厂的应用;邓欢忠 等;给水排水;20191231;第51-54页 *

Also Published As

Publication number Publication date
CN115470702A (en) 2022-12-13

Similar Documents

Publication Publication Date Title
CN115470702B (en) Sewage treatment water quality prediction method and system based on machine learning
CN108088974B (en) Soft measurement method for effluent nitrate nitrogen in anaerobic simultaneous denitrification methanogenesis process
CN110186505B (en) Method for predicting standard reaching condition of rural domestic sewage treatment facility effluent based on support vector machine
CN107402586A (en) Dissolved Oxygen concentration Control method and system based on deep neural network
CN103632032A (en) Effluent index online soft measurement prediction method in urban sewage treatment process
CN107247888B (en) Method for soft measurement of total phosphorus TP (thermal transfer profile) in sewage treatment effluent based on storage pool network
CN112417765B (en) Sewage treatment process fault detection method based on improved teacher-student network model
CN111977710A (en) Industrial wastewater treatment system and method based on artificial intelligence
CN113325702B (en) Aeration control method and device
CN103810309A (en) Soft measurement modeling method of A2O municipal sewage treatment process based on constraint theory
CN112989704A (en) DE algorithm-based IRFM-CMNN effluent BOD concentration prediction method
CN110642393B (en) Aeration control system based on neural network model
CN114564699B (en) Continuous online monitoring method and system for total phosphorus and total nitrogen
CN115754207A (en) Simulation method and system for biological sewage treatment process
KR101016394B1 (en) Real-time wastewater composition analyzer using a rapid microbial respiration detector, ss and ec combined sensing system and its measuring method
CN112573641B (en) Sewage treatment capacity determining method and device
CN107665288A (en) A kind of water quality hard measurement Forecasting Methodology of COD
CN116679026B (en) Self-adaptive unbiased finite impulse response filtering sewage dissolved oxygen concentration estimation method
CN117776336A (en) Water pretreatment method and anaerobic ammonia oxidation water treatment process
CN115403226B (en) Factory network joint debugging control method, system and device for carbon source in balance system
Fiorentino et al. Optimization of wastewater treatment plants monitoring in flow variation conditions due to rain events.
CN117808216B (en) Energy saving and emission reduction effect evaluation method for sewage treatment
CN117388457B (en) Method for improving prediction accuracy of effluent of sewage plant by coupling hydraulic retention time
Haimi Data-derived soft sensors in biological wastewater treatment-With application of multivariate statistical methods
Sweeney et al. Modeling, instrumentation, automation, and optimization of water resource recovery facilities (2019) DIRECT

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant