CN113962819A - Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine - Google Patents
Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine Download PDFInfo
- Publication number
- CN113962819A CN113962819A CN202111170371.0A CN202111170371A CN113962819A CN 113962819 A CN113962819 A CN 113962819A CN 202111170371 A CN202111170371 A CN 202111170371A CN 113962819 A CN113962819 A CN 113962819A
- Authority
- CN
- China
- Prior art keywords
- dissolved oxygen
- prediction
- data
- value
- ielm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 title claims abstract description 75
- 229910052760 oxygen Inorganic materials 0.000 title claims abstract description 75
- 239000001301 oxygen Substances 0.000 title claims abstract description 75
- 238000009360 aquaculture Methods 0.000 title claims abstract description 33
- 244000144974 aquaculture Species 0.000 title claims abstract description 33
- 238000012360 testing method Methods 0.000 claims abstract description 32
- 238000012216 screening Methods 0.000 claims abstract description 21
- 238000012549 training Methods 0.000 claims abstract description 20
- 238000007781 pre-processing Methods 0.000 claims abstract description 14
- 230000008859 change Effects 0.000 claims abstract description 11
- 235000012907 honey Nutrition 0.000 claims description 30
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 27
- 241000257303 Hymenoptera Species 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 19
- 239000002245 particle Substances 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 4
- 238000010276 construction Methods 0.000 abstract 1
- 238000003062 neural network model Methods 0.000 description 8
- 238000005286 illumination Methods 0.000 description 4
- 230000005855 radiation Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241000251468 Actinopterygii Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002620 method output Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- -1 pH value Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Forestry; Mining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Marketing (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Operations Research (AREA)
- Primary Health Care (AREA)
- Mining & Mineral Resources (AREA)
- Marine Sciences & Fisheries (AREA)
- Animal Husbandry (AREA)
- Agronomy & Crop Science (AREA)
- Quality & Reliability (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture. Belongs to the technical field of aquaculture; the method comprises the following specific steps: data preprocessing, factor screening, IELM network model construction, test prediction method and prediction result output. The invention corrects the missing data by using a data preprocessing method; screening the index factors by using a Pearson correlation coefficient method, determining 8 indexes with strongest correlation with the dissolved oxygen concentration as input quantity of a prediction method, and dividing a preprocessed data set into a training set and a testing set; then, optimizing the initial weight and the threshold of the extreme learning machine by using an artificial bee colony algorithm to obtain an optimal parameter value, and constructing an IELM network model; finally, the obtained dissolved oxygen prediction value of the IELM is compared with the prediction result of the traditional ELM model in the test set, the prediction effect of the IELM prediction method is better, and the change trend of the dissolved oxygen in the industrial aquaculture can be predicted more accurately.
Description
Technical Field
The invention belongs to the technical field of aquaculture, relates to a method for predicting dissolved oxygen in industrial aquaculture, and particularly relates to a method for predicting dissolved oxygen in industrial aquaculture based on an extreme learning machine.
Background
Industrial aquaculture provides new hopes for areas with limited natural resources in an industrial and intensive culture mode, and is an industry trend of the aquaculture industry. In industrial aquaculture, the balance and quality of water quality of a water body are particularly important, and the accurate control and prediction of dissolved oxygen are the center of gravity of the aquaculture work. How to obtain and effectively utilize the information of aquaculture water environment and meteorological environment to prevent and control the anoxic death of fish bodies is an important problem needing attention and research at present.
In the current dissolved oxygen prediction research, the traditional neural network and the support vector machine are the most studied prediction methods. However, conventional neural networks are not suitable for handling dissolved oxygen predictions for high-dimensional, small samples. The support vector machine has the problems of high computational complexity, low training speed and the like. As an effective prediction method, the extreme learning machine has quick learning capability and can overcome some defects in the traditional algorithm, but weight and threshold parameter selection of the method can influence the prediction accuracy of the dissolved oxygen, meanwhile, high-dimensional redundant network input can influence the prediction performance of the method, and no effective method can solve the problem of the extreme learning machine in dissolved oxygen prediction at present.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide an extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture, which analyzes related influence factors influencing the change of the concentration of the dissolved oxygen by utilizing a Pearson correlation coefficient method, effectively realizes redundant deletion of the predicted input quantity of the dissolved oxygen, and completes quick and accurate prediction of the dissolved oxygen in water by obtaining the optimal weight and threshold parameters of the extreme learning machine.
The technical scheme is as follows: the invention relates to an extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture, which comprises the following specific operation steps of:
(1) data preprocessing;
(2) factor screening, and determining a dissolved oxygen prediction data set;
(3) constructing an IELM network model;
(4) and testing the prediction method and outputting the prediction result.
Further, in step (1), the data preprocessing operation procedure is: deploying a dissolved oxygen sensor and a pH sensor in a test pond of an industrial aquaculture base, deploying an automatic weather station at the side of the pond, and acquiring water body parameter data and weather data in real time through a constructed wireless sensing network;
firstly, for small part of data with discontinuous loss, a linear difference method is adopted to complete the interpolation of the lost data, and the formula is as follows:
in the formula, xkAnd xk+jRespectively representing the monitored water quality data at known k time and k + j time, xk+iThe water quality monitoring data value lost at the k + i moment is represented;
secondly, for different dimension data in the acquisition process, the Z-Score method is used for completing the standardization of the data set, and the formula is as follows:
wherein m represents the number of index variables, n represents the number of samples,represents XmnMean value of SnRepresents XmnIs markedStandard deviation, standard value Z obtained by standard processing of raw datamnHas a mean value of 0 and a variance of 1.
Further, in step (2), the data set for dissolved oxygen prediction is determined by factor screening: analyzing the data by using a Pearson correlation coefficient method aiming at the normalized data set; removing factors influencing the small change of the dissolved oxygen concentration, and reserving factors influencing the large change of the dissolved oxygen concentration, thereby determining a data set of a prediction test;
the method mainly comprises the following steps of screening factors by a Pearson correlation coefficient method:
first, defining m influence factors, where n represents the number of samples, and then representing the matrix of influence factors by an n × m matrix:
secondly, calculating the Pearson correlation coefficient value between each influence factor and the dissolved oxygen concentration, wherein the calculation formula is as follows:
in the formula, xi,yiRespectively representing the ith elements in two correlation vectors x and y; l represents a variable length; mean values of elements in vectors x and y, respectively;
and finally, after acquiring the Pearson correlation coefficient values between the factors and the dissolved oxygen concentration, removing the factors according to the principle that the correlation coefficient value is less than 0.1 and the correlation coefficient value is greater than 0.1 to complete the factor screening process.
Further, in step (3), the building of the IELM network model is: the network model comprises an input layer unit, a hidden layer unit and an output layer unit, the weight and the threshold of the extreme learning machine are optimized by utilizing an artificial bee colony algorithm to obtain the optimal initial values of the weight and the threshold of the extreme learning machine, and the specific operation process is as follows:
first, n samples are set to constitute a sample set (x)i,ti) (i ═ 1,2, …, n), m-dimensional feature x of the ith samplei=[xi1,xi2,…,xim],ti=[ti1,ti2,…,tim]If the number of hidden layers in the ELM network is l, the ELM network is:
in the formula, wjRepresenting the weight of the input layer unit and the jth hidden layer unit; bjRepresenting the biasing of the input layer elements from the hidden layer elements; beta is ajRepresenting the output weight between the jth hidden layer unit and the output layer unit; g (x) activation function, selection Sigmoid function, of networkIs an activation function; let the output value of the ELM network equal the desired value, equation (5) above can be converted into:
the simplified equation (6) is in matrix form as follows:
Hβ=T (7)
after the weight w and the bias in the network are randomly obtained, solving the weight beta between the hidden layer unit and the output layer unit by using a least square method, wherein the calculation expression is as follows:
β=H+Y (9)
in the formula, H+Represents the generalized inverse of the output matrix H;
secondly, initializing the population in the artificial bee colony to generate k particles, namely k feasible solutions;
each particle has D ═ l · (n +1) elements, where l denotes the number of hidden layer elements, n denotes the number of input layer elements, and the size of each element is [ -1,1 [ ]]To (c) to (d); each particle represents a set of input weights and a threshold value of the hidden layer cell, namely [ w11,w12,…w1L,w21,w22,…,w2L,…,wn1,wn2,…wnL,b1,b2,…bL]From each feasible solution, a corresponding fitness value may be generated;
determining k/2 particles as employment bees, recording the optimal value and the corresponding employment bees, and using the rest particles as observation bees; and searching a new honey source in the neighborhood range to update the employment bees, wherein the updating calculation formula is as follows:
P′j=Pj+(Pj-PN)*(rand-0.5)*2 (10)
of formula (II) to (III)'jRepresenting updated employment bees, PjRepresenting the original employed bee, P, before renewalNRepresenting a randomly selected original hiring bee;
performing iterative movement according to the principle that the source fitness is better and the bee moves to a new honey source, using a roulette method to observe whether the bee follows the information of the employed bee and performing the iterative movement according to the probabilityExecuting a roulette method to select a honey source; according to an objective function fiThe rule of whether it is greater than 0, fitness function f (σ)i) Expressed as:
in the formula, deltaiDenotes the ith honey source, i belongs to {1,2,3, …, T }, T denotes the number of honey sources, f (delta)i) Is shown asδiThe fitness of the position honey source; by comparison of f (delta)i) Observing the honey source selected by the bees; when the maximum honey collection times are met, the fitness is still unsuccessfully updated, and the local optimal solution is found, the honey source is abandoned, a new hiring bee is obtained according to the formula (10), and the new honey source is continuously found for replacement; when the maximum iteration times are met, obtaining the optimal fitness value and the optimal particles in the optimizing process, and taking the optimal result as the input parameter weight and the threshold of the ELM network model;
finally, performing IELM neural network training; based on preprocessing and factor screening, applying the optimal parameter weight and threshold determined in the artificial bee colony algorithm optimizing process to the ELM network; selecting a training set to carry out network training, calculating a predicted value and a root mean square error under the time point of the corresponding training set to be recorded as RMSE,and storing the well-trained IELM network model meeting the error condition.
Further, in step (4), the passing the test prediction method, so as to output the prediction result, means that: testing the prediction performance of the trained IELM network in a test set based on the trained IELM network model, and outputting the prediction result of the test set; and selecting a traditional ELM network model as a comparison algorithm, and outputting the prediction results of different algorithms in the test set.
Furthermore, the method utilizes the Pearson correlation coefficient method to carry out factor screening on the listed 11 index factors, eliminates the influence factors with small correlation, retains the influence factors with large correlation, avoids data redundancy and improves the precision and the efficiency of dissolved oxygen prediction.
Furthermore, the initial weight and the threshold parameter of the extreme learning machine are optimized by using the artificial bee colony algorithm, so that the problem that the extreme learning machine falls into local optimization in the optimizing process is avoided, and the precision of predicting the aquaculture dissolved oxygen is improved.
Has the advantages that: compared with the prior art, the method provided by the invention relates to 11 index parameters related to the dissolved oxygen concentration, which are collected in industrial aquaculture, and the missing data is corrected by using a data preprocessing method; screening the index factors by using a Pearson correlation coefficient method, determining 8 indexes with strongest correlation with the dissolved oxygen concentration as input quantity of a prediction method, and dividing a preprocessed data set into a training set and a testing set; then, optimizing the initial weight and the threshold of the extreme learning machine by using an artificial bee colony algorithm to obtain an optimal parameter value, and constructing an IELM network model; finally, the dissolved oxygen prediction value of the IELM is obtained in the test set, the prediction result of the IELM network model is compared with the prediction result of the traditional ELM model, the prediction effect of the IELM prediction method is better, and the change trend of the dissolved oxygen in the industrial aquaculture can be predicted more accurately.
Drawings
FIG. 1 is a flow chart of the operation of the present invention;
fig. 2 is a diagram showing the prediction result of dissolved oxygen in the IELM network model according to the present invention.
Detailed Description
The invention is further described below with reference to the following figures and specific examples.
As shown in the figure, the method for predicting the dissolved oxygen in the industrial aquaculture based on the extreme learning machine comprises the following specific operation steps:
(1) data preprocessing; deploying a dissolved oxygen sensor and a pH sensor in a test pond of an industrial aquaculture base, deploying an automatic weather station at the side of the pond, and acquiring water body parameter data and weather data in real time through a constructed wireless sensing network;
firstly, for small part of data with discontinuous loss, a linear difference method is adopted to complete the interpolation of the lost data, and the formula is as follows:
in the formula, xkAnd xk+jRespectively representing the monitored water quality data at known k time and k + j time, xk+iThe water quality monitoring data value lost at the k + i moment is represented;
secondly, for different dimension data in the acquisition process, the Z-Score method is used for completing the standardization of the data set, and the formula is as follows:
wherein m represents the number of index variables, n represents the number of samples,represents XmnMean value of SnRepresents XmnThe standard deviation of (2), the normalized value Z obtained after the raw data are normalizedmnHas a mean value of 0 and a variance of 1;
the dissolved oxygen concentration of the water body is influenced by various water body parameter indexes and meteorological environment parameters, and in view of the culture experience of fishermen and the research experience of related personnel, different sensors are respectively selected from two parts of water body parameters and meteorological parameters for data acquisition in the test, wherein the two parts comprise dissolved oxygen, pH value, water temperature and CO2Concentration, air pressure, temperature, humidity, wind speed, wind direction, illumination, photosynthetically active radiation and radiation illumination, thereby obtaining an initial data index system;
(2) factor screening, and determining a dissolved oxygen prediction data set; analyzing the data by using a Pearson correlation coefficient method aiming at the normalized data set; removing factors influencing small change of the dissolved oxygen concentration, and reserving factors influencing large change of the dissolved oxygen concentration;
the method mainly comprises the following steps of screening factors by a Pearson correlation coefficient method:
first, defining m influence factors, where n represents the number of samples, and then representing the matrix of influence factors by an n × m matrix:
secondly, calculating the Pearson correlation coefficient value between each influence factor and the dissolved oxygen concentration, wherein the calculation formula is as follows:
in the formula, xi,yiRespectively representing the ith elements in two correlation vectors x and y; l represents a variable length; mean values of elements in vectors x and y, respectively;
finally, after the Pearson correlation coefficient values between the factors and the dissolved oxygen concentration are obtained, the factors are removed according to the principle that the correlation coefficient value is smaller than 0.1 and the correlation coefficient value is larger than 0.1, and the screening process of the factors is finished;
in the model, the Pearson correlation coefficient of 11 factors influencing the change of the dissolved oxygen concentration of the water body is calculated, and 8 indexes of water temperature, pH value, humidity, temperature, illumination, wind speed, radiation illumination and photosynthetically active radiation are kept after factor screening to be used as the input quantity of the IELM dissolved oxygen prediction model;
(3) constructing an IELM network model; the network model comprises an input layer unit, a hidden layer unit and an output layer unit, the weight and the threshold of the extreme learning machine are optimized by utilizing an artificial bee colony algorithm to obtain the optimal initial values of the weight and the threshold of the extreme learning machine, and the specific operation process is as follows:
first, n samples are set to constitute a sample set (x)i,ti) (i ═ 1,2, …, n), m-dimensional feature x of the ith samplei=[xi1,xi2,…,xim],ti=[ti1,ti2,…,tim]If the number of hidden layers in the ELM network is l, the ELM network is:
in the formula, wjRepresenting the weight of the input layer unit and the jth hidden layer unit; bjRepresenting the biasing of the input layer elements from the hidden layer elements; beta is ajRepresenting the output weight between the jth hidden layer unit and the output layer unit; g (x) activation function, selection Sigmoid function, of networkIs an activation function; let the output value of the ELM network equal the desired value, equation (5) above can be converted into:
the simplified equation (6) is in matrix form as follows:
Hβ=T (7)
after the weight w and the bias in the network are randomly obtained, solving the weight beta between the hidden layer unit and the output layer unit by using a least square method, wherein the calculation expression is as follows:
β=H+Y (9)
in the formula, H+Represents the generalized inverse of the output matrix H;
secondly, initializing the population in the artificial bee colony to generate k particles, namely k feasible solutions;
each particle has D ═ l · (n +1) elements, where l denotes the number of hidden layer elements, n denotes the number of input layer elements, and the size of each element is [ -1,1 [ ]]To (c) to (d); each particle represents a set of input weights and a threshold value of the hidden layer cell, namely [ w11,w12,…w1L,w21,w22,…,w2L,…,wn1,wn2,…wnL,b1,b2,…bL]From each feasible solution, a corresponding fitness value may be generated;
determining k/2 particles as employment bees, recording the optimal value and the corresponding employment bees, and using the rest particles as observation bees; and searching a new honey source in the neighborhood range to update the employment bees, wherein the updating calculation formula is as follows:
P′j=Pj+(Pj-PN)*(rand-0.5)*2 (10)
of formula (II) to (III)'jRepresenting updated employment bees, PjRepresenting the original employed bee, P, before renewalNRepresenting a randomly selected original hiring bee;
performing iterative movement according to the principle that the source fitness is better and the bee moves to a new honey source, using a roulette method to observe whether the bee follows the information of the employed bee and performing the iterative movement according to the probabilityExecuting a roulette method to select a honey source; according to an objective function fiThe rule of whether it is greater than 0, fitness function f (σ)i) Expressed as:
in the formula, deltaiDenotes the ith honey source, i belongs to {1,2,3, …, T }, T denotes the number of honey sources, f (delta)i) Is denoted by the number δiThe fitness of the position honey source; by comparison of f (delta)i) Observing the honey source selected by the bees; when the maximum honey collection times are met, the fitness is still unsuccessfully updated, and the local optimal solution is found, the honey source is abandoned, a new hiring bee is obtained according to the formula (10), and the new honey source is continuously found for replacement; when the maximum iteration times are met, obtaining the optimal fitness value and the optimal particles in the optimizing process, and taking the optimal result as the input parameter weight and the threshold of the ELM network model;
finally, performing IELM neural network training; based on preprocessing and factor screening, applying the optimal parameter weight and threshold determined in the artificial bee colony algorithm optimizing process to the ELM network; selecting a training set to carry out network training, calculating a predicted value and a root mean square error under the time point of the corresponding training set to be recorded as RMSE,and storing the trained IELM network model meeting the error condition;
in the model, water body dissolved oxygen data collected in a test period from 1/7/2019 to 30/7/2019 are selected for prediction; firstly, preprocessing collected meteorological data and water parameter data by using a data preprocessing method to obtain 4320 groups of data; the front 3888 group data of the data set is used as a training set, and the rear 432 group data is used as a testing set; after the factor screening is completed through the Pearson correlation coefficient, an input-output structure of the IELM prediction model is constructed by using the screened factor; finally, training the IELM network to finish the output of the training result;
(4) and the test prediction method outputs a prediction result: testing the prediction performance of the trained IELM network in a test set based on the trained IELM network model, and outputting the prediction result of the test set; selecting a traditional ELM network model as a comparison algorithm, and outputting prediction results of different algorithms in a test set;
predicting the dissolved oxygen concentration by using an IELM neural network model and a traditional ELM neural network model respectively to obtain a prediction result graph of the dissolved oxygen concentration in 432 groups of test set data in total; in the figure, the abscissa is the serial number of the test sample, and the ordinate is the dissolved oxygen concentration value; the prediction results of the two prediction models are combined to discover that the two prediction models can realize the prediction of the dissolved oxygen, but the prediction effects are greatly different; the dissolved oxygen prediction result of the IELM neural network model is closer to the actually measured dissolved oxygen concentration value; however, the fluctuation amplitude of the prediction curves of the two prediction models between the sample No. 140-185 and the sample No. 275-339 in the test set is obviously higher than that of the other positions; the time interval corresponds to 0 to 7 points of the day, is the time interval with the lowest dissolved oxygen concentration of water in one day, and has frequent respiration of microorganisms and plants in the water;
and (3) comparison analysis of the prediction model:
predicting the dissolved oxygen concentration of the IELM neural network prediction model from 7 months, 28 days to 30 days in 2019 to obtain a corresponding predicted value, a Root Mean Square Error (RMSE) and a mean relative error (MAE) based on the trained and tested IELM neural network prediction model; and the prediction results of the traditional ELM neural network model are compared with those of the IELM, and 24 groups of dissolved oxygen prediction results of each integral point of 7-month-28-day are listed in a limited space, as shown in Table 1.
TABLE 1 comparison of water dissolved oxygen prediction results for IELM and ELM prediction models
Time | Actual value | IELM prediction value | ELM prediction |
0:00 | 4.97 | 5.09 | 5.59 |
1:00 | 4.46 | 4.17 | 4.44 |
2:00 | 3.70 | 3.41 | 4.15 |
3:00 | 3.52 | 3.30 | 3.85 |
4:00 | 3.13 | 3.06 | 3.48 |
5:00 | 3.39 | 2.91 | 3.18 |
6:00 | 2.88 | 2.75 | 2.88 |
7:00 | 2.68 | 2.45 | 2.87 |
8:00 | 2.84 | 3.10 | 3.80 |
9:00 | 3.20 | 3.41 | 4.34 |
10:00 | 3.63 | 3.92 | 4.06 |
11:00 | 4.04 | 4.10 | 4.02 |
12:00 | 4.56 | 4.41 | 4.18 |
13:00 | 5.11 | 5.05 | 4.54 |
14:00 | 5.72 | 5.54 | 5.66 |
15:00 | 6.42 | 5.74 | 6.23 |
16:00 | 6.65 | 6.31 | 6.85 |
17:00 | 6.65 | 6.60 | 7.21 |
18:00 | 6.69 | 6.17 | 7.05 |
19:00 | 5.91 | 5.60 | 5.98 |
20:00 | 7.56 | 7.48 | 6.23 |
21:00 | 7.18 | 7.30 | 6.33 |
22:00 | 6.49 | 6.86 | 6.65 |
23:00 | 5.29 | 5.84 | 6.25 |
RMSE | / | 0.35 | 0.64 |
MAE | / | 0.25 | 0.44 |
According to the comparison result, when the IELM neural network model is used for predicting the dissolved oxygen concentration of the aquaculture water body, the predicted root mean square error value is 0.35 and is obviously lower than the predicted root mean square error value of the ELM neural network model by 0.64; meanwhile, the average relative error values of the IELM and ELM neural network model predicted values of the whole day of 28 days in 7 months are 0.25 and 0.44 respectively; therefore, the prediction precision and the prediction effect of the IELM network model are higher.
The invention takes the water body dissolved oxygen of industrial aquaculture as a research object, provides a prediction algorithm based on an extreme learning machine to predict the water body dissolved oxygen, utilizes the raw data standardized by a data preprocessing method to screen a plurality of influence factors influencing the change of the dissolved oxygen by using a Pearson correlation coefficient method, obtains the input quantity and the output quantity of a dissolved oxygen prediction model, then improves the extreme learning machine based on an artificial bee colony algorithm, and constructs an IELM neural network model; the method can effectively avoid the calculation problem caused by multi-input redundant information in the dissolved oxygen prediction, and solve the problem that the traditional ELM neural network is easy to fall into local optimum in the network training process, thereby improving the training speed and the prediction precision of the traditional ELM network model; the method can be used for predicting the dissolved oxygen in the industrial aquaculture production, so that scientific, reasonable and accurate prediction results can be obtained, the aquaculture production is guaranteed, and the aquaculture risk is reduced.
Furthermore, the method utilizes the Pearson correlation coefficient method to carry out factor screening on the listed 11 index factors, eliminates the influence factors with small correlation, retains the influence factors with large correlation, avoids data redundancy and improves the precision and the efficiency of dissolved oxygen prediction.
Furthermore, the initial weight and the threshold parameter of the extreme learning machine are optimized by using the artificial bee colony algorithm, so that the problem that the extreme learning machine falls into local optimization in the optimizing process is avoided, and the precision of predicting the aquaculture dissolved oxygen is improved.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (5)
1. An extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture is characterized by comprising the following specific operation steps:
(1) data preprocessing;
(2) factor screening, and determining a dissolved oxygen prediction data set;
(3) constructing an IELM network model;
(4) and testing the prediction method and outputting the prediction result.
2. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (1), the data preprocessing operation process is as follows: deploying a dissolved oxygen sensor and a pH sensor in a test pond of an industrial aquaculture base, deploying an automatic weather station at the side of the pond, and acquiring water body parameter data and weather data in real time through a constructed wireless sensing network;
firstly, for small part of data with discontinuous loss, a linear difference method is adopted to complete the interpolation of the lost data, and the formula is as follows:
in the formula, xkAnd xk+jRespectively representing the monitored water quality data at known k time and k + j time, xk+iThe water quality monitoring data value lost at the k + i moment is represented;
secondly, for different dimension data in the acquisition process, the Z-Score method is used for completing the standardization of the data set, and the formula is as follows:
3. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (2), the data set for dissolved oxygen prediction is determined by factor screening: analyzing the data by using a Pearson correlation coefficient method aiming at the normalized data set; removing factors influencing small change of the dissolved oxygen concentration, and reserving factors influencing large change of the dissolved oxygen concentration;
the method mainly comprises the following steps of screening factors by a Pearson correlation coefficient method:
first, defining m influence factors, where n represents the number of samples, and then representing the matrix of influence factors by an n × m matrix:
secondly, calculating the Pearson correlation coefficient value between each influence factor and the dissolved oxygen concentration, wherein the calculation formula is as follows:
in the formula, xi,yiRespectively representing the ith elements in two correlation vectors x and y; l represents a variable length; representing the elements in both vectors x and y, respectivelyA value;
and finally, after acquiring the Pearson correlation coefficient values between the factors and the dissolved oxygen concentration, removing the factors according to the principle that the correlation coefficient value is less than 0.1 and the correlation coefficient value is greater than 0.1 to complete the factor screening process.
4. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (3), the building of the IELM network model is: the network model comprises an input layer unit, a hidden layer unit and an output layer unit, the weight and the threshold of the extreme learning machine are optimized by utilizing an artificial bee colony algorithm to obtain the optimal initial values of the weight and the threshold of the extreme learning machine, and the specific operation process is as follows:
first, n samples are set to constitute a sample set (x)i,ti) (i ═ 1,2, …, n), m-dimensional feature x of the ith samplei=[xi1,xi2,…,xim],ti=[ti1,ti2,…,tim]If the number of hidden layers in the ELM network is l, the ELM network is:
in the formula, wjRepresenting the weight of the input layer unit and the jth hidden layer unit; bjRepresenting the biasing of the input layer elements from the hidden layer elements; beta is ajRepresenting the output weight between the jth hidden layer unit and the output layer unit; g (x) activation function, selection Sigmoid function, of networkIs an activation function; let the output value of the ELM network equal the desired value, equation (5) above can be converted into:
the simplified equation (6) is in matrix form as follows:
Hβ=T (7)
after the weight w and the bias in the network are randomly obtained, solving the weight beta between the hidden layer unit and the output layer unit by using a least square method, wherein the calculation expression is as follows:
β=H+Y (9)
in the formula, H+Represents the generalized inverse of the output matrix H;
secondly, initializing the population in the artificial bee colony to generate k particles, namely k feasible solutions;
each particle has D ═ l · (n +1) elements, where l denotes the number of hidden layer elements, n denotes the number of input layer elements, and the size of each element is [ -1,1 [ ]]To (c) to (d); each particle represents a set of input weights and a threshold value of the hidden layer cell, namely [ w11,w12,…w1L,w21,w22,…,w2L,…,wn1,wn2,…wnL,b1,b2,…bL]From each feasible solution, a corresponding fitness value may be generated;
determining k/2 particles as employment bees, recording the optimal value and the corresponding employment bees, and using the rest particles as observation bees; and searching a new honey source in the neighborhood range to update the employment bees, wherein the updating calculation formula is as follows:
P′j=Pj+(Pj-PN)*(rand-0.5)*2 (10)
of formula (II) to (III)'jRepresenting updated employment bees, PjRepresenting the original employed bee, P, before renewalNRepresenting a randomly selected original hiring bee;
performing iterative movement according to the principle that the source fitness is better and the source is moved to a new honey source, and using a wheelThe betting board method is used for observing whether bees follow the information of the employed bees or not and according to the probabilityExecuting a roulette method to select a honey source; according to an objective function fiThe rule of whether it is greater than 0, fitness function f (σ)i) Expressed as:
in the formula, deltaiDenotes the ith honey source, i belongs to {1,2,3, …, T }, T denotes the number of honey sources, f (delta)i) Is denoted by the number δiThe fitness of the position honey source; by comparison of f (delta)i) Observing the honey source selected by the bees; when the maximum honey collection times are met, the fitness is still unsuccessfully updated, and the local optimal solution is found, the honey source is abandoned, a new hiring bee is obtained according to the formula (10), and the new honey source is continuously found for replacement; when the maximum iteration times are met, obtaining the optimal fitness value and the optimal particles in the optimizing process, and taking the optimal result as the input parameter weight and the threshold of the ELM network model;
finally, performing IELM neural network training; based on preprocessing and factor screening, applying the optimal parameter weight and threshold determined in the artificial bee colony algorithm optimizing process to the ELM network; selecting a training set to carry out network training, calculating a predicted value and a root mean square error under the time point of the corresponding training set to be recorded as RMSE,and storing the well-trained IELM network model meeting the error condition.
5. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (4), the passing of the test prediction method and the outputting of the prediction result means: testing the prediction performance of the trained IELM network in a test set based on the trained IELM network model, and outputting the prediction result of the test set; and selecting a traditional ELM network model as a comparison algorithm, and outputting the prediction results of different algorithms in the test set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111170371.0A CN113962819A (en) | 2021-10-08 | 2021-10-08 | Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111170371.0A CN113962819A (en) | 2021-10-08 | 2021-10-08 | Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113962819A true CN113962819A (en) | 2022-01-21 |
Family
ID=79463692
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111170371.0A Pending CN113962819A (en) | 2021-10-08 | 2021-10-08 | Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113962819A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115994327A (en) * | 2023-03-22 | 2023-04-21 | 山东能源数智云科技有限公司 | Equipment fault diagnosis method and device based on edge calculation |
CN116230087A (en) * | 2022-12-02 | 2023-06-06 | 深圳太力生物技术有限责任公司 | Method and device for optimizing culture medium components |
-
2021
- 2021-10-08 CN CN202111170371.0A patent/CN113962819A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116230087A (en) * | 2022-12-02 | 2023-06-06 | 深圳太力生物技术有限责任公司 | Method and device for optimizing culture medium components |
CN116230087B (en) * | 2022-12-02 | 2024-05-14 | 深圳太力生物技术有限责任公司 | Method and device for optimizing culture medium components |
CN115994327A (en) * | 2023-03-22 | 2023-04-21 | 山东能源数智云科技有限公司 | Equipment fault diagnosis method and device based on edge calculation |
CN115994327B (en) * | 2023-03-22 | 2023-06-23 | 山东能源数智云科技有限公司 | Equipment fault diagnosis method and device based on edge calculation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107480775B (en) | Pond dissolved oxygen prediction method based on data restoration | |
CN108416366B (en) | Power system short-term load prediction method based on meteorological index weighted LS-SVM | |
CN108665106A (en) | A kind of aquaculture dissolved oxygen prediction method and device | |
CN112906298B (en) | Blueberry yield prediction method based on machine learning | |
CN113962819A (en) | Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine | |
CN113554466B (en) | Short-term electricity consumption prediction model construction method, prediction method and device | |
CN107169621A (en) | A kind of Dissolved Oxygen in Water Forecasting Methodology and device | |
CN113361761A (en) | Short-term wind power integration prediction method and system based on error correction | |
CN112527037A (en) | Greenhouse environment regulation and control method and system with environment factor prediction function | |
CN113177673B (en) | Air conditioner cold load prediction optimization method, system and equipment | |
CN115034126A (en) | Method and system for optimizing LSTM neural network model through wolf algorithm | |
CN109934422A (en) | Neural network wind speed prediction method based on time series data analysis | |
CN111292124A (en) | Water demand prediction method based on optimized combined neural network | |
CN114897204A (en) | Method and device for predicting short-term wind speed of offshore wind farm | |
CN114548350A (en) | Power load prediction method based on goblet sea squirt group and BP neural network | |
Lu et al. | Image classification and identification for rice leaf diseases based on improved WOACW_SimpleNet | |
Marinković et al. | Data mining approach for predictive modeling of agricultural yield data | |
CN116663404A (en) | Flood forecasting method and system coupling artificial intelligence and Bayesian theory | |
CN111126827A (en) | Input-output accounting model construction method based on BP artificial neural network | |
CN113947332B (en) | Underground engineering comprehensive security capability assessment method and system | |
CN114234392B (en) | Air conditioner load fine prediction method based on improved PSO-LSTM | |
CN116167508A (en) | Short-term photovoltaic output rapid prediction method and system based on meteorological factor decomposition | |
Qu et al. | Application of Deep Neural Network on Net Photosynthesis Modeling | |
CN114357877A (en) | Fishpond water quality evaluation prediction system and method based on fuzzy evaluation and improved support vector machine | |
CN110991743B (en) | Wind power short-term combination prediction method based on cluster analysis and neural network optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |