CN113962819A - Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine - Google Patents

Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine Download PDF

Info

Publication number
CN113962819A
CN113962819A CN202111170371.0A CN202111170371A CN113962819A CN 113962819 A CN113962819 A CN 113962819A CN 202111170371 A CN202111170371 A CN 202111170371A CN 113962819 A CN113962819 A CN 113962819A
Authority
CN
China
Prior art keywords
dissolved oxygen
prediction
data
value
ielm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111170371.0A
Other languages
Chinese (zh)
Inventor
施珮
唐玥
匡亮
余晓栋
孙宁
陆松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi University
Original Assignee
Wuxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi University filed Critical Wuxi University
Priority to CN202111170371.0A priority Critical patent/CN113962819A/en
Publication of CN113962819A publication Critical patent/CN113962819A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Operations Research (AREA)
  • Primary Health Care (AREA)
  • Mining & Mineral Resources (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Quality & Reliability (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture. Belongs to the technical field of aquaculture; the method comprises the following specific steps: data preprocessing, factor screening, IELM network model construction, test prediction method and prediction result output. The invention corrects the missing data by using a data preprocessing method; screening the index factors by using a Pearson correlation coefficient method, determining 8 indexes with strongest correlation with the dissolved oxygen concentration as input quantity of a prediction method, and dividing a preprocessed data set into a training set and a testing set; then, optimizing the initial weight and the threshold of the extreme learning machine by using an artificial bee colony algorithm to obtain an optimal parameter value, and constructing an IELM network model; finally, the obtained dissolved oxygen prediction value of the IELM is compared with the prediction result of the traditional ELM model in the test set, the prediction effect of the IELM prediction method is better, and the change trend of the dissolved oxygen in the industrial aquaculture can be predicted more accurately.

Description

Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine
Technical Field
The invention belongs to the technical field of aquaculture, relates to a method for predicting dissolved oxygen in industrial aquaculture, and particularly relates to a method for predicting dissolved oxygen in industrial aquaculture based on an extreme learning machine.
Background
Industrial aquaculture provides new hopes for areas with limited natural resources in an industrial and intensive culture mode, and is an industry trend of the aquaculture industry. In industrial aquaculture, the balance and quality of water quality of a water body are particularly important, and the accurate control and prediction of dissolved oxygen are the center of gravity of the aquaculture work. How to obtain and effectively utilize the information of aquaculture water environment and meteorological environment to prevent and control the anoxic death of fish bodies is an important problem needing attention and research at present.
In the current dissolved oxygen prediction research, the traditional neural network and the support vector machine are the most studied prediction methods. However, conventional neural networks are not suitable for handling dissolved oxygen predictions for high-dimensional, small samples. The support vector machine has the problems of high computational complexity, low training speed and the like. As an effective prediction method, the extreme learning machine has quick learning capability and can overcome some defects in the traditional algorithm, but weight and threshold parameter selection of the method can influence the prediction accuracy of the dissolved oxygen, meanwhile, high-dimensional redundant network input can influence the prediction performance of the method, and no effective method can solve the problem of the extreme learning machine in dissolved oxygen prediction at present.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide an extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture, which analyzes related influence factors influencing the change of the concentration of the dissolved oxygen by utilizing a Pearson correlation coefficient method, effectively realizes redundant deletion of the predicted input quantity of the dissolved oxygen, and completes quick and accurate prediction of the dissolved oxygen in water by obtaining the optimal weight and threshold parameters of the extreme learning machine.
The technical scheme is as follows: the invention relates to an extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture, which comprises the following specific operation steps of:
(1) data preprocessing;
(2) factor screening, and determining a dissolved oxygen prediction data set;
(3) constructing an IELM network model;
(4) and testing the prediction method and outputting the prediction result.
Further, in step (1), the data preprocessing operation procedure is: deploying a dissolved oxygen sensor and a pH sensor in a test pond of an industrial aquaculture base, deploying an automatic weather station at the side of the pond, and acquiring water body parameter data and weather data in real time through a constructed wireless sensing network;
firstly, for small part of data with discontinuous loss, a linear difference method is adopted to complete the interpolation of the lost data, and the formula is as follows:
Figure BDA0003292914180000021
in the formula, xkAnd xk+jRespectively representing the monitored water quality data at known k time and k + j time, xk+iThe water quality monitoring data value lost at the k + i moment is represented;
secondly, for different dimension data in the acquisition process, the Z-Score method is used for completing the standardization of the data set, and the formula is as follows:
Figure BDA0003292914180000022
wherein m represents the number of index variables, n represents the number of samples,
Figure BDA0003292914180000023
represents XmnMean value of SnRepresents XmnIs markedStandard deviation, standard value Z obtained by standard processing of raw datamnHas a mean value of 0 and a variance of 1.
Further, in step (2), the data set for dissolved oxygen prediction is determined by factor screening: analyzing the data by using a Pearson correlation coefficient method aiming at the normalized data set; removing factors influencing the small change of the dissolved oxygen concentration, and reserving factors influencing the large change of the dissolved oxygen concentration, thereby determining a data set of a prediction test;
the method mainly comprises the following steps of screening factors by a Pearson correlation coefficient method:
first, defining m influence factors, where n represents the number of samples, and then representing the matrix of influence factors by an n × m matrix:
Figure BDA0003292914180000024
secondly, calculating the Pearson correlation coefficient value between each influence factor and the dissolved oxygen concentration, wherein the calculation formula is as follows:
Figure BDA0003292914180000025
in the formula, xi,yiRespectively representing the ith elements in two correlation vectors x and y; l represents a variable length;
Figure BDA0003292914180000026
Figure BDA0003292914180000027
mean values of elements in vectors x and y, respectively;
and finally, after acquiring the Pearson correlation coefficient values between the factors and the dissolved oxygen concentration, removing the factors according to the principle that the correlation coefficient value is less than 0.1 and the correlation coefficient value is greater than 0.1 to complete the factor screening process.
Further, in step (3), the building of the IELM network model is: the network model comprises an input layer unit, a hidden layer unit and an output layer unit, the weight and the threshold of the extreme learning machine are optimized by utilizing an artificial bee colony algorithm to obtain the optimal initial values of the weight and the threshold of the extreme learning machine, and the specific operation process is as follows:
first, n samples are set to constitute a sample set (x)i,ti) (i ═ 1,2, …, n), m-dimensional feature x of the ith samplei=[xi1,xi2,…,xim],ti=[ti1,ti2,…,tim]If the number of hidden layers in the ELM network is l, the ELM network is:
Figure BDA0003292914180000031
in the formula, wjRepresenting the weight of the input layer unit and the jth hidden layer unit; bjRepresenting the biasing of the input layer elements from the hidden layer elements; beta is ajRepresenting the output weight between the jth hidden layer unit and the output layer unit; g (x) activation function, selection Sigmoid function, of network
Figure BDA0003292914180000032
Is an activation function; let the output value of the ELM network equal the desired value, equation (5) above can be converted into:
Figure BDA0003292914180000033
the simplified equation (6) is in matrix form as follows:
Hβ=T (7)
Figure BDA0003292914180000034
after the weight w and the bias in the network are randomly obtained, solving the weight beta between the hidden layer unit and the output layer unit by using a least square method, wherein the calculation expression is as follows:
β=H+Y (9)
in the formula, H+Represents the generalized inverse of the output matrix H;
secondly, initializing the population in the artificial bee colony to generate k particles, namely k feasible solutions;
each particle has D ═ l · (n +1) elements, where l denotes the number of hidden layer elements, n denotes the number of input layer elements, and the size of each element is [ -1,1 [ ]]To (c) to (d); each particle represents a set of input weights and a threshold value of the hidden layer cell, namely [ w11,w12,…w1L,w21,w22,…,w2L,…,wn1,wn2,…wnL,b1,b2,…bL]From each feasible solution, a corresponding fitness value may be generated;
determining k/2 particles as employment bees, recording the optimal value and the corresponding employment bees, and using the rest particles as observation bees; and searching a new honey source in the neighborhood range to update the employment bees, wherein the updating calculation formula is as follows:
P′j=Pj+(Pj-PN)*(rand-0.5)*2 (10)
of formula (II) to (III)'jRepresenting updated employment bees, PjRepresenting the original employed bee, P, before renewalNRepresenting a randomly selected original hiring bee;
performing iterative movement according to the principle that the source fitness is better and the bee moves to a new honey source, using a roulette method to observe whether the bee follows the information of the employed bee and performing the iterative movement according to the probability
Figure BDA0003292914180000041
Executing a roulette method to select a honey source; according to an objective function fiThe rule of whether it is greater than 0, fitness function f (σ)i) Expressed as:
Figure BDA0003292914180000042
in the formula, deltaiDenotes the ith honey source, i belongs to {1,2,3, …, T }, T denotes the number of honey sources, f (delta)i) Is shown asδiThe fitness of the position honey source; by comparison of f (delta)i) Observing the honey source selected by the bees; when the maximum honey collection times are met, the fitness is still unsuccessfully updated, and the local optimal solution is found, the honey source is abandoned, a new hiring bee is obtained according to the formula (10), and the new honey source is continuously found for replacement; when the maximum iteration times are met, obtaining the optimal fitness value and the optimal particles in the optimizing process, and taking the optimal result as the input parameter weight and the threshold of the ELM network model;
finally, performing IELM neural network training; based on preprocessing and factor screening, applying the optimal parameter weight and threshold determined in the artificial bee colony algorithm optimizing process to the ELM network; selecting a training set to carry out network training, calculating a predicted value and a root mean square error under the time point of the corresponding training set to be recorded as RMSE,
Figure BDA0003292914180000043
and storing the well-trained IELM network model meeting the error condition.
Further, in step (4), the passing the test prediction method, so as to output the prediction result, means that: testing the prediction performance of the trained IELM network in a test set based on the trained IELM network model, and outputting the prediction result of the test set; and selecting a traditional ELM network model as a comparison algorithm, and outputting the prediction results of different algorithms in the test set.
Furthermore, the method utilizes the Pearson correlation coefficient method to carry out factor screening on the listed 11 index factors, eliminates the influence factors with small correlation, retains the influence factors with large correlation, avoids data redundancy and improves the precision and the efficiency of dissolved oxygen prediction.
Furthermore, the initial weight and the threshold parameter of the extreme learning machine are optimized by using the artificial bee colony algorithm, so that the problem that the extreme learning machine falls into local optimization in the optimizing process is avoided, and the precision of predicting the aquaculture dissolved oxygen is improved.
Has the advantages that: compared with the prior art, the method provided by the invention relates to 11 index parameters related to the dissolved oxygen concentration, which are collected in industrial aquaculture, and the missing data is corrected by using a data preprocessing method; screening the index factors by using a Pearson correlation coefficient method, determining 8 indexes with strongest correlation with the dissolved oxygen concentration as input quantity of a prediction method, and dividing a preprocessed data set into a training set and a testing set; then, optimizing the initial weight and the threshold of the extreme learning machine by using an artificial bee colony algorithm to obtain an optimal parameter value, and constructing an IELM network model; finally, the dissolved oxygen prediction value of the IELM is obtained in the test set, the prediction result of the IELM network model is compared with the prediction result of the traditional ELM model, the prediction effect of the IELM prediction method is better, and the change trend of the dissolved oxygen in the industrial aquaculture can be predicted more accurately.
Drawings
FIG. 1 is a flow chart of the operation of the present invention;
fig. 2 is a diagram showing the prediction result of dissolved oxygen in the IELM network model according to the present invention.
Detailed Description
The invention is further described below with reference to the following figures and specific examples.
As shown in the figure, the method for predicting the dissolved oxygen in the industrial aquaculture based on the extreme learning machine comprises the following specific operation steps:
(1) data preprocessing; deploying a dissolved oxygen sensor and a pH sensor in a test pond of an industrial aquaculture base, deploying an automatic weather station at the side of the pond, and acquiring water body parameter data and weather data in real time through a constructed wireless sensing network;
firstly, for small part of data with discontinuous loss, a linear difference method is adopted to complete the interpolation of the lost data, and the formula is as follows:
Figure BDA0003292914180000051
in the formula, xkAnd xk+jRespectively representing the monitored water quality data at known k time and k + j time, xk+iThe water quality monitoring data value lost at the k + i moment is represented;
secondly, for different dimension data in the acquisition process, the Z-Score method is used for completing the standardization of the data set, and the formula is as follows:
Figure BDA0003292914180000052
wherein m represents the number of index variables, n represents the number of samples,
Figure BDA0003292914180000053
represents XmnMean value of SnRepresents XmnThe standard deviation of (2), the normalized value Z obtained after the raw data are normalizedmnHas a mean value of 0 and a variance of 1;
the dissolved oxygen concentration of the water body is influenced by various water body parameter indexes and meteorological environment parameters, and in view of the culture experience of fishermen and the research experience of related personnel, different sensors are respectively selected from two parts of water body parameters and meteorological parameters for data acquisition in the test, wherein the two parts comprise dissolved oxygen, pH value, water temperature and CO2Concentration, air pressure, temperature, humidity, wind speed, wind direction, illumination, photosynthetically active radiation and radiation illumination, thereby obtaining an initial data index system;
(2) factor screening, and determining a dissolved oxygen prediction data set; analyzing the data by using a Pearson correlation coefficient method aiming at the normalized data set; removing factors influencing small change of the dissolved oxygen concentration, and reserving factors influencing large change of the dissolved oxygen concentration;
the method mainly comprises the following steps of screening factors by a Pearson correlation coefficient method:
first, defining m influence factors, where n represents the number of samples, and then representing the matrix of influence factors by an n × m matrix:
Figure BDA0003292914180000061
secondly, calculating the Pearson correlation coefficient value between each influence factor and the dissolved oxygen concentration, wherein the calculation formula is as follows:
Figure BDA0003292914180000062
in the formula, xi,yiRespectively representing the ith elements in two correlation vectors x and y; l represents a variable length;
Figure BDA0003292914180000063
Figure BDA0003292914180000064
mean values of elements in vectors x and y, respectively;
finally, after the Pearson correlation coefficient values between the factors and the dissolved oxygen concentration are obtained, the factors are removed according to the principle that the correlation coefficient value is smaller than 0.1 and the correlation coefficient value is larger than 0.1, and the screening process of the factors is finished;
in the model, the Pearson correlation coefficient of 11 factors influencing the change of the dissolved oxygen concentration of the water body is calculated, and 8 indexes of water temperature, pH value, humidity, temperature, illumination, wind speed, radiation illumination and photosynthetically active radiation are kept after factor screening to be used as the input quantity of the IELM dissolved oxygen prediction model;
(3) constructing an IELM network model; the network model comprises an input layer unit, a hidden layer unit and an output layer unit, the weight and the threshold of the extreme learning machine are optimized by utilizing an artificial bee colony algorithm to obtain the optimal initial values of the weight and the threshold of the extreme learning machine, and the specific operation process is as follows:
first, n samples are set to constitute a sample set (x)i,ti) (i ═ 1,2, …, n), m-dimensional feature x of the ith samplei=[xi1,xi2,…,xim],ti=[ti1,ti2,…,tim]If the number of hidden layers in the ELM network is l, the ELM network is:
Figure BDA0003292914180000065
in the formula, wjRepresenting the weight of the input layer unit and the jth hidden layer unit; bjRepresenting the biasing of the input layer elements from the hidden layer elements; beta is ajRepresenting the output weight between the jth hidden layer unit and the output layer unit; g (x) activation function, selection Sigmoid function, of network
Figure BDA0003292914180000071
Is an activation function; let the output value of the ELM network equal the desired value, equation (5) above can be converted into:
Figure BDA0003292914180000072
the simplified equation (6) is in matrix form as follows:
Hβ=T (7)
Figure BDA0003292914180000073
after the weight w and the bias in the network are randomly obtained, solving the weight beta between the hidden layer unit and the output layer unit by using a least square method, wherein the calculation expression is as follows:
β=H+Y (9)
in the formula, H+Represents the generalized inverse of the output matrix H;
secondly, initializing the population in the artificial bee colony to generate k particles, namely k feasible solutions;
each particle has D ═ l · (n +1) elements, where l denotes the number of hidden layer elements, n denotes the number of input layer elements, and the size of each element is [ -1,1 [ ]]To (c) to (d); each particle represents a set of input weights and a threshold value of the hidden layer cell, namely [ w11,w12,…w1L,w21,w22,…,w2L,…,wn1,wn2,…wnL,b1,b2,…bL]From each feasible solution, a corresponding fitness value may be generated;
determining k/2 particles as employment bees, recording the optimal value and the corresponding employment bees, and using the rest particles as observation bees; and searching a new honey source in the neighborhood range to update the employment bees, wherein the updating calculation formula is as follows:
P′j=Pj+(Pj-PN)*(rand-0.5)*2 (10)
of formula (II) to (III)'jRepresenting updated employment bees, PjRepresenting the original employed bee, P, before renewalNRepresenting a randomly selected original hiring bee;
performing iterative movement according to the principle that the source fitness is better and the bee moves to a new honey source, using a roulette method to observe whether the bee follows the information of the employed bee and performing the iterative movement according to the probability
Figure BDA0003292914180000074
Executing a roulette method to select a honey source; according to an objective function fiThe rule of whether it is greater than 0, fitness function f (σ)i) Expressed as:
Figure BDA0003292914180000075
in the formula, deltaiDenotes the ith honey source, i belongs to {1,2,3, …, T }, T denotes the number of honey sources, f (delta)i) Is denoted by the number δiThe fitness of the position honey source; by comparison of f (delta)i) Observing the honey source selected by the bees; when the maximum honey collection times are met, the fitness is still unsuccessfully updated, and the local optimal solution is found, the honey source is abandoned, a new hiring bee is obtained according to the formula (10), and the new honey source is continuously found for replacement; when the maximum iteration times are met, obtaining the optimal fitness value and the optimal particles in the optimizing process, and taking the optimal result as the input parameter weight and the threshold of the ELM network model;
finally, performing IELM neural network training; based on preprocessing and factor screening, applying the optimal parameter weight and threshold determined in the artificial bee colony algorithm optimizing process to the ELM network; selecting a training set to carry out network training, calculating a predicted value and a root mean square error under the time point of the corresponding training set to be recorded as RMSE,
Figure BDA0003292914180000081
and storing the trained IELM network model meeting the error condition;
in the model, water body dissolved oxygen data collected in a test period from 1/7/2019 to 30/7/2019 are selected for prediction; firstly, preprocessing collected meteorological data and water parameter data by using a data preprocessing method to obtain 4320 groups of data; the front 3888 group data of the data set is used as a training set, and the rear 432 group data is used as a testing set; after the factor screening is completed through the Pearson correlation coefficient, an input-output structure of the IELM prediction model is constructed by using the screened factor; finally, training the IELM network to finish the output of the training result;
(4) and the test prediction method outputs a prediction result: testing the prediction performance of the trained IELM network in a test set based on the trained IELM network model, and outputting the prediction result of the test set; selecting a traditional ELM network model as a comparison algorithm, and outputting prediction results of different algorithms in a test set;
predicting the dissolved oxygen concentration by using an IELM neural network model and a traditional ELM neural network model respectively to obtain a prediction result graph of the dissolved oxygen concentration in 432 groups of test set data in total; in the figure, the abscissa is the serial number of the test sample, and the ordinate is the dissolved oxygen concentration value; the prediction results of the two prediction models are combined to discover that the two prediction models can realize the prediction of the dissolved oxygen, but the prediction effects are greatly different; the dissolved oxygen prediction result of the IELM neural network model is closer to the actually measured dissolved oxygen concentration value; however, the fluctuation amplitude of the prediction curves of the two prediction models between the sample No. 140-185 and the sample No. 275-339 in the test set is obviously higher than that of the other positions; the time interval corresponds to 0 to 7 points of the day, is the time interval with the lowest dissolved oxygen concentration of water in one day, and has frequent respiration of microorganisms and plants in the water;
and (3) comparison analysis of the prediction model:
predicting the dissolved oxygen concentration of the IELM neural network prediction model from 7 months, 28 days to 30 days in 2019 to obtain a corresponding predicted value, a Root Mean Square Error (RMSE) and a mean relative error (MAE) based on the trained and tested IELM neural network prediction model; and the prediction results of the traditional ELM neural network model are compared with those of the IELM, and 24 groups of dissolved oxygen prediction results of each integral point of 7-month-28-day are listed in a limited space, as shown in Table 1.
TABLE 1 comparison of water dissolved oxygen prediction results for IELM and ELM prediction models
Time Actual value IELM prediction value ELM prediction
0:00 4.97 5.09 5.59
1:00 4.46 4.17 4.44
2:00 3.70 3.41 4.15
3:00 3.52 3.30 3.85
4:00 3.13 3.06 3.48
5:00 3.39 2.91 3.18
6:00 2.88 2.75 2.88
7:00 2.68 2.45 2.87
8:00 2.84 3.10 3.80
9:00 3.20 3.41 4.34
10:00 3.63 3.92 4.06
11:00 4.04 4.10 4.02
12:00 4.56 4.41 4.18
13:00 5.11 5.05 4.54
14:00 5.72 5.54 5.66
15:00 6.42 5.74 6.23
16:00 6.65 6.31 6.85
17:00 6.65 6.60 7.21
18:00 6.69 6.17 7.05
19:00 5.91 5.60 5.98
20:00 7.56 7.48 6.23
21:00 7.18 7.30 6.33
22:00 6.49 6.86 6.65
23:00 5.29 5.84 6.25
RMSE / 0.35 0.64
MAE / 0.25 0.44
According to the comparison result, when the IELM neural network model is used for predicting the dissolved oxygen concentration of the aquaculture water body, the predicted root mean square error value is 0.35 and is obviously lower than the predicted root mean square error value of the ELM neural network model by 0.64; meanwhile, the average relative error values of the IELM and ELM neural network model predicted values of the whole day of 28 days in 7 months are 0.25 and 0.44 respectively; therefore, the prediction precision and the prediction effect of the IELM network model are higher.
The invention takes the water body dissolved oxygen of industrial aquaculture as a research object, provides a prediction algorithm based on an extreme learning machine to predict the water body dissolved oxygen, utilizes the raw data standardized by a data preprocessing method to screen a plurality of influence factors influencing the change of the dissolved oxygen by using a Pearson correlation coefficient method, obtains the input quantity and the output quantity of a dissolved oxygen prediction model, then improves the extreme learning machine based on an artificial bee colony algorithm, and constructs an IELM neural network model; the method can effectively avoid the calculation problem caused by multi-input redundant information in the dissolved oxygen prediction, and solve the problem that the traditional ELM neural network is easy to fall into local optimum in the network training process, thereby improving the training speed and the prediction precision of the traditional ELM network model; the method can be used for predicting the dissolved oxygen in the industrial aquaculture production, so that scientific, reasonable and accurate prediction results can be obtained, the aquaculture production is guaranteed, and the aquaculture risk is reduced.
Furthermore, the method utilizes the Pearson correlation coefficient method to carry out factor screening on the listed 11 index factors, eliminates the influence factors with small correlation, retains the influence factors with large correlation, avoids data redundancy and improves the precision and the efficiency of dissolved oxygen prediction.
Furthermore, the initial weight and the threshold parameter of the extreme learning machine are optimized by using the artificial bee colony algorithm, so that the problem that the extreme learning machine falls into local optimization in the optimizing process is avoided, and the precision of predicting the aquaculture dissolved oxygen is improved.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. An extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture is characterized by comprising the following specific operation steps:
(1) data preprocessing;
(2) factor screening, and determining a dissolved oxygen prediction data set;
(3) constructing an IELM network model;
(4) and testing the prediction method and outputting the prediction result.
2. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (1), the data preprocessing operation process is as follows: deploying a dissolved oxygen sensor and a pH sensor in a test pond of an industrial aquaculture base, deploying an automatic weather station at the side of the pond, and acquiring water body parameter data and weather data in real time through a constructed wireless sensing network;
firstly, for small part of data with discontinuous loss, a linear difference method is adopted to complete the interpolation of the lost data, and the formula is as follows:
Figure FDA0003292914170000011
in the formula, xkAnd xk+jRespectively representing the monitored water quality data at known k time and k + j time, xk+iThe water quality monitoring data value lost at the k + i moment is represented;
secondly, for different dimension data in the acquisition process, the Z-Score method is used for completing the standardization of the data set, and the formula is as follows:
Figure FDA0003292914170000012
wherein m represents the number of index variables, n represents the number of samples,
Figure FDA0003292914170000013
represents XmnMean value of SnRepresents XmnThe standard deviation of (2), the normalized value Z obtained after the raw data are normalizedmnHas a mean value of 0 and a variance of 1.
3. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (2), the data set for dissolved oxygen prediction is determined by factor screening: analyzing the data by using a Pearson correlation coefficient method aiming at the normalized data set; removing factors influencing small change of the dissolved oxygen concentration, and reserving factors influencing large change of the dissolved oxygen concentration;
the method mainly comprises the following steps of screening factors by a Pearson correlation coefficient method:
first, defining m influence factors, where n represents the number of samples, and then representing the matrix of influence factors by an n × m matrix:
Figure FDA0003292914170000021
secondly, calculating the Pearson correlation coefficient value between each influence factor and the dissolved oxygen concentration, wherein the calculation formula is as follows:
Figure FDA0003292914170000022
in the formula, xi,yiRespectively representing the ith elements in two correlation vectors x and y; l represents a variable length;
Figure FDA0003292914170000026
Figure FDA0003292914170000027
representing the elements in both vectors x and y, respectivelyA value;
and finally, after acquiring the Pearson correlation coefficient values between the factors and the dissolved oxygen concentration, removing the factors according to the principle that the correlation coefficient value is less than 0.1 and the correlation coefficient value is greater than 0.1 to complete the factor screening process.
4. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (3), the building of the IELM network model is: the network model comprises an input layer unit, a hidden layer unit and an output layer unit, the weight and the threshold of the extreme learning machine are optimized by utilizing an artificial bee colony algorithm to obtain the optimal initial values of the weight and the threshold of the extreme learning machine, and the specific operation process is as follows:
first, n samples are set to constitute a sample set (x)i,ti) (i ═ 1,2, …, n), m-dimensional feature x of the ith samplei=[xi1,xi2,…,xim],ti=[ti1,ti2,…,tim]If the number of hidden layers in the ELM network is l, the ELM network is:
Figure FDA0003292914170000023
in the formula, wjRepresenting the weight of the input layer unit and the jth hidden layer unit; bjRepresenting the biasing of the input layer elements from the hidden layer elements; beta is ajRepresenting the output weight between the jth hidden layer unit and the output layer unit; g (x) activation function, selection Sigmoid function, of network
Figure FDA0003292914170000024
Is an activation function; let the output value of the ELM network equal the desired value, equation (5) above can be converted into:
Figure FDA0003292914170000025
the simplified equation (6) is in matrix form as follows:
Hβ=T (7)
Figure FDA0003292914170000031
after the weight w and the bias in the network are randomly obtained, solving the weight beta between the hidden layer unit and the output layer unit by using a least square method, wherein the calculation expression is as follows:
β=H+Y (9)
in the formula, H+Represents the generalized inverse of the output matrix H;
secondly, initializing the population in the artificial bee colony to generate k particles, namely k feasible solutions;
each particle has D ═ l · (n +1) elements, where l denotes the number of hidden layer elements, n denotes the number of input layer elements, and the size of each element is [ -1,1 [ ]]To (c) to (d); each particle represents a set of input weights and a threshold value of the hidden layer cell, namely [ w11,w12,…w1L,w21,w22,…,w2L,…,wn1,wn2,…wnL,b1,b2,…bL]From each feasible solution, a corresponding fitness value may be generated;
determining k/2 particles as employment bees, recording the optimal value and the corresponding employment bees, and using the rest particles as observation bees; and searching a new honey source in the neighborhood range to update the employment bees, wherein the updating calculation formula is as follows:
P′j=Pj+(Pj-PN)*(rand-0.5)*2 (10)
of formula (II) to (III)'jRepresenting updated employment bees, PjRepresenting the original employed bee, P, before renewalNRepresenting a randomly selected original hiring bee;
performing iterative movement according to the principle that the source fitness is better and the source is moved to a new honey source, and using a wheelThe betting board method is used for observing whether bees follow the information of the employed bees or not and according to the probability
Figure FDA0003292914170000032
Executing a roulette method to select a honey source; according to an objective function fiThe rule of whether it is greater than 0, fitness function f (σ)i) Expressed as:
Figure FDA0003292914170000033
in the formula, deltaiDenotes the ith honey source, i belongs to {1,2,3, …, T }, T denotes the number of honey sources, f (delta)i) Is denoted by the number δiThe fitness of the position honey source; by comparison of f (delta)i) Observing the honey source selected by the bees; when the maximum honey collection times are met, the fitness is still unsuccessfully updated, and the local optimal solution is found, the honey source is abandoned, a new hiring bee is obtained according to the formula (10), and the new honey source is continuously found for replacement; when the maximum iteration times are met, obtaining the optimal fitness value and the optimal particles in the optimizing process, and taking the optimal result as the input parameter weight and the threshold of the ELM network model;
finally, performing IELM neural network training; based on preprocessing and factor screening, applying the optimal parameter weight and threshold determined in the artificial bee colony algorithm optimizing process to the ELM network; selecting a training set to carry out network training, calculating a predicted value and a root mean square error under the time point of the corresponding training set to be recorded as RMSE,
Figure FDA0003292914170000041
and storing the well-trained IELM network model meeting the error condition.
5. The extreme learning machine-based method for predicting dissolved oxygen in industrial aquaculture according to claim 1,
in step (4), the passing of the test prediction method and the outputting of the prediction result means: testing the prediction performance of the trained IELM network in a test set based on the trained IELM network model, and outputting the prediction result of the test set; and selecting a traditional ELM network model as a comparison algorithm, and outputting the prediction results of different algorithms in the test set.
CN202111170371.0A 2021-10-08 2021-10-08 Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine Pending CN113962819A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111170371.0A CN113962819A (en) 2021-10-08 2021-10-08 Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111170371.0A CN113962819A (en) 2021-10-08 2021-10-08 Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine

Publications (1)

Publication Number Publication Date
CN113962819A true CN113962819A (en) 2022-01-21

Family

ID=79463692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111170371.0A Pending CN113962819A (en) 2021-10-08 2021-10-08 Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine

Country Status (1)

Country Link
CN (1) CN113962819A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115994327A (en) * 2023-03-22 2023-04-21 山东能源数智云科技有限公司 Equipment fault diagnosis method and device based on edge calculation
CN116230087A (en) * 2022-12-02 2023-06-06 深圳太力生物技术有限责任公司 Method and device for optimizing culture medium components

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116230087A (en) * 2022-12-02 2023-06-06 深圳太力生物技术有限责任公司 Method and device for optimizing culture medium components
CN116230087B (en) * 2022-12-02 2024-05-14 深圳太力生物技术有限责任公司 Method and device for optimizing culture medium components
CN115994327A (en) * 2023-03-22 2023-04-21 山东能源数智云科技有限公司 Equipment fault diagnosis method and device based on edge calculation
CN115994327B (en) * 2023-03-22 2023-06-23 山东能源数智云科技有限公司 Equipment fault diagnosis method and device based on edge calculation

Similar Documents

Publication Publication Date Title
CN107480775B (en) Pond dissolved oxygen prediction method based on data restoration
CN108416366B (en) Power system short-term load prediction method based on meteorological index weighted LS-SVM
CN108665106A (en) A kind of aquaculture dissolved oxygen prediction method and device
CN112906298B (en) Blueberry yield prediction method based on machine learning
CN113962819A (en) Method for predicting dissolved oxygen in industrial aquaculture based on extreme learning machine
CN113554466B (en) Short-term electricity consumption prediction model construction method, prediction method and device
CN107169621A (en) A kind of Dissolved Oxygen in Water Forecasting Methodology and device
CN113361761A (en) Short-term wind power integration prediction method and system based on error correction
CN112527037A (en) Greenhouse environment regulation and control method and system with environment factor prediction function
CN113177673B (en) Air conditioner cold load prediction optimization method, system and equipment
CN115034126A (en) Method and system for optimizing LSTM neural network model through wolf algorithm
CN109934422A (en) Neural network wind speed prediction method based on time series data analysis
CN111292124A (en) Water demand prediction method based on optimized combined neural network
CN114897204A (en) Method and device for predicting short-term wind speed of offshore wind farm
CN114548350A (en) Power load prediction method based on goblet sea squirt group and BP neural network
Lu et al. Image classification and identification for rice leaf diseases based on improved WOACW_SimpleNet
Marinković et al. Data mining approach for predictive modeling of agricultural yield data
CN116663404A (en) Flood forecasting method and system coupling artificial intelligence and Bayesian theory
CN111126827A (en) Input-output accounting model construction method based on BP artificial neural network
CN113947332B (en) Underground engineering comprehensive security capability assessment method and system
CN114234392B (en) Air conditioner load fine prediction method based on improved PSO-LSTM
CN116167508A (en) Short-term photovoltaic output rapid prediction method and system based on meteorological factor decomposition
Qu et al. Application of Deep Neural Network on Net Photosynthesis Modeling
CN114357877A (en) Fishpond water quality evaluation prediction system and method based on fuzzy evaluation and improved support vector machine
CN110991743B (en) Wind power short-term combination prediction method based on cluster analysis and neural network optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination