CN114462208A - Effluent BOD online soft measurement method based on self-organizing RBFNN - Google Patents

Effluent BOD online soft measurement method based on self-organizing RBFNN Download PDF

Info

Publication number
CN114462208A
CN114462208A CN202210018901.8A CN202210018901A CN114462208A CN 114462208 A CN114462208 A CN 114462208A CN 202210018901 A CN202210018901 A CN 202210018901A CN 114462208 A CN114462208 A CN 114462208A
Authority
CN
China
Prior art keywords
neuron
similarity
neurons
network
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210018901.8A
Other languages
Chinese (zh)
Inventor
乔俊飞
贾丽杰
李文静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202210018901.8A priority Critical patent/CN114462208A/en
Publication of CN114462208A publication Critical patent/CN114462208A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

An effluent BOD online soft measurement method based on a self-organizing RBF neural network is directly applied to the field of sewage treatment. Aiming at the problems of long test period, large hysteresis, incapability of reflecting BOD change in a water body in time and the like of the BOD concentration of effluent in the current sewage treatment process, the invention provides a BOD prediction method of effluent based on self-organizing RBF nerves, which utilizes Gaussian membership as a similarity measurement standard, designs a self-organizing mechanism of a network online structure to obtain a compact network structure, provides an online small-batch gradient learning algorithm to perform online learning on network parameters to obtain rapid and stable convergence performance, finally realizes accurate prediction of BOD concentration and solves the problem of difficulty in measuring BOD concentration in the sewage treatment process.

Description

Effluent BOD online soft measurement method based on self-organizing RBFNN
Technical Field
The invention relates to the field of artificial intelligence, and is directly applied to the field of sewage treatment.
Background
The sewage treatment process has more and very complex reactions, which causes that important parameters in the sewage are very difficult to measure. Biochemical Oxygen Demand (BOD) represents the amount of Oxygen consumed by microorganisms in a water body to convert organic substances into inorganic substances, can directly reflect the degree of water body pollution, and is a very important water quality detection index in a sewage treatment process. The existing method for predicting the BOD concentration of effluent water comprises a dilution and inoculation method, manual timing sampling and the like, but the dilution and inoculation method has the problems of long test period, large hysteresis, incapability of reflecting the change of the BOD in the water body in time and the like. The method for detecting the BOD of the effluent in the sewage by using the instrument can quickly detect the result, but the instrument detection is expensive. Therefore, how to detect the BOD concentration of the effluent more effectively is a key problem in the sewage treatment process.
In order to solve the problem, many scholars propose a soft measurement method, which adopts the idea of indirect measurement and utilizes known variables to predict non-measurable variables at future time in real time by constructing a model. The method can solve the problems that the traditional measuring method of the effluent BOD consumes too long time, the detection price of an application instrument is high, and regular maintenance is needed. The traditional RBF neural network structure is difficult to determine, and most of the existing self-organizing RBF networks excessively depend on human factors for parameter setting and lack of adaptivity. The invention designs a self-organizing RBF neural network based on Gaussian membership and is used for BOD prediction of effluent water, so that the BOD concentration of the effluent water can be accurately predicted in real time.
Disclosure of Invention
1. The problems that the invention can solve are as follows:
the invention provides an on-line prediction method of BOD concentration of effluent water in a sewage treatment process based on a self-organizing RBF neural network. The method utilizes Gaussian membership as a similarity measurement standard, designs a network online structure self-organization mechanism, and provides an online small batch gradient learning algorithm to perform online learning on network parameters, thereby realizing online real-time prediction of the BOD concentration of effluent and solving the problem of difficulty in measuring the BOD concentration in the sewage treatment process.
2. The specific technical scheme of the invention is as follows:
a self-organizing RBFNN-based effluent BOD online soft measurement method mainly comprises the following steps:
step 1: effluent BOD data preprocessing
Through the analysis of the effluent BOD mechanism in the sewage treatment process, 10 variables of the influent pH value, the effluent pH value, the influent suspended solid concentration, the effluent suspended solid concentration, the influent BOD concentration, the influent chemical oxygen demand Concentration (COD), the effluent COD concentration, the sludge sedimentation rate (SV), the suspension concentration (MLSS) and the dissolved oxygen concentration (DO) in total are selected as the effluent variablesThe raw feature set of water BOD, i.e., the input variables, is denoted as X ═ X np1,2, …, P, N is 1,2, …, N, where P is 10; and N is the number of samples. x is the number ofnpDenoted as the p-th feature of the nth data. Selecting BOD concentration of effluent as output variable, and recording as Y ═ Y n1,2, …, N being the number of samples, ynRepresenting the nth output sample. And carrying out data normalization processing on the obtained multi-feature data set, so that data indexes are in the same order of magnitude, and eliminating errors generated on a soft measurement model due to large difference of the order of magnitude of the data.
Carrying out normalization processing on an input variable X and an output variable Y according to the following formula:
Figure BDA0003461671860000021
Figure BDA0003461671860000022
x and Y represent data after normalization processing, and the value range is [0,1 ].
Step 2: designing a self-organizing RBFNN-based effluent BOD online soft measurement model structure;
step 2.1: and designing the topology structure of the RBF neural network. The RBF is a three-layer forward neural network and mainly comprises an input layer, a hidden layer and an output layer. The input layer imports a sample into the network, containing 10 neurons. The hidden layer of the network initially comprises 2 neurons, the number of the neurons of the hidden layer of the network is represented by H, the number of the neurons of the hidden layer of the network is 2 initially, a neuron kernel function adopts a Gaussian function, and the output of the H-th neuron is as follows:
Figure BDA0003461671860000023
wherein is chAnd σhRespectively representing the center and width vectors of the h-th hidden layer neuron, c1=x1,c2=x2,σ1=1,σ2=1,xnIndicates that the nth sample n is more than or equal to 3, | | xn-chI represents xnAnd chThe Euclidean distance between the two is calculated by the formula
Figure BDA0003461671860000024
The output layer only has one neuron, the output of the neuron is the linear weighting of the hidden layer neuron output, and the expression is as follows:
Figure BDA0003461671860000031
in the formula, whIs the connection weight, w, of the h hidden layer neuron and the output neuron1=y1,w2=y2
Step 2.2: designing a self-organizing RBF neural network
Step 2.2.1: for the current sample xn(n is not less than 3), calculating sample xnAnd the similarity mu between neurons hnh,μnhIs calculated by the formula
Figure BDA0003461671860000032
Where H is the number of neurons, chIs the central vector of neuron h.
Sample xnAnd the similarity vectors between the current neurons are recorded as Un,Un=[μn1n2nh,…,μnH](1. ltoreq. H. ltoreq.H) according to UnFinding the maximum of the middle element and the sample xnThe neuron h having the greatest degree of similarity,
Figure BDA0003461671860000033
after the maximum similarity neuron h is determined, calculating a similarity threshold T of the neuron hhSimilarity threshold of neuron hValue ThDefined as the maximum value of the similarity between neuron h and all the neurons at present.
First, the similarity between neuron H and all other neurons is calculated, for example, the similarity between neuron H and neuron H '(H' ∈ H \ { H }) is calculated by using formula (7), and H '∈ H \ { H } indicates that H' is a neuron other than the H-th neuron in the H neurons.
Figure BDA0003461671860000034
Where H is the number of neurons, chIs the central vector of neuron h, ch'Is the central vector of neuron h'.
The similarity matrix of neuron h and all other neurons is recorded as Uh,Uh=[μh1h2hh',…,μhH](H'. di. epsilon. H \ H }). Similarity threshold T of neuron hhThe calculation formula is Th=max(max(Uh))。
Calculating the current root mean square error of the network according to the samples in the sliding window by using a formula (8), wherein when n is less than L, the samples in the sliding window are the first sample to the nth sample, and n samples are shared; when n is larger than or equal to L, the samples from the n-L +1 th sample to the nth sample are the samples in the sliding window, and the sliding window has L samples in total.
Figure BDA0003461671860000035
Wherein L represents the size of the sliding window, and the value range of L is [50,100 ]]。diIs the desired output of the network, yiIs the actual output and i indicates that the sample is data within a sliding window.
Designing a network addition criterion: if μnh<ThAnd E is>E0,E0The value is 0.05. Adding the current sample as a new neuron into the network, and setting parameters as shown in formula (9):
Figure BDA0003461671860000041
wherein c isH+1,σH+1,wH+1Respectively, the center vector, width and output weight of the newly added neuron H +1, dmaxIs the maximum distance between the centers. e.g. of the typen HIs x before RBFNN structure changenThe time-dependent network error, i.e., the error when the network has H neurons, is calculated using equation (10).
Figure BDA0003461671860000042
Is x at the input after neuron additionnSum of outputs of H +1 neurons of the hidden layer of the time network, θhCalculated using equation (3). Update H ═ H + 1.
Figure BDA0003461671860000043
Where y isnIs input as xnIn the case of (3), the net obtains actual outputs when there are H neurons.
Step 2.2.2: a learning algorithm-an online fixed small batch gradient algorithm of a self-organizing RBFNN-based effluent BOD online soft measurement model is designed. If the increase of the neuron is performed, learning parameters after the structure changes, and defining the loss function of the current network as follows:
Figure BDA0003461671860000044
wherein d isiIs the desired output, yiIs the actual output;
Figure BDA0003461671860000045
fixed batch size L span [50,100]And k is the number of iterations.
Training the output weight, neuron center and neuron width parameters by using the following formula:
wh(k+1)=wh(k)-ηwΦwh(k) (12)
ch(k+1)=ch(k)-ηcΦch(k) (13)
σh(k+1)=σh(k)-ησΦσh(k) (14)
Figure BDA0003461671860000046
Figure BDA0003461671860000047
Figure BDA0003461671860000051
wherein, the formula (12), the formula (13) and the formula (14) are respectively the update iteration rules of the output weight, the neuron center and the neuron width; w is ah(k)、ch(k) And σh(k) Respectively representing the weight, the central vector and the width of the h neuron in the k iteration; phiwh(k)、Φch(k) And phiσh(k) Representing weight gradient, center gradient and width gradient of the h neuron; e (k) is the error vector generated by the network at the kth iteration based on the samples in the current sliding window; x represents a sample within the current sliding window; thetah(k) Is the output vector of the h neuron at the kth iteration, η, based on the samples in the current sliding windoww、ηcAnd ησThe learning rates, η, of the output weight, center and width, respectivelyw、ηcAnd ησValue range of [0.01,0.05 ]]。
The size of a sliding window is set to be 50 during online learning of the network parameters, and training is stopped when a training error reaches an expected error value of 0.05 or the maximum training time is 200.
Step 2.2.3: if H is>Calculating the phase between the neuronsSimilarity, the obtained similarity matrix is expressed as U ═ U1,U2,…,UH]Here U1Representing a similarity matrix, U, between the first neuron and all remaining neurons1=[μ1213,…,μ1h,…,μ1H],μ1hThe similarity between the first neuron and the h-th neuron is expressed and calculated by formula (7). Each element in the matrix U represents the similarity between two neurons, assuming μp1p2Max (u)), neuron p1And neuron p2Degree of similarity mu ofp1p2For the maximum value of the similarity between all neurons, neuron p is determined according to the following formula1And neuron p2
Figure BDA0003461671860000052
Figure BDA0003461671860000053
If μp1p2>T0(T0Is the combined threshold, takes the value of 0.8), i.e. the neuron p1And p2The similarity between the two is greater than a preset merging threshold value T0Then neuron p is replaced1And p2And (6) merging. The parameters of the new neuron after merging are set as:
Figure BDA0003461671860000054
wherein c isp1,cp2p1p2,wp1,wp2Respectively, merge pre-neuron p1And p2Center, width, and output weight of; c. Cp,σp,wpRespectively the central vector, width and output weight of the new neuron p after merging;
Figure BDA0003461671860000055
respectively representing the sample in the sliding window in the neuron p1,p2And the sum of the outputs on the new neuron p after merging, update H-1.
Step 2.2.4: if neuron merging is performed, learning the parameters of the network after the structure change by using the learning algorithm designed in step 2.2.2;
step 2.2.5: every agemaxTime-step examination of the lifetime 'age' of neurons and the neuronal similarity capability 'P' of neuronsmaxIs 40. The lifetime of a neuron is defined as the number of samples that each neuron has passed since its generation, the age of each neuron from its generation is 0, and the age is increased by 1 every time a new sample is input after its passage. The similarity capability of a neuron is defined as the number of times that each neuron satisfies that the similarity between the neuron and the sample is greater than a self-similarity threshold, i.e., μnh>Th(H ═ 1,2, …, H), P plus 1. When the age reaches the maximum value of agemaxWhen P is 2 or less, it is regarded as a noise neuron, and H-1 is updated.
Step 2.2.6: if the noise neuron deletion is executed, learning by using the parameters of the learning algorithm read network designed in the step 2.2.2 after the structure changes, returning to the step 2.2.1, and learning the next sample;
step 2.2.7: and stopping when the last sample is completely learned.
And step 3: effluent BOD prediction
And taking the test sample data as the input of the self-organizing RBF neural network to obtain the output of the self-organizing RBF neural network, and performing reverse normalization on the output to obtain the BOD concentration of the effluent.
3. Compared with the prior art, the invention has the following obvious advantages:
(1) aiming at some problems of BOD of effluent in the current sewage treatment process, the invention provides a self-organized RBF neural network model to realize accurate prediction of BOD concentration of the effluent, and has the characteristics of low cost and high efficiency.
(2) The invention introduces the Gaussian membership degree into the RBF neural network structure self-organizing mechanism for the first time, designs the self-organizing RBF neural network model insensitive to the parameters and obtains a compact network structure. An online small-batch gradient learning algorithm is designed to learn the parameters of the network, and the fast and stable convergence performance is obtained.
Drawings
FIG. 1 is a diagram of the internal architecture of the neural network of the present invention;
FIG. 2 is a graph of the Root Mean Square Error (RMSE) variation for the effluent BOD concentration prediction method of the present invention;
FIG. 3 is a graph of the BOD concentration prediction of the effluent of the present invention;
FIG. 4 is a graph of the BOD concentration prediction error of the effluent of the present invention.
Detailed Description
The invention designs an effluent BOD online soft measurement method based on self-organizing RBFNN, realizes the prediction of BOD concentration at the future moment, solves the problem that the BOD concentration of the effluent is difficult to measure in real time in the sewage treatment process, and improves the monitoring level of the future water quality in the sewage treatment process.
The experimental data come from water quality analysis data of a certain sewage treatment plant in Beijing to obtain 365 groups of data, wherein the characteristics of input data are selected from the pH value of inlet water, the pH value of outlet water, the concentration of solid suspended matters of inlet water, the concentration of solid suspended matters of outlet water, the concentration of BOD of inlet water, the concentration of Chemical Oxygen Demand (COD) of inlet water, the concentration of COD of outlet water, the sludge settlement rate (SV), the concentration of suspended Matters (MLSS) and the concentration of Dissolved Oxygen (DO), and the output is the BOD concentration of outlet water. The front 265 groups are selected as training samples, the rear 100 groups are selected as testing samples, and the method mainly comprises the following steps:
step 1: effluent BOD data preprocessing
Selecting input variable, and recording as X ═ X np1,2, …, P, N is 1,2, …, N, where P is 10; and N is the number of samples. x is the number ofnpDenoted as the p-th feature of the nth data. Selecting BOD concentration of effluent as output variable, and recording as Y ═ Y n1,2, …, N being the number of samples, ynRepresenting the nth output sample. The obtained multi-feature data set is subjected to data normalization processing, so that data indexes are in the same order of magnitude, and the problem that the data indexes are different greatly and the pair is not matched with each other due to the fact that the order of magnitude of the data is large in difference is solvedThe soft measurement model produces errors.
Carrying out normalization processing on an input variable X and an output variable Y according to the following formula:
Figure BDA0003461671860000071
Figure BDA0003461671860000072
x and Y represent data after normalization processing, and the value range is [0,1 ].
Step 2: designing a self-organizing RBFNN-based effluent BOD online soft measurement model structure;
step 2.1: and designing the topology structure of the RBF neural network. The RBF is a three-layer forward neural network and mainly comprises an input layer, a hidden layer and an output layer. The input layer imports a sample into the network, containing 10 neurons. The hidden layer of the network initially comprises 2 neurons, the number of the neurons of the hidden layer of the network is represented by H, the number of the neurons of the hidden layer of the network is 2 initially, a neuron kernel function adopts a Gaussian function, and the output of the H-th neuron is as follows:
Figure BDA0003461671860000081
wherein is chAnd σhRespectively representing the center and width vectors of the h-th hidden layer neuron, c1=x1,c2=x2,σ1=1,σ2=1,xnIndicates that the nth sample n is more than or equal to 3, | | xn-chI represents xnAnd chThe Euclidean distance between the two is calculated by the formula
Figure BDA0003461671860000082
The output layer only has one neuron, the output of the neuron is the linear weighting of the hidden layer neuron output, and the expression is as follows:
Figure BDA0003461671860000083
in the formula, whIs the connection weight, w, of the h hidden layer neuron and the output neuron1=y1,w2=y2
Step 2.2: designing a self-organizing RBF neural network
Step 2.2.1: for the current sample xn(n is not less than 3), calculating sample xnAnd the similarity mu between neurons hnh,μnhIs calculated by the formula
Figure BDA0003461671860000084
Where H is the number of neurons, chIs the central vector of neuron h.
Sample xnAnd the similarity vectors between the current neurons are recorded as Un,Un=[μn1n2nh,…,μnH](1. ltoreq. H. ltoreq.H) according to UnFinding the maximum of the middle element and the sample xnThe neuron h having the greatest degree of similarity,
Figure BDA0003461671860000085
after the maximum similarity neuron h is determined, calculating a similarity threshold T of the neuron hhSimilarity threshold T of neuron hhDefined as the maximum value of the similarity between neuron h and all the neurons at present.
First, the similarity between neuron H and all other neurons is calculated, for example, the similarity between neuron H and neuron H '(H' ∈ H \ { H }) is calculated by formula (27), and H '∈ H \ { H } indicates that H' is a neuron other than the H-th neuron in H neurons.
Figure BDA0003461671860000086
Where H is the number of neurons, chIs the central vector of neuron h, ch'Is the central vector of neuron h'.
The similarity matrix of neuron h and all other neurons is recorded as Uh,Uh=[μh1h2hh',…,μhH](H'. di. epsilon. H \ H }). Similarity threshold T of neuron hhThe calculation formula is Th=max(max(Uh))。
Calculating the current root mean square error of the network according to the samples in the sliding window by using a formula (28), wherein when n is less than L, the samples in the sliding window are the first sample to the nth sample, and the total number of the samples is n; when n is larger than or equal to L, the samples from the n-L +1 th sample to the nth sample are the samples in the sliding window, and the sliding window has L samples in total.
Figure BDA0003461671860000091
Wherein L represents the size of the sliding window, and the value range of L is [50,100 ]]。diIs the desired output of the network, yiIs the actual output and i indicates that the sample is data within a sliding window.
Designing a network addition criterion: if μnh<ThAnd E is>E0,E0The value is 0.05. Adding the current sample as a new neuron into the network, and setting parameters as shown in formula (29):
Figure BDA0003461671860000092
wherein c isH+1,σH+1,wH+1Respectively, the center vector, width and output weight of the newly added neuron H +1, dmaxIs the maximum distance between the centers. e.g. of the typen HIs x before RBFNN structure changenThe error of the network, i.e., the error of the network having H neurons, is calculated by equation (30).
Figure BDA0003461671860000093
Is x at the input after neuron additionnSum of outputs of H +1 neurons of the hidden layer of the time network, θhCalculated using equation (23). Update H ═ H + 1.
Figure BDA0003461671860000094
Where y isnIs input as xnIn the case of (3), the net obtains actual outputs when there are H neurons.
Step 2.2.2: a learning algorithm-an online fixed small batch gradient algorithm of a self-organizing RBFNN-based effluent BOD online soft measurement model is designed. If the increase of the neuron is performed, learning parameters after the structure changes, and defining the loss function of the current network as follows:
Figure BDA0003461671860000101
wherein d isiIs the desired output, yiIs the actual output;
Figure BDA0003461671860000102
fixed batch size L span [50, 100%]And k is the number of iterations.
Training the output weight, neuron center and neuron width parameters by using the following formula:
wh(k+1)=wh(k)-ηwΦwh(k) (32)
ch(k+1)=ch(k)-ηcΦch(k) (33)
σh(k+1)=σh(k)-ησΦσh(k) (34)
Figure BDA0003461671860000103
Figure BDA0003461671860000104
Figure BDA0003461671860000105
wherein, the formula (32), the formula (33) and the formula (35) are respectively the update iteration rules of the output weight, the neuron center and the neuron width; w is ah(k)、ch(k) And σh(k) Respectively representing the weight, the central vector and the width of the h neuron in the k iteration; phiwh(k)、Φch(k) And phiσh(k) Representing weight gradient, center gradient and width gradient of the h neuron; e (k) is the error vector generated by the network at the kth iteration based on the samples in the current sliding window; x represents a sample within the current sliding window; thetah(k) Is the output vector of the h neuron at the kth iteration, η, based on the samples in the current sliding windoww、ηcAnd ησThe learning rates, η, of the output weight, center and width, respectivelyw、ηcAnd ησValue range of [0.01,0.05 ]]。
The size of a sliding window is set to be 50 during online learning of the network parameters, and training is stopped when a training error reaches an expected error value of 0.05 or the maximum training time is 200.
Step 2.2.3: if H is present>And 3, calculating the similarity between the neurons, and expressing the obtained similarity matrix as U ═ U1,U2,…,UH]Here U1Representing a similarity matrix, U, between the first neuron and all remaining neurons1=[μ1213,…,μ1h,…,μ1H],μ1hThe similarity between the first neuron and the h-th neuron is expressed and calculated by formula (7). Each element in the matrix U represents the similarity between two neurons, assuming μp1p2=max(max(U)),I.e. neuron p1And neuron p2Degree of similarity mu ofp1p2For the maximum value of the similarity between all neurons, neuron p is determined according to the following formula1And neuron p2
Figure BDA0003461671860000111
Figure BDA0003461671860000112
If μp1p2>T0(T0Is the combined threshold, takes the value of 0.8), i.e. the neuron p1And p2The similarity between the two is greater than a preset merging threshold value T0Then neuron p is replaced1And p2And (6) merging. The parameters of the new neuron after merging are set as:
Figure BDA0003461671860000113
wherein c isp1,cp2p1p2,wp1,wp2Respectively, merge pre-neuron p1And p2Center, width, and output weight of; c. Cp,σp,wpRespectively the central vector, width and output weight of the new neuron p after merging;
Figure BDA0003461671860000114
respectively representing the sample in the sliding window in the neuron p1,p2And the sum of the outputs on the new neuron p after merging, update H-1.
Step 2.2.4: if neuron merging is performed, learning parameters of the network after structure change by using the learning algorithm designed in the step 2.2.2;
step 2.2.5: every agemaxTime-step examination of the lifetime 'age' of neurons and the neuronal similarity capability 'P' of neuronsmaxIs 40. The lifetime of a neuron is defined as the number of samples that each neuron has passed since its generation, the age of each neuron from its generation is 0, and the age is increased by 1 every time a new sample is input after its passage. The similarity capability of a neuron is defined as the number of times that each neuron satisfies that the similarity between the neuron and the sample is greater than a self-similarity threshold, i.e., μnh>Th(H ═ 1,2, …, H), P plus 1. When the age reaches the maximum value of agemaxWhen P is 2 or less, it is regarded as a noise neuron, and H-1 is updated.
Step 2.2.6: if the noise neuron deletion is executed, learning by using the parameters of the learning algorithm read network designed in the step 2.2.2 after the structure changes, returning to the step 2.2.1, and learning the next sample;
step 2.2.7: and stopping when the last sample is completely learned.
And step 3: effluent BOD prediction
And taking the test sample data as the input of the self-organizing RBF neural network to obtain the output of the self-organizing RBF neural network, and performing reverse normalization on the output to obtain the BOD concentration of the effluent.
In this embodiment, the effluent BOD concentration prediction method trains RMSE as shown in fig. 2, where the X-axis is the number of training samples and the Y-axis is the value of the trained RMSE; the BOD concentration prediction result of the effluent is shown in figure 3, the X axis is the number of test samples, the Y axis is the BOD concentration value of the effluent, the unit is mg/L, the solid line is the expected output value of the BOD concentration of the effluent, and the dotted line is the actual output value of the BOD concentration of the effluent; the BOD concentration test error of the effluent is shown in FIG. 4, the X axis is the number of test samples, and the Y axis is the BOD concentration prediction error of the effluent, and the unit is mg/L.

Claims (1)

1. A self-organizing RBFNN-based effluent BOD online soft measurement method is characterized by comprising the following steps:
step 1: effluent BOD data preprocessing
Through the BOD mechanism analysis of effluent in the sewage treatment process, the pH value of the inlet water, the pH value of the outlet water, the concentration of solid suspended matter in the inlet water, the concentration of solid suspended matter in the outlet water, the BOD concentration of the inlet water and the pH value of the inlet water are selectedThe total 10 variables of chemical oxygen demand Concentration (COD) of water, COD concentration of effluent water, sedimentation rate of Sludge (SV), suspended matter concentration (MLSS) and dissolved oxygen concentration (DO) in R are taken as the original feature set of BOD of effluent water, namely input variable, which is recorded as X ═ { X ═ X { (np1,2, …, P, N is 1,2, …, N, where P is 10; n is the number of samples; x is the number ofnpA p-th feature represented as an nth data; selecting BOD concentration of effluent as output variable, and recording as Y ═ Yn1,2, …, N being the number of samples, ynRepresents the nth output sample; carrying out data normalization processing on the obtained multi-feature data set to enable data indexes to be in the same order of magnitude, and eliminating errors generated on a soft measurement model due to large difference of the order of magnitude of the data;
carrying out normalization processing on an input variable X and an output variable Y according to the following formula:
Figure FDA0003461671850000011
Figure FDA0003461671850000012
x and Y represent data after normalization processing, and the value range is [0,1 ];
step 2: designing a self-organizing RBFNN-based effluent BOD online soft measurement model structure;
step 2.1: designing a topological structure of the RBF neural network; the RBF is a three-layer forward neural network and mainly comprises an input layer, a hidden layer and an output layer; the input layer leads the sample into a network and comprises 10 neurons; the hidden layer of the network initially comprises 2 neurons, the number of the neurons of the hidden layer of the network is represented by H, the number of the neurons of the hidden layer of the network is 2 initially, a neuron kernel function adopts a Gaussian function, and the output of the H-th neuron is as follows:
Figure FDA0003461671850000013
wherein is chAnd σhRespectively representing the center and width vectors of the h-th hidden layer neuron, c1=x1,c2=x2,σ1=1,σ2=1,xnIndicates that the nth sample n is more than or equal to 3, | | xn-chI represents xnAnd chThe Euclidean distance between the two is calculated by the formula
Figure FDA0003461671850000014
The output layer only has one neuron, the output of the neuron is the linear weighting of the hidden layer neuron output, and the expression is as follows:
Figure FDA0003461671850000021
in the formula, whIs the connection weight, w, of the h hidden layer neuron and the output neuron1=y1,w2=y2
Step 2.2: designing a self-organizing RBF neural network
Step 2.2.1: for the current sample xn(n is more than or equal to 3), calculating a sample xnAnd the similarity mu between neurons hnh,μnhIs calculated by the formula
Figure FDA0003461671850000022
Where H is the number of neurons, chIs the central vector of neuron h;
sample xnAnd the similarity vectors between the current neurons are recorded as Un,Un=[μn1n2nh,…,μnH](1. ltoreq. H. ltoreq.H) according to UnFinding the maximum of the middle element and the sample xnThe neuron h having the greatest degree of similarity,
Figure FDA0003461671850000023
after the maximum similarity neuron h is determined, calculating a similarity threshold T of the neuron hhSimilarity threshold T of neuron hhDefining the maximum value of the similarity between the neuron h and all the current neurons;
firstly, calculating the similarity of the neuron H and all other neurons, for example, the similarity of the neuron H and a neuron H '(H' is belonged to H \ H }) is calculated by using a formula (7), and H 'is belonged to H \ H } to show that H' is a neuron except the H-th neuron in the H neurons;
Figure FDA0003461671850000024
where H is the number of neurons, chIs the central vector of neuron h, ch'Is the central vector of neuron h';
the similarity matrix of neuron h and all other neurons is recorded as Uh,Uh=[μh1h2hh',…,μhH](H' is belonged to H \ H }); similarity threshold T of neuron hhThe calculation formula is Th=max(max(Uh));
Calculating the current root mean square error of the network according to the samples in the sliding window by using a formula (8), wherein when n is less than L, the samples in the sliding window are the first sample to the nth sample, and n samples are shared; when n is larger than or equal to L, the samples from the n-L +1 th sample to the nth sample are in the sliding window, and the sliding window has L samples;
Figure FDA0003461671850000031
wherein L represents the size of the sliding window, and the value range of L is [50,100 ]];diIs the desired output of the network, yiIs the actual output, i denotes that the sample is data within a sliding window;
designing a network addition criterion: if μnh<ThAnd E is>E0,E0The value is 0.05; adding the current sample as a new neuron into the network, and setting parameters as shown in formula (9):
Figure FDA0003461671850000032
wherein c isH+1,σH+1,wH+1Respectively, the center vector, width and output weight of the newly added neuron H +1, dmaxIs the maximum distance between centers;
Figure FDA0003461671850000033
is x before RBFNN structure changenThe error of the network, namely the error of the network with H neurons, is calculated by using a formula (10);
Figure FDA0003461671850000034
is x at the input after neuron additionnSum of outputs of H +1 neurons of the hidden layer of the time network, θhCalculating by using a formula (3); updating H-H + 1;
Figure FDA0003461671850000035
where y isnIs input as xnUnder the condition of (1), the network obtains actual output when H neurons exist;
step 2.2.2: designing a learning algorithm-an online fixed small batch gradient algorithm of a self-organizing RBFNN-based effluent BOD online soft measurement model; if the increase of the neuron is performed, learning parameters after the structure changes, and defining the loss function of the current network as follows:
Figure FDA0003461671850000036
wherein d isiIs the desired output, yiIs the actual output;
Figure FDA0003461671850000037
fixed batch size L span [50,100]K is the number of iterations;
training the output weight, neuron center and neuron width parameters by using the following formula:
wh(k+1)=wh(k)-ηwΦwh(k) (12)
ch(k+1)=ch(k)-ηcΦch(k) (13)
σh(k+1)=σh(k)-ησΦσh(k) (14)
Figure FDA0003461671850000041
Figure FDA0003461671850000042
Figure FDA0003461671850000043
wherein, the formula (12), the formula (13) and the formula (14) are respectively the update iteration rules of the output weight, the neuron center and the neuron width; w is ah(k)、ch(k) And σh(k) Respectively representing the weight, the central vector and the width of the h neuron in the k iteration; phiwh(k)、Φch(k) And phiσh(k) Representing weight gradient, center gradient and width gradient of the h neuron; e (k) is the error vector generated by the network at the kth iteration based on the samples in the current sliding window; x represents a sample within the current sliding window; thetah(k) Is that the h neuron is at the k iteration based on the samples in the current sliding windowOutput vector of ηw、ηcAnd ησThe learning rates, η, of the output weight, center and width, respectivelyw、ηcAnd ησValue range of [0.01,0.05 ]];
Setting the size of the current fixed batch to be 50 during online learning of network parameters, and stopping training when a training error reaches an expected error value of 0.05 or the maximum training frequency is 200;
step 2.2.3: if H is present>And 3, calculating the similarity between the neurons, and expressing the obtained similarity matrix as U ═ U1,U2,…,UH]Here U1Representing a similarity matrix, U, between the first neuron and all remaining neurons1=[μ1213,…,μ1h,…,μ1H],μ1hRepresenting the similarity between the first neuron and the h neuron, and calculating by using a formula (7); each element in the matrix U represents the similarity between two neurons, assuming μp1p2Max (u)), neuron p1And neuron p2Degree of similarity mu ofp1p2For the maximum value of the similarity between all neurons, neuron p is determined according to the following formula1And neuron p2
Figure FDA0003461671850000044
Figure FDA0003461671850000045
If μp1p2>T0,T0Is a combined threshold, with a value of 0.8, i.e. neuron p1And p2The similarity between the two is greater than a preset merging threshold value T0Then neuron p is replaced1And p2Merging; the parameters of the new neuron after merging are set as:
Figure FDA0003461671850000046
wherein c isp1,cp2p1p2,wp1,wp2Respectively, merge pre-neuron p1And p2Center, width, and output weight of; c. Cp,σp,wpRespectively the central vector, width and output weight of the new neuron p after merging;
Figure FDA0003461671850000051
respectively representing the sample in the sliding window in the neuron p1,p2And the sum of the outputs on the new neuron p after merging, update H-1;
step 2.2.4: if neuron merging is performed, learning the parameters of the network after the structure change by using the learning algorithm designed in step 2.2.2;
step 2.2.5: every agemaxTime-step examination of the lifetime 'age' of neurons and the neuronal similarity capability 'P' of neuronsmaxIs 40; the life of the neuron is defined as the number of samples of each neuron which passes through from the generation, the age of each neuron which passes through from the generation is 0, and the age is added with 1 after the input of a new sample every time; the similarity capability of a neuron is defined as the number of times that each neuron satisfies that the similarity between the neuron and the sample is greater than a self-similarity threshold, i.e., μnh>Th(H ═ 1,2, …, H), P plus 1; when the age reaches the maximum value of agemaxWhen P is 2 or less, considered as a noise neuron, update H ═ H-1;
step 2.2.6: if the noise neuron deletion is executed, learning by using the parameters of the learning algorithm read network designed in the step 2.2.2 after the structure changes, returning to the step 2.2.1, and learning the next sample;
step 2.2.7: stopping when the last sample is completely learned;
and 3, step 3: effluent BOD prediction
And taking the test sample data as the input of the self-organizing RBF neural network to obtain the output of the self-organizing RBF neural network, and performing reverse normalization on the output to obtain the BOD concentration of the effluent.
CN202210018901.8A 2022-01-10 2022-01-10 Effluent BOD online soft measurement method based on self-organizing RBFNN Pending CN114462208A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210018901.8A CN114462208A (en) 2022-01-10 2022-01-10 Effluent BOD online soft measurement method based on self-organizing RBFNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210018901.8A CN114462208A (en) 2022-01-10 2022-01-10 Effluent BOD online soft measurement method based on self-organizing RBFNN

Publications (1)

Publication Number Publication Date
CN114462208A true CN114462208A (en) 2022-05-10

Family

ID=81408740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210018901.8A Pending CN114462208A (en) 2022-01-10 2022-01-10 Effluent BOD online soft measurement method based on self-organizing RBFNN

Country Status (1)

Country Link
CN (1) CN114462208A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029589A (en) * 2022-12-14 2023-04-28 浙江问源环保科技股份有限公司 Rural domestic sewage animal and vegetable oil online monitoring method based on two-section RBF

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029589A (en) * 2022-12-14 2023-04-28 浙江问源环保科技股份有限公司 Rural domestic sewage animal and vegetable oil online monitoring method based on two-section RBF
CN116029589B (en) * 2022-12-14 2023-08-22 浙江问源环保科技股份有限公司 Rural domestic sewage animal and vegetable oil online monitoring method based on two-section RBF

Similar Documents

Publication Publication Date Title
CN108898215B (en) Intelligent sludge bulking identification method based on two-type fuzzy neural network
CN111291937A (en) Method for predicting quality of treated sewage based on combination of support vector classification and GRU neural network
CN109060001B (en) Multi-working-condition process soft measurement modeling method based on feature transfer learning
CN107358021B (en) DO prediction model establishment method based on BP neural network optimization
US11346831B2 (en) Intelligent detection method for biochemical oxygen demand based on a self-organizing recurrent RBF neural network
Qiao et al. A self-organizing deep belief network for nonlinear system modeling
CN112884056A (en) Optimized LSTM neural network-based sewage quality prediction method
CN110824915B (en) GA-DBN network-based intelligent monitoring method and system for wastewater treatment
CN102854296A (en) Sewage-disposal soft measurement method on basis of integrated neural network
CN114037163A (en) Sewage treatment effluent quality early warning method based on dynamic weight PSO (particle swarm optimization) optimization BP (Back propagation) neural network
CN112949894A (en) Effluent BOD prediction method based on simplified long-term and short-term memory neural network
CN109978024B (en) Effluent BOD prediction method based on interconnected modular neural network
CN114462208A (en) Effluent BOD online soft measurement method based on self-organizing RBFNN
CN111125907A (en) Sewage treatment ammonia nitrogen soft measurement method based on hybrid intelligent model
CN109408896B (en) Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production
CN113743008A (en) Fuel cell health prediction method and system
CN110542748B (en) Knowledge-based robust effluent ammonia nitrogen soft measurement method
Miao et al. A hybrid neural network and genetic algorithm model for predicting dissolved oxygen in an aquaculture pond
CN111863153A (en) Method for predicting total amount of suspended solids in wastewater based on data mining
CN114861543A (en) Data-driven intelligent evaluation method for biodegradability of petrochemical sewage
CN116306803A (en) Method for predicting BOD concentration of outlet water of ILSTM (biological information collection flow) neural network based on WSFA-AFE
Kang et al. Research on forecasting method for effluent ammonia nitrogen concentration based on GRA-TCN
Wang A neural network algorithm based assessment for marine ecological environment
CN113222324A (en) Sewage quality monitoring method based on PLS-PSO-RBF neural network model
Meng et al. A Self-Organizing Modular Neural Network for Nonlinear System Modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination