CN114462208A

CN114462208A - Effluent BOD online soft measurement method based on self-organizing RBFNN

Info

Publication number: CN114462208A
Application number: CN202210018901.8A
Authority: CN
Inventors: 乔俊飞; 贾丽杰; 李文静
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2022-01-10
Filing date: 2022-01-10
Publication date: 2022-05-10

Abstract

An effluent BOD online soft measurement method based on a self-organizing RBF neural network is directly applied to the field of sewage treatment. Aiming at the problems of long test period, large hysteresis, incapability of reflecting BOD change in a water body in time and the like of the BOD concentration of effluent in the current sewage treatment process, the invention provides a BOD prediction method of effluent based on self-organizing RBF nerves, which utilizes Gaussian membership as a similarity measurement standard, designs a self-organizing mechanism of a network online structure to obtain a compact network structure, provides an online small-batch gradient learning algorithm to perform online learning on network parameters to obtain rapid and stable convergence performance, finally realizes accurate prediction of BOD concentration and solves the problem of difficulty in measuring BOD concentration in the sewage treatment process.

Description

Effluent BOD online soft measurement method based on self-organizing RBFNN

Technical Field

The invention relates to the field of artificial intelligence, and is directly applied to the field of sewage treatment.

Background

The sewage treatment process has more and very complex reactions, which causes that important parameters in the sewage are very difficult to measure. Biochemical Oxygen Demand (BOD) represents the amount of Oxygen consumed by microorganisms in a water body to convert organic substances into inorganic substances, can directly reflect the degree of water body pollution, and is a very important water quality detection index in a sewage treatment process. The existing method for predicting the BOD concentration of effluent water comprises a dilution and inoculation method, manual timing sampling and the like, but the dilution and inoculation method has the problems of long test period, large hysteresis, incapability of reflecting the change of the BOD in the water body in time and the like. The method for detecting the BOD of the effluent in the sewage by using the instrument can quickly detect the result, but the instrument detection is expensive. Therefore, how to detect the BOD concentration of the effluent more effectively is a key problem in the sewage treatment process.

In order to solve the problem, many scholars propose a soft measurement method, which adopts the idea of indirect measurement and utilizes known variables to predict non-measurable variables at future time in real time by constructing a model. The method can solve the problems that the traditional measuring method of the effluent BOD consumes too long time, the detection price of an application instrument is high, and regular maintenance is needed. The traditional RBF neural network structure is difficult to determine, and most of the existing self-organizing RBF networks excessively depend on human factors for parameter setting and lack of adaptivity. The invention designs a self-organizing RBF neural network based on Gaussian membership and is used for BOD prediction of effluent water, so that the BOD concentration of the effluent water can be accurately predicted in real time.

Disclosure of Invention

1. The problems that the invention can solve are as follows:

the invention provides an on-line prediction method of BOD concentration of effluent water in a sewage treatment process based on a self-organizing RBF neural network. The method utilizes Gaussian membership as a similarity measurement standard, designs a network online structure self-organization mechanism, and provides an online small batch gradient learning algorithm to perform online learning on network parameters, thereby realizing online real-time prediction of the BOD concentration of effluent and solving the problem of difficulty in measuring the BOD concentration in the sewage treatment process.

2. The specific technical scheme of the invention is as follows:

a self-organizing RBFNN-based effluent BOD online soft measurement method mainly comprises the following steps:

step 1: effluent BOD data preprocessing

Through the analysis of the effluent BOD mechanism in the sewage treatment process, 10 variables of the influent pH value, the effluent pH value, the influent suspended solid concentration, the effluent suspended solid concentration, the influent BOD concentration, the influent chemical oxygen demand Concentration (COD), the effluent COD concentration, the sludge sedimentation rate (SV), the suspension concentration (MLSS) and the dissolved oxygen concentration (DO) in total are selected as the effluent variablesThe raw feature set of water BOD, i.e., the input variables, is denoted as X ═

X

_np1,2, …, P, N is 1,2, …, N, where P is 10; and N is the number of samples. x is the number of_npDenoted as the p-th feature of the nth data. Selecting BOD concentration of effluent as output variable, and recording as Y ═

Y

_n1,2, …, N being the number of samples, y_nRepresenting the nth output sample. And carrying out data normalization processing on the obtained multi-feature data set, so that data indexes are in the same order of magnitude, and eliminating errors generated on a soft measurement model due to large difference of the order of magnitude of the data.

Carrying out normalization processing on an input variable X and an output variable Y according to the following formula:

x and Y represent data after normalization processing, and the value range is [0,1 ].

Step 2: designing a self-organizing RBFNN-based effluent BOD online soft measurement model structure;

step 2.1: and designing the topology structure of the RBF neural network. The RBF is a three-layer forward neural network and mainly comprises an input layer, a hidden layer and an output layer. The input layer imports a sample into the network, containing 10 neurons. The hidden layer of the network initially comprises 2 neurons, the number of the neurons of the hidden layer of the network is represented by H, the number of the neurons of the hidden layer of the network is 2 initially, a neuron kernel function adopts a Gaussian function, and the output of the H-th neuron is as follows:

wherein is c_hAnd σ_hRespectively representing the center and width vectors of the h-th hidden layer neuron, c₁＝x₁，c₂＝x₂，σ₁＝1，σ₂＝1，x_nIndicates that the nth sample n is more than or equal to 3, | | x_n-c_hI represents x_nAnd c_hThe Euclidean distance between the two is calculated by the formula

The output layer only has one neuron, the output of the neuron is the linear weighting of the hidden layer neuron output, and the expression is as follows:

in the formula, w_hIs the connection weight, w, of the h hidden layer neuron and the output neuron₁＝y₁，w₂＝y₂。

Step 2.2: designing a self-organizing RBF neural network

Step 2.2.1: for the current sample x_n(n is not less than 3), calculating sample x_nAnd the similarity mu between neurons h_nh，μ_nhIs calculated by the formula

Where H is the number of neurons, c_hIs the central vector of neuron h.

Sample x_nAnd the similarity vectors between the current neurons are recorded as U_n，U_n＝[μ_n1,μ_n2,μ_nh,…,μ_nH](1. ltoreq. H. ltoreq.H) according to U_nFinding the maximum of the middle element and the sample x_nThe neuron h having the greatest degree of similarity,

after the maximum similarity neuron h is determined, calculating a similarity threshold T of the neuron h_hSimilarity threshold of neuron hValue T_hDefined as the maximum value of the similarity between neuron h and all the neurons at present.

First, the similarity between neuron H and all other neurons is calculated, for example, the similarity between neuron H and neuron H '(H' ∈ H \ { H }) is calculated by using formula (7), and H '∈ H \ { H } indicates that H' is a neuron other than the H-th neuron in the H neurons.

Where H is the number of neurons, c_hIs the central vector of neuron h, c_h'Is the central vector of neuron h'.

The similarity matrix of neuron h and all other neurons is recorded as U_h,U_h＝[μ_h1,μ_h2,μ_hh',…,μ_hH](H'. di. epsilon. H \ H }). Similarity threshold T of neuron h_hThe calculation formula is T_h＝max(max(U_h))。

Calculating the current root mean square error of the network according to the samples in the sliding window by using a formula (8), wherein when n is less than L, the samples in the sliding window are the first sample to the nth sample, and n samples are shared; when n is larger than or equal to L, the samples from the n-L +1 th sample to the nth sample are the samples in the sliding window, and the sliding window has L samples in total.

Wherein L represents the size of the sliding window, and the value range of L is [50,100 ]]。d_iIs the desired output of the network, y_iIs the actual output and i indicates that the sample is data within a sliding window.

Designing a network addition criterion: if μ_nh<T_hAnd E is>E₀，E₀The value is 0.05. Adding the current sample as a new neuron into the network, and setting parameters as shown in formula (9):

wherein c is_H+1，σ_H+1，w_H+1Respectively, the center vector, width and output weight of the newly added neuron H +1, d_maxIs the maximum distance between the centers. e.g. of the type_n ^HIs x before RBFNN structure change_nThe time-dependent network error, i.e., the error when the network has H neurons, is calculated using equation (10).

Is x at the input after neuron addition_nSum of outputs of H +1 neurons of the hidden layer of the time network, θ_hCalculated using equation (3). Update H ═ H + 1.

Where y is_nIs input as x_nIn the case of (3), the net obtains actual outputs when there are H neurons.

Step 2.2.2: a learning algorithm-an online fixed small batch gradient algorithm of a self-organizing RBFNN-based effluent BOD online soft measurement model is designed. If the increase of the neuron is performed, learning parameters after the structure changes, and defining the loss function of the current network as follows:

wherein d is_iIs the desired output, y_iIs the actual output;

fixed batch size L span [50,100]And k is the number of iterations.

Training the output weight, neuron center and neuron width parameters by using the following formula:

w_h(k+1)＝w_h(k)-η_wΦ_wh(k) (12)

c_h(k+1)＝c_h(k)-η_cΦ_ch(k) (13)

σ_h(k+1)＝σ_h(k)-η_σΦ_σh(k) (14)

wherein, the formula (12), the formula (13) and the formula (14) are respectively the update iteration rules of the output weight, the neuron center and the neuron width; w is a_h(k)、c_h(k) And σ_h(k) Respectively representing the weight, the central vector and the width of the h neuron in the k iteration; phi_wh(k)、Φ_ch(k) And phi_σh(k) Representing weight gradient, center gradient and width gradient of the h neuron; e (k) is the error vector generated by the network at the kth iteration based on the samples in the current sliding window; x represents a sample within the current sliding window; theta_h(k) Is the output vector of the h neuron at the kth iteration, η, based on the samples in the current sliding window_w、η_cAnd η_σThe learning rates, η, of the output weight, center and width, respectively_w、η_cAnd η_σValue range of [0.01,0.05 ]]。

The size of a sliding window is set to be 50 during online learning of the network parameters, and training is stopped when a training error reaches an expected error value of 0.05 or the maximum training time is 200.

Step 2.2.3: if H is>Calculating the phase between the neuronsSimilarity, the obtained similarity matrix is expressed as U ═ U₁,U₂,…,U_H]Here U₁Representing a similarity matrix, U, between the first neuron and all remaining neurons₁＝[μ₁₂,μ₁₃,…,μ_1h,…,μ_1H]，μ_1hThe similarity between the first neuron and the h-th neuron is expressed and calculated by formula (7). Each element in the matrix U represents the similarity between two neurons, assuming μ_p1p2Max (u)), neuron p₁And neuron p₂Degree of similarity mu of_p1p2For the maximum value of the similarity between all neurons, neuron p is determined according to the following formula₁And neuron p₂，

If μ_p1p2>T₀(T₀Is the combined threshold, takes the value of 0.8), i.e. the neuron p₁And p₂The similarity between the two is greater than a preset merging threshold value T₀Then neuron p is replaced₁And p₂And (6) merging. The parameters of the new neuron after merging are set as:

wherein c is_p1,c_p2,σ_p1,σ_p2,w_p1,w_p2Respectively, merge pre-neuron p₁And p₂Center, width, and output weight of; c. C_p，σ_p，w_pRespectively the central vector, width and output weight of the new neuron p after merging;

respectively representing the sample in the sliding window in the neuron p₁，p₂And the sum of the outputs on the new neuron p after merging, update H-1.

Step 2.2.4: if neuron merging is performed, learning the parameters of the network after the structure change by using the learning algorithm designed in step 2.2.2;

step 2.2.5: every age_maxTime-step examination of the lifetime 'age' of neurons and the neuronal similarity capability 'P' of neurons_maxIs 40. The lifetime of a neuron is defined as the number of samples that each neuron has passed since its generation, the age of each neuron from its generation is 0, and the age is increased by 1 every time a new sample is input after its passage. The similarity capability of a neuron is defined as the number of times that each neuron satisfies that the similarity between the neuron and the sample is greater than a self-similarity threshold, i.e., μ_nh>T_h(H ═ 1,2, …, H), P plus 1. When the age reaches the maximum value of age_maxWhen P is 2 or less, it is regarded as a noise neuron, and H-1 is updated.

Step 2.2.6: if the noise neuron deletion is executed, learning by using the parameters of the learning algorithm read network designed in the step 2.2.2 after the structure changes, returning to the step 2.2.1, and learning the next sample;

step 2.2.7: and stopping when the last sample is completely learned.

And step 3: effluent BOD prediction

And taking the test sample data as the input of the self-organizing RBF neural network to obtain the output of the self-organizing RBF neural network, and performing reverse normalization on the output to obtain the BOD concentration of the effluent.

3. Compared with the prior art, the invention has the following obvious advantages:

(1) aiming at some problems of BOD of effluent in the current sewage treatment process, the invention provides a self-organized RBF neural network model to realize accurate prediction of BOD concentration of the effluent, and has the characteristics of low cost and high efficiency.

(2) The invention introduces the Gaussian membership degree into the RBF neural network structure self-organizing mechanism for the first time, designs the self-organizing RBF neural network model insensitive to the parameters and obtains a compact network structure. An online small-batch gradient learning algorithm is designed to learn the parameters of the network, and the fast and stable convergence performance is obtained.

Drawings

FIG. 1 is a diagram of the internal architecture of the neural network of the present invention;

FIG. 2 is a graph of the Root Mean Square Error (RMSE) variation for the effluent BOD concentration prediction method of the present invention;

FIG. 3 is a graph of the BOD concentration prediction of the effluent of the present invention;

FIG. 4 is a graph of the BOD concentration prediction error of the effluent of the present invention.

Detailed Description

The invention designs an effluent BOD online soft measurement method based on self-organizing RBFNN, realizes the prediction of BOD concentration at the future moment, solves the problem that the BOD concentration of the effluent is difficult to measure in real time in the sewage treatment process, and improves the monitoring level of the future water quality in the sewage treatment process.

The experimental data come from water quality analysis data of a certain sewage treatment plant in Beijing to obtain 365 groups of data, wherein the characteristics of input data are selected from the pH value of inlet water, the pH value of outlet water, the concentration of solid suspended matters of inlet water, the concentration of solid suspended matters of outlet water, the concentration of BOD of inlet water, the concentration of Chemical Oxygen Demand (COD) of inlet water, the concentration of COD of outlet water, the sludge settlement rate (SV), the concentration of suspended Matters (MLSS) and the concentration of Dissolved Oxygen (DO), and the output is the BOD concentration of outlet water. The front 265 groups are selected as training samples, the rear 100 groups are selected as testing samples, and the method mainly comprises the following steps:

step 1: effluent BOD data preprocessing

Selecting input variable, and recording as X ═

X

Y

_n1,2, …, N being the number of samples, y_nRepresenting the nth output sample. The obtained multi-feature data set is subjected to data normalization processing, so that data indexes are in the same order of magnitude, and the problem that the data indexes are different greatly and the pair is not matched with each other due to the fact that the order of magnitude of the data is large in difference is solvedThe soft measurement model produces errors.

Step 2.2: designing a self-organizing RBF neural network

Where H is the number of neurons, c_hIs the central vector of neuron h.

after the maximum similarity neuron h is determined, calculating a similarity threshold T of the neuron h_hSimilarity threshold T of neuron h_hDefined as the maximum value of the similarity between neuron h and all the neurons at present.

First, the similarity between neuron H and all other neurons is calculated, for example, the similarity between neuron H and neuron H '(H' ∈ H \ { H }) is calculated by formula (27), and H '∈ H \ { H } indicates that H' is a neuron other than the H-th neuron in H neurons.

Calculating the current root mean square error of the network according to the samples in the sliding window by using a formula (28), wherein when n is less than L, the samples in the sliding window are the first sample to the nth sample, and the total number of the samples is n; when n is larger than or equal to L, the samples from the n-L +1 th sample to the nth sample are the samples in the sliding window, and the sliding window has L samples in total.

Designing a network addition criterion: if μ_nh<T_hAnd E is>E₀，E₀The value is 0.05. Adding the current sample as a new neuron into the network, and setting parameters as shown in formula (29):

wherein c is_H+1，σ_H+1，w_H+1Respectively, the center vector, width and output weight of the newly added neuron H +1, d_maxIs the maximum distance between the centers. e.g. of the type_n ^HIs x before RBFNN structure change_nThe error of the network, i.e., the error of the network having H neurons, is calculated by equation (30).

Is x at the input after neuron addition_nSum of outputs of H +1 neurons of the hidden layer of the time network, θ_hCalculated using equation (23). Update H ═ H + 1.

wherein d is_iIs the desired output, y_iIs the actual output;

fixed batch size L span [50, 100%]And k is the number of iterations.

w_h(k+1)＝w_h(k)-η_wΦ_wh(k) (32)

c_h(k+1)＝c_h(k)-η_cΦ_ch(k) (33)

σ_h(k+1)＝σ_h(k)-η_σΦ_σh(k) (34)

wherein, the formula (32), the formula (33) and the formula (35) are respectively the update iteration rules of the output weight, the neuron center and the neuron width; w is a_h(k)、c_h(k) And σ_h(k) Respectively representing the weight, the central vector and the width of the h neuron in the k iteration; phi_wh(k)、Φ_ch(k) And phi_σh(k) Representing weight gradient, center gradient and width gradient of the h neuron; e (k) is the error vector generated by the network at the kth iteration based on the samples in the current sliding window; x represents a sample within the current sliding window; theta_h(k) Is the output vector of the h neuron at the kth iteration, η, based on the samples in the current sliding window_w、η_cAnd η_σThe learning rates, η, of the output weight, center and width, respectively_w、η_cAnd η_σValue range of [0.01,0.05 ]]。

Step 2.2.3: if H is present>And 3, calculating the similarity between the neurons, and expressing the obtained similarity matrix as U ═ U₁,U₂,…,U_H]Here U₁Representing a similarity matrix, U, between the first neuron and all remaining neurons₁＝[μ₁₂,μ₁₃,…,μ_1h,…,μ_1H]，μ_1hThe similarity between the first neuron and the h-th neuron is expressed and calculated by formula (7). Each element in the matrix U represents the similarity between two neurons, assuming μ_p1p2＝max(max(U))，I.e. neuron p₁And neuron p₂Degree of similarity mu of_p1p2For the maximum value of the similarity between all neurons, neuron p is determined according to the following formula₁And neuron p₂，

Step 2.2.4: if neuron merging is performed, learning parameters of the network after structure change by using the learning algorithm designed in the step 2.2.2;

step 2.2.7: and stopping when the last sample is completely learned.

And step 3: effluent BOD prediction

In this embodiment, the effluent BOD concentration prediction method trains RMSE as shown in fig. 2, where the X-axis is the number of training samples and the Y-axis is the value of the trained RMSE; the BOD concentration prediction result of the effluent is shown in figure 3, the X axis is the number of test samples, the Y axis is the BOD concentration value of the effluent, the unit is mg/L, the solid line is the expected output value of the BOD concentration of the effluent, and the dotted line is the actual output value of the BOD concentration of the effluent; the BOD concentration test error of the effluent is shown in FIG. 4, the X axis is the number of test samples, and the Y axis is the BOD concentration prediction error of the effluent, and the unit is mg/L.

Claims

1. A self-organizing RBFNN-based effluent BOD online soft measurement method is characterized by comprising the following steps:

step 1: effluent BOD data preprocessing

Through the BOD mechanism analysis of effluent in the sewage treatment process, the pH value of the inlet water, the pH value of the outlet water, the concentration of solid suspended matter in the inlet water, the concentration of solid suspended matter in the outlet water, the BOD concentration of the inlet water and the pH value of the inlet water are selectedThe total 10 variables of chemical oxygen demand Concentration (COD) of water, COD concentration of effluent water, sedimentation rate of Sludge (SV), suspended matter concentration (MLSS) and dissolved oxygen concentration (DO) in R are taken as the original feature set of BOD of effluent water, namely input variable, which is recorded as X ═ { X ═ X { (_np1,2, …, P, N is 1,2, …, N, where P is 10; n is the number of samples; x is the number of_npA p-th feature represented as an nth data; selecting BOD concentration of effluent as output variable, and recording as Y ═ Y_n1,2, …, N being the number of samples, y_nRepresents the nth output sample; carrying out data normalization processing on the obtained multi-feature data set to enable data indexes to be in the same order of magnitude, and eliminating errors generated on a soft measurement model due to large difference of the order of magnitude of the data;

x and Y represent data after normalization processing, and the value range is [0,1 ];

step 2.1: designing a topological structure of the RBF neural network; the RBF is a three-layer forward neural network and mainly comprises an input layer, a hidden layer and an output layer; the input layer leads the sample into a network and comprises 10 neurons; the hidden layer of the network initially comprises 2 neurons, the number of the neurons of the hidden layer of the network is represented by H, the number of the neurons of the hidden layer of the network is 2 initially, a neuron kernel function adopts a Gaussian function, and the output of the H-th neuron is as follows:

in the formula, w_hIs the connection weight, w, of the h hidden layer neuron and the output neuron₁＝y₁，w₂＝y₂；

Step 2.2: designing a self-organizing RBF neural network

Step 2.2.1: for the current sample x_n(n is more than or equal to 3), calculating a sample x_nAnd the similarity mu between neurons h_nh，μ_nhIs calculated by the formula

Where H is the number of neurons, c_hIs the central vector of neuron h;

after the maximum similarity neuron h is determined, calculating a similarity threshold T of the neuron h_hSimilarity threshold T of neuron h_hDefining the maximum value of the similarity between the neuron h and all the current neurons;

firstly, calculating the similarity of the neuron H and all other neurons, for example, the similarity of the neuron H and a neuron H '(H' is belonged to H \ H }) is calculated by using a formula (7), and H 'is belonged to H \ H } to show that H' is a neuron except the H-th neuron in the H neurons;

where H is the number of neurons, c_hIs the central vector of neuron h, c_h'Is the central vector of neuron h';

the similarity matrix of neuron h and all other neurons is recorded as U_h,U_h＝[μ_h1,μ_h2,μ_hh',…,μ_hH](H' is belonged to H \ H }); similarity threshold T of neuron h_hThe calculation formula is T_h＝max(max(U_h))；

Calculating the current root mean square error of the network according to the samples in the sliding window by using a formula (8), wherein when n is less than L, the samples in the sliding window are the first sample to the nth sample, and n samples are shared; when n is larger than or equal to L, the samples from the n-L +1 th sample to the nth sample are in the sliding window, and the sliding window has L samples;

wherein L represents the size of the sliding window, and the value range of L is [50,100 ]]；d_iIs the desired output of the network, y_iIs the actual output, i denotes that the sample is data within a sliding window;

designing a network addition criterion: if μ_nh<T_hAnd E is>E₀，E₀The value is 0.05; adding the current sample as a new neuron into the network, and setting parameters as shown in formula (9):

wherein c is_H+1，σ_H+1，w_H+1Respectively, the center vector, width and output weight of the newly added neuron H +1, d_maxIs the maximum distance between centers;

is x before RBFNN structure change_nThe error of the network, namely the error of the network with H neurons, is calculated by using a formula (10);

is x at the input after neuron addition_nSum of outputs of H +1 neurons of the hidden layer of the time network, θ_hCalculating by using a formula (3); updating H-H + 1;

where y is_nIs input as x_nUnder the condition of (1), the network obtains actual output when H neurons exist;

step 2.2.2: designing a learning algorithm-an online fixed small batch gradient algorithm of a self-organizing RBFNN-based effluent BOD online soft measurement model; if the increase of the neuron is performed, learning parameters after the structure changes, and defining the loss function of the current network as follows:

wherein d is_iIs the desired output, y_iIs the actual output;

fixed batch size L span [50,100]K is the number of iterations;

w_h(k+1)＝w_h(k)-η_wΦ_wh(k) (12)

c_h(k+1)＝c_h(k)-η_cΦ_ch(k) (13)

σ_h(k+1)＝σ_h(k)-η_σΦ_σh(k) (14)

wherein, the formula (12), the formula (13) and the formula (14) are respectively the update iteration rules of the output weight, the neuron center and the neuron width; w is a_h(k)、c_h(k) And σ_h(k) Respectively representing the weight, the central vector and the width of the h neuron in the k iteration; phi_wh(k)、Φ_ch(k) And phi_σh(k) Representing weight gradient, center gradient and width gradient of the h neuron; e (k) is the error vector generated by the network at the kth iteration based on the samples in the current sliding window; x represents a sample within the current sliding window; theta_h(k) Is that the h neuron is at the k iteration based on the samples in the current sliding windowOutput vector of η_w、η_cAnd η_σThe learning rates, η, of the output weight, center and width, respectively_w、η_cAnd η_σValue range of [0.01,0.05 ]]；

Setting the size of the current fixed batch to be 50 during online learning of network parameters, and stopping training when a training error reaches an expected error value of 0.05 or the maximum training frequency is 200;

step 2.2.3: if H is present>And 3, calculating the similarity between the neurons, and expressing the obtained similarity matrix as U ═ U₁,U₂,…,U_H]Here U₁Representing a similarity matrix, U, between the first neuron and all remaining neurons₁＝[μ₁₂,μ₁₃,…,μ_1h,…,μ_1H]，μ_1hRepresenting the similarity between the first neuron and the h neuron, and calculating by using a formula (7); each element in the matrix U represents the similarity between two neurons, assuming μ_p1p2Max (u)), neuron p₁And neuron p₂Degree of similarity mu of_p1p2For the maximum value of the similarity between all neurons, neuron p is determined according to the following formula₁And neuron p₂，

If μ_p1p2>T₀，T₀Is a combined threshold, with a value of 0.8, i.e. neuron p₁And p₂The similarity between the two is greater than a preset merging threshold value T₀Then neuron p is replaced₁And p₂Merging; the parameters of the new neuron after merging are set as:

respectively representing the sample in the sliding window in the neuron p₁，p₂And the sum of the outputs on the new neuron p after merging, update H-1;

step 2.2.5: every age_maxTime-step examination of the lifetime 'age' of neurons and the neuronal similarity capability 'P' of neurons_maxIs 40; the life of the neuron is defined as the number of samples of each neuron which passes through from the generation, the age of each neuron which passes through from the generation is 0, and the age is added with 1 after the input of a new sample every time; the similarity capability of a neuron is defined as the number of times that each neuron satisfies that the similarity between the neuron and the sample is greater than a self-similarity threshold, i.e., μ_nh>T_h(H ═ 1,2, …, H), P plus 1; when the age reaches the maximum value of age_maxWhen P is 2 or less, considered as a noise neuron, update H ═ H-1;

step 2.2.7: stopping when the last sample is completely learned;

and 3, step 3: effluent BOD prediction