CN108664706B - Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process - Google Patents

Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process Download PDF

Info

Publication number
CN108664706B
CN108664706B CN201810338582.2A CN201810338582A CN108664706B CN 108664706 B CN108664706 B CN 108664706B CN 201810338582 A CN201810338582 A CN 201810338582A CN 108664706 B CN108664706 B CN 108664706B
Authority
CN
China
Prior art keywords
oxygen content
model
distribution
parameters
posterior distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810338582.2A
Other languages
Chinese (zh)
Other versions
CN108664706A (en
Inventor
邵伟明
宋执环
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201810338582.2A priority Critical patent/CN108664706B/en
Publication of CN108664706A publication Critical patent/CN108664706A/en
Application granted granted Critical
Publication of CN108664706B publication Critical patent/CN108664706B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention discloses an online estimation method of oxygen content of a first-stage furnace in an ammonia synthesis process based on a semi-supervised Bayesian Gaussian mixture model, which is characterized by firstly designing a new complete Bayesian model structure, making probability of all model parameters and semi-supervised regression learning possible; and then, under a variational reasoning framework, simultaneously mining labeled sample information and unlabeled sample information, and establishing a learning step of model parameters. The method can provide the estimated value of the oxygen content of the first-stage furnace in the ammonia synthesis process in real time on line. By applying the method and the device, the influence of overfitting can be effectively reduced, so that the estimation precision is improved, and technical support and guarantee are provided for reducing the production cost, enhancing the process running stability, monitoring the process and making a decision.

Description

Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process
Technical Field
The invention belongs to the field of chemical process soft measurement modeling and application, and particularly relates to an online estimation method for the oxygen content of a first-stage furnace in an ammonia synthesis process based on a semi-supervised Bayesian Gaussian mixture model.
Background
Ammonia is a very important basic chemical product, the yield of the ammonia is highest among various chemical products, and the ammonia is industrially used for producing a large amount of urea, soda ash, ammonium nitrogen fertilizer and nitric acid and preparing organic synthetic industrial products such as fibers, plastics, dyes and the like. The raw materials for ammonia synthesis include nitrogen, which is available in large quantities from air, and hydrogen, which needs to be produced by specialized hydrogen production facilities. In most ammonia synthesis processes, a primary reformer (abbreviated as primary furnace) is the main equipment for preparing hydrogen, wherein the chemical reaction (catalyst is nickel) is as follows:
Figure BDA0001629961810000011
Figure BDA0001629961810000012
Figure BDA0001629961810000013
the chemical reaction is endothermic and requires heat to be supplied to the primary furnace. Therefore, the reaction temperature is an important factor for keeping the hydrogen production reaction stably. The conventional heating means of a primary furnace is to burn fuel gas and recovered flue gas in the radiant section. In order to maintain the reaction temperature set by the process, the oxygen content in the first-stage furnace needs to be controlled within a specified range. The oxygen content (in mole percent, mol%) can be determined by a mass analyzer. However, the mass analyzer is not only expensive, has a long measurement period, but also is prone to failure. Losing the measured value of the oxygen content, the closed-loop controller will not work, possibly causing a series of adverse consequences, such as environmental pollution and cost increase caused by the increase of rejection rate, energy consumption and the like, and even leading to potential safety hazards.
The data-driven oxygen content soft measurement model can realize the on-line real-time estimation of the oxygen content to make up for the defects of the mass analyzer. The principle is that a mathematical model is established according to the dependence relationship between the oxygen content and variables (such as temperature, pressure, flow, liquid level and other parameters, also called as auxiliary variables) which are easy to measure in the process at an off-line stage, and then the oxygen content is estimated on line by using the mathematical model, so that the method has the advantages of no measurement lag, low cost, good universality, easy maintenance and the like. However, because the combustion process of the first furnace is very complex, the working conditions are frequently switched, and the production data has the characteristics of uncertainty, multimodality, strong nonlinearity and the like, the traditional soft measurement model (such as a principal component analysis model, a partial least square model, a neural network model, a support vector machine model and the like) is difficult to obtain satisfactory estimation accuracy. On the other hand, because the measurement period of the mass analyzer is long, the number of labeled samples (i.e., samples with known oxygen content) is small, so that the traditional supervised modeling method is difficult to obtain accurate model parameters due to reasons such as "over-learning" or "under-learning". The soft measurement model of oxygen content with poor training cannot necessarily provide satisfactory estimation precision, and manual parameter setting is time-consuming and labor-consuming and has great difficulty.
Therefore, it is necessary and urgent to research and develop a soft oxygen content measurement model capable of simultaneously handling the problems of complex uncertainty, strong nonlinearity, multimodality, rare labeled samples and the like in the process of one stage of the furnace, which is helpful to improve the estimation accuracy of the oxygen content, thereby assisting the ammonia synthesis enterprises to realize the goals of safe production, energy saving, environmental protection, cost reduction and efficiency improvement.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method for estimating the oxygen content of a first-stage furnace in the ammonia synthesis process on line based on a semi-supervised Bayes Gaussian mixture model, which is characterized in that a probabilistic mathematical model between the oxygen content and auxiliary variables is established in the form of the Bayes Gaussian mixture model, the contribution degree of the mixture model is adaptively distributed according to the working condition switching, the problems of uncertainty, nonlinearity, multimodality and the like are effectively solved, and the problem of low model estimation precision caused by the rare labeled samples is solved by simultaneously utilizing the labeled samples and the unlabeled samples (namely the samples with unknown oxygen content and only known auxiliary variables) through semi-supervised learning. The specific technical scheme is as follows:
a method for estimating the oxygen content of a first-stage furnace in an ammonia synthesis process on line based on a semi-supervised Bayes Gaussian mixture model is characterized by comprising the following steps:
(1) selecting an auxiliary variable associated with the oxygen content y of the primary furnace
Figure BDA0001629961810000021
Wherein d represents the number of auxiliary variables;
(2) collecting labeled sample sets containing both auxiliary variables and oxygen content
Figure BDA0001629961810000022
And unlabeled sample set containing only auxiliary variables
Figure BDA0001629961810000023
Wherein n islAnd nuRespectively representing the number of the labeled samples and the number of the unlabeled samples;
(3) to (X)l,Yl) And XuCarrying out dimensionless treatment, and converting the sample variance of the auxiliary variable sample and the oxygen content sample into unit variance;
(4) initializing model parameters given a truncation level M of the Dirichlet process
Figure BDA0001629961810000024
Is a conjugate prior distribution parameter of0、b0、c0、d0、e0、f0、β0、v0、m0、W0And posterior distribution parameters a, b, hk、lk、ck、dk、ek、fk、βk、vk、mk、Wk、ωk、ΩkWherein, in the step (A),
Figure BDA0001629961810000025
Figure BDA0001629961810000026
Figure BDA0001629961810000027
and is
α represents a concentration factor of the dirichlet process;
χkparameters representing the kth hybrid model coefficients;
μkand ΛkRespectively representing a mean vector and a precision matrix of the distribution of the auxiliary variable x in the kth mixed model;
Figure BDA0001629961810000028
representing a linear regression coefficient between an auxiliary variable x and an oxygen content y in the kth mixed model;
τkto represent
Figure BDA0001629961810000031
The accuracy matrix parameters of (1);
ηkand representing the precision matrix coefficient of the measurement noise in the k mixed model.
The meaning of the conjugate prior distribution parameter and the posterior distribution parameter is:
(a0,b0) And (a, b) a prior distribution parameter and a posterior distribution parameter respectively representing α;
(hk,lk) Denotes xkThe posterior distribution parameters of (1);
(m00,W0,v0) And (m)kk,Wk,vk) Respectively represent (mu)kk) The prior distribution parameter and the posterior distribution parameter;
(c0,d0) And (c)k,dk) Respectively represent taukThe prior distribution parameter and the posterior distribution parameter;
(e0,f0) And (e)k,fk) Respectively represent ηkThe prior distribution parameter and the posterior distribution parameter;
ωkand ΩkTo represent
Figure BDA0001629961810000032
The posterior distribution parameters of (1);
(5) constructed with labeled samples (X)l,Yl) And unlabeled sample XuAnd its corresponding hidden variable
Figure BDA0001629961810000033
In which z isi=(zi1,…,ziM)TAnd zj=(zj1,…,zjM)TRespectively represent the ith labeled sample (x)i,yi) And the jth unlabeled sample xjCorresponding binary hidden variables, and satisfy
Figure BDA0001629961810000034
(6) Inputting the training sample set processed in the step (3), the initial model parameters in the step (4) and the likelihood function constructed in the step (5) into a semi-supervised Bayes Gaussian mixture model, and learning the optimal posterior distribution q (alpha) and the optimal posterior distribution q (alpha) of each model parameter through variational inference
Figure BDA0001629961810000035
Where q (-) denotes the optimal posterior distribution of the corresponding variable.
(7) And (4) collecting unknown samples only containing auxiliary variables, eliminating the dimension of the auxiliary variables according to the step (3), and estimating the oxygen content by using the optimal posterior distribution of the model parameters obtained in the step (6).
Further, the labeled sample (X) constructed in the step (5)l,Yl) And unlabeled sample XuAnd its corresponding hidden variable Zl、ZuThe likelihood function of (d) is:
Figure BDA0001629961810000036
Figure BDA0001629961810000037
Figure BDA0001629961810000038
Figure BDA0001629961810000039
Figure BDA0001629961810000041
wherein χ ═ χ (χ ═ χ)1,…,χM),μ=(μ1,…,μM),Λ=(Λ1,…,ΛM),
Figure BDA0001629961810000042
η=(η1,…,ηM),
Figure BDA0001629961810000043
Represents the mean value of μkThe covariance matrix is
Figure BDA0001629961810000044
The gaussian probability density function of (a) is,
Figure BDA0001629961810000045
further, the parameters a, b, h of the optimal posterior distribution of the model parameters of step (6)k,lk,ck,dk,ek,fk,βk,vk,mk,Wk,ωkAnd ΩkHas the following form:
a=a0+M-1
Figure BDA0001629961810000046
Figure BDA0001629961810000047
Figure BDA0001629961810000048
Figure BDA0001629961810000049
Figure BDA00016299618100000410
Figure BDA00016299618100000411
Figure BDA00016299618100000412
Figure BDA00016299618100000413
Figure BDA00016299618100000414
ck=c0+(d+1)/2
Figure BDA00016299618100000415
Figure BDA00016299618100000416
Figure BDA00016299618100000417
where ψ (·) denotes a digamma function, I denotes an identity matrix of the corresponding dimension,
Figure BDA0001629961810000051
1 is the full 1 column vector, Tr (-) trace of the matrix,
Figure BDA0001629961810000052
Figure BDA0001629961810000053
represents the estimation error of the k-th hybrid model,
Figure BDA0001629961810000054
Figure BDA0001629961810000055
here, the
Figure BDA0001629961810000056
Express according to
Figure BDA0001629961810000057
Distribution calculation of
Figure BDA0001629961810000058
(iii) a desire; kappaikAnd kappajkIs calculated in a manner that
Figure BDA0001629961810000059
Figure BDA00016299618100000510
Wherein
Figure BDA00016299618100000511
Figure BDA00016299618100000512
Further, the step (7) is specifically as follows:
according to the posterior distribution of alpha calculated in the step (6) and the property of the Dirichlet process, each model mixing coefficient pi ═ pi (pi ═ pi)1,…,πM) Can be calculated as
q(π)=Dir(π|φ1,…,φM)
Wherein Dir (π | φ)1,…,φM) The representative parameter is (phi)1,…,φM) Is distributed, and
Figure BDA00016299618100000513
then, the dimensionless auxiliary variable x can be obtained from the posterior distribution of the model parameters calculated in step (6)tIs distributed at the edge of
Figure BDA00016299618100000514
Wherein
Figure BDA0001629961810000061
Figure BDA0001629961810000062
The expression parameter is
Figure BDA0001629961810000063
Student's t distribution. Further, x can be obtainedtCorresponding hidden variable zt=(zt1,…,ztM) The posterior distribution of
Figure BDA0001629961810000064
Wherein z ist1,…,ztMAre all variable from 0 to 1 and satisfy
Figure BDA0001629961810000065
The probability distribution of the oxygen content can then be found, thereby obtaining an estimate of the oxygen content.
Further, the oxygen content ytThe probability distribution of (c) is:
Figure BDA0001629961810000066
wherein
Figure BDA0001629961810000067
Thus, an estimate of the oxygen content can be obtained as
Figure BDA0001629961810000068
Compared with the prior art, the invention has the following beneficial effects:
1. a mathematical model of oxygen content and auxiliary variables is established in a form of a mixed model, so that the problems of multi-mode and strong nonlinearity caused by working condition switching and a complex combustion process can be effectively solved;
2. through semi-supervised learning, labeled samples and unlabeled samples can be simultaneously utilized, and the problem of poor model parameter learning caused by insufficient labeled samples is solved, so that the estimation accuracy of the oxygen content is improved;
3. the method can solve the problems of parameter learning and model selection in one round of training at the same time, and does not need to traverse the number of all candidate mixed models, thereby improving the training efficiency.
Drawings
FIG. 1 is a flow chart of the method for on-line estimation of oxygen content in a first-stage furnace in an ammonia synthesis process based on a semi-supervised Bayesian Gaussian mixture model;
FIG. 2 is a schematic diagram of a process of a first stage furnace apparatus of a certain ammonia synthesis plant;
FIG. 3 is a diagram illustrating the estimation result of oxygen content according to the present invention, wherein the abscissa represents the oxygen content in mol%, the ordinate represents the serial number of the test sample, the solid line represents the true value of oxygen content, and the dotted line represents the estimated value of oxygen content;
fig. 4 is a schematic diagram of an estimation result of a gaussian mixture model on oxygen content, wherein an abscissa represents oxygen content in mol%, an ordinate represents a test sample number, a solid line represents a true value of oxygen content, and a dotted line represents an estimated value of oxygen content;
fig. 5 is a diagram showing the estimation result of the partial least squares model for the oxygen content, in which the abscissa represents the oxygen content in mol%, the ordinate represents the serial number of the test sample, the solid line represents the true value of the oxygen content, and the dotted line represents the estimated value of the oxygen content.
Detailed Description
The method for estimating the oxygen content of the first-stage furnace in the ammonia synthesis process based on the semi-supervised Bayesian Gaussian mixture model is further explained by combining a specific embodiment. It should be noted that the described embodiments are only intended to enhance the understanding of the present invention, and do not have any limiting effect on the present invention.
An online estimation method for oxygen content of a first-stage furnace in an ammonia synthesis process based on a semi-supervised Bayesian Gaussian mixture model is shown in figure 1, and specifically comprises the following steps:
(1) selection of auxiliary variables associated with oxygen content y in a one-stage furnace production plant
Figure BDA0001629961810000071
Wherein d represents the number of auxiliary variables;
in this embodiment, according to the process mechanism analysis of a first stage furnace device (as shown in fig. 2) in the production process of synthetic ammonia in a certain ICI-AMV process (with a yield of 1000t/d), 13 variables having the greatest influence on the oxygen content are selected as auxiliary variables, which are: fuel gas flow (x) to 03B0011And the bit number: fr03001.pv), fuel exhaust gas flow (x) to 03B0012And the bit number: fr03002.pv), pressure of fuel exhaust gas at exit 03E005 (x)3And the bit number: PC03002.PV), 03B001 outlet fuel gas pressure (x)4And the bit number: pc03007.pv), temperature of fuel exhaust gas at exit of 03E005 (x)5And the bit number: ti03001.pv), temperature of fuel gas at exit of 03B002E06 (x)6And the bit number: ti03009.pv), temperature of process gas at 03B001 inlet (x)7And the bit number: TR03012.PV), temperature of fuel gas above and to the left of 03B001 (x)8And the bit number: ti03013.pv), temperature of fuel gas right above 03B001 (x)9And the bit number: TI03014.PV), temperature (x) of the gas mixture just above 03B00110And the bit number: TR03015.PV), temperature (x) of the switching gas at left outlet of 03B00111And the bit number: TR03016.PV), temperature (x) of the transfer gas at right side outlet of 03B00112And the bit number: TR03017.PV) and 03B001 outlet transfer gas temperature (x)13And the bit number: tr03020. pv). Thus the auxiliary variable x ═ x1,…,x13]I.e. by
Figure BDA0001629961810000072
(2) Collecting labeled sample sets containing both auxiliary variables and oxygen content
Figure BDA0001629961810000073
And unlabeled sample set containing only auxiliary variables
Figure BDA0001629961810000074
Wherein n islAnd nuRespectively representing the number of the labeled samples and the number of the unlabeled samples;
the invention is distributed control from computerA2000 group of labeled sample sets (noted as auxiliary variables and oxygen content) were collected from the system database
Figure BDA0001629961810000081
And 5000 groups of unlabeled sample sets (noted as auxiliary variables) containing only auxiliary variables
Figure BDA0001629961810000082
As a training data set, where n l2000 and nu5000 represents the number of labeled and unlabeled swatches, respectively.
(3) To (X)l,Yl) And XuCarrying out dimensionless treatment, and converting the sample variance of the auxiliary variable sample and the oxygen content sample into unit variance;
the dimension removing method comprises the following steps:
Figure BDA0001629961810000083
in the formula (I), the compound is shown in the specification,
Figure BDA0001629961810000084
sample standard deviations, x, representing the ith auxiliary variable and oxygen content, respectivelyn(l) The sample value of the i auxiliary variable representing the n sample.
(4) Initializing model parameters given a truncation level M of the Dirichlet process
Figure BDA0001629961810000085
The conjugate prior distribution parameter and the posterior distribution parameter, the meaning of the model parameter is:
α represents a concentration factor of the dirichlet process;
χkparameters representing the kth hybrid model coefficients;
μkand ΛkRespectively representing a mean vector and a precision matrix of the distribution of the auxiliary variable x in the kth mixed model;
Figure BDA0001629961810000086
representing a linear regression coefficient between an auxiliary variable x and an oxygen content y in the kth mixed model;
τkto represent
Figure BDA0001629961810000087
The accuracy matrix parameters of (1);
ηktable k precision matrix coefficients of the measurement noise in the hybrid model.
In the invention, the conjugate prior distribution and the posterior distribution of each model parameter are determined as follows:
both the a priori distribution p (α) and the a posteriori distribution q (α) of α are gamma distributions, i.e., p (α) ═ Gam (α | a)0,b0) Q (α) ═ Gam (α | a, b), where Gam (α | a)0,b0) And Gam (α | a, b) each represent a parameter of (a)0,b0) Gamma distributions of (a) and (b);
χka priori distribution p (χ)k) And posterior distribution q (χ)k) Are all beta distribution, i.e. p (χ)k)=Beta(χk|1,α),q(χk)=Beta(χk|hk,lk) Therein Beta (x)k1, α) and Beta (χ)k|hk,lk) Respectively representing the parameters (1, alpha) and (h)k,lk) Beta distribution of (a);
μkka priori distribution p (mu)kk) And posterior distribution q (mu)kk) Are all in a Gaussian-Weisset distribution, i.e.
Figure BDA0001629961810000091
Wherein
Figure BDA0001629961810000092
And
Figure BDA0001629961810000093
respectively represent a parameter of (m)00,W0,v0) And (m)kk,Wk,vk) Gauss-WeishateDistributing;
Figure BDA0001629961810000094
prior distribution of
Figure BDA0001629961810000095
And posterior distribution
Figure BDA0001629961810000096
Are all Gaussian distributed, i.e.
Figure BDA0001629961810000097
Figure BDA0001629961810000098
Wherein
Figure BDA0001629961810000099
Represents a mean vector of 0 and a covariance matrix of
Figure BDA00016299618100000910
The distribution of the gaussian component of (a) is,
Figure BDA00016299618100000911
represents the mean vector as ωkThe covariance matrix is omegakI denotes the identity matrix of the corresponding dimension;
τka priori distribution p (τ)k) And a posterior distribution q (tau)k) Are all gamma distributions, i.e. p (τ)k)=Gam(τk|c0,d0),q(τk)=Gam(τk|ck,dk) Wherein Gam (τ)k|c0,d0) And Gam (τ)k|ck,dk) Respectively represent a parameter of (c)0,d0) And (c)k,dk) The gamma distribution of (1);
ηka priori distribution p (η)k) And posterior distribution q (η)k) Are all gamma distributions, i.e. p (η)k)=Gam(ηk|e0,f0),q(ηk)=Gam(ηk|ek,fk) Wherein Gam (η)k|e0,f0) And Gam (. eta.)k|ek,fk) Respectively represent a parameter of (e)0,f0) And (e)k,fk) The gamma distribution of (1).
Therefore, in this step, it is necessary to initialize a priori distribution parameters, including
Figure BDA00016299618100000912
Figure BDA00016299618100000913
And posterior distribution parameters including
Figure BDA00016299618100000914
Figure BDA00016299618100000915
Figure BDA00016299618100000916
In this example, the parameter of the prior distribution is set to a0=1,b0=1,c0=1,d0=1,e0=1,f0=1,β0=1,v0=1,m0=0,W0I ═ I; parameters a, b, h of the posterior distributionk,lk,ck,dk,ek,fk,βk,vk,mk,Wk,ωk,ΩkIs a random value.
(5) Constructed with labeled samples (X)l,Yl) And unlabeled sample XuAnd its corresponding hidden variable
Figure BDA00016299618100000917
Figure BDA00016299618100000918
In which z isi=(zi1,…,ziM)TAnd zj=(zj1,…,zjM)TRespectively represent the ith labeled sample (x)i,yi) And the jth unlabeled sample xjCorresponding binary hidden variables, and satisfy
Figure BDA00016299618100000919
Has the following form:
Figure BDA0001629961810000101
Figure BDA0001629961810000102
Figure BDA0001629961810000103
Figure BDA0001629961810000104
Figure BDA0001629961810000105
(6) inputting the training sample set processed in the step (3), the initial model parameters in the step (4) and the likelihood function constructed in the step (5) into a semi-supervised Bayes Gaussian mixture model, and learning the optimal posterior distribution q (alpha) and the optimal posterior distribution q (alpha) of each model parameter through variational inference
Figure BDA0001629961810000106
The specific process comprises a variation expectation part and a variation maximization part.
In the expected part of variation, the implicit variable Z needs to be calculatedlAnd ZuPosterior distribution q (Z)l) And q (Z)u). According to the principle of variational reasoning, the method can be obtained
Figure BDA0001629961810000107
Wherein
Figure BDA0001629961810000108
Express according to
Figure BDA0001629961810000109
Distribution calculation of
Figure BDA00016299618100001010
Is desired, x ═ x1,…,χM),μ=(μ1,…,μM),Λ=(Λ1,…,ΛM),
Figure BDA00016299618100001011
η=(η1,…,ηM),
Figure BDA00016299618100001012
Represents the mean value of μkThe covariance matrix is
Figure BDA00016299618100001013
The gaussian probability density function of (a) is,
Figure BDA00016299618100001014
and is
Figure BDA00016299618100001015
Where ψ (·) represents a digamma function. Therefore, the temperature of the molten metal is controlled,
Figure BDA00016299618100001016
wherein
Figure BDA0001629961810000111
For simplicity, the constant term in equation (7) is omitted; constant terms are still omitted when posterior distribution of each parameter is calculated subsequently.
By the same token, Z can be obtaineduPosterior distribution q (Z)u) The following were used:
Figure BDA0001629961810000112
wherein
Figure BDA0001629961810000113
Thereby obtaining
Figure BDA0001629961810000114
Wherein
Figure BDA0001629961810000115
In the variation maximization part, model parameters need to be calculated
Figure BDA0001629961810000116
The posterior distribution q (Θ). The principle of variational reasoning is still adopted. Specifically, q (α) is solved by
Figure BDA0001629961810000117
Therefore, the posterior distribution q (α) of α is represented by the parameter update formula of Gam (α | a, b)
Figure BDA0001629961810000118
lnq(χk) Can be calculated as follows
Figure BDA0001629961810000121
Therefore, χkPosterior distribution q (χ)k)=Beta(χk|hk,lk) The parameter update formula is
Figure BDA0001629961810000122
lnq(μkk) Can be calculated as follows
Figure BDA0001629961810000123
Wherein
Figure BDA0001629961810000124
The above formula is mukkPosterior distribution of
Figure BDA0001629961810000125
The parameter update formula of (1), the trace of the Tr (-) matrix;
Figure BDA0001629961810000126
can be calculated as follows
Figure BDA0001629961810000131
Wherein the content of the first and second substances,
Figure BDA0001629961810000132
1 is the vector of all 1 columns,
Figure BDA0001629961810000133
Figure BDA0001629961810000134
represents the estimation error of the k-th hybrid model, and therefore,
Figure BDA0001629961810000135
posterior distribution of
Figure BDA0001629961810000136
The parameter update formula is
Figure BDA0001629961810000137
lnq(τk) Can be calculated as follows
Figure BDA0001629961810000138
Thus, τkA posterior distribution q (τ)k)=Gam(τk|ck,dk) The parameter update formula is
Figure BDA0001629961810000139
lnq(ηk) Can be calculated as follows
Figure BDA00016299618100001310
Thus ηkA posterior distribution q (η)k)=Gam(ηk|ek,fk) The parameter update formula is
Figure BDA0001629961810000141
By iteratively performing the variational expectation portion and the variational maximization portion, the posterior distribution of the model parameters will converge. The criterion for convergence in this example is that the relative increment of the lower bound of the variation is below a set threshold (10)-7)。
(7) In the on-line phase, unknown samples x containing only auxiliary variables are acquiredtAnd (4) according to the dimension of eliminating the auxiliary variable in the step (3), estimating the oxygen content by using the optimal posterior distribution of the model parameters obtained in the step (6). Specifically, each model mixture coefficient pi ═ according to the posterior distribution of α calculated in step (6) and the properties of the dirichlet process (pi ═ pi1,…,πM) Can be calculated as
q(π)=Dir(π|φ1,…,φM) (25)
Wherein Dir (π | φ)1,…,φM) The representative parameter is (phi)1,…,φM) Is distributed, and
Figure BDA0001629961810000142
then, the dimensionless auxiliary variable x can be obtained from the posterior distribution of the model parameters calculated in step (6)tIs distributed at the edge of
Figure BDA0001629961810000143
Wherein
Figure BDA0001629961810000144
Figure BDA0001629961810000145
The expression parameter is
Figure BDA0001629961810000146
Student's t distribution. Further, x can be obtainedtCorresponding hidden variable zt=(zt1,…,ztM) The posterior distribution of
Figure BDA0001629961810000147
Wherein z ist1,…,ztMAre all variable from 0 to 1 and satisfy
Figure BDA0001629961810000148
Finally, the oxygen content y can be obtainedtHas a probability distribution of
Figure BDA0001629961810000151
Wherein
Figure BDA0001629961810000152
Therefore, according to equation (29), an estimated value of the oxygen content can be obtained as
Figure BDA0001629961810000153
In order to verify the effectiveness of the present invention, an additional 4000 sets of labeled samples were collected from the computer distributed control system of the furnace device of the first stage of the ammonia plant as a calibration sample set, and the oxygen content was estimated according to step (7), and the average estimation result is shown in fig. 3. Meanwhile, fig. 4 and 5 show the average estimation results of the oxygen content by the conventional gaussian mixture model and the partial least square model, respectively. In the Gaussian mixture model, the mixing component quantity is set to be 12 through a Bayesian information criterion; in the partial least squares model, the number of principal components is set to 10 by the cross-validation method. It can be seen that the estimated value of oxygen content provided by the partial least squares model deviates significantly from the true value due to the inability to process non-linear objects; the estimation result of the conventional gaussian mixture model, although improved compared to the partial least square model, is still unsatisfactory, especially in the third and fourth operation regions (2500 th and 4000 th samples). In contrast, the method provided by the present invention provides an estimated oxygen level that substantially meets its true value in all operating regions.
The estimation accuracy of the invention and the traditional Gaussian mixture model and partial least square model is quantified by using Root Mean Square Error (RMSE), and is defined as follows
Figure BDA0001629961810000154
Wherein y istAnd
Figure BDA0001629961810000155
respectively representing the true oxygen content and the estimated value of the t-th check sample. The estimated RMSE of the method provided by the invention, the Gaussian mixture model and the partial least square model are 0.6933, 1.1515 and 1.7143 respectively. Therefore, the estimation accuracy of the oxygen content is obviously improved by the Gauss mixed model and the partial least square model, and the estimation accuracy is improvedThe error of the meter is reduced by about 40% and 60%, respectively.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the claims.

Claims (5)

1. A method for estimating the oxygen content of a first-stage furnace in an ammonia synthesis process on line based on a semi-supervised Bayes Gaussian mixture model is characterized by comprising the following steps:
(1) selecting an auxiliary variable associated with the oxygen content y of the primary furnace
Figure FDA0002478273170000011
Wherein d represents the number of auxiliary variables;
(2) collecting labeled sample sets containing both auxiliary variables and oxygen content
Figure FDA0002478273170000012
And unlabeled sample set containing only auxiliary variables
Figure FDA0002478273170000013
Wherein n islAnd nuRespectively representing the number of the labeled samples and the number of the unlabeled samples;
(3) to (X)l,Yl) And XuCarrying out dimensionless treatment, and converting the sample variance of the auxiliary variable sample and the oxygen content sample into unit variance;
(4) initializing model parameters given a truncation level M of the Dirichlet process
Figure FDA0002478273170000014
Is a conjugate prior distribution parameter of0、b0、c0、d0、e0、f0、β0、v0、m0、W0And posterior distribution parameters a, b, hk、lk、ck、dk、ek、fk、βk、vk、mk、Wk、ωk、ΩkWherein, in the step (A),
Figure FDA0002478273170000015
Figure FDA0002478273170000016
Figure FDA0002478273170000017
and is
α represents a concentration factor of the dirichlet process;
χkparameters representing the kth hybrid model coefficients;
μkand ΛkRespectively representing a mean vector and a precision matrix of the distribution of the auxiliary variable x in the kth mixed model;
Figure FDA0002478273170000018
representing a linear regression coefficient between an auxiliary variable x and an oxygen content y in the kth mixed model;
τkto represent
Figure FDA0002478273170000019
The accuracy matrix parameters of (1);
ηkrepresenting the precision matrix coefficient of the measurement noise in the kth mixed model;
the meaning of the conjugate prior distribution parameter and the posterior distribution parameter is:
(a0,b0) And (a, b) a prior distribution parameter and a posterior distribution parameter respectively representing α;
(hk,lk) Denotes xkThe posterior distribution parameters of (1);
(m00,W0,v0) And (m)kk,Wk,vk) Respectively represent (mu)kk) A priori distribution parameters ofChecking distribution parameters;
(c0,d0) And (c)k,dk) Respectively represent taukThe prior distribution parameter and the posterior distribution parameter;
(e0,f0) And (e)k,fk) Respectively represent ηkThe prior distribution parameter and the posterior distribution parameter;
ωkand ΩkTo represent
Figure FDA0002478273170000021
The posterior distribution parameters of (1);
(5) constructed with labeled samples (X)l,Yl) And unlabeled sample XuAnd its corresponding hidden variable
Figure FDA0002478273170000022
Figure FDA0002478273170000023
In which z isi=(zi1,…,ziM)TAnd zj=(zj1,…,zjM)TRespectively represent the ith labeled sample (x)i,yi) And the jth unlabeled sample xjCorresponding binary hidden variables, and satisfy
Figure FDA0002478273170000024
(6) Inputting the training sample set processed in the step (3), the initial model parameters in the step (4) and the likelihood function constructed in the step (5) into a semi-supervised Bayes Gaussian mixture model, and learning the optimal posterior distribution q (alpha) and the optimal posterior distribution q (alpha) of each model parameter through variational inference
Figure FDA0002478273170000025
Where q (-) denotes the optimal posterior distribution of the corresponding variable;
(7) and (4) collecting unknown samples only containing auxiliary variables, eliminating the dimension of the auxiliary variables according to the step (3), and estimating the oxygen content by using the optimal posterior distribution of the model parameters obtained in the step (6).
2. The method for on-line estimation of oxygen content in one-stage furnace in ammonia synthesis process based on semi-supervised Bayesian Gaussian mixture model as recited in claim 1, wherein the labeled sample (X) constructed in the step (5) isl,Yl) And unlabeled sample XuAnd its corresponding hidden variable Zl、ZuThe likelihood function of (d) is:
Figure FDA0002478273170000026
Figure FDA0002478273170000027
Figure FDA0002478273170000028
Figure FDA0002478273170000029
Figure FDA00024782731700000210
wherein χ ═ χ (χ ═ χ)1,…,χM),μ=(μ1,…,μM),Λ=(Λ1,…,ΛM),
Figure FDA00024782731700000211
η=(η1,…,ηM),
Figure FDA00024782731700000212
Represents the mean value of μkThe covariance matrix is
Figure FDA00024782731700000213
The gaussian probability density function of (a) is,
Figure FDA00024782731700000214
3. the method for on-line estimation of oxygen content in one-stage furnace in ammonia synthesis process based on semi-supervised Bayesian Gaussian mixture model as recited in claim 1 or 2, wherein the parameters a, b, h of optimal posterior distribution of model parameters in step (6) arek,lk,ck,dk,ek,fk,βk,vk,mk,Wk,ωkAnd ΩkHas the following form:
a=a0+M-1
Figure FDA0002478273170000031
Figure FDA0002478273170000032
Figure FDA0002478273170000033
Figure FDA0002478273170000034
Figure FDA0002478273170000035
Figure FDA0002478273170000036
Figure FDA0002478273170000037
Figure FDA0002478273170000038
Figure FDA0002478273170000039
ck=c0+(d+1)/2
Figure FDA00024782731700000310
Figure FDA00024782731700000311
Figure FDA00024782731700000312
where ψ (·) denotes a digamma function, I denotes an identity matrix of the corresponding dimension,
Figure FDA00024782731700000313
1 is the all 1 column vector, Tr (-) is the trace of the matrix,
Figure FDA00024782731700000314
represents the estimation error of the k-th hybrid model,
Figure FDA00024782731700000315
Figure FDA00024782731700000316
here, the
Figure FDA00024782731700000317
Express according to
Figure FDA00024782731700000318
Distribution calculation of
Figure FDA00024782731700000319
(iii) a desire; kappaikAnd kappajkIs calculated in a manner that
Figure FDA0002478273170000041
Figure FDA0002478273170000042
Wherein
Figure FDA0002478273170000043
Figure FDA0002478273170000044
4. The method for estimating the oxygen content of the primary furnace in the ammonia synthesis process based on the semi-supervised Bayesian Gaussian mixture model according to claim 1 or 2, wherein the step (7) specifically comprises the following steps:
according to the posterior distribution of alpha calculated in the step (6) and the property of the Dirichlet process, each model mixing coefficient pi ═ pi (pi ═ pi)1,…,πM) Can be calculated as
q(π)=Dir(π|φ1,…,φM)
Wherein Dir (π | φ)1,…,φM) The representative parameter is (phi)1,…,φM) Is distributed, and
Figure FDA0002478273170000045
then, the dimensionless auxiliary variable x can be obtained from the posterior distribution of the model parameters calculated in step (6)tIs distributed at the edge of
Figure FDA0002478273170000046
Wherein
Figure FDA0002478273170000047
The expression parameter is
Figure FDA0002478273170000048
Student's t distribution; further, x can be obtainedtCorresponding hidden variable zt=(zt1,…,ztM) The posterior distribution of
Figure FDA0002478273170000051
Wherein z ist1,…,ztMAre all variable from 0 to 1 and satisfy
Figure FDA0002478273170000052
The probability distribution of the oxygen content can then be found, thereby obtaining an estimate of the oxygen content.
5. The method for on-line estimation of oxygen content in one-stage furnace in ammonia synthesis process based on semi-supervised Bayesian Gaussian mixture model as recited in claim 4, wherein the oxygen content y istThe probability distribution of (c) is:
Figure FDA0002478273170000053
wherein
Figure FDA0002478273170000054
Thus, an estimate of the oxygen content can be obtained as
Figure FDA0002478273170000055
CN201810338582.2A 2018-04-16 2018-04-16 Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process Active CN108664706B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810338582.2A CN108664706B (en) 2018-04-16 2018-04-16 Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810338582.2A CN108664706B (en) 2018-04-16 2018-04-16 Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process

Publications (2)

Publication Number Publication Date
CN108664706A CN108664706A (en) 2018-10-16
CN108664706B true CN108664706B (en) 2020-11-03

Family

ID=63783484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810338582.2A Active CN108664706B (en) 2018-04-16 2018-04-16 Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process

Country Status (1)

Country Link
CN (1) CN108664706B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083065B (en) * 2019-05-21 2020-07-10 浙江大学 Self-adaptive soft measurement method based on flow type variational Bayesian supervised factor analysis
CN113470739B (en) * 2021-07-03 2023-04-18 中国科学院新疆理化技术研究所 Protein interaction prediction method and system based on mixed membership degree random block model
CN113707240B (en) * 2021-07-30 2023-11-07 浙江大学 Component parameter robust soft measurement method based on semi-supervised nonlinear variation Bayesian hybrid model
CN117150931B (en) * 2023-10-30 2024-01-30 中国石油大学(华东) Mixed oil length on-line estimation method and system based on mixed single hidden layer neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451102A (en) * 2017-07-28 2017-12-08 江南大学 A kind of semi-supervised Gaussian process for improving self-training algorithm returns soft-measuring modeling method
CN107464247A (en) * 2017-08-16 2017-12-12 西安电子科技大学 One kind is based on G0Stochastic gradient variation Bayes's SAR image segmentation method of distribution
CN107505837A (en) * 2017-07-07 2017-12-22 浙江大学 A kind of semi-supervised neural network model and the soft-measuring modeling method based on the model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018757A1 (en) * 2016-07-13 2018-01-18 Kenji Suzuki Transforming projection data in tomography by means of machine learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107505837A (en) * 2017-07-07 2017-12-22 浙江大学 A kind of semi-supervised neural network model and the soft-measuring modeling method based on the model
CN107451102A (en) * 2017-07-28 2017-12-08 江南大学 A kind of semi-supervised Gaussian process for improving self-training algorithm returns soft-measuring modeling method
CN107464247A (en) * 2017-08-16 2017-12-12 西安电子科技大学 One kind is based on G0Stochastic gradient variation Bayes's SAR image segmentation method of distribution

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multimode process data modeling: A Dirichlet process mixture model based Bayesian robust factor analyzer approach;Zhu, JL 等;《CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS》;20150331;第142卷;全文 *
Quality variable prediction for chemical processes based on semisupervised Dirichlet process mixture of Gaussians;Weiming Shao等;《CHEMICAL ENGINEERING SCIENCE》;20190116;第193卷;全文 *

Also Published As

Publication number Publication date
CN108664706A (en) 2018-10-16

Similar Documents

Publication Publication Date Title
CN108664706B (en) Semi-supervised Bayesian Gaussian mixture model-based online estimation method for oxygen content of one-stage furnace in ammonia synthesis process
Xiong et al. JITL based MWGPR soft sensor for multi-mode process with dual-updating strategy
CN107688701B (en) WASP model-based water quality soft measurement and water eutrophication evaluation method
CN110807554A (en) Generation method and system based on wind power/photovoltaic classical scene set
CN112650063B (en) Self-adaptive soft measurement method based on semi-supervised incremental Gaussian mixture regression
CN109033524B (en) Chemical process concentration variable online estimation method based on robust mixed model
CN112989711B (en) Aureomycin fermentation process soft measurement modeling method based on semi-supervised ensemble learning
CN104462850A (en) Multi-stage batch process soft measurement method based on fuzzy gauss hybrid model
CN109670625A (en) NOx emission concentration prediction method based on Unscented kalman filtering least square method supporting vector machine
CN110046377B (en) Selective integration instant learning soft measurement modeling method based on heterogeneous similarity
CN114239400A (en) Multi-working-condition process self-adaptive soft measurement modeling method based on local double-weighted probability hidden variable regression model
CN105159071A (en) Method for estimating economic performance of industrial model prediction control system in iterative learning strategy
CN101673096B (en) Soft-measuring method for density in concentration process of salvia miltiorrhiza injection production
CN108171002B (en) Polypropylene melt index prediction method based on semi-supervised hybrid model
CN114169459A (en) Robust soft measurement method based on semi-supervised Bayesian regularization hybrid Student's t model
CN114239397A (en) Soft measurement modeling method based on dynamic feature extraction and local weighted deep learning
CN1327376C (en) Soft measuring meter moduling method based on supporting vector machine
CN116825253A (en) Method for establishing hot rolled strip steel mechanical property prediction model based on feature selection
CN116843052A (en) Multi-objective optimization method for refinery production plan
CN113707240B (en) Component parameter robust soft measurement method based on semi-supervised nonlinear variation Bayesian hybrid model
CN112580692B (en) Virtual sample generation method based on interpolation algorithm
Luo Machine Learning Modeling for Process Control and Electrochemical Reactor Operation
CN115035962A (en) Variational self-encoder and generation countermeasure network-based virtual sample generation and soft measurement modeling method
CN113065242A (en) KPLSR model-based soft measurement method for total nitrogen concentration of effluent from sewage treatment
Peterson et al. Hybrid modeling of the catalytic CO2 methanation using process data and process knowledge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant