CN112001115A

CN112001115A - Soft measurement modeling method of semi-supervised dynamic soft measurement network

Info

Publication number: CN112001115A
Application number: CN202010690011.2A
Authority: CN
Inventors: 刘涵; 郭润元
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2020-07-17
Filing date: 2020-07-17
Publication date: 2020-11-27
Anticipated expiration: 2040-07-17
Also published as: CN112001115B

Abstract

The invention discloses a soft measurement modeling method of a semi-supervised dynamic soft measurement network, which is implemented according to the following steps: denoising and redundancy removing are carried out on the training set data based on a CEEMD and Isomap method; carrying out serialization and normalization processing on the training set data; and finishing the establishment of a soft measurement model of the semi-supervised dynamic soft measurement network based on the training set data. The modeling method provided by the invention removes noise and redundancy of data by using a method combining CEEMD and Isomap, the CEEMD has completeness, the result has no obvious modal aliasing phenomenon, the Isomap has strong nonlinear feature transformation capability, the advantages of the CEEMD and the Isomap are combined, so that the noise and the redundancy in the original data are effectively removed, the information loss is reduced to the maximum extent, the data are serialized, the historical data are introduced for dynamic modeling, and compared with the traditional soft measurement method, the method provided by the invention can predict variables more accurately by using the model.

Description

Soft measurement modeling method of semi-supervised dynamic soft measurement network

Technical Field

The invention belongs to the technical field of intelligent signal processing and industrial artificial intelligence, and particularly relates to a soft measurement modeling method of a semi-supervised dynamic soft measurement network.

Background

In a complex industrial process, due to the reasons of severe production environment, complex working conditions, limited detection technology or cost and the like, key quality variables in the process often cannot be measured reliably and in real time. In order to overcome the problem, a soft measurement technology is developed, which takes an auxiliary variable which is easy to measure in the process as an input and a main variable which is desired to be measured as an output, and establishes a model which can predict the main variable, thereby realizing accurate estimation of a key quality variable.

With the development of information technology and the wide application of a distributed control system in a complex industrial process, massive industrial process big data can be collected and used for establishing a soft measurement model, but external environment disturbance factors and process fluctuation comprise raw material composition change, random interference in data transmission and storage processes and other factors, so that industrial data often have a large amount of data noise, and meanwhile, massive industrial production data inevitably introduce a data redundancy problem, namely the same production condition and the colinearity among different auxiliary variables which repeatedly occur in the production process, and the noise and redundancy contained in the data can seriously influence the accuracy of a data-driven soft measurement model and need to be removed in a preprocessing stage. In addition, industrial process data is often a continuous time series, and if static soft measurement modeling is performed on the industrial process data, the front-back relation of the data cannot be captured, so that the model estimation precision is low and the robustness is poor in practical application. Therefore, to build an accurate data-driven soft measurement model, dynamic modeling is required.

Disclosure of Invention

The invention aims to provide a soft measurement modeling method of a semi-supervised dynamic soft measurement network, which can remove the noise and redundancy of variable data and capture the dynamic characteristics among data so as to establish an accurate dynamic soft measurement prediction model.

In order to solve the technical problem, the invention discloses a soft measurement modeling method of a semi-supervised dynamic soft measurement network, which is implemented according to the following steps:

step 1, denoising and redundancy removing processing are carried out on training set data based on a Complementary integrated Empirical Mode Decomposition (CEEMD) and an Isomap method;

step 2, carrying out serialization and normalization processing on the training set data processed in the step 1;

and 3, completing soft measurement model establishment of a semi-supervised dynamic soft measurement network (SSDGRU-MLR) based on the training set data processed in the step 2.

Further, step 1, denoising and redundancy removing processing are carried out on the training set data based on a CEEMD and Isomap method, and the specific steps are as follows:

step 1.1, applying a CEEMD algorithm to an original auxiliary variable training data set X to obtain IMFs of various orders;

step 1.2, calculating correlation coefficient indexes of IMFs of all orders and original variable signals, judging whether the IMFs are noise or not based on a set threshold constant, and eliminating the IMFs judged to be noise, wherein a calculation formula of the correlation coefficient is as follows:

in the formula (1) (. rho. (X))_v(t),c_vi(t)) represents the original auxiliary variable X_v(t) and its ith IMFc_vi(t), i 1, N, the covariance between N,

and

are each X_v(t) and the standard deviation of the ith IMF, wherein the value range of | rho | is between 0 and 1, and the closer to 1, the higher the similarity is;

and step 1.3, performing nonlinear feature transformation on the residual IMF through an Isomap algorithm, and then performing data reconstruction by using a new mode function and an original remainder item obtained after dimensionality reduction to finally obtain an auxiliary variable X' after denoising and redundancy removal.

Further, in step 1.3, in the implementation process of the Isomap algorithm, the method for calculating the geodesic distance is as follows: the geodesic distance between a sample point and its neighborhood is replaced by the Euclidean distance between them; the sample point and points outside its neighborhood are replaced by the shortest path between them on the manifold.

Further, step 2, performing serialization and normalization processing on the training set data processed in step 1, specifically comprising the following steps:

step 2.1, after denoising and redundancy removing operation, performing serialization operation on auxiliary variable data, and predicting a dominant variable at the t + ts + z moment according to the auxiliary variable data of a total ts time step from the t moment to the t + ts moment, wherein ts is the time window length of input data, and z is the time step length of the dominant variable needing to be predicted, so as to obtain data X 'after serialization of the input data X';

step 2.2, carrying out standardization treatment by using a Z-SCORE method, converting the data into data with a mean value of 0 and a variance of 1, wherein the formula is as follows:

wherein in formula (2), X' is the data after serialization, mu and sigma respectively represent the mean and variance, and X_inExpressing the sequence data which is used for inputting into the neural network after standardization, wherein only the characteristic data is subjected to standardization treatment, and the original value of the label data is kept unchanged;

further, step 3, completing the establishment of the SSDGRU-MLR soft measurement model based on the training set data processed in step 2, and specifically comprising the following steps:

step 3.1, the SSDGRU-MLR is a network formed by combining a GRU unit with an MLP in supervised learning after multi-layer stacking, the output of the last layer of the DGRU outputs a soft measurement prediction result through a fully connected MLP network, wherein the MLP is a neural network of a single hidden layer and is used for regression fitting of the final key quality variable;

step 3.2, before training the model, firstly, initializing the parameters of the model, and adopting an XVaier initialization mode to make the number of nodes of the current network layer be n_inThe number of output nodes is n_outThe way of Xvaier initialization is to achieve uniform distribution as follows:

after initialization is completed, serialized data X is processed_inThe loss function of the whole model training process can be defined as follows when the loss function is input into the soft measurement model:

wherein in the formula (4), y_tRepresents the label output corresponding to the t-th sequence sample, n-ts +1 represents the number of samples after serialization,

a prediction output representing the t-th sequence sample; and based on the optimization target of the minimum loss function, updating and adjusting parameters of the whole model through a BPTT algorithm, and finally completing batch training of the whole SSDGRU-MLR soft measurement model through multiple iterations.

Compared with the prior art, the invention can obtain the following technical effects:

1) the invention discloses a soft measurement modeling method of a semi-supervised dynamic soft measurement network, which removes noise and redundancy of data by using a method of combining CEEMD and Isomap, wherein CEEMD has completeness and the result has no obvious modal aliasing phenomenon, and Isomap has strong nonlinear feature transformation capability, and combines the advantages of CEEMD and Isomap so as to effectively remove the noise and redundancy in the original data and reduce information loss to the maximum extent, and meanwhile, data is serialized so as to introduce historical data for dynamic modeling. A Semi-Supervised depth Gated loop sensor network (SSDGRU-MLR) formed by DGRUs and MLPs is adopted to carry out Semi-Supervised dynamic modeling on the preprocessed serialized data, the SSDGRU-MLR not only can utilize a large number of non-label samples in the process, but also can help to extract high-level representation in variables, and meanwhile, GRU Units in the structure can capture dynamic characteristics of the data and spread the data along with time, so that the modeling effect is improved.

2) In the training process of the SSDGRU-MLR, a dropout technology is used for avoiding the generation of overfitting, and meanwhile, a callback function is designed to ensure the smooth training of the model. Through the soft measurement prediction experiment, the experimental result can be analyzed, compared with the traditional soft measurement method, the variable prediction by using the model of the invention is more accurate, and the effectiveness and superiority of the method are proved in a comparison experiment based on the industrial example of the air preheater.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flow chart of a soft measurement modeling method of a semi-supervised dynamic soft measurement network according to the present invention;

FIG. 2 is a SSDGRU-MLR model diagram of a soft measurement modeling method via a semi-supervised dynamic soft measurement network of the present invention;

FIG. 3 is a cross-sectional view of an air preheater in an industrial example of a method of soft measurement modeling via a semi-supervised dynamic soft measurement network of the present invention;

FIG. 4 is a graph of a loss function obtained when trained by the soft measurement modeling method of a semi-supervised dynamic soft measurement network of the present invention;

FIG. 5 is a CEEMD decomposition result diagram of an inlet temperature variable obtained by the soft measurement modeling method of the semi-supervised dynamic soft measurement network of the present invention;

FIG. 6 is a diagram of the predicted rotor thermal deformation results obtained by the soft measurement modeling method of the semi-supervised dynamic soft measurement network of the present invention;

fig. 7 is a diagram of soft measurement prediction error analysis obtained by the soft measurement modeling method of the semi-supervised dynamic soft measurement network of the present invention.

Detailed Description

The following embodiments are described in detail with reference to the accompanying drawings, so that how to implement the technical features of the present invention to solve the technical problems and achieve the technical effects can be fully understood and implemented.

The invention discloses a soft measurement modeling method of a semi-supervised dynamic soft measurement network, which is implemented according to the following steps as shown in figure 1:

the method comprises the following specific steps:

and

step 1.3, performing nonlinear feature transformation on the residual IMF through an Isomap algorithm, and then performing data reconstruction by using a new mode function and an original remainder obtained after dimensionality reduction to finally obtain an auxiliary variable X' after denoising and redundancy removal;

in the implementation process of the Isomap algorithm, the geodesic distance is calculated as follows: the geodesic distance between a sample point and its neighborhood is replaced by the Euclidean distance between them; the sample point and points outside its neighborhood are replaced by the shortest path between them on the manifold;

the method comprises the following specific steps:

step 2.1, after denoising and redundancy removing operation, performing serialization operation on auxiliary variable data, and predicting a dominant variable at the t + ts + z moment according to the auxiliary variable data of a total ts time step from the t moment to the t + ts moment, wherein ts is the time window length of input data, z is the time step of the dominant variable needing to be predicted, and the two parameters (ts and z) need to be set by combining an industrial background to obtain data X 'after serialization of the input data X';

step 3, completing the soft measurement model establishment of a semi-supervised dynamic soft measurement network (SSDGRU-MLR) based on the training set data processed in the step 2;

the method comprises the following specific steps:

step 3.1, the SSDGRU-MLR is a network formed by combining a GRU unit with an MLP in supervised learning after multi-layer stacking, the overall structure of the network is shown in FIG. 2, and the output of the last layer of the DGRU outputs the soft measurement prediction result through the MLP network which is completely connected. Wherein, the MLP is a neural network of a single hidden layer and is used for regression fitting of final key quality variables;

step 3.2, before training the model, firstly, the parameters of the model need to be initialized, including the parameter W inside each GRU unit_r、U_r、W_z、U_zW, U and the weights and offsets W between layers of the network₁、b₁、W_j、b_j、W_l、b_lAnd the node number of the current network layer is n by adopting an XVaier initialization mode_inThe number of output nodes is n_outThe way of Xvaier initialization is to achieve uniform distribution as follows:

after initialization is finished, the serialized data is input into a soft measurement model, hidden layer output h corresponding to the last GRU layer at the last time step is obtained through DGRU forward propagation, and then a data set is formed

Inputting the variables into an MLP network, and obtaining predicted values of key dominant variables through MLP forward propagation

Wherein h is_tRepresenting the hidden layer output, y, obtained from DGRU in the t-th sequence sample input model_tAnd (4) representing the label output corresponding to the t-th sequence sample, wherein n-ts +1 represents the number of the serialized samples.

Representing the predicted output of the t-th sequence sample. The prediction error can be obtained through the label value and the predicted value, and therefore, the loss function of the whole model training process can be defined as follows:

wherein y is_tRepresents the t-th sequence sampleAnd outputting the corresponding label, wherein n-ts +1 represents the number of the samples after serialization.

Representing the predicted output of the t-th sequence sample. And based on the optimization target of the minimum loss function, updating and adjusting parameters of the whole model through a BPTT algorithm, and finally completing batch training of the whole SSDGRU-MLR soft measurement model through multiple iterations.

The following experiments show that the soft measurement modeling method of the semi-supervised dynamic soft measurement network is effective and feasible and has certain superiority:

based on the industrial example of the air preheater rotor thermal deformation soft measurement, the test set data is input into the established soft measurement model, the deformation prediction results obtained by adopting other soft measurement methods and the method of the invention are compared, the effectiveness and superiority of the modeling method of the invention are analyzed,

the method comprises the following specific steps:

(1) FIG. 3 is a cross-sectional view of an air preheater rotor, and based on an industrial example of soft measurement of air preheater rotor thermal deformation, the serialized real-time preheater temperature data is divided into a training set, a validation set and a test set, wherein the number of samples in the training set is 8571, the number of samples in the validation set is 2143, and the number of samples in the test set is 1000. Completing model training according to the steps 1-3, and in the training process, using ticks to ensure that the model is effectively trained: carrying out disorder processing on the data; using the same dropout mask for each time step; setting a callback function in a program to monitor the change of the verification loss;

(2) the input test set tests the effectiveness and superiority of the prediction capability of the soft measurement model, 10 soft measurement models are established to carry out a comparison test experiment, and the 10 soft measurement models comprise a traditional method MLP and a support vector regression SVR in the field of machine learning, deep learning DBN, DLSTM and DGRU methods and corresponding deep learning methods which are subjected to denoising and redundancy removal. Among the 10 comparative experiments, the effect of EMD and CEEMD on the decomposition of the auxiliary variables was also compared. Use ofSNR and overall orthogonality index I_OTThe denoising and redundancy removing result is measured numerically with the root mean square error RMSE; mean Absolute Error (MAE), Mean Square Error (MSE) and coefficient of measure R are used²As an evaluation index of the predictive performance of the dynamic soft measurement model.

Table 1 shows the test results of comparative experiments using 10 soft measurement models, and the results verify the necessity and superiority of introducing deep learning into soft measurements to perform semi-supervised modeling.

For 6 models, namely No. 3-8 models, each deep learning model is provided with two groups of experiments which are preprocessed and not preprocessed, and as can be seen from Table 1, no matter which model among DBN, DLSTM and DGRU is adopted, the model which is subjected to denoising and redundancy removing processing achieves higher prediction precision than the corresponding model which is not preprocessed, and the result also shows that the operation of denoising and redundancy removing processing on auxiliary variable data is necessary. Meanwhile, the DBN in the model No. 3 and the model No. 4 is established based on the static assumption of the industrial process, compared with the static DBN model, the estimation accuracy of the dynamic DLSTM and DGRU model is greatly improved, the MAE value and the MSE value are obviously smaller, the feedback structure of the DLSTM and DGRU model obtains the dynamic characteristics in the data, and therefore the prediction performance of the soft measurement model is improved;

in addition, effective training of the dynamic model is needed to be taken as a basis for realizing good dynamic performance, in 6 models, the model No. 8 realizes the optimal prediction result, the model is taken as an example to be drawn to obtain a graph 4, the graph 4 is a loss function curve graph during training drawn by the soft measurement modeling method of the semi-supervised dynamic soft measurement network, a triangular curve in the graph represents the loss of a training set, a fork curve represents the loss of a verification set, and the two loss curves tend to be stable after continuously descending trend along with the increase of the number of iteration rounds and are closely adjacent and well fitted. The prediction result of the DGRU is better than that of the DLSTM, so that the DGRU structure is more applicable to solving the problem, and compared with the widely used DLSTM, the application value of the SSDGRU-MLR model in the field of soft measurement modeling is worthy of further popularization;

FIG. 5 is a CEEMD decomposition result diagram of an inlet temperature variable obtained by the soft measurement modeling method of the semi-supervised dynamic soft measurement network, taking the decomposition of the inlet temperature variable as an example, the first action is an inlet temperature variable signal, the rest actions are IMFs obtained by CEEMD decomposition, and the IMFs obtained by CEEMD has no obvious modal aliasing phenomenon, so that noise signals can be accurately expressed, meanwhile, the denoising and redundancy removing results in Table 2 show that the RMSE indexes of the model 8 and the model 10 are one order of magnitude smaller than that of the model 9 and the SNR is one order of magnitude larger than that of the model 10, which shows that the favorable decomposition result based on CEEMD makes the denoising performance of the model 8 and the model 10 more excellent, and then I is subjected to I-EMD decomposition_OTThe indexes are analyzed, although both the PCA method and the Isomap method are suitable for dimension reduction processing, the linear PCA method emphasizes the orthogonality of principal elements, and in the face of IMF data with strong nonlinearity, the Isomap method keeps the geodesic distance unchanged in the feature transformation process, so that more important information of original variables is kept, and high-quality reconstruction signals with less information loss are obtained. Through the analysis, the effectiveness and the superiority of the noise reduction and redundancy removal method combining CEEMD and Isomap are further verified.

Fig. 6 is a rotor thermal deformation prediction result diagram obtained by the soft measurement modeling method of the semi-supervised dynamic soft measurement network, 1000 actual continuous test samples are selected, and meanwhile, a model 1, a model 3, a model 4 and a model 8 in table 1 are used for comparison experiments, and as can be seen from fig. 6, the rotor deformation prediction value of the model 8 is better tracked and adapted to the change of a real value compared with other 3 static models, the prediction error is smaller, and the most accurate prediction is realized. The model 8 is not only modeled by applying serialized data, but also adopts an SSDGRU-MLR model to further capture dynamic characteristics among data, so that the prediction precision is remarkably improved, and the dynamic performance is strongest in the four models.

In order to further evaluate the dynamic prediction performance of the model, the rotor thermal-type variation prediction error map of the model 8 is also analyzed, fig. 7 is a prediction error analysis map obtained by the soft measurement modeling method of the semi-supervised dynamic soft measurement network, wherein fig. 7(a) is a prediction error map at each moment, fig. 7(b) is a frequency histogram (and a kernel density estimation curve) of the prediction error, and fig. 7(c) is a time delay scatter map of the prediction error. As can be seen from the observation of FIG. 7, the prediction error drift of each point is small, the nuclear density curve is approximate to zero mean value, the shape is bell-shaped, the prediction error of the model is approximate to normal distribution, and the dynamic soft measurement result is real and reliable. The point distribution does not show a correlation relation on the whole, the useful information of each auxiliary variable and the dynamic characteristics among data are effectively utilized by the proposed model, and no redundant effective information in errors can be used for prediction.

By observing fig. 4-fig. 7 and table 1-table 2 and combining the above analysis, it is clear that the soft measurement modeling method of the semi-supervised dynamic soft measurement network of the present invention is effective and feasible and has certain advantages.

The invention relates to a soft measurement modeling method of a semi-supervised dynamic soft measurement network based on CEEMD, Isomap and DGRU, which removes noise and redundancy of data by using a method of combining CEEMD and Isomap. A Semi-Supervised depth Gated loop sensor network (SSDGRU-MLR) formed by DGRUs and MLPs is adopted to carry out Semi-Supervised dynamic modeling on the preprocessed serialized data, the SSDGRU-MLR not only can utilize a large number of non-label samples in the process, but also can help to extract high-level representation in variables, and meanwhile, GRU Units in the structure can capture dynamic characteristics of the data and spread the data along with time, so that the modeling effect is improved. In the training process of the SSDGRU-MLR, a dropout technology is used for avoiding the generation of overfitting, and meanwhile, a callback function is designed to ensure the smooth training of the model. Through the soft measurement prediction experiment that carries on, it can be reachd to carry out the analysis to the experimental result, compares in traditional soft measurement method, uses the novel semi-supervised dynamic soft measurement method in this patent to predict the variable can be more accurate, and validity and the superiority of method have obtained the proof in the contrast experiment based on air heater industry example.

Table 1 shows the results of comparative experiments using 10 soft measurement models in the examples

Table 2 shows the results of the de-noising and de-redundancy experiments in the examples

While the foregoing description shows and describes several preferred embodiments of the invention, it is to be understood, as noted above, that the invention is not limited to the forms disclosed herein, but is not to be construed as excluding other embodiments and is capable of use in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as expressed herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A soft measurement modeling method of a semi-supervised dynamic soft measurement network is characterized by comprising the following steps:

step 1, denoising and redundancy removing processing are carried out on training set data based on a CEEMD and Isomap method;

and 3, finishing the establishment of a soft measurement model of the semi-supervised dynamic soft measurement network based on the training set data processed in the step 2.

2. The soft measurement modeling method of the semi-supervised dynamic soft measurement network as recited in claim 1, wherein step 1, denoising and redundancy removing processing are performed on training set data based on a CEEMD and Isomap method, and the specific steps are as follows:

and

3. The soft measurement modeling method of the semi-supervised dynamic soft measurement network of claim 2, wherein in the implementation process of the Isomap algorithm, the step 1.3 is to calculate the geodetic distance by the following method: the geodesic distance between a sample point and its neighborhood is replaced by the Euclidean distance between them; the sample point and points outside its neighborhood are replaced by the shortest path between them on the manifold.

4. The soft measurement modeling method of the semi-supervised dynamic soft measurement network as recited in claim 3, wherein in step 2, the training set data processed in step 1 is serialized and normalized, and the specific steps are as follows:

wherein in formula (2), X' is the data after serialization, mu and sigma respectively represent the mean and variance, and X_inThe normalized sequence data is used for input into the neural network, and it should be noted that only the feature data is normalized, and the tag data is kept unchanged.

5. The soft measurement modeling method of the semi-supervised dynamic soft measurement network as recited in claim 4, wherein in step 3, the establishment of the SSDGRU-MLR soft measurement model is completed based on the training set data processed in step 2, and the specific steps are as follows:

wherein in the formula (4), y_tRepresents the label output corresponding to the t-th sequence sample, n-ts +1 represents the number of the sequenced samples, y_t ^predictA prediction output representing the t-th sequence sample; and based on the optimization target of the minimum loss function, updating and adjusting parameters of the whole model through a BPTT algorithm, and finally completing batch training of the whole SSDGRU-MLR soft measurement model through multiple iterations.