CN117112992A

CN117112992A - Fault diagnosis method for polyester esterification stage

Info

Publication number: CN117112992A
Application number: CN202310948985.XA
Authority: CN
Inventors: 陈磊; 彭闯; 曹广浩; 郝矿荣; 蔡欣; 隗兵
Original assignee: Donghua University
Current assignee: Donghua University
Priority date: 2023-07-31
Filing date: 2023-07-31
Publication date: 2023-11-24

Abstract

The invention discloses a fault diagnosis method in a polyester esterification stage, which mixes global supervision learning and scene metric element learning, utilizes attribute information of a single sample and similarity information from a sample group, firstly utilizes a variation mode to decompose and obtain multi-scale data in a global supervision training mode, extracts multi-scale components with fault characteristics, carries out multi-scale characteristic fusion learning, utilizes triple loss learning to obtain better fine characteristics, then fixes a multi-scale characteristic fusion module for task element learning training, learns a characteristic, enables original data of a meta task to be converted into a basic characteristic space, finally utilizes a dimension variation prototype module to adaptively measure the characteristic similarity of a sample pair, and automatically learns metric scaling parameters to change an embedding space by a variation inferred statistical method. The method is simple, and solves the problem of fault diagnosis under the condition of the full open set of limited data.

Description

Fault diagnosis method for polyester esterification stage

Technical Field

The invention belongs to the technical field of automatic control, and relates to a fault diagnosis method for a polyester esterification stage.

Background

In modern complex industrial processes of large scale, high end, complexity and intelligence, process faults can cause serious casualties and economic losses. To prevent this, a series of fault diagnosis techniques have been developed. Traditional methods based on signal processing often require excessive manual intervention, and are difficult to meet the requirements of diagnosis accuracy and efficiency brought by the large scale and automation of modern equipment. With the deployment of large-scale sensors and the availability of large amounts of data, data driven approaches have become an effective technique for industrial fault diagnosis.

In a practical industrial scenario, due to changes in process equipment and operating conditions, a fault class may appear in the new operating conditions that is not exactly the same as the source conditions, which is referred to herein as an open set fault diagnosis problem. Methods for solving this problem are mainly classified into two types: one is a method based on a discriminant model, employing a threshold classification scheme, a decision maker either refuses input samples or classifies the input samples into known categories according to empirically set thresholds; another approach is instance-based generation that addresses challenges presented by spatial diversity in data analysis by generating domain and class specific data. While the open-set fault diagnosis problem considers that new fault types may occur during the test phase, in practical applications many faults may be destructive and cause significant losses, few factories allow faults to occur and collect samples to train the fault diagnosis system. Therefore, it becomes difficult to collect sufficient failure data. Once the amount of signature data has been drastically reduced, these methods may be at risk of performance degradation, referred to herein as a fault diagnosis problem in an under-data scenario.

Aiming at the fault diagnosis problem of the under-data scene, researchers propose a large number of solutions from different angles. One of these approaches is to learn a general enhancement function on the auxiliary dataset based on a data enhancement method, or to implement an enhancement strategy directly on the test dataset to increase the number of training samples. However, due to the scarcity of data, the original dataset may not fully cover all possible failure situations, limiting the generalization ability of the model. Furthermore, the method based on the migration learning can obtain transferable fault knowledge from one field and apply the transferable fault knowledge to other different but related fields. However, if there is a large difference between the source task and the troubleshooting task, the pre-trained model may not provide an efficient feature representation, thereby reducing the effectiveness of the transfer learning.

In recent years, a metric-based meta-learning model has received a great deal of attention, which realizes faster and more accurate classification in less sample learning by learning a uniform, class-independent distance feature space. Document 1 (Reweighted Regularized Prototypical Network for Few-Shot Fault Diagnosis [ J ]. IEEE Transactions on Neural Networks and Learning Systems, 2022.) proposes a multi-scale dynamic fusion prototype network based on a fuzzy c-means clustering algorithm, which provides a more accurate distance metric reference for nearest neighbor classifiers. Document 2 (Metric-based meta-learning model for few-shot fault diagnosis under multiple limited data conditions [ J ]. Mechanical Systems and Signal Processing,2021, 155:107510.) constructed a convolutional twin neural network that accurately learns classification boundaries between sample pair features by maximizing inter-class distances and minimizing intra-class distances. Document 3 (TRNet: ACross-Component Few-Shot Mechanical Fault Diagnosis [ J ]. IEEE Transactions on Industrial Informatics, 2022.) proposes a re-weighted regularized prototype network by an intra-class re-weighting strategy to reduce the effects of noise and outliers to obtain a stable prototype estimate. However, these methods directly apply meta-learning methods to failure diagnosis of under-data scenarios, and since models focus on task-level learning and ignore failure feature-level learning, the diagnostic effect of models is generally poor.

Aiming at the fact that the A production line has abundant fault samples in actual industrial engineering, the B production line has new fault categories, and meanwhile, the fault categories have very limited sample numbers, the invention is called as the fault diagnosis problem under the full open set scene of limited data. While metric-based meta-learning provides a solution idea, a large number of difficult negative examples are encountered in processing real-process industrial fault data. In particular, sample features between categories may be more similar than sample features within categories, which may be less similar than sample features between categories. This leads to problems of extraction and misclassification of poor fault signatures. Furthermore, for high-dimensional, strongly coupled flow failures, employing only simple adaptive distance metric mapping will exacerbate the dilemma of difficult negative sample identification.

Disclosure of Invention

The invention aims to solve the problems in the prior art and provides a fault diagnosis method of a polyester esterification stage, and particularly, the invention utilizes a VMD technology (variable modal decomposition) to carry out modal decomposition on original data to obtain multi-scale data; training a multi-scale feature fusion module in a global supervision mode to complete multi-scale data fault feature fusion, weakening the problem that a hard sample is difficult to identify by using a hard negative sample identification module, and effectively learning fine features; the multi-scale feature fusion module is fixed and shared to a meta-task training stage, so that the original data of the scene task is converted into a basic feature space; the embedded space is converted into a more realistic data space using variance inference, and prototype vectors are adaptively learned to measure feature similarity of fault sample pairs.

In order to achieve the above purpose, the invention adopts the following technical scheme:

a method for fault diagnosis in the esterification stage of polyester, comprising the steps of:

(1) Collecting data in the polyester esterification stage of the production line B in the actual production process, and obtaining a fault sample B to be diagnosed;

(2) The expert and technician performs fault diagnosis on the fault sample B to be diagnosed, divides the fault sample B to be diagnosed into a diagnosed fault sample B1 with known fault types and a fault sample B2 to be diagnosed with unknown fault types, and forms a support set S by all B1 ^* ，S ^* The fault type comprises C;

(3) Decomposing B1 and B2 by utilizing a VMD technology to obtain multi-scale characteristics with the scale number of H;

(4) Will S ^* Samples with the same fault type are classified into a class, and a fault prototype tau of a c-th class is calculated _c* (i.e. a cluster center), c=1, 2., C;

in the method, in the process of the invention,represents S ^* Samples of C fault types in +.>Represents S ^* The number of samples of C failure types, +.>Represents S ^* D-dimensional feature vector of samples in (a), +.>Represents S ^* Fault type label of sample in +.>Representing the use of trained multiscale feature fusion module pair +.>Feature vector obtained after feature extraction, +.>Representative use of trained dimension variation prototype Module pair +. >Performing further feature processing to obtain measurement features;

(5) Calculation ofAnd τ _c* Euclidean distance of (c) D-dimensional feature vector representing B2, +.>Representing the use of trained multiscale feature fusion module pair +.>Feature vector obtained after feature extraction, +.>Representative use of trained dimension variation prototype Module pair +.>Performing further feature processing to obtain measurement features;

(6) Calculating the probability that the fault type of B2 belongs to the c-th class Representing the predicted +.>Label, alpha represents dimension scaling parameter of dimension variation prototype module, tau _u* Representing fault prototypes of other classes except class c;

(7) Taking the fault type corresponding to the maximum probability as the fault type of B2;

the multi-scale feature fusion module is formed by sequentially connecting a first one-dimensional convolution layer, a first pooling layer, a second one-dimensional convolution layer, a second pooling layer, a third one-dimensional convolution layer and a fourth one-dimensional convolution layer; the first one-dimensional convolution layer adopts H convolution kernels with the same size and is used for fusing and overlapping multi-scale features with the scale number of H into a single-scale feature map; the purpose of not arranging a pooling layer after the third one-dimensional convolution layer and the fourth one-dimensional convolution layer is to reserve enough information for a subsequent dimension variation prototype module to carry out convolution operation;

The dimension variation prototype module is formed by sequentially connecting a first one-dimensional convolution layer, a first pooling layer, a second one-dimensional convolution layer and a second pooling layer;

the training method of the multi-scale feature fusion module and the dimension variation prototype module comprises the following steps: firstly, identifying the fault type of source domain data of a fault sample of the production line A in a global supervision training mode, then fixing a multi-scale feature fusion module for task element learning training, learning a feature representation, enabling similar faults to be closer to different types of faults in an embedded space, finally, adaptively measuring the feature similarity of the sample pair by utilizing a dimension variation prototype module, and automatically learning a measurement scaling parameter to change the embedded space by a variation inferred statistical method; the production line A and the production line B are two production lines with the same technological process and performed simultaneously.

As a preferable technical scheme:

in the above-mentioned fault diagnosis method for polyester esterification stage, in the step (2), the process of fault diagnosis of the fault sample B to be diagnosed by expert and technician is as follows: the expert and technician collect and record key data and information, analyze the characteristics and the performance of faults according to the technological parameters and the fault phenomena, compare the fault characteristics and the performance with the known fault database and give out fault types according to the specific conditions of the professional domain knowledge and production.

In the above-mentioned fault diagnosis method for polyester esterification stage, in the step (3), when decomposing B1 by VMD technology, the number of modes is set to 3, and when decomposing the original signal into 4 modes, the center frequency distances of two adjacent mode components are the closest, and the mode is considered to be overdecomposed, so the number of modes is set to 3.

According to the fault diagnosis method for the polyester esterification stage, the size of the cores in the first one-dimensional convolution layer of the multi-scale feature fusion module is 64 multiplied by 1 multiplied by 3, the step length is 2 multiplied by 1, the number of the cores is 16, and the activation function is ReLU; the size of cores in the first pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of cores is 16, and the activation function is Max; the size of the kernel in the second one-dimensional convolution layer is 3×1, the step size is 2×1, the number of kernels is 32, and the activation function is ReLU; the size of the cores in the second pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 32, and the activation function is Max; the size of the kernel in the third one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 64, and the activation function is ReLU; the size of the kernel in the fourth one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 64, and the activation function is ReLU.

The method comprises the steps of decomposing signals into different modes, taking the signals as multi-scale data, extracting the characteristics of the multiple scales on a first layer, fusing the characteristics, and taking the signals as single-scale signals to extract the characteristics; the large-scale convolution kernel firstly performs feature extraction from different scales, is favorable for acquiring more general features of faults, performs feature fusion, then uses a smaller convolution kernel to extract fault feature information with finer granularity, and is favorable for overall feature extraction.

The fault diagnosis method for the polyester esterification stage comprises the steps that the size of cores in a first one-dimensional convolution layer of a dimension variation prototype module is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 64, and an activation function is ReLU; the size of the cores in the first pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 64, and the activation function is Max; the size of the kernel in the second one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 128, and the activation function is ReLU; the cores in the second pooling layer have a size of 2×1, a step size of 2×1, a number of cores of 128, and an activation function of Max.

The fault diagnosis method for the polyester esterification stage comprises the following training processes of the multi-scale feature fusion module and the dimension variation prototype module:

(a) Collecting data in the polyester esterification stage of the production line A in the actual production process, and obtaining a fault sample A to be diagnosed;

(b) Performing fault diagnosis on the fault sample A to be diagnosed by experts and technicians to obtain the fault type of the fault sample;

(c) Decomposing the training sample by using a VMD technology to respectively obtain multi-scale features with the scale number of H;

(d) Construction of triplet dataAll are training samples which are respectively used as anchor points, positive samples and negative samples, and are- >And->The same fault type->And->Is different in fault type; in the characteristic extraction process, the invention utilizes the triplet loss to distinguish the least similar sample in the same category (namely a hard positive sample) from the most similar sample in different categories (namely a hard negative sample), and better fine characteristics are learned;

(e) Data of the tripletInput into a multi-scale feature fusion module, and output feature vectors

(f) Two nonlinear full-connection layers are introduced to reduce the dimension of the feature vector so that the Euclidean distance can be directly used for measurement, and the feature vector is simplifiedWherein, FC ₂ Representing operation through two fully connected layers;

(g) Calculated to obtainAnd->

(h) Judgment d ^- ≥d ⁺ If +gamma is true, 0 < gamma < 1, and if so, fixing parameters of the multi-scale feature fusion module to finish training of the multi-scale feature fusion module; otherwise, entering the next step; the invention introduces the parameter gamma to avoid the phenomenon of d ⁺ ＝d ^- When the model cannot determine category attribution, setting gamma can increase the distance between the anchor point and the negative sample, and decrease the distance between the anchor point and the positive sample;

(i) Gradient back propagation trains parameters of a multi-scale feature fusion module until a loss function L _g The convergence is achieved by the fact that, N _s Representing the total number of training samples, n represents the number of the training samples, fixing the parameters of the multi-scale feature fusion module, and finishing the training of the multi-scale feature fusion module;

(j) Forming a support set S by using part of training samples, forming a query set Q by using part of training samples, wherein no identical training samples exist in S and Q;

(k) Constructing M fault diagnosis tasks by using S and Q, wherein S of each fault diagnosis task contains C fault types, each fault type has K training samples in S, each fault type has L training samples in Q, K is less than L, and the ith fault diagnosis taskWherein (1)>D-dimensional feature vector, y, representing training samples in S _i,j Fault type tag representing training samples in S, < +.>D-dimensional feature vector representing training samples in Q, < >>Failure type label representing training samples in Q, (x) _i,j ,y _i,j ) Training samples in S representing the ith fault diagnosis task, +.>Training samples in Q representing the ith fault diagnosis task;

(l) Re-parameterizing a dimension scaling parameter α of a dimension variant prototype module, α=μ+εσ, μ representing the expectation of α, σ representing the expectation of α, ε representing compliance with ε _i Random numbers of N (0, 1);

(m) setting initial values of μ, σ;

(n) classifying training samples with the same fault type in S into one class, and calculating a fault prototype tau of a c-th class _c (i.e. a cluster center), c=1, 2., C;

wherein S is _C Training samples representing C failure types in S, |S _C |＝C×K，f _θ (x _i,j ) Representing the use of trained multi-scale feature fusion module pairs x _i,j Feature vector h obtained after feature extraction _φ (f _θ (x _i,j ) Representing the f using dimension variation prototype module pairs _θ (x _i,j ) Performing further feature processing to obtain measurement features;

(o) calculation ofAnd τ _c Euclidean distance of-> Representative use of trained multi-scale feature fusion module pairsFeature vector obtained after feature extraction, +.>Representing the use of dimension variant prototype module pairs +.>Performing further feature processing to obtain measurement features;

(p) calculation ofProbability of the failure type belonging to class c +.> τ _u Representing fault prototypes of other classes than class c;

(q) calculating the total loss L of M tasks _p The formula is as follows:

in the method, in the process of the invention,representing the loss of the jth Q in the ith task;

(r) judging L _p If the dimension variation prototype module is converged, fixing mu and sigma, and finishing training of the dimension variation prototype module; otherwise, entering the next step;

(s) updating mu, sigma according to the following rule, and returning to the step (n);

wherein, I _ψ Updating the step length; u represents the number of sample sizes of fault prototypes of other classes except class c, mu ₀ 、σ ₀ Initial values entered for the network.

In the above-mentioned fault diagnosis method for polyester esterification stage, in the step (b), the process of fault diagnosis of the fault sample A to be diagnosed by expert and technician is as follows: the expert and technician collect and record key data and information, analyze the characteristics and the performance of faults according to the technological parameters and the fault phenomena, compare the fault characteristics and the performance with the known fault database and give out fault types according to the specific conditions of the professional domain knowledge and production.

In the above-mentioned fault diagnosis method for polyester esterification stage, in the step (c), when decomposing the training sample by VMD technology, the number of modes is set to 3, and when decomposing the original signal into 4 modes, the center frequency distances of two adjacent mode components are the closest, so that the mode is considered to be overdecomposed, and the number of modes is set to 3.

The principle of the invention is as follows:

in order to solve the problem of fault diagnosis under the full open set scene of limited data, the invention provides a fault diagnosis method of polyester esterification stage, which comprises the steps of firstly obtaining a multi-scale feature fusion module with a parameter theta by utilizing fully marked source domain data and through global supervision training of the multi-scale feature fusion module and a difficult negative sample identification module, and representing the multi-scale feature fusion module as a function f _θ The method comprises the steps of carrying out a first treatment on the surface of the Then fix f _θ Dimension variation prototype module with phi as meta-scenario task training parameter constructed by using source domain data and representing the dimension variation prototype module as function h _φ The method comprises the steps of carrying out a first treatment on the surface of the Finally, completing verification of the model on the target domain data; the overall algorithm is based on a dimension variation prototype network framework of multi-scale feature fusion as shown in fig. 1; the invention fully utilizes the attribute information of a single sample and the similarity information from a sample group, and overcomes the difficulty that the existing fault diagnosis technology is limited by the assumption that each fault type has enough marked samples, but the assumption is not satisfied in the real complex flow industry of strong noise and strong coupling; the concrete explanation is as follows:

modern industrial process signals often show high-dimensional, nonlinear and multi-scale characteristics and are accompanied by stronger noise, but the current fault diagnosis method based on deep learning mostly only considers partial characteristics of industrial data, and can cause partial characteristic information loss in the training process, and the final diagnosis effect is generally affected by stronger noise. VMD techniques can decompose the signal into a series of eigenmode functions (Empirical Mode Functions, EMD) of different frequencies and amplitudes to filter out noise disturbances. High-dimensional nonlinear spatial features of different scales are extracted for each eigenmode function, providing effective fault features in the time and frequency domains of the process signal.

The VMD technology is the existing modal decomposition technology, is a basic data preprocessing method, has the existing code package, sets the required modal number, and is set to be 3; according to experimental analysis, when the VMD is decomposed into 4 modes, the center frequency distances of two adjacent mode components are the closest, and the VMD can be considered as mode overdomposition;

the specific theory is as follows:

the goal of the VMD is to decompose a real-valued input signal into a discrete number of sub-signals (modes) u _k The process mainly comprises two stages of constructing a variational problem and solving the variational problem:

1) Construction variation problem: assuming that the original signal is decomposed into H components and the decomposed sequence is a modal component with a limited bandwidth of the center frequency, then estimating the bandwidths of the modal signals, namely the constraint variation problem is as follows:

wherein H represents the total number of eigenmode functions (IMFs); { u _h -representing a set of individual modes; { omega _h -represents the center frequency of each mode; delta (t) represents an impulse function; * Representing a convolution; m represents an original signal;

2) Solving a variation mode: to solve the optimal solution of constraint variation problem, lagrangian multiplier lambda (t) and quadratic penalty factor are introduced Converting constraint problems into unconstrained problems, namely:

iterative update computation using a multiplier-alternating direction algorithmAnd->In order to find the "saddle point" of the above Lagrangian function, i.e. the optimal solution of the variation problem, the alternative direction method (ADMM), called multiplier, is used to iteratively update each mode and its center frequency using a multiplier alternative direction algorithm.

The multi-scale characteristics with the scale number of H are obtained through the VMD, multi-scale one-dimensional process signals are used as network input, and the characteristics are automatically extracted from the multi-scale characteristic diagram by using the learnable convolution kernels with the same H sizes, so that comprehensive information in the process signals is learned, and effective information fusion is realized. The multi-scale feature fusion module comprises an input layer with multiple scales, a first layer of convolution operation uses a larger convolution kernel to obtain a large receptive field, other convolution layers use smaller convolution kernels to extract feature information with finer granularity, and the last convolution layer is not connected with pooling, so that enough information for carrying out convolution operation by a subsequent dimension variation prototype module is reserved. Fig. 2 illustrates a first layer convolution computing architecture of a multi-scale feature fusion module.

For f _θ Better subtle features are learned in training to distinguish the least similar samples in the same class (i.e., hard positive samples) from the most similar samples in different classes (i.e., hard negative samples), the invention builds a hard negative sample recognition module. The module utilizes source domain fault data First selecting a sample from any fault type as anchor point +.>Then another sample is selected from the faults as positive sample +.>Selecting one sample from other fault types as negative sample to form one triple data +.>Through f _θ Extracting features to obtain->Because the high-dimensional feature is obtained at this time, the Euclidean distance cannot be measured, and therefore, the dimension of the feature vector can be reduced by adding two nonlinear full-connection layers, and the simplified feature vector is represented as: />Wherein FC is ₂ Representing the calculation of +.>And->Then claim d ⁺ The smaller the better the d ^- The larger and better, when d ^- ≥d ⁺ When +gamma, at this time, loss=0, and no optimization is needed, otherwise, the model cannot distinguish the positive and negative samples, loss=d ⁺ +γ-d ^- The loss of global supervision training is:wherein setting the relative distance parameter gamma is mainly to avoid d ⁺ ＝d ^- The model can not judge which class belongs to, the distance between the anchor point and the negative sample is larger, the distance between the anchor point and the positive sample is smaller, and the hard positive sample and the hard negative sample can be effectively identified through the optimization of the hard negative sample identification module so as to learn better fine features.

Training to obtain f by using supervision information of single sample in source domain _θ These layers are then fixed and shared to meta-scenario training converts the raw data of the meta-tasks into a basic feature space.

Obtaining f in global supervision training _θ After that, the invention can solve the problem of limited data whole by meta-scene trainingFault diagnosis problem in open set scenario. To train the dimension variation prototype module, the present invention contemplates employing a prototype network with scaling parameters to learn prototype vectors for representing different classes of data. However, scaling parameters in scalar form can simply change the scale of the embedded space, failing to adjust the relative position between the query set and the support set. Thus, the present invention extends scalar form scaling parameters into a dimension scaling vector that enables it to transform an embedded space into a space that fits the data.

The selection of the appropriate dimensions is critical to ensure that the data points can be projected into a linearly separable space. Too low a dimension may result in information loss, while too high a dimension may be redundant. The best dimension is related to the relevance of the data and the choice as a hyper-parameter is difficult to determine before training. Therefore, the invention redefines the prototype network based on dimension scaling measurement by adopting the Bayesian viewpoint, and automatically learns the scaling parameter alpha of the dimension measurement by a statistical method of variation inference.

Assuming a posterior distribution of α as p _φ (α|T _s ) The prediction distribution can be represented as:

wherein the condition distributionIs a discriminative classifier represented by the parameter phi.

The goal of the variance inference is to find an approximate posterior probability distributionSo that it matches the true posterior probability distribution p _φ (α|T _s ) The KL divergence between is minimized, represented by:

the second term of the above-mentioned logp (T) _s ) And (3) withIrrespective, the minimization of the KL divergence translates into the maximization variation lower bound ELBO, expressed as:

dimension variation prototype module h _φ Is represented as:

since sampling directly from the variation parameters is a nonlinear operation, the gradient cannot be calculated directly.

To solve this problem, alpha is re-parameterized asε _i N (0, 1). Let->Obeys N (mu, sigma) ² ) The variation parameter at this time is represented as μ= (μ) ¹ ,μ ² ,...,μ ^W ) ^Τ ,σ＝(σ ¹ ,σ ² ,...,σ ^W ) ^Τ Where W is the variation parameter dimension, and the reparameterization is represented as:

α＝g _μ,σ (ε)＝μ+εσ；

let q be _μ,σ (α)～N(μ,σ ² ) A priori distributions p (α) -N (μ) ₀ ,σ ₀ ) The objective function is represented as:

for μ, σ, the bias leads are:

selecting Euclidean distance as a measure, thenAnd class prototype c _c The distance between them is represented as:

the update rules of the variation parameters μ, σ are respectively:

total loss L of M tasks _p The formula is as follows:

in the method, in the process of the invention,representing the loss of the jth Q in the ith task.

Advantageous effects

(1) The invention designs a variation prototype network based on multi-scale feature fusion, and fully utilizes attribute information from a single sample and similarity information from a sample group to solve the problem of fault diagnosis under a limited data full open set scene.

(2) The invention redefines a prototype network based on dimension scaling measure from the Bayesian angle, converts the embedded space into a space suitable for data, and adaptively measures the feature similarity of the sample pairs.

(3) According to the invention, a fault data set in the polyester polymerization esterification stage is constructed by analysis, and a wide experiment is carried out on the method, so that the result shows that the MFF-VPNet (algorithm of the invention) can solve the problem of fault diagnosis under the condition of a limited data full open set. By comparing with the prior method, the method has the advantages of verifying the effectiveness and superiority, and providing a feasible solution for fault diagnosis of actual complex industrial processes.

Drawings

FIG. 1 is a schematic diagram of a dimension variation prototype network framework for multi-scale feature fusion in accordance with the present invention;

FIG. 2 is a schematic diagram of a multi-channel feature fusion module according to the present invention; only three channels are shown in the figure, and the processing modes of the other channels are the same;

FIG. 3 is a schematic diagram of an industrial process for the polyester polymerization esterification stage of the present application.

Detailed Description

The application is further described below in conjunction with the detailed description. It is to be understood that these examples are illustrative of the present application and are not intended to limit the scope of the present application. Furthermore, it should be understood that various changes and modifications can be made by one skilled in the art after reading the teachings of the present application, and such equivalents are intended to fall within the scope of the application as defined in the appended claims.

(2) The expert and technician collect and record key data and information, analyze the characteristics and performance of faults according to technological parameters and fault phenomena, compare with known fault databases and produce according to the expert domain knowledgeTo-be-diagnosed fault samples B are divided into diagnosed fault samples B1 with known fault types and to-be-diagnosed fault samples B2 with unknown fault types, and all B1 are formed into a support set S ^* ，S ^* The fault type comprises C;

(3) Decomposing B1 and B2 by utilizing a VMD technology to obtain multi-scale characteristics with the scale number of H; wherein, when B1 is decomposed, the number of modes is set to 3;

(4) Will S ^* Samples with the same fault type are classified into a class, and a fault prototype tau of a c-th class is calculated _c* ，c*＝1,2...,C*；

In the method, in the process of the invention,represents S ^* Samples of C fault types in +.>Represents S ^* The number of samples of C failure types, +.>Represents S ^* D-dimensional feature vector of samples in (a), +.>Represents S ^* Fault type label of sample in +.>Representing the use of trained multiscale feature fusion module pair +.>Feature vector obtained after feature extraction, +.>Representative use of trained dimension variation prototype Module pair +.>Performing further feature processing to obtain measurement features;

the multi-scale feature fusion module is formed by sequentially connecting a first one-dimensional convolution layer, a first pooling layer, a second one-dimensional convolution layer, a second pooling layer, a third one-dimensional convolution layer and a fourth one-dimensional convolution layer; the first one-dimensional convolution layer adopts H convolution kernels with the same size and is used for fusing and overlapping multi-scale features with the scale number of H into a single-scale feature map;

the size of the cores in the first one-dimensional convolution layer of the multi-scale feature fusion module is 64 multiplied by 1 multiplied by 3, the step length is 2 multiplied by 1, the number of the cores is 16, and the activation function is ReLU; the size of cores in the first pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of cores is 16, and the activation function is Max; the size of the kernel in the second one-dimensional convolution layer is 3×1, the step size is 2×1, the number of kernels is 32, and the activation function is ReLU; the size of the cores in the second pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 32, and the activation function is Max; the size of the kernel in the third one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 64, and the activation function is ReLU; the size of the kernel in the fourth one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 64, and the activation function is ReLU;

the size of the cores in the first one-dimensional convolution layer of the dimension variation prototype module is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 64, and the activation function is ReLU; the size of the cores in the first pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 64, and the activation function is Max; the size of the kernel in the second one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 128, and the activation function is ReLU; the size of the cores in the second pooling layer is 2×1, the step size is 2×1, the number of cores is 128, and the activation function is Max;

the training method of the multi-scale feature fusion module and the dimension variation prototype module comprises the following steps: firstly, identifying the fault type of source domain data of a fault sample of the production line A in a global supervision training mode, then fixing a multi-scale feature fusion module for task element learning training, learning a feature representation, enabling similar faults to be closer to different types of faults in an embedded space, finally, adaptively measuring the feature similarity of the sample pair by utilizing a dimension variation prototype module, and automatically learning a measurement scaling parameter to change the embedded space by a variation inferred statistical method; the production line A and the production line B are two production lines with the same technological process and performed simultaneously; the specific training steps are as follows:

(b) The expert and technician collect and record key data and information, analyze the characteristic and performance of the fault according to technological parameters and fault phenomenon, and compare with known fault database, give the fault type according to the specific situation of the professional domain knowledge and production;

(c) Decomposing the training sample by using a VMD technology to respectively obtain multi-scale features with the scale number of H; when the training samples are decomposed, the number of modes is set to be 3;

(d) Construction of triplet dataAll are training samples, and are added with->And->The same fault type->And->Is different in fault type;

(f) Two nonlinear full-connection layers are introduced to reduce the dimension of the feature vector, and the feature vector after simplificationWherein, FC ₂ Representing operation through two fully connected layers;

(g) Calculated to obtainAnd->

(h) Judging d- > d ⁺ If +gamma is true, 0 < gamma < 1, and if so, fixing parameters of the multi-scale feature fusion module to finish training of the multi-scale feature fusion module; otherwise, entering the next step;

(i) Gradient back propagation trains parameters of a multi-scale feature fusion module until a loss function L _g The convergence is achieved by the fact that,N _s representing the total number of training samples, n represents the number of the training samples, fixing the parameters of the multi-scale feature fusion module, and finishing the training of the multi-scale feature fusion module;

(k) Constructing M fault diagnosis tasks by using S and Q, wherein S of each fault diagnosis task contains C fault types, each fault type has K training samples in S, each fault type has L training samples in Q, K is less than L, and i is thFault diagnosis taskWherein (1)>D-dimensional feature vector, y, representing training samples in S _i,j Fault type tag representing training samples in S, < +.>D-dimensional feature vector representing training samples in Q, < >>Failure type label representing training samples in Q, (x) _i,j ,y _i,j ) Training samples in S representing the ith fault diagnosis task, +.>Training samples in Q representing the ith fault diagnosis task;

(m) setting initial values of μ, σ;

(n) classifying training samples with the same fault type in S into one class, and calculating a fault prototype tau of a c-th class _c ，c＝1,2...,C；

Wherein S is _C Training samples representing C failure types in S, |S _C |＝C×K，f _θ (x _i,j ) Representing the use of trained multi-scale feature fusion module pairs x _i,j Feature vector h obtained after feature extraction _φ (f _θ (x _i,j ) Representative utilization of dimension variation prototype modulesFor f _θ (x _i,j ) Performing further feature processing to obtain measurement features;

(q) calculating the total loss L of M tasks _p The formula is as follows:

The validity of the method model of the invention is verified by specific experimental data:

the esterification stage is the first stage of the polyester polymerization process, and is a complex-flow industrial process with complex chemical process and various production equipment; for the product viscosity, as long as the esterification rate of the esterification reaction reaches within a required range, the viscosity control of the product can reach within the required range; the real-time state monitoring diagnosis and fault data analysis of the system process should be started from the esterification reaction at first;

according to the process principle of Ji Ma polyester esterification production, PTA and EG are mixed in proportion and then enter an esterification kettle from the bottom through a conveying pipeline, and the flow control of materials is realized through a material pump; the esterification kettle is a double-chamber vertical reactor, no stirring device is arranged in the kettle, and the esterification reaction is uniform by virtue of natural circulation formed by the excessive EG; separating the vaporized glycol and water vapor in an esterification separation tower under high temperature conditions; the water vapor escapes from the top of the separation column and is recycled after condensation; the fractionated glycol comes out from the bottom of the separation tower and can be reused; the esterification stage process flow diagram 3 is shown.

The esterification reaction is carried out at high temperature and positive pressure, and a plurality of reaction parameters determine the esterification reaction, and the esterification reaction can be influenced by the temperature, the liquid level, the pressure, n (EG)/n (PTA) and the reflux EG; according to the characteristics and field experience, 20-dimensional monitoring variables shown in table 1 are selected on each production line (A production line and B production line) and comprise 11 control variables and 9 process variables, and the variables comprehensively represent the production conditions of the esterification stage;

TABLE 1 polyester esterification stage monitoring variables

The experimental data are effective sensor data collected by DSC equipment of the two production lines in the esterification stage;

variable analysis is carried out on the collected sensor data according to the existing literature data and reaction mechanism, and finally, the description of the fault type and the sample size of the polyester esterification stage are obtained, as shown in table 2:

TABLE 2 description of polyester esterification stage fault types

/>

As shown in table 2, the A, B production lines all contained 8 faults, and there were 6 faults shared by both production lines;

the meta learning of the invention is completed by A production line in the training stage with a support set S and a query set Q; after the network training is finished, the types of fault samples (such as samples of the existing labels in the three types II1, II5 and II 7) obtained by the expert and technician analysis on the production line B are used for calculating fault prototypes, and then the network is tested by using the faults (other faults in the three types II1, II5 and II 7). Specifically, fault samples of the A production lines (I1, I2, I3, I4 and I6) are used for training a network, the fault samples of the B production lines (II 1, II5 and II 7) are tested, and the accuracy of a model is verified under the open set diagnosis scene of two limited data of 3-way-10-shot and 3-way-20-shot respectively; although the sensor arrangement of the two production lines is consistent, the working conditions of the two production lines are greatly different; 750 training fault class sample sizes are selected uniformly, 350 test fault class sample sizes are selected uniformly, and each sample contains continuous 32 time step information; sample normalization is carried out by adopting a Z-Score standardization method so as to obtain high-quality training and testing results;

Decomposing the fault sample by using a VMD technology, wherein the scale number is set to be 3; the triplet loss hyper-parameter is set to γ=0.3; the a priori distribution of the dimension scaling parameters is set to p (α) =n (1, 1), and the variational parameters are initialized to μ _init =100, σ=0.2, the regularization parameter is set to η=0.7; in order to avoid overfitting dorpout to take 0.2, the whole model is optimized by an Adam optimizer when the initial learning rate is 1e-4, the learning rate attenuation rate is 0.8, and the learning rate is reduced along with the attenuation rate of every 20 iterations;

in order to evaluate the effectiveness and the advancement of the algorithm model (MFF-VPNet) of the invention, several classical under-data fault diagnosis machine learning algorithms, namely FDDPN, MRN, CFDM, FSM and TRNet, are also adopted, and a prediction sample is identified through a fault sample, wherein the identification result is shown in table 3:

TABLE 3 identification accuracy of line target test faults at different sample volumes

Compared with the classical machine learning algorithm, the algorithm model disclosed by the invention can more accurately identify the fault type, achieves a considerable effect in the reaction process of the real polyester esterification stage, further proves the effectiveness and the advancement of the proposed model, and can solve the problem of fault diagnosis under the full open set scene of limited data.

The data adopted by the invention is constructed by analyzing the real complex flow industrial data, the flow industrial is complex, the working environment and the operation working condition are bad, the data has the characteristics of strong noise and strong coupling, the comparison methods (FDDPN, MRN, CFDM, FSM, TRNet) are basically diagnostic tasks completed on a simulation data set, and the simulation data cannot better reflect the actual performance condition of the industrial flow.

Claims

1. A method for diagnosing faults in the esterification stage of polyester, comprising the steps of:

(4) Will S ^* Samples with the same fault type are classified into one type, and a fault prototype of the c type is calculatedc*＝1,2...,C*；

In the method, in the process of the invention,represents S ^* Samples of C fault types in +. >Represents S ^* The number of samples of C failure types, +.>Represents S ^* D-dimensional feature vector of samples in (a), +.>Represents S ^* Fault type label of sample in +.>Representing the use of trained multiscale feature fusion module pair +.>Feature vector obtained after feature extraction, +.>Representative use of trained dimension variation prototype Module pair +.>Performing further feature processing to obtain measurement features;

(5) Calculation ofAnd->Euclidean distance of-> D-dimensional feature vector representing B2, +.>Representing the use of trained multiscale feature fusion module pair +.>Feature vector obtained after feature extraction, +.>Representative use of trained dimension variation prototype Module pair +.>Performing further feature processing to obtain measurement features;

(6) Calculating the probability that the fault type of B2 belongs to the c-th class Representing the predicted +.>Label, alpha represents dimension scaling parameter of dimension variation prototype module, < >>Representing fault prototypes of other classes except class c;

2. The method for diagnosing faults in the esterification stage of polyester according to claim 1, wherein in the step (2), the process of diagnosing faults by an expert and a technician on the sample B to be diagnosed is as follows: the expert and technician collect and record key data and information, analyze the characteristics and the performance of faults according to the technological parameters and the fault phenomena, compare the fault characteristics and the performance with the known fault database and give out fault types according to the specific conditions of the professional domain knowledge and production.

3. The method for diagnosing a failure in an esterification stage of polyester according to claim 1, wherein in the step (3), the number of modes is set to 3 when decomposing B1 by VMD technique.

4. The method for diagnosing a failure in an esterification stage of polyester according to claim 1, wherein the first one-dimensional convolution layer of the multi-scale feature fusion module has a kernel size of 64 x 1 x 3, a step size of 2 x 1, a number of kernels of 16, and an activation function of ReLU; the size of cores in the first pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of cores is 16, and the activation function is Max; the size of the kernel in the second one-dimensional convolution layer is 3×1, the step size is 2×1, the number of kernels is 32, and the activation function is ReLU; the size of the cores in the second pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 32, and the activation function is Max; the size of the kernel in the third one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 64, and the activation function is ReLU; the size of the kernel in the fourth one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 64, and the activation function is ReLU.

5. The method for diagnosing a failure in a polyester esterification stage according to claim 1, wherein the first one-dimensional convolution layer of the dimension-variant prototype module has a kernel size of 2 x 1, a step size of 2 x 1, a number of kernels of 64, and an activation function of ReLU; the size of the cores in the first pooling layer is 2 multiplied by 1, the step length is 2 multiplied by 1, the number of the cores is 64, and the activation function is Max; the size of the kernel in the second one-dimensional convolution layer is 2×1, the step size is 2×1, the number of kernels is 128, and the activation function is ReLU; the cores in the second pooling layer have a size of 2×1, a step size of 2×1, a number of cores of 128, and an activation function of Max.

6. The method for diagnosing faults in the esterification stage of claim 1, wherein the training process of the multi-scale feature fusion module and the dimension variation prototype module is as follows:

(d) Construction of triplet data All are training samples, and are added with->And->The same fault type->And->Is different in fault type;

(g) Calculated to obtain

(h) Judgment d ^- ≥d ⁺ If +gamma is true, 0 < gamma < 1, and if so, fixing parameters of the multi-scale feature fusion module to finish multi-scale feature fusionTraining of the module; otherwise, entering the next step;

(m) setting initial values of μ, σ;

(q) calculating the total loss L of M tasks _p The formula is as follows:

7. The method for diagnosing a failure in an esterification stage of polyester according to claim 6, wherein in the step (b), the process of performing the failure diagnosis on the failure sample a to be diagnosed by an expert and a technician is as follows: the expert and technician collect and record key data and information, analyze the characteristics and the performance of faults according to the technological parameters and the fault phenomena, compare the fault characteristics and the performance with the known fault database and give out fault types according to the specific conditions of the professional domain knowledge and production.

8. The method of claim 6, wherein in step (c), the number of modes is set to 3 when decomposing the training sample by VMD technique.