CN112434729A

CN112434729A - Fault intelligent diagnosis method based on layer regeneration network under class imbalance sample

Info

Publication number: CN112434729A
Application number: CN202011244156.6A
Authority: CN
Inventors: 陈景龙; 李芙东; 訾艳阳
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2020-11-09
Filing date: 2020-11-09
Publication date: 2021-03-02
Anticipated expiration: 2040-11-09
Also published as: CN112434729B

Abstract

The invention discloses a fault intelligent diagnosis method based on a layer regeneration network under a class imbalance sample, which comprises the steps of collecting an original signal under the operation state of mechanical equipment by using an acceleration transducer, intercepting a time sequence by a fixed length to obtain a data sample set, and carrying out standardized pretreatment on each sample; classifying and labeling the acquired data samples, and dividing the sample data into a pre-training set, a training set and a testing set; then constructing a fault diagnosis model based on a layer regeneration network, and using a pre-training set for pre-training to obtain a model with good identification capability on the health state of old task data; training the model by using the training set sample and finely adjusting parameters of the full connection layer to obtain a model with good identification capability on the health state of the new task data; and finally, adjusting all parameters of the network by knowledge distillation, and improving the recognition capability of the network on old task data. The invention can reduce a large amount of data storage cost and is beneficial to promoting the application of the intelligent diagnosis method in the actual engineering.

Description

Fault intelligent diagnosis method based on layer regeneration network under class imbalance sample

Technical Field

The invention relates to the technical field of fault diagnosis of mechanical equipment, in particular to a fault intelligent diagnosis method based on a layer regeneration network under a class imbalance sample.

Background

The fault diagnosis method based on deep learning has a plurality of obstacles in the process from theoretical research to practical application, and one of the obstacles is how a deep learning model identifies a problem after new types of data appear. Generally, after the model is trained and applied to practice, the data types that the model can recognize are fixed. But due to the complexity of the actual conditions, it is difficult to take into account all failure modes. When new task data which is not considered in the training phase is generated in the operation process of the equipment, the model is difficult to identify the type of data, so that the model needs to be updated to be capable of identifying the new task data.

The fine-tuning method can directly fine-tune the model on the basis of the old task by means of the knowledge of the old task. But it is easy to cause the model to show a sharp drop on the old task, since the parameters are changed during the training process. The joint training method uses all the new task data and the old task data to train the model, and can give consideration to both the new task and the old task. But the requirement on the data volume is extremely large, the data storage cost is increased and the model updating speed is greatly reduced along with the accumulation of task data. Aiming at the problem that the model under the class imbalance sample is difficult to update parameters, a new technology and a new method for intelligently diagnosing the faults of the mechanical equipment, which have low requirement on data volume and can consider the identification performance of a new task and an old task, are needed to be researched.

Disclosure of Invention

The invention provides a fault intelligent diagnosis method based on a layer regeneration network under a class imbalance sample, which overcomes the defects of the prior art, only utilizes new task data, constructs the layer regeneration network under the condition of no old task data sample, trains the network by a distillation training method to automatically extract data characteristics and other implicit class data characteristics, gives consideration to the expression of a model in the new task and the old task, realizes the updating training of model parameters, can reduce a large amount of data storage cost, and is favorable for promoting the application of the intelligent diagnosis method in the actual engineering.

In order to achieve the purpose, the invention adopts the following technical scheme:

an intelligent fault diagnosis method based on a layer regeneration network under a class imbalance sample comprises the following steps:

step 1: acquiring original signal data of mechanical equipment in an operating state by using an acceleration sensor, intercepting the original signal data at a fixed length to obtain a data sample set, and performing time sequence preprocessing on each sample in the data sample set to obtain sequence samples with the same mean value and variance;

step 2: classifying and labeling the obtained sequence samples, and dividing the sequence samples into a pre-training set, a training set and a testing set, wherein the pre-training set corresponds to an old task and a label, the training set is new type data and labels encountered in application and is called as a new task, and the testing set comprises the old task, the new task data and the label;

and step 3: constructing a fault diagnosis model based on a layer regeneration network and consisting of a feature extraction module and a new task state recognition module, and training by using the pre-training set obtained in the step 2 to obtain a model capable of recognizing the health state of an old task;

and 4, step 4: updating and training the parameters in the model trained in the step 3 by using the training set sample in the step 2, so that the model can identify the health state of a new task, and meanwhile, the model slows down the downward sliding of the model in the performance of an old task;

and 5: and (4) carrying out fault diagnosis on the state of the test set sample in the step 2 by using the model trained in the step 4.

Further, the time-series preprocessing described in step 1 uses zero-mean normalization,

for samples in the data sample set { a }₁,a₂,...,a_nThe calculation formula is:

wherein, a_iIs the ith data value of the sample; n is the length of intercepting original signal data; a is a sample mean value; s is the sample variance;

new sequence { x₁,x₂,...,x_nMean 0, variance 1, and dimensionless.

Further, the feature extraction module in step 3 is constructed by a one-dimensional convolutional neural network, and includes four convolutional layers-pooling layer structures and three full-connected layers, specifically, the size of a convolutional kernel decreases as the number of layers increases, the sizes of the convolutional kernels used by the four convolutional layers are respectively 9, 7, 5 and 3, the step lengths are all 1, and an edge zero-filling measure is adopted to make the sizes of input and output equal; the pooling layers are all in a maximum pooling mode, the sizes of the pooling layers are 4, 2 and 2 respectively, and the step lengths are 4, 2 and 2 respectively.

Further, the new task state identification module in step 3 is composed of two fully-connected layers, and the input of the new task state identification module is the output of the first fully-connected layer of the feature extraction module.

Further, the parameter update training in step 4 includes two steps: the method comprises a parameter fine-tuning stage and a distillation training stage, wherein the parameter fine-tuning stage aims to enable a model to be converged quickly in a new task, but simultaneously enables the model to slide down rapidly in an old task, and the distillation training stage utilizes knowledge to distill and improve the performance of the model in the old task.

Further, in the parameter fine tuning stage, the convolutional layer parameters are frozen, and the full-link layer parameters are trained by using training set sequence samples until loss convergence.

Further, cross entropy output by the calculation model in the training process of the parameter fine-tuning stage is used as loss, and the optimization target is that the cross entropy loss is minimized:

min ylog(q(x^new))

in the formula, y is an actual label of the training set data; q (x)^new) The prediction result is output by the model and is the last layer input of the new task state identification module and the feature extraction moduleAnd (6) outputting the results of splicing together and performing Softmax operation.

Further, in the distillation training stage, the optimization target consists of two parts, namely the minimized cross entropy loss and the minimized Euclidean distance between the distillation output and the original output, wherein the minimized cross entropy loss is used for ensuring the recognition accuracy of the model on a new task, and the minimized Euclidean distance between the distillation output and the original output is used for slowing down the performance degradation of the model on an old task.

Further, aiming at the minimized cross entropy loss, the cross entropy between the output of the model and the data label of the training set is calculated as the loss, and the optimization target is the minimized cross entropy loss:

ylog(q'(x^new))

wherein, q' (x)^new) And outputting the model after the parameter fine adjustment and performing Softmax operation.

Further, aiming at the Euclidean distance between the minimized distillation output and the original output, the feature extraction module of the model is output

And new task state recognition module output

Splicing to a new output:

output to model z ═ { z₁,z₂,...,z_nA generalized softmax function is used:

in the formula, T is temperature, and the larger T is, the softer the output probability is; q. q.s_iObtaining the probability of the ith state after Softmax operation;

and (3) using the distance between the distillation output of the model after Euclidean distance measurement training and the output after parameter fine adjustment to participate in training:

dist(q^R(x^new)/T,q^T(x^new)/T)

in the formula, dist (. cndot.) is the Euclidean distance function; q. q.s^R(x^new) Outputting the model after parameter fine adjustment; q. q.s^T(x^new) The distillation output of the model after the Euclidean distance measurement training is used;

the overall optimization objectives at this stage are:

min(ylog(q^T(x^new))+λdist(q^R(x^new)/T,q^T(x^new)/T))

in the formula, λ is a fractional weight factor.

Compared with the prior art, the invention has the following beneficial technical effects:

1) the invention uses the deep convolutional neural network to carry out feature extraction and running state identification on the mechanical signal, can effectively extract the sensitive features in the mechanical signal, and gets rid of the dependence of the traditional feature extraction process on manual experience.

2) The method gives consideration to both the new task and the old task, and slows down the performance reduction of the model on the old task while enhancing the learning of the new task. Compared with a fine tuning method, the method can improve the performance of the model on the old task by about 30-40%.

3) The method does not need old task data to participate in training, and considers that the model trained by the old task can mine information related to the old task in the new task and train the model by utilizing the information. Therefore, the method can reduce a large amount of data storage cost and is beneficial to promoting the practical application of the intelligent diagnosis method in engineering.

4) The method can use new task data to continuously learn by itself in the running process of the equipment, and can use more and more state data to carry out self-updating along with the running of the equipment.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic structural diagram of a detection model according to the present invention;

FIG. 3 is a diagram showing the results of the detection in the embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and examples in order to provide a better understanding of the invention to those skilled in the art. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. For convenience of description, only portions related to the related invention are shown in the drawings. It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

An intelligent fault diagnosis method based on a layer regeneration network under a class imbalance sample is shown in fig. 1, and comprises the following steps:

the data normalization preprocessing uses zero-mean normalization on the sequence { a₁,a₂,...,a_nThe calculation formula is:

wherein, a_iIs the ith data value of the sample; n is the length of intercepting original signal data; a is a sample mean value; s is the sample variance. New sequence { x₁,x₂,...,x_nMean 0, variance 1, and dimensionless.

Step 2: classifying and labeling the obtained sequence samples, and dividing the sequence samples into a pre-training set, a training set and a test set, wherein the pre-training set corresponds to an old task and a label, the training set corresponds to new type data and labels encountered by a model in application and is called as a new task, and the test set comprises the old task, the new task data and the label;

the feature extraction module is constructed by a one-dimensional convolutional neural network and comprises four convolutional layer-pooling layer structures and three full-connection layers. More specifically, the size of the convolution kernel decreases as the number of layers increases, the sizes of the convolution kernels used by the four convolution layers are respectively 9, 7, 5 and 3, the step length is 1, and the sizes of the input and the output are equal by adopting an edge zero filling measure; the pooling layers are all in a maximum pooling mode, the sizes of the pooling layers are 4, 2 and 2 respectively, and the step lengths are 4, 2 and 2 respectively.

The new task state recognition module is composed of two fully-connected layers, the input of the new task state recognition module is the output of the first fully-connected layer of the feature extraction module in the step 3, and the new task state recognition module does not have gradient operation with the feature extraction module except for receiving the output of the middle layer of the feature extraction module as the input.

the parameter updating training comprises two steps: a parameter fine-tuning stage and a distillation training stage. The parameter tuning phase aims at making the model converge quickly on the new task, but at the same time causes the model to slide down sharply on the old task performance. The distillation training stage utilizes knowledge to distill and improve the performance of the model on old tasks.

In the parameter fine-tuning stage, the convolutional layer parameters are frozen, and the full-link layer parameters are trained by using training set sequence samples until loss convergence. Freezing parameters may slow down the performance degradation of the model over the old task. Calculating the cross entropy between the output of the model and the data label as loss, wherein the optimization target is the minimum cross entropy loss:

min ylog(q(x^new))

wherein y is the actual label of the data; q (x)^new) And (4) outputting a prediction result for the model. Note that the result is the result of the last layer of output of the new task state recognition module and the feature extraction module being spliced together and performing Softmax operation.

In the distillation training stage, the optimization target consists of two parts, namely the minimized cross entropy loss and the minimized Euclidean distance between the distillation output and the original output, wherein the minimized cross entropy loss is used for ensuring the recognition accuracy of the model on a new task, and the minimized Euclidean distance between the distillation output and the original output is used for slowing down the performance degradation of the model on an old task.

In the part of minimizing cross entropy loss, cross entropy between the output of the calculation model and the data label is used as loss, and the optimization target is the minimized cross entropy loss:

ylog(q'(x^new))

wherein, q' (x)^new) And the result is output with the trained model and subjected to Softmax operation.

In the Euclidean distance part of the minimum distillation output and the original output in the distillation training stage, the feature extraction module of the model outputs

And new task state recognition module output

Splicing to a new output:

in the formula, T is temperature, and the larger T is, the softer the output probability is; q. q.s_iAnd obtaining the probability of the ith state after Softmax operation.

And (3) using the distance between the distillation output of the model after Euclidean distance measurement training and the output of the model after parameter fine adjustment to participate in training:

dist(q^R(x^new)/T,q^T(x^new)/T)

in the formula, dist (. cndot.) is the Euclidean distance function; q. q.s^R(x^new) Fine tuning the output of the model for the parameter; q. q.s^T(x^new) Is the distillation output of the model after training using euclidean distance metric.

In summary, the overall optimization objectives in this phase are:

min ylog(q^T(x^new))+λdist(q^R(x^new)/T,q^T(x^new)/T)

in the formula, λ is a fractional weight factor.

The trained model is used for carrying out state evaluation on the test set sample, and the performance of the model on an old task can be remarkably improved under the condition that the recognition accuracy of the model on the state of a new task is not influenced.

The present invention is described in further detail below with reference to specific examples:

the used data set containing the four bearing operating states has four rolling bearing operating states of normal, ball fault, inner ring fault and outer ring fault, each operating state contains 2000 samples, and the total length of the samples is 8000 samples, and the length of the samples is 1024. Taking 1000 samples of normal, ball fault and inner ring fault as pre-training data (old task), taking 1000 outer ring fault samples as training data (new task), and taking 1000 samples of remaining normal, ball fault, inner ring fault and outer ring fault as test data. The model is first trained using training data, enabling the model to identify the type of data contained in the old task. The model is then update trained using the training data to simulate the update process of the model when new types of data are encountered in practice. And finally, measuring the performance of the model in the new task and the old task by using the test data. As shown in fig. 3, on the basis of fine tuning, the method only uses the new task data to improve the performance of the model in the old task by more than 40%.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. The fault intelligent diagnosis method based on the layer regeneration network under the class imbalance sample is characterized by comprising the following steps of:

2. The method for fault intelligent diagnosis based on layer regeneration network under the class of unbalanced samples according to claim 1, wherein the time sequence preprocessing in step 1 uses zero mean normalization,

new sequence { x₁,x₂,...,x_nMean 0, variance 1, and dimensionless.

3. The method for intelligently diagnosing faults based on the layer regeneration network under the class of the unbalanced sample according to claim 1, wherein the feature extraction module in the step 3 is constructed by a one-dimensional convolutional neural network and comprises four convolutional layer-pooling layer structures and three full-connected layers, specifically, the size of a convolutional kernel is reduced along with the deepening of the layer number, the sizes of the convolutional kernels used by the four convolutional layers are respectively 9, 7, 5 and 3, the step length is 1, and the sizes of input and output are equal by adopting an edge zero padding measure; the pooling layers are all in a maximum pooling mode, the sizes of the pooling layers are 4, 2 and 2 respectively, and the step lengths are 4, 2 and 2 respectively.

4. The method for intelligently diagnosing faults based on the layer regeneration network under the class imbalance sample according to claim 3, wherein the new task state identification module in the step 3 is composed of two fully-connected layers, and the input of the new task state identification module is the output of the first fully-connected layer of the feature extraction module.

5. The method for intelligently diagnosing faults based on the layer regeneration network under the class imbalance sample according to claim 3, wherein the parameter updating training in the step 4 comprises two steps: the method comprises a parameter fine-tuning stage and a distillation training stage, wherein the parameter fine-tuning stage aims to enable a model to be converged quickly in a new task, but simultaneously enables the model to slide down rapidly in an old task, and the distillation training stage utilizes knowledge to distill and improve the performance of the model in the old task.

6. The method as claimed in claim 5, wherein in the parameter fine-tuning stage, convolutional layer parameters are frozen, and full-link layer parameters are trained using training set sequence samples until loss converges.

7. The method for intelligently diagnosing faults based on the layer regeneration network under the class imbalance sample according to claim 6, wherein cross entropy output by a calculation model in a training process of a parameter fine-tuning stage is used as loss, and an optimization target is to minimize the cross entropy loss:

min y log(q(x^new))

in the formula, y is an actual label of the training set data; q (x)^new) And the result is a prediction result output by the model, and the result is a result obtained by splicing the last layer output of the new task state identification module and the last layer output of the feature extraction module and performing Softmax operation.

8. The method for intelligently diagnosing faults based on the layer regeneration network under the class imbalance sample according to claim 7, wherein in the distillation training stage, the optimization objective consists of two parts, namely, minimizing cross entropy loss and minimizing Euclidean distance between distillation output and original output, wherein the minimizing cross entropy loss is used for ensuring the identification accuracy of the model on a new task, and the minimizing Euclidean distance between the distillation output and the original output is used for relieving the performance degradation of the model on an old task.

9. The method for intelligently diagnosing faults based on the layer regeneration network under the class imbalance sample according to claim 8, wherein for minimizing cross entropy loss, cross entropy between a model output and a training set data label is calculated as loss, and an optimization target is to minimize the cross entropy loss:

y log(q'(x^new))

10. The method for intelligently diagnosing faults based on the layer regeneration network under the class imbalance sample according to claim 9, wherein the Euclidean distance between the distillation output and the original output is minimized, and the feature extraction module of the model is output

And new task state recognition module output

Splicing to a new output:

dist(q^R(x^new)/T,q^T(x^new)/T)

the overall optimization objectives at this stage are:

min(y log(q^T(x^new))+λdist(q^R(x^new)/T,q^T(x^new)/T))

in the formula, λ is a fractional weight factor.