CN115859142A - Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network - Google Patents

Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network Download PDF

Info

Publication number
CN115859142A
CN115859142A CN202211233344.8A CN202211233344A CN115859142A CN 115859142 A CN115859142 A CN 115859142A CN 202211233344 A CN202211233344 A CN 202211233344A CN 115859142 A CN115859142 A CN 115859142A
Authority
CN
China
Prior art keywords
convolution
data
signal
transformer
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211233344.8A
Other languages
Chinese (zh)
Inventor
高慧慧
张潇然
韩红桂
高学金
李方昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202211233344.8A priority Critical patent/CN115859142A/en
Publication of CN115859142A publication Critical patent/CN115859142A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network relates to the field of fault diagnosis of rolling bearings and other rotating equipment and solves the problem that accurate fault diagnosis is difficult to achieve under the condition of scarce operating data. Firstly, acquiring signal data of a rolling bearing under actual operation conditions and carrying out data standardization processing on the signal data; secondly, constructing a generator and a discriminator with a convolution and transformer cross structure, and effectively extracting global time domain characteristics of the time sequence signal by using a transformer layer; on the basis, the convolution layer is used for further extracting the local time domain characteristics of the time sequence signal. Meanwhile, the position codes are embedded into the time sequence signals, so that the model can fully learn the position information characteristics of the signals, and finally, high-quality time sequence signal samples are generated to expand the original training samples, thereby improving the fault diagnosis precision under the condition of small samples.

Description

Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network
Technical Field
The invention relates to the field of fault diagnosis of rotating equipment such as rolling bearings and the like, in particular to a small-sample rolling bearing fault diagnosis method based on a countermeasure network generated by a convolution transformer.
Background
In recent years, the deep neural network has been successfully applied to the fault diagnosis of the rolling bearing by virtue of its strong feature extraction capability. Their main assumption is that there is a large amount of valid data for training the fault diagnosis model. However, in an actual engineering scene, due to the operation safety problem of the rolling bearing device and the complex and variable working conditions, the data acquisition system can often record only a small amount of operation data, and the fault diagnosis effect is greatly influenced. Therefore, it is critical and necessary to design an effective fault diagnosis method under the condition of scarce operation data.
Currently, researchers have proposed various methods to deal with the limited data problem in fault diagnosis. Data sampling is one method of handling limited data. Balancing the data of each class by undersampling a large number of sample classes and oversampling a small number of sample classes is a common way to handle the various sample scale imbalances. Although the data sampling method works well in situations where the data is limited and the samples are unbalanced. However, the data sampling method can only utilize the existing data information, and cannot effectively map the original data distribution, so that the data cannot be effectively expanded to meet the requirement of the intelligent fault diagnosis method on mass data. Transfer learning solves the cross-domain problem by transferring knowledge acquired by a source domain to a target domain. Methods based on transfer learning typically utilize model pre-training and tuning to solve the problem of fault diagnosis under limited data. However, the greatest limitation of this approach is that it does not fundamentally solve the data deficiency problem, and in addition, pre-training of the original model still requires a large number of samples.
With the gradual development of generative models, the solution of the sample scarcity problem through data generation has received a great deal of attention. Among them, generating a countermeasure Network (GAN) is a mainstream generation model in the field of artificial intelligence. GAN can generate data similar to the raw data distribution and is originally applied in the field of image processing. By virtue of its powerful data generating capability, GAN has been successfully applied in the field of rolling bearing fault diagnosis. Yang et al developed a fusion diagnostic model CGAN-2D-CNN. And converting the vibration signals into two-dimensional gray images, and expanding and classifying image data by using CGAN and 2D-CNN for diagnosing the bearing fault of the small sample. Liang et al extract the time-frequency image features from the one-dimensional raw time-domain signal by wavelet transform and generate a large number of time-frequency image samples using GAN. However, converting a one-dimensional time sequence signal into a two-dimensional image cannot well represent vibration information carried by a vibration signal, so that the quality of a generated sample is poor, and the final fault diagnosis effect is affected. With the gradual development of GAN in the field of time sequence signal generation, a method for directly expanding the vibration signal of the rolling bearing by using GAN has also made a rapid progress. Guo et al propose a fault diagnosis framework called multi-tag all-1D generation countermeasure network (ML 1D-GAN) that can be used to directly generate one-dimensional vibration signal data. Sonal Dixit et al propose a novel one-dimensional condition-assisted classifier to generate an anti-network fault diagnosis model to better generate bearing signal samples directly. Zhang et al developed a small sample intelligent fault diagnosis method based on multi-module gradient penalty generative countermeasure network (MGPGAN) to generate mechanical fault signals with high similarity.
However, the above solutions all have certain problems. 1) The GAN feature extraction capability with the full connection layer as the basic structure is insufficient, and the model parameter amount is too large when processing the long sequence signal of the bearing signal. 2) The GAN with the one-dimensional convolution as the basic structure has strong local feature extraction capability, but lacks global feature extraction capability seriously, and cannot be effectively modeled aiming at long-sequence signals. 3) The GAN at the last stage does not take into account the relative or absolute position information of the entire original vibration signal sequence when generating the bearing signal samples, thereby affecting the quality of signal generation.
Disclosure of Invention
The invention aims to solve the problem that diagnosis precision is reduced due to the fact that rolling bearing carrier data are scarce, and the small-sample rolling bearing fault diagnosis method is based on a Convolutional Transformer generation countermeasure Network (CoT-GAN). In order to enable the model to better extract global and local features of the vibration signal, a generator and a discriminator of a transformer and convolution cross structure are designed. The transformer is good at processing long sequence signals and has strong global feature extraction capability, and can effectively carry out global modeling on vibration signals. Furthermore, adding a position code to the vibration signal sequence may enable the model to efficiently learn the relative and absolute position information of the signal, thereby preserving its inherent vibration information characteristics. On the basis, the convolutional layer is utilized to further enhance the learning capability of the model to the local characteristics of the signal. The method starts from the characteristics of the vibration signals, fully considers the time sequence characteristics of the vibration signals, combines the respective advantages of the transformer and the convolution, models the vibration signals of the bearing from local and global, and fully utilizes the position information carried by the vibration signals. And finally, generating sufficient vibration signal samples and effectively improving the fault diagnosis performance.
In order to realize the purpose, the invention adopts the following technical scheme:
a small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized by comprising the following steps:
(1) Firstly, historical operating data of the rolling bearing is acquired, data standardization processing is carried out, and then signal samples after data standardization are divided into training samples and testing samples.
(2) Constructing a generation countermeasure network (CoT-GAN) with a convolution and transformer cross structure, generating random noise into a generation signal similar to the distribution of a real signal by using a generator, carrying out true and false discrimination and category discrimination on the generation signal and the real signal by using a discriminator, alternately learning the generator and the discriminator in a zero and game mode so as to improve the performance of a model until a Nash equilibrium state is reached, and finally generating a signal sample; expanding the generated signal sample to an original training sample as an enhanced data set to train a fault classifier;
(3) And (3) adopting the fault classifier trained in the step (2) to carry out fault identification and classification on the test sample, and completing a final fault diagnosis task.
A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized in that the specific process of the step (1) is as follows:
1) Obtaining historical data of the rolling bearing under actual operation conditions
Figure BDA0003881880810000021
Where n represents the number of samples and m represents the sample dimension and also the total number of samples collected. Calculating the mean X and standard deviation sigma of the historical number X, and normalizing the data X to obtain->
Figure BDA0003881880810000022
Figure BDA0003881880810000023
Wherein i =1,2,. Cndot.n;
2) Will normalize the data
Figure BDA0003881880810000024
Divided into training sample sets>
Figure BDA0003881880810000025
And the test sample set->
Figure BDA0003881880810000026
Wherein the sum of p and q is n;
a small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized in that a signal is generated by using the generation countermeasure network with a convolution and transformer cross structure, and the specific process of the step (2) is as follows:
1) Setting random noise Z and embedding the corresponding fault class label c into the random noise to obtain random noise Z = [ Z, c ] containing the fault class label, specifically,
first, a random noise of a standard normal distribution (mean 0, variance 1) is obtained
Figure BDA0003881880810000031
Wherein k represents the number of random noises, and l represents the dimensionality of the random noises;
secondly, label the corresponding fault category
Figure BDA0003881880810000032
Embedding into random noise z = [ z ] 1 ,z 2 ,...z k ]Obtaining random noise Z containing a fault class label, wherein i belongs to {1,2,3,4};
2) In order to facilitate the processing of the discriminator and the transformer module on the input vector by the signal generated by the subsequent generator, the input random noise Z is transformed and converted into a patch of a fixed size, specifically,
firstly, changing the dimension of input random noise into a fixed value L to facilitate the subsequent processing of a discriminator to generate signals;
secondly, random noise Z = [ Z ] is embedded through one-dimensional convolution 1 ,Z 2 ,...Z k ]Into a plurality of fixed-size patches, specifically,
partitioning random noise into N patches of dimension M
Figure BDA0003881880810000033
Wherein M represents the size of the patch, N = L/M represents the number of patches, j ∈ {1, 2.
In order to reduce the parameter calculation amount, the characteristics of good weight sharing and local feature extraction effects of the convolutional neural network are utilized, and a one-dimensional convolution is used for forming an embedded module. Setting the convolution kernel size of the one-dimensional convolution as M multiplied by 1 and the step length as M, thereby enabling the one-dimensional convolution kernel to process random noise in a non-overlapping mode and finally obtaining N dimensions of M patches. In particular, the method comprises the following steps of,
embedding matrices using learning
Figure BDA0003881880810000034
It is projected to the dimension of the model as D by convolution model In the vector of (a). Wherein, the formula of the one-dimensional convolution operation is as follows:
Figure BDA0003881880810000035
wherein v is i And u j Corresponding to the input of the ith channel and the output of the jth channel, respectively. k is the convolution kernel, b is the bias, and x is the convolution operation. M j Is a channel set of jth channels for computing output functions;
the fixed size patch is then embedded with a position tag so that the generated signal can have more similar position information to the real signal, thereby improving the quality of the generated sample, specifically,
will have dimension D model Position information matrix of
Figure BDA0003881880810000036
Encoding and attaching to the patch, the obtained patch with position information being:
Figure BDA0003881880810000037
finally, the patch sequence T carrying the position information Z ′=[T Z,1 ,T Z,1 ,...,T Z,k ]A generator which is sent into the transformer module and sequentially passes through the convolution and the transformer cross structureSignal forming sample
Figure BDA0003881880810000038
Where l represents the number of generated samples, specifically,
the patch carrying the location information is sent to the transformer module to extract the global features of the input, specifically,
the transformer module can dynamically capture the characteristic information of the input vector by means of a multi-head attention mechanism in the transformer module, so that the generator can grasp the global characteristic information to a great extent. The function of self-attention is to update each component of the sequence by aggregating global context information from the complete input sequence. The formula for self-attention can be expressed as:
Figure BDA0003881880810000041
wherein d is k The representation signal is converted into the dimension of a specific key value vector, and Q, K and V respectively represent a query vector, a key value vector and a matrix corresponding to the value vector.
Multi-head attention is a mechanism involving multiple self-attentions that can encapsulate multiple complex relationships between different elements in a sequence. Assuming h self-attention modules, multi-head attention translates a given input vector into three different sets of vectors. Each group has h vectors of dimension D/h. Then, vectors from different inputs are packed into different matrices:
Figure BDA0003881880810000042
Figure BDA0003881880810000043
and &>
Figure BDA0003881880810000044
Thus, the formula for a multi-head attention mechanism can be expressed as:
Figure BDA0003881880810000045
wherein Q ', K ' and V ' are each independently
Figure BDA0003881880810000046
Figure BDA0003881880810000047
And &>
Figure BDA0003881880810000048
The cascade of (2), device for combining or screening>
Figure BDA0003881880810000049
Is a linear projection matrix;
the transformer module applies layer normalization prior to multi-head attention operation. The information flow is then enhanced with residual concatenation to achieve higher performance. Specifically, it can be expressed as:
x′=x+Multihead(LN(x)) (6)
wherein x is an input vector of the transformer module;
after the steps, the final output of the transformer module is output by the multilayer perceptron, and the specific operation is as follows:
Figure BDA00038818808100000410
after processing by the transformer module, the output is fed into the deconvolution layer to effectively obtain its local characteristics. Outputting the feature vector after passing through the deconvolution layer
Figure BDA00038818808100000411
And again input to the module cross-structured by the transformer and the deconvolution layer. The generator comprises 4 transformers and deconvolution cross-type structure modules in total, and generates a generated sample with the same dimension as a real signal after an input vector passes through a last deconvolution layer.
Will be generated by the generatorNumber (C)
Figure BDA00038818808100000412
And true signal
Figure BDA00038818808100000413
The mixture is fed to a discriminator, which, in particular,
first, the generated signal and the real signal input to the discrimination are converted into a plurality of fixed-size patches by means of one-dimensional convolution embedding, and the specific operation is similar to that of a 22). And processing the input signal of the discriminator by using a one-dimensional convolutional neural network in a non-overlapping mode to obtain a plurality of patches with fixed sizes.
Secondly, each patch is added with a corresponding position label, so that the discriminator can pay more attention to the relative position and absolute value information of the signal when learning the signal characteristics, and the signal generation of a generator is facilitated.
Then, the patch carrying the position information is sent to a subsequent transformer module and sequentially passes through a discriminator with a convolution and transformer cross structure, specifically,
the vector passing through the transformer module is introduced into a convolution layer for extracting the local characteristics of the input vector. Output vector passing through the convolutional layer
Figure BDA0003881880810000051
It will continue to be sent to a transformer module to obtain global signatures. The processing process of the input vector in the network is similar to that of a generator, and the discriminator always comprises 4 convolution and transformer modules with a cross structure.
Finally, the characteristic vector output by the last layer of convolution layer is deformed to obtain a plurality of vectors of 1 multiplied by 1024, the vectors are respectively subjected to two-classification discrimination and multi-classification discrimination by utilizing Sigmoid and Softmax activation functions,
and (3) respectively passing the output vectors with the dimensionality of 1 multiplied by 1024 through a two-classification full connection layer and a multi-classification full connection layer to respectively obtain the output vectors with the output dimensionality of 1 and the fault class number. Respectively sending the two output vectors into a Sigmoid and Softmax activation function to perform true and false discrimination and category discrimination, wherein the formula of the Sigmoid activation function is as follows:
Figure BDA0003881880810000052
where x represents the input vector into the Sigmoid activation function.
The formula for the Softmax activation function is:
Figure BDA0003881880810000053
wherein z represents an input vector, z k Represents the kth input vector, z i Representing the ith input vector and K representing the number of classes of the multi-classification.
3) Finally, the generated signal samples are expanded to the original training samples as an enhanced data set to train the fault classifier.
A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized in that in step (2), the specific calculation process is as follows:
1) The generator and the discriminator are alternately trained in a zero-sum game mode until Nash equilibrium is reached, and the objective function of the CoT-GAN is expressed as follows:
Figure BDA0003881880810000054
Figure BDA0003881880810000055
wherein, P data Is the true data distribution, P g Is to generate a data distribution of the sample, D(s) represents the probability from the real data,
Figure BDA0003881880810000061
representing the probability from the real data. />
Figure BDA0003881880810000062
Represents a desire for a true data distribution>
Figure BDA0003881880810000063
Indicating the expectation of noise synthesis generation data. P (Y = Y | S) real ) Representing a conditional probability distribution over class labels. The optimization process of the generator and the discriminator is a binary minimum and maximum problem, and can be formalized as the following equation:
Figure BDA0003881880810000064
2) Training the fault classifier by using the enhanced data set so that the fault classifier can have better generalization capability, wherein an objective function of the fault classifier is represented as follows:
Figure BDA0003881880810000065
the CoT-GAN network structure specifically comprises: the CoT-GAN is composed of a generator and a discriminator of a convolution and transformer cross structure, can effectively model the global characteristics and the local parts of the vibration signals, and fully considers the relative position and absolute position information contained in the signals to generate sufficient signal data. The generator is composed of L deconvolution and transformer cross modules, the input of the generator is composed of random noise and existing fault category labels, data points are converted into patch forms through one-dimensional convolution embedding, embedded position information is input into network layers of L transformers and deconvolution cross structures, and finally generated signals with the same dimensionalities as real signals are output. The discriminator consists of L convolution and transformer cross modules, the input of the discriminator consists of a generated signal and a real signal, the input signal is converted into a plurality of patches through one-dimensional convolution and embedded with position information, the patches are input to L transformer and network layers of a convolution cross structure, and finally the output layer of the discriminator is the probability of two-class classification and multi-class classification.
Output of generator and discriminator transformer module
Figure BDA0003881880810000066
And &>
Figure BDA0003881880810000067
Indicated as follows, L =1, 2., L,
Figure BDA0003881880810000068
Figure BDA0003881880810000069
wherein the content of the first and second substances,
Figure BDA00038818808100000610
represents the output vector of the l-1 th transformer module in the generator, is->
Figure BDA00038818808100000611
Representing the output vector of the ith transformer module in the generator. f. of G,l (. H) represents the corresponding set i transformer module and deconvolution operation in the generator, when l =1, and->
Figure BDA00038818808100000612
I.e. a fixed patch representing position-coded information, when L = L, then £ h @>
Figure BDA00038818808100000613
I.e. the output vector representing the generator. />
Figure BDA00038818808100000614
Represents the output vector of the l-1 th transformer module in the discriminator>
Figure BDA00038818808100000615
To representAnd the output vector of the ith transformer module in the discriminator. f. of D,l (·) represents the corresponding i-th set of transformer modules and convolution operations in the arbiter, when l =1,
Figure BDA00038818808100000616
i.e. a fixed patch representing position-coded information, when L = L, then £ h @>
Figure BDA00038818808100000617
I.e. the output vector representing the arbiter. More specifically, the generator consisting of L cross transformer modules and deconvolution can be expressed as:
Figure BDA0003881880810000071
Figure BDA0003881880810000072
advantageous effects
The invention designs a generation countermeasure network with a transformer and convolution cross structure, and utilizes the advantages of the transformer and convolution respectively to extract global and local characteristics of a time sequence signal by utilizing a transformer layer and a convolution layer, so that a model can fully capture time domain characteristics of vibration. Secondly, position coding is embedded into the vibration signal, so that the model can fully learn the relative and absolute position information of the signal, the inherent time sequence characteristic of the generated signal is enhanced, sufficient signal data are finally generated, and the fault diagnosis performance under the condition of small samples is effectively improved. The method fully considers the characteristics of the time sequence signals during sample generation and carries out modeling from the whole situation and the local situation, has the characteristics of strong characteristic expression capability, strong pertinence and high diagnosis accuracy, and has very important significance for fault diagnosis of the rolling bearing.
Drawings
FIG. 1 is a flow chart of the CoT-GAN method of the present invention;
FIG. 2 is a schematic diagram of a generator;
FIG. 3 is a schematic diagram of the discriminator;
FIG. 4 is a schematic view of a Kaiser university of West storage (CWRU) bearing test stand;
FIG. 5 illustrates the results of the present invention generated for a CWRU bearing dataset;
FIG. 6 is a graph showing the effect of varying the number of training samples on the diagnostic performance of a model;
FIG. 7 shows the diagnostic effect of the model for 1 training sample;
FIG. 8 shows the diagnostic effect of the model for 2 training samples;
FIG. 9 shows the diagnostic effect of the model for 4 training samples;
FIG. 10 shows the diagnostic effect of the model for 8 training samples;
FIG. 11 shows the diagnostic effect of the model for 16 training samples;
FIG. 12 shows the diagnostic effect of the model for 32 training samples;
Detailed Description
The invention provides a small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network, aiming at the defects of the prior art, and the method can effectively generate time sequence signal samples to expand an original training sample set so as to improve the rolling bearing fault diagnosis precision under the condition of small samples.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art based on the embodiments of the present invention without inventive step, are within the scope of the present invention.
Referring to fig. 1, the invention provides a small sample rolling bearing fault diagnosis method based on a Convolutional Transformer generated countermeasure Network (CoT-GAN), which overcomes the problem that accurate fault diagnosis is difficult to realize under the condition of scarce operating data. Firstly, acquiring signal data of a rolling bearing under actual operation conditions and carrying out data standardization processing on the signal data; next, a generator and a discriminator of a convolution and transformer cross structure are constructed as shown in fig. 2 and 3, respectively. The generator and the discriminator of the convolution and transformer cross structure are constructed, so that the local and global time domain characteristics of the time sequence signal can be effectively extracted. And meanwhile, the position codes are embedded into the time sequence signals, so that the model can fully learn the position information characteristics of the signals, and finally sufficient time sequence signal samples are generated to keep the fault diagnosis precision under the condition of small samples.
The Keiss Sichu university (CWRU) common bearing data set is widely used to verify the performance of fault diagnosis. Fig. 4 shows a CWRU bearing test stand consisting of two motors, a torque sensor, a dynamometer and other control devices. Single point failures on the bearing inner, outer and ball elements were caused by using electro-discharge machining with damage diameters of 0.007, 0.014 and 0.021 inches, respectively. The accelerometer collects vibration signals at various loads of 0 to 3 horsepower. Vibration signals were acquired at 12kHz and 48kHz sampling frequencies using a 16 channel DAT recorder. In this experiment, vibration data collected from a drive end bearing with a fault severity of 0.021 inches, a load of 0hp, and a sampling frequency of 12kHz was used for analysis. Four different bearing health conditions are selected for classification, namely a health state, an outer ring fault, an inner ring fault and a ball fault. Each class contains 100 samples, each sample containing 1024 data points. The training data of the experiment is 1-32 samples randomly sampled for each category, and the rest are test data.
In the hyper-parameter setting, the CoT-GAN adopts an Adam optimizer to perform model optimization, and in order to make model training more stable, a label smoothing strategy is adopted, wherein a real label is set to be 0.9, a false label is set to be 0.1, a Batch _ size of a training model is set to be 4, a learning rate lr of a discriminator is set to be 0.0003, a learning rate lr of a generator is set to be 0.0005, and the model is iterated for 1000 times in total.
Based on the above description, according to the invention, the specific process is implemented as follows:
1) For experimental data X = [ X ] 1 ,x 2 ,...,x 100 ]∈R 1×1024 Performing standardization, and calculating the mean value of X
Figure BDA0003881880810000081
And standard deviation->
Figure BDA0003881880810000082
Normalizing X by equation (1) results in->
Figure BDA0003881880810000083
2) The normalized data
Figure BDA0003881880810000084
Division into training samples>
Figure BDA0003881880810000085
And the test sample->
Figure BDA0003881880810000086
3) Random noise setting a standard normal distribution (mean 0, variance 1)
Figure BDA0003881880810000087
Label the corresponding fault category->
Figure BDA0003881880810000088
Embedding in random noise z = [ z ] 1 ,z 2 ,...z k ]Obtaining random noise Z containing a fault class label, wherein i belongs to {1,2,3,4};
4) According to the formula (2), inputting random noise Z containing fault category to a one-dimensional convolution embedding module, and converting the form of data point into fixed patch Z p =[Z 1 ,Z 2 ,...,Z K ]In the form of (a);
5) According to the formula (3), adding position information to the patch to obtain the patch T carrying the position information Z =[T Z,1 ,T Z,2 ,...,T Z,k ]And subjecting it toSending the signal into a network layer of a transformer and convolution cross structure;
6) Obtaining the output vector of the transformer according to the formulas (5), (6) and (7)
Figure BDA0003881880810000089
Sending the signal into a convolution layer behind a rear transformer to obtain an output vector (or greater or lesser) of the transformer and the convolution crossing module>
Figure BDA00038818808100000810
The above operation can be represented by formula (10);
7) According to equation (12), the output vector of the final generator is obtained
Figure BDA0003881880810000091
I.e. to generate signal samples
Figure BDA0003881880810000092
8) Will generate a signal
Figure BDA0003881880810000093
And training samples
Figure BDA0003881880810000094
Inputting the data into a discriminator to train the discriminator;
9) Similar to 4), converting the generated signal and the real signal into a patch with a fixed size in a one-dimensional convolution embedding mode;
10 Like 5), position information codes are respectively added to patches of fixed size, and the patches are input into a network layer of a transformer and convolution cross structure;
11 Obtaining the output vector of the last transformer of the discriminator and the convolution cross module according to a formula (15) and a formula (17), and changing the shape of the output vector;
12 Respectively sending the final output vector into a two-classification full connection layer and a multi-classification full connection layer, and processing the output vector behind the full connection layers according to the activation functions of formulas (8) and (9) to finally obtain the probability of judging real data and judging categories;
13 ) generators and discriminators are trained in an alternating manner to eventually reach a nash equilibrium state and generate signal samples
Figure BDA0003881880810000095
A resulting plot of the generated signals is shown in fig. 5. Wherein, the upper part of fig. 5 is the original signal and the lower part is the generated signal; />
14 Adding the generated signal to the original training sample to obtain an enhanced data set
Figure BDA0003881880810000096
Wherein H represents the total number of samples;
15 Will enhance the data set
Figure BDA0003881880810000097
For training fault classifiers and using test data sets
Figure BDA0003881880810000098
And performing fault diagnosis. The diagnostic effect using the enhanced data set and the original data set is shown in table 1. Where 4 in the first column of table 1 indicates that there are four classes in total, the numbers multiplied by the latter represent the amount of training samples contained in each class. As can be seen from table 1, the final diagnosis effect obtained by training the fault classifier with the enhanced data is far better than that obtained by using only the original small sample data set, and the obtained fault diagnosis effect is better as the number of training CoT-GAN samples and the number of generated samples gradually increase. FIG. 6 illustrates the effect on model diagnostic efficacy as a function of the number of training samples. Wherein the number of generated samples for each category is 10. As can be seen from FIG. 6, with the increase of the number of training samples, the CoT-GAN can effectively generate a synthesized sample to train the classifier, thereby effectively improving the fault accuracy under a small sample. In order to further show the classification precision of each fault class under different training samples, the confusion matrix is used for showing the classification effect of different classes. As shown in FIGS. 7-12, as the number of training samples increasesIn addition, the classification effect of each category is also obviously improved.
Finally, the method can be used for effectively diagnosing the faults under the condition of the small sample, so that the method has great beneficial effect on fault diagnosis of the rolling bearing with the small sample.
TABLE 1 diagnostic accuracy (%) comparison using enhanced and raw data sets
Figure BDA0003881880810000101
/>

Claims (4)

1. A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized by comprising the following steps:
(1) Firstly, acquiring historical operating data of a rolling bearing, carrying out data standardization processing, and dividing a signal sample after data standardization into a training sample and a test sample;
(2) Constructing a generation countermeasure network with a convolution and transformer cross structure, generating random noise into a generation signal similar to real signal distribution by using a generator, performing true and false discrimination and category discrimination on the generation signal and the real signal by using a discriminator, alternately learning the generator and the discriminator in a zero and game mode so as to improve the performance of a model until a Nash equilibrium state is reached, and finally generating a signal sample; expanding the generated signal samples to original training samples as an enhanced data set to train a fault classifier;
(3) And (3) adopting the fault classifier trained in the step (2) to carry out fault identification and classification on the test sample, and completing a final fault diagnosis task.
2. The small-sample rolling bearing fault diagnosis method based on the convolution transformer generation countermeasure network as claimed in claim 1, characterized in that: the specific steps of (1) are as follows:
1) Obtaining historical data of rolling bearing under actual operation condition
Figure FDA0003881880800000011
Wherein n represents the number of samples, m represents the sample dimension, and also represents the total number of collected samples; calculating the mean of the number of histories X>
Figure FDA0003881880800000012
And standard deviation σ, normalized data X results in >>
Figure FDA0003881880800000013
Figure FDA0003881880800000014
Wherein i =1,2, ·, n;
2) Will normalize the data
Figure FDA0003881880800000015
Divided into training sample sets>
Figure FDA0003881880800000016
And the test sample set->
Figure FDA0003881880800000017
Wherein the sum of p and q is n.
3. The small-sample rolling bearing fault diagnosis method based on the convolution transformer generation countermeasure network as claimed in claim 1, characterized in that in step (2), the signal sample is generated by using the generation countermeasure network of the cross-type structure of convolution and transformer, and the specific steps are as follows:
1) Setting a standard normally distributed random noise Z with a mean value of 0 and a variance of 1, and embedding a corresponding fault class label c into the random noise to obtain random noise Z = [ Z, c ] containing the fault class label;
2) Converting an input signal into a plurality of patches with fixed sizes by using a one-dimensional convolution embedding mode, and embedding position coding information into each patch;
3) Constructing a generation countermeasure network with a convolution and transformer cross structure, and extracting global characteristic local characteristics of signals by using a transformer layer and a convolution layer respectively; sending a random noise patch sequence carrying position information into a generator with a transformer and convolution cross structure to generate signal samples; patch operation is carried out on the generated signal and the real signal, position information is embedded, then the generated signal and the real signal are mixed and sent to a discriminator with a transformer and convolution cross structure for learning, and a two-class prediction label and a multi-class prediction label are output by utilizing a Sigmoid and a Softmax activation function at the tail end of the discriminator, so that true and false discrimination and category discrimination are carried out by comparing with the real label;
4) The generator and the discriminator are alternately trained in a zero-sum game mode to reach a Nash equilibrium state, and finally signal samples are generated;
5) The generated signal samples are extended to the original training samples as an enhanced data set to train the fault classifier.
4. The small-sample rolling bearing fault diagnosis method based on the convolution transformer generation countermeasure network as claimed in claim 3, characterized in that in step (2), the specific calculation process is as follows:
1) Performing convolution operation on an input signal in a non-overlapping sliding mode by utilizing a one-dimensional convolution kernel, so that the input signal is divided into a plurality of patches with fixed sizes, and each patch is embedded with a position code which can be learnt in model training; the one-dimensional convolution operation formula and the position coding operation formula are as follows:
Figure FDA0003881880800000021
wherein v is i And u j Inputs corresponding to the ith channel and outputs corresponding to the jth channel, respectively; k is the convolution kernel, b is the offset, and is the convolution operation; m is a group of j Is a channel set of jth channels for computing output functions;
Figure FDA0003881880800000022
wherein, U p Representing different patches, E representing a learnable embedded matrix, E pos Representing a learnable position information matrix, T U A final patch sequence representing a final binding position code;
2) The generator and the discriminator are alternately trained in a zero-sum game mode until Nash equilibrium is reached, and the objective function of the CoT-GAN is expressed as follows:
Figure FDA0003881880800000023
Figure FDA0003881880800000024
wherein, P data Is the true data distribution, P g Is to generate a data distribution of the sample, D(s) represents the probability from the real data,
Figure FDA0003881880800000025
representing the probability from noisy data; />
Figure FDA0003881880800000026
Represents a desire for a true data distribution>
Figure FDA0003881880800000027
Representing a desire for noise synthesis generated data; p (Y = Y | S) real ) Representing a conditional probability distribution over class labels; the optimization process of the generator and the discriminator is a binary minimum and maximum problem, and is formalized as the following equation:
Figure FDA0003881880800000028
3) Training a fault classifier with the enhanced data set, the objective function of the fault classifier being represented as follows:
Figure FDA0003881880800000029
where x represents the input sample of the fault classifier, y represents the data label output by the classifier, P data And P g Data distributions representing real and generated samples, respectively; p (Y = Y | x) also represents the conditional probability distribution on the class label.
CN202211233344.8A 2022-10-10 2022-10-10 Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network Pending CN115859142A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211233344.8A CN115859142A (en) 2022-10-10 2022-10-10 Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211233344.8A CN115859142A (en) 2022-10-10 2022-10-10 Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network

Publications (1)

Publication Number Publication Date
CN115859142A true CN115859142A (en) 2023-03-28

Family

ID=85661365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211233344.8A Pending CN115859142A (en) 2022-10-10 2022-10-10 Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network

Country Status (1)

Country Link
CN (1) CN115859142A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076935A (en) * 2023-10-16 2023-11-17 武汉理工大学 Digital twin-assisted mechanical fault data lightweight generation method and system
CN117152548A (en) * 2023-11-01 2023-12-01 山东理工大学 Method and system for identifying working conditions of actually measured electric diagram of oil pumping well
CN117743947A (en) * 2024-02-20 2024-03-22 烟台哈尔滨工程大学研究院 Intelligent cabin fault diagnosis method and medium under small sample

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076935A (en) * 2023-10-16 2023-11-17 武汉理工大学 Digital twin-assisted mechanical fault data lightweight generation method and system
CN117076935B (en) * 2023-10-16 2024-02-06 武汉理工大学 Digital twin-assisted mechanical fault data lightweight generation method and system
CN117152548A (en) * 2023-11-01 2023-12-01 山东理工大学 Method and system for identifying working conditions of actually measured electric diagram of oil pumping well
CN117152548B (en) * 2023-11-01 2024-01-30 山东理工大学 Method and system for identifying working conditions of actually measured electric diagram of oil pumping well
CN117743947A (en) * 2024-02-20 2024-03-22 烟台哈尔滨工程大学研究院 Intelligent cabin fault diagnosis method and medium under small sample
CN117743947B (en) * 2024-02-20 2024-04-30 烟台哈尔滨工程大学研究院 Intelligent cabin fault diagnosis method and medium under small sample

Similar Documents

Publication Publication Date Title
Shao et al. Generative adversarial networks for data augmentation in machine fault diagnosis
Han et al. Multi-level wavelet packet fusion in dynamic ensemble convolutional neural network for fault diagnosis
CN103728551B (en) A kind of analog-circuit fault diagnosis method based on cascade integrated classifier
CN113159051B (en) Remote sensing image lightweight semantic segmentation method based on edge decoupling
CN115859142A (en) Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network
Gao et al. ASM1D-GAN: An intelligent fault diagnosis method based on assembled 1D convolutional neural network and generative adversarial networks
CN107870321B (en) Radar one-dimensional range profile target identification method based on pseudo-label learning
CN110657984A (en) Planetary gearbox fault diagnosis method based on reinforced capsule network
Wu et al. A transformer-based approach for novel fault detection and fault classification/diagnosis in manufacturing: A rotary system application
CN113139512B (en) Depth network hyperspectral image classification method based on residual error and attention
CN111639697B (en) Hyperspectral image classification method based on non-repeated sampling and prototype network
CN115774851B (en) Method and system for detecting internal defects of crankshaft based on hierarchical knowledge distillation
CN115019104A (en) Small sample remote sensing image classification method and system based on multi-source domain self-attention
CN115761398A (en) Bearing fault diagnosis method based on lightweight neural network and dimension expansion
CN115290326A (en) Rolling bearing fault intelligent diagnosis method
CN104504391B (en) A kind of hyperspectral image classification method based on sparse features and markov random file
Han et al. Data-enhanced stacked autoencoders for insufficient fault classification of machinery and its understanding via visualization
CN111008570B (en) Video understanding method based on compression-excitation pseudo-three-dimensional network
CN114676733A (en) Fault diagnosis method for complex supply and delivery mechanism based on sparse self-coding assisted classification generation type countermeasure network
CN113295413B (en) Traction motor bearing fault diagnosis method based on indirect signals
CN112541524B (en) BP-Adaboost multisource information motor fault diagnosis method based on attention mechanism improvement
CN113758709A (en) Rolling bearing fault diagnosis method and system combining edge calculation and deep learning
CN116593980B (en) Radar target recognition model training method, radar target recognition method and device
CN113111774B (en) Radar signal modulation mode identification method based on active incremental fine adjustment
Xing et al. Intelligent fault diagnosis of rotating machinery using locally connected restricted boltzmann machine in big data era

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination