CN116878885B

CN116878885B - Bearing fault diagnosis method based on self-adaptive joint domain adaptive network

Info

Publication number: CN116878885B
Application number: CN202311127040.8A
Authority: CN
Inventors: 何俊; 梁文生; 陈丹凤; 曾晨露; 刘士亚
Original assignee: Foshan University
Current assignee: Foshan University
Priority date: 2023-09-04
Filing date: 2023-09-04
Publication date: 2023-12-19
Anticipated expiration: 2043-09-04
Also published as: CN116878885A

Abstract

The invention discloses a bearing fault diagnosis method based on a self-adaptive joint domain adaptive network, which relates to the technical field of fault diagnosis, and comprises the steps of acquiring unlabeled target domain data; inputting target domain data into a pre-trained bearing fault diagnosis training model for detection and diagnosis so as to obtain diagnosis evaluation data; determining a fault type according to the diagnosis evaluation data; the bearing fault diagnosis training model comprises a feature extractor, a label classifier and a domain adaptation module. According to the invention, the edge distribution difference and the condition distribution difference of the source domain and the target domain can be reduced at the same time, the attention degree of the two distribution differences in the training process is regulated in real time through the self-adaptive weighting factors, the regulation is not needed by manual experience, and the training trend of the feature extractor is restrained, so that the joint distribution is better pulled up, and the unsupervised domain adaptation task is realized.

Description

Bearing fault diagnosis method based on self-adaptive joint domain adaptive network

Technical Field

The invention relates to the technical field of fault diagnosis, in particular to a bearing fault diagnosis method based on a self-adaptive joint domain adaptive network.

Background

In the problem of unsupervised domain adaptation in the bearing fault field, the inter-domain distance can be effectively reduced by pulling the edge distribution of the source domain and the target domain, the intra-domain distance can be effectively reduced by pulling the condition distribution, and a better effect is difficult to realize under certain complex conditions by independently pulling one distribution, and the inter-domain and intra-domain distances can be simultaneously pulled from different layers by pulling the joint distribution of the two domains, so that a better domain adaptation effect is realized. Currently, most joint distribution adaptation works face two key problems: on the one hand, the reliability degree of the false label of the target domain sample is not reasonably considered when the condition distribution difference is reduced, and the continuous accumulation of false label samples in error can influence the trend of the training process; on the other hand, the respective importance degree of the edge distribution and the condition distribution in different domain adaptation tasks is not well considered, and the balance coefficient between the two distributions is set by relying on manual experience, so that the uncertainty is high.

Disclosure of Invention

The invention aims to solve the technical problem of providing a bearing fault diagnosis method based on a self-adaptive joint domain adaptation network, which does not need to rely on manual experience to adjust balance coefficients of two distributions, can reduce the influence of accumulation of false labels of a target domain on the training trend, and reduces the inter-domain difference and intra-domain difference by pulling up the joint distribution of the two domains and restricting the training trend of a feature extractor, thereby better realizing the unsupervised domain adaptation task.

In order to solve the technical problems, the invention provides a bearing fault diagnosis method based on a self-adaptive joint domain adaptive network, which comprises the following steps: acquiring unlabeled target domain data; inputting target domain data into a pre-trained bearing fault diagnosis training model for detection and diagnosis so as to obtain diagnosis evaluation data; determining a fault type according to the diagnosis evaluation data; the bearing fault diagnosis training model comprises a feature extractor, a label classifier and a domain adaptation module; the domain adaptation module adopts an adaptation joint domain adaptation difference based on AJMMD to measure the edge distribution difference and the conditional distribution difference between a source domain and a target domain and adaptively adjust the importance degree between the two distribution differences; the step of adaptively adjusting the importance level between the two distribution differences comprises the following steps: and calculating an adaptive weighting factor through the edge distribution difference and the conditional distribution difference, and adopting the adaptive weighting factor to adaptively adjust the importance degree between the two distribution differences.

As an improvement of the above-described aspect, the training step of the bearing failure diagnosis training model includes:

s1, obtaining n _s Tagged source domain data and n _t Target domain data without labels; s2, inputting the source domain data and the target domain data into a feature extractor for feature extraction so as to obtain the source domain feature data and the target domain feature data; s3, inputting the source domain feature data and the target domain feature data into a tag classifier to perform tag classification so as to obtain a prediction pseudo tag of the target domain; s4, inputting the source domain characteristic data, the target domain prediction pseudo tag and the real tag of the source domain data into a domain adaptation module to calculate domain adaptation loss so as to obtain domain adaptation loss L _AJMMD The method comprises the steps of carrying out a first treatment on the surface of the S5, calculating classification loss L according to the real labels of the source domain data and the cross entropy loss function _C The method comprises the steps of carrying out a first treatment on the surface of the S6, according to the classification loss L _C Sum domain adaptation loss L _AJMMD Counter-propagating the gradients to optimize updating parameters of the feature extractor and parameters of the tag classifier; and S7, outputting a trained bearing fault diagnosis training model when the current iteration calculation times are greater than or equal to the preset iteration times, otherwise, returning to the step S2 to continue training.

As an improvement of the scheme, the feature extractor is a one-dimensional residual feature extractor, and comprises four Block blocks, wherein the first Block comprises a convolution layer, the convolution layer adopts a one-dimensional convolution kernel, and the rest Block blocks comprise two Bottleneck blocks; the tag classifier includes a fully connected output layer.

As an improvement of the above scheme, according to the classification loss L _C Sum domain adaptation loss L _AJMMD The step of back-propagating the gradient to optimize updating parameters of the feature extractor and parameters of the tag classifier includes: according to the classification loss L _C Domain adaptation loss L _AJMMD And overall domain adaptationConstructing an overall loss function by factors and calculating overall loss L; and carrying out model parameter optimization calculation according to the gradient descent algorithm and the total loss L so as to optimize and update the parameters of the feature extractor and the parameters of the tag classifier.

As an improvement of the above scheme, the domain adaptation loss L is calculated by the AJMMD adaptation joint domain adaptation loss function in the domain adaptation module _AJMMD The function calculation formula is as follows:

wherein L is _AJMMD Denoted as domain adaptation loss, L _MMD Expressed as edge distribution loss, L _LMMD Expressed as a conditional distribution penalty, w is expressed as an adaptive weighting factor for each batch of data in the training,MMD (·) is the maximum mean difference loss function, LMMD (·) is the local maximum mean difference loss function, F (x) _s ) Represented as source domain feature data, F (x _t ) Expressed as target domain feature data, y _s Real tag represented as source domain data, +.>Represented as a predictive pseudo tag for the target domain.

As an improvement of the above scheme, the calculation formula of the cross entropy loss function is:

wherein,expressed as categorical loss>Expressed as the prediction result of the label classifier, y ^s A real class label expressed as source domain data, < +.>The real label of the ith sample of the source domain is represented, M represents the total number of source domain fault types, I (·) is a sign function, M represents the current fault category, a _i,m Represents->The prediction probability that each sample belongs to the m class.

As an improvement of the above scheme, the calculation formula of the overall loss function is:

wherein,expressed as total loss->Ideal parameters denoted as feature extractor +.>Ideal parameters of tag classifier, +.>Parameters to be optimized, denoted feature extractor, < ->Parameters to be optimized, denoted as label classifier, < ->Is the overall domain adaptation factor.

As a modification of the above scheme, the one-dimensional convolution kernel is a convolution kernel of 32×1.

The invention also provides a computer device, which comprises a processor and a memory, wherein the memory is used for storing computer executable programs, the processor reads part or all of the computer executable programs from the memory and executes the computer executable programs, and the bearing fault diagnosis method can be realized when the processor executes part or all of the computer executable programs.

The invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the bearing fault diagnosis method can be realized when the computer program is executed by a processor.

The implementation of the invention has the following beneficial effects:

according to the bearing fault diagnosis method, the computer equipment and the computer readable storage medium based on the self-adaptive joint domain adaptation network, the edge distribution difference and the condition distribution difference between the source domain and the target domain can be measured through the AJMMD self-adaptive joint domain adaptation difference energy, the importance degree between the two distribution differences can be adjusted in a self-adaptive mode, manual experience is not required to be relied on to adjust balance coefficients of the two distributions, the influence of accumulation of false labels of the target domain on the training trend can be reduced, the joint distribution of the two domains is pulled up, the training trend of the feature extractor is restrained, the inter-domain difference and the intra-domain difference are reduced, and therefore the non-supervision domain adaptation task is better achieved. And performing model parameter optimization updating processing through a gradient descent method and a total loss function so as to output a bearing fault diagnosis training model with trained model parameters.

Drawings

FIG. 1 is a flow chart of a method of bearing fault diagnosis based on an adaptive joint domain adaptation network of the present invention;

FIG. 2 is a training flow chart of the bearing fault diagnosis training model of the present invention;

FIG. 3 is a schematic diagram of accuracy data of different width convolution kernels of the present invention;

FIG. 4 is a data diagram of the present invention and other weighting schemes;

FIG. 5 is a data schematic diagram of the migration task performed by the method of the present invention with other model methods based on a West university data set;

FIG. 6 is a data schematic diagram of the migration task with other model methods of the present invention based on the university of Jiangnan dataset.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present invention more apparent. It is only stated that the terms of orientation such as up, down, left, right, front, back, inner, outer, etc. used in this document or the imminent present invention, are used only with reference to the drawings of the present invention, and are not meant to be limiting in any way.

As shown in fig. 1, a specific embodiment of the present invention provides a bearing fault diagnosis method based on an adaptive joint domain adaptive network, including:

s1, acquiring unlabeled target domain data;

s2, inputting target domain data into a pre-trained bearing fault diagnosis training model for detection and diagnosis so as to obtain diagnosis evaluation data;

s3, determining a fault type according to the diagnosis and evaluation data;

the bearing fault diagnosis training model comprises a feature extractor, a label classifier and a domain adaptation module. The domain adaptation module adopts an adaptation joint domain adaptation difference based on AJMMD to measure the edge distribution difference and the conditional distribution difference between a source domain and a target domain and adaptively adjust the importance degree between the two distribution differences; the step of adaptively adjusting the importance level between the two distribution differences comprises the following steps: and calculating an adaptive weighting factor through the edge distribution difference and the conditional distribution difference, and adopting the adaptive weighting factor to adaptively adjust the importance degree between the two distribution differences. According to the method, the balance coefficients of the two distributions can be adjusted without relying on manual experience, the influence of accumulation of false labels in the target domain on the training trend can be reduced, the joint distribution of the two domains is pulled up, the training trend of the feature extractor is restrained, the inter-domain difference and the intra-domain difference are reduced, and therefore the unsupervised domain adaptation task is better achieved.

When bearing fault diagnosis is carried out, label-free target domain data are obtained and used as test sample data, the test sample data are input into the bearing fault diagnosis training model for detection and diagnosis, diagnosis evaluation data of corresponding bearing fault categories can be obtained, and the bearing fault types can be rapidly and accurately determined through the diagnosis evaluation data, so that the requirements of users are met. Wherein, AJMMD is a custom word that is expressed as an adaptive joint domain adaptation difference.

As shown in fig. 2, the training steps of the bearing failure diagnosis training model include:

s10, obtaining n _s Tagged source domain data and n _t Target domain data without labels;

in the fault diagnosis for the motor bearing, the bearing source vibration data includes unlabeled target data and labeled source domain data.

Obtaining n _s Tagged source domain dataAnd n _t Personal untagged target domain data. Wherein (1)>For the ith sample data in the source domain data,/or->Label data corresponding to the ith sample data in the source domain data, +.>Is the i-th sample data in the target domain data. They have the same feature space, the same class space, but their edge distribution and conditional distribution are different.

S20, inputting the source domain data and the target domain data into a feature extractor for feature extraction so as to obtain the source domain feature data and the target domain feature data;

s30, inputting the source domain feature data and the target domain feature data into a tag classifier to perform tag classification so as to obtain a prediction pseudo tag of the target domain;

specifically, the feature extractor is a one-dimensional residual feature extractor, and comprises four Block blocks, wherein the first Block comprises a convolution layer, the convolution layer adopts a one-dimensional convolution kernel, and the rest Block blocks comprise two Bottleneck blocks which are used for realizing residual connection. The tag classifier includes a fully connected output layer for classifying the features extracted from the feature extractor and outputting the predicted pseudo tags of the target domain. A global average pooling layer is arranged before the full-connection output layer. And constructing a one-dimensional residual error network model (1D-ResNet) through the structural layer. The main structural parameters of the model are shown in the following table 1:

TABLE 1 Main Structure parameters of the model

As shown in Table 1, the convolution layer of the first Block of the feature extractor uses a convolution kernel with the size of 32x1, and the rest blocks all use 2 Bottleneck blocks, so that the parameter number and training time of the model can be effectively reduced, and meanwhile, the deep features of the vibration signals can be extracted. Wherein, bottleneck1 represents a regular residual block, bottleneck2 represents a downsampled residual block, C represents Conv1d, B represents BatchNorm1d batch normalization, A represents activation, all using the Relu activation function, P represents Maxpool1d, and S represents a ShortCut residual connection. The feature extractor increases the width of the convolution kernel of the first layer by using the convolution kernel of 32x1, so that the receptive field is increased, more data information can be considered, the influence of noise data is reduced, and the training process is more stable.

S40, inputting the source domain characteristic data, the target domain prediction pseudo tag and the real tag of the source domain data into a domain adaptation module to calculate domain adaptation loss so as to obtain domain adaptation loss L _AJMMD ；

It should be noted that, in the invention, the adaptive joint domain adaptation part of the AJMMD is embedded behind the global average pooling layer, the maximum mean difference MMD is adopted to calculate the edge distribution difference, the condition distribution loss between two domains is measured through the local maximum mean difference LMMD loss, the importance of the edge distribution difference and the condition distribution difference in the training process is adjusted in real time through the adaptive weighting factor, and the condition distribution difference and the edge distribution difference are approximate to the joint distribution difference through the weighted combination.

MMD is one of the most widely used edge distribution difference metric methods in domain adaptation, a kernel method that maps data points from an input space to a hilbert feature space, and compares the mean difference between the feature space for the source domain and the target domain.

LMMD (local maximum mean difference) is a method for adapting to conditional distribution, and is improved based on the existing CMMD, not only by using the characteristics of the sample, but also by using the label information of the sample to weight the sample, and the conditional distribution alignment is realized by acquiring the pseudo label of the target domain. The difference from CMMD is that LMMD does not take absolute weight, but adopts the output class probability of the model in the sample data of the target domain as a weighting matrix, thereby reducing the influence of the accumulation of false labels of the target domain on the training trend.

The source domain features F (x _t ) And target domain feature F (x _t ) Sending into the adaptation space part of the AJMMD domain, and sending the real label y of the source domain data _s Predictive pseudo tags for target domainsAs input parameters, calculating domain adaptation loss L by AJMMD adaptation joint domain adaptation loss function in domain adaptation module _AJMMD The function calculation formula is as follows:

wherein L is _AJMMD Denoted as domain adaptation loss, L _MMD Expressed as edge distribution loss, L _LMMD Expressed as conditional distribution loss, w is expressed as an adaptive weighting factor for each batch of data in training; MMD (·) is the maximum mean difference loss function, LMMD (·) is the local maximum mean difference loss function, F (x) _s ) Represented as source domain feature data, F (x _t ) Expressed as target domain feature data, y _s Represented as a true tag of the source domain data,a predictive pseudo tag expressed as a target domain, H representing Hilbert space, Φ (·) being a mapping function, D _s For source domain data, D _t Is the target domain data.

Where w is an adaptive weighting factor for each batch of data in training, and its calculation formula is:

wherein,for e as a base, L _MMD As an exponential function of an index>For e as a base, L _LMMD Is an exponential function of the index. When the edge distribution difference is larger, it indicates that the overall distribution difference in the time domain is larger, and at the moment, w will be adaptively increased, so that the edge distribution gets relatively more attention in the optimization process, and otherwise, the difference is reduced. And w enables two distribution importance degrees to be adaptively adjusted according to actual conditions in the training process.

The adaptive joint domain adaptation part of the AJMMD does not need to rely on manual experience to adjust balance coefficients of the two distributions, so that the influence of accumulation of false labels of the target domain on the training trend can be reduced, the joint distribution of the two domains is pulled up, the training trend of the feature extractor is restrained, and inter-domain differences and intra-domain differences are reduced, and the unsupervised domain adaptation task is better realized.

Specifically, the expansion calculation formula of the maximum mean difference loss function MMD (·) is:

wherein H represents Hilbert space, the difference measurement is carried out by mapping the characteristics of the source domain and the target domain to the space through a mapping function phi (#),representing the kernel inner product between the source domain features and the target domain sample features.

The expansion calculation formula of the local maximum mean difference loss function LMMD (·) is as follows

Where M is the total number of classes in the class space, and M is preferably 10 classes of faults in fault diagnosis, but this is not a limitation.And->Weights representing that the source domain and target domain samples belong to class m, respectively, < >>For computing a weighted sum of m categories to balance the cases of possible unbalance in each batch category in a small batch random gradient descent method, for sample x _i Weight->Is calculated as follows:

y _im is the i-th sample tag vector, D represents the source domain or the target domain. In domain adaptation, the entire LMMD module needs to input 4 parameters: source domain feature data, target domain feature data, real labels of the source domain data, output probability distribution of the target domain (i.e. prediction pseudo labels of the target domain).

S50, calculating classification loss L according to the real labels of the source domain data and the cross entropy loss function _C ；

After the label classifier outputs, a Softmax function is used to output probability distribution of the corresponding class, and an index corresponding to the maximum probability is used as a predicted label. The cross entropy loss function is adopted as the classification loss function, and the calculation formula is as follows:

wherein,expressed as categorical loss>Expressed as the prediction result of the label classifier, y ^s A real class label expressed as source domain data, < +.>A real label representing the ith sample of the source domain, M represents the total number of source domain fault types, M represents the current fault category, a _i,m Represents->The predicted probability that each sample belongs to class m, I (·) is a sign function ifI=1, otherwise i=0.

S60, according to the classification loss L _C Sum domain adaptation loss L _AJMMD Counter-propagating the gradients to optimize updating parameters of the feature extractor and parameters of the tag classifier;

specifically, according to the classification loss L _C Sum domain adaptation loss L _AJMMD The step of back-propagating the gradient to optimize updating parameters of the feature extractor and parameters of the tag classifier includes:

step one, according to the classification loss L _C Domain adaptation loss L _AJMMD And the overall domain adaptation factor constructs an overall loss function and calculates overall loss L; specifically, the calculation formula of the overall loss function is:

And secondly, performing model parameter optimization calculation according to a gradient descent algorithm and the total loss L so as to optimize and update parameters of the feature extractor and parameters of the tag classifier.

Based on the current total loss and the parameters to be optimized of the feature extractor, updating the parameters of the corresponding feature extractor according to a gradient descent algorithm; accordingly, the parameters of the tag classifier can also be updated by the above principle, so that the parameters of the model are reversely updated.

And S70, outputting a trained bearing fault diagnosis training model when the current iteration calculation times are greater than or equal to the preset iteration times, otherwise, returning to the step S20 to continue training.

It should be noted that, when the number of iterative computations is greater than or equal to the preset number of iterations, outputting a trained bearing fault diagnosis training model, otherwise, inputting the parameters of the feature extractor and the parameters of the tag classifier updated currently into step S20 for continuous training until the preset number of iterations is satisfied, so as to obtain a bearing fault diagnosis training model for outputting the trained model parameters. The preset iteration number may be set according to practical situations, and is not limited herein.

In order to verify the effectiveness, feasibility and superiority of the method of the invention. The method according to the invention is demonstrated in the following by way of specific examples.

Example 1

The data set of this embodiment is exemplified by the university of western storage data set (CWRU). The data set adopts the original driving end vibration signal of 12Khz, collects vibration signals at four different load rotating speeds (1730 rpm, 1750rpm, 1772rpm and 1797 rpm), and regards the vibration signals as data sets (A, B, C and D data sets) of four different working conditions. The fault type is formed by single-point processing operation OF an electric spark machining technology, and has four states in total, and the four states are divided into normal data (N), ball Faults (BF), inner ring faults (IF) and outer ring faults (OF). Each fault type has a different size fault diameter (0.007 inches, 0.014 inches, 0.021 inches) and each condition can be divided into a total of 10 categories of 1 set of normal data and 9 sets of fault data. Each fault class collection window size 784 data points, the correlation between consecutive samples is maintained by sliding overlap sampling, the sliding step size 80, 150 samples are sampled per fault class, and a total of 10 x 150 are 1500 samples. Specific information for the dataset is shown in table 2 below:

table 2 information of four data sets under CWRU variable operating conditions

The effect of the convolution kernel width of the first convolution layer of 1D-ResNet on the training process was verified by using two operating mode (d→a) datasets that were widely different. The optimizer uses small batch SGD with a learning rate of 0.01, the convolution kernel size of the first convolution layer is selected from {7, 14, 32, 64, 96}, and the accuracy of the different width convolution kernels is shown in fig. 3.

As can be seen from fig. 3, when the convolution kernel width is set to 7, the obtained effective features are relatively few, the training process is relatively unstable, and the highest accuracy rate thereof is converged at 97.40%; when the convolution kernel is increased to 14, due to the rising of receptive field and model parameters, more key information can be considered, the accuracy is greatly improved, the first time of 42 epochs reaches 100%, and the accuracy is basically and stably converged at 100% in 80 epochs after small-amplitude oscillation; when the convolution kernel is set to 32, it reaches 100% for the first time at the 36 th epoch, and then very steadily converges to this optimal state; while the convolution kernel continues to increase to 64, the small amplitude oscillation is limited to 98% and 99% before the 100 th epoch, and 100% basic convergence is achieved after the 120 th epoch, and the accuracy rate is reduced to 98.47% when the convolution kernel continues to increase to 96. Therefore, the invention adopts the convolution kernel of 32X2 to lead the stability, the convergence speed and the accuracy of model training to be better than the broadband of other convolution kernels and can reach a good balance state.

As shown in fig. 4, fig. 4 shows coefficients of a general condition distribution and an edge distribution, and a coefficient combination thereof is selected from { [0.2,0.2], [0.5,0.5], [0.8,0.8], [0.1,0.9], [0.3,0.7], [0.7,0.3], [0.9,0.1] }, a first index of each element corresponds to an importance weight of LMMD, and a second index corresponds to an importance weight of MMD.

When the combination coefficients are set as [0.2,0.2], [0.5,0.5], [0.8,0.8], which is equivalent to placing the condition distribution and the edge distribution at the same importance level, especially when the combination coefficients are set as [0.2,0.2], the classification performance of the model can not reach 80% of accuracy, the combination coefficients of [0.5,0.5] can be regarded as a special case of the edge distribution, and in the edge distribution coefficient combination [0.1,0.9], the classification performance of the model is also poorer, and the adaptive weighted distribution can be realized by adopting the method of the invention, and the accuracy and the stability are high. Therefore, the balance coefficient is set through manual experience, so that the uncertainty is very large, the weights of the two distributions cannot be adjusted adaptively according to actual conditions, and the effect of the adaptive weighting method of the invention is often not achieved in the decision making capability of the model.

Further, the method of the invention is adopted to carry out the same migration task as other model methods so as to verify the effectiveness of the method of the invention. The method comprises the steps of performing mutual migration among 4 different working conditions of the ABCD, wherein the total number of migration tasks is 12. The model parameter combination is set to be that the overall domain adaptation factor is fixed to be 1.2, the optimizer adopts small-batch SGD, the learning rate is 0.01, the Baschsize is 128, and the total iteration number is 150epoch.

The accuracy information of the migration task performed by the method and other model methods is shown in fig. 5. As can be seen from fig. 5, the average accuracy of the backbone network 1D-ResNet of the working model of the present invention on 12 migration tasks reaches 90.75%, and even the two working conditions can reach 98.62% under the condition that the source domain and the target domain are not greatly different, which illustrates that some key features of the same fault class can be extracted by adopting the improved residual network, however, when the difference is increased, the performance of the model without domain adaptation is greatly reduced; while other methods of band adaptation generally have higher average accuracy over 12 migration tasks than the non-band adaptation method. The CMMD method does not consider the confidence coefficient of the pseudo tag, and the classification accuracy rate of the CMMD method has no remarkable effect compared with DDC and D-CORAL of an edge distribution network; the DSAN has relatively stable classification capability on each migration task, the average accuracy rate reaches 98.61%, so that the combination of 1D-ResNet and LMMD is even higher than JAN and BDA of a joint distribution method, which is very critical in consideration of the reliability degree of pseudo tags, and the joint distribution network method JAN and BDA is superior to an edge distribution network and a conditional distribution network except the DSAN in general; SC-1DCNN with self-adaptive weighting factors is higher than JAN and BDA of manually set parameters, and the overall migration effect is close to that of the invention; the method of the invention obtains more than 99% of accuracy on all migration tasks of the data set of the West university and is very stable, and the average accuracy on 12 migration tasks reaches 99.90%, which is higher than 98.61%, 98.46% and 99.76% of DSAN, BDA and SC-1 DCNN. Therefore, the accuracy and stability of the migration task performed by the method are highest.

Example 2

The data set of this example is exemplified by the university of south river data set (JNU). The data acquisition frequency is 50KHz, the sampling time is 20s, the data are respectively acquired at three rotating speeds of 600rpm, 800rpm and 1000rpm, compared with the data set of the Sichuang university, the difference of the rotating speeds is larger, the data are regarded as data of 3 different working conditions, each working condition comprises 1 health state and 3 fault states, the three faults are respectively ball faults, inner ring faults and outer ring faults, 500 samples are respectively classified, the total number is (4 x 500) 2000 samples are respectively classified, the number is 2048 data points are respectively classified as each sample, and the sliding step length is 240. The data set is shown in Table 3 below by recording the conditions at three speeds of 600rpm, 800rpm and 1000rpm as E, F, G, respectively:

TABLE 3 information of three datasets under JNU variable Condition

The mutual migration is carried out between the E, F working conditions and the G working conditions, and the total migration tasks are 6. The data results of the method of the present invention and other model methods for performing the 6 migration tasks are shown in table 4 and fig. 6 below.

Table 4 data set test results of university of Jiangnan (%)

/>

As can be seen from table 4 and fig. 6, the average accuracy of the CMMD model is lower than that of the test result of the university of jiang and south university at 6 migration tasks, and the average accuracy of the CMMD model is not prominent compared with 1D-ResNet without any domain adaptation method, but only 1.34% of the difference, because CMMD is a conditional distribution domain adaptation method, label information is needed to align subdomains, and when the difference between two domains increases, the output target domain pseudo labels are not reliable, and the alignment is hard and can be wrong to bring different classes close. The DSAN model adopting the LMMD method can take the confidence level of the pseudo tag into consideration through weighting, so that a more ideal effect can be obtained in migration tasks with smaller differences, but the accuracy rate in migration tasks (G-E) with larger differences only reaches 83.91%; because even if the pseudo tag confidence is considered, when the difference is larger, the probability of obtaining an erroneous pseudo tag increases, which seriously affects the method of conditional distribution alignment by tag information. As can be seen from Table 3 and FIG. 6, the average accuracy of the method of the present invention in these 6 migration tasks is still better than that of the other comparative models, and the accuracy is up to 95.26% and higher than 94.98, 93.07% and 93.25% of SC-ADCNN, DSAN and BDA.

The results of the above embodiments fully demonstrate the effectiveness, feasibility and superiority of the bearing fault diagnosis method based on the adaptive joint domain adaptive network.

In summary, the edge distribution difference and the condition distribution difference between the source domain and the target domain can be measured through the adaptation difference energy of the AJMMD self-adaptive joint domain, the importance degree between the two distribution differences can be adjusted in a self-adaptive mode, manual experience is not required to be relied on to adjust balance coefficients of the two distributions, the influence of accumulation of false labels of the target domain on the training trend can be reduced, joint distribution of the two domains is pulled up, the training trend of the feature extractor is restrained, and inter-domain difference and intra-domain difference are reduced, so that an unsupervised domain adaptation task is better realized. And performing model parameter optimization updating processing through a gradient descent method and a total loss function so as to output a bearing fault diagnosis training model with trained model parameters.

By improving the convolution kernel width of the first layer in the one-dimensional residual feature extractor, the receptive field is increased, more data information can be considered, the influence of noise data is reduced, and the training process is more stable.

The foregoing disclosure is merely illustrative of the preferred embodiments of the present invention and is not intended to limit the scope of the claims herein, as equivalent changes may be made in the claims herein without departing from the scope of the invention.

Claims

1. The bearing fault diagnosis method based on the adaptive joint domain adaptive network is characterized by comprising the following steps of:

acquiring unlabeled target domain data;

inputting the target domain data into a pre-trained bearing fault diagnosis training model for detection and diagnosis so as to obtain diagnosis evaluation data;

determining a fault type according to the diagnosis evaluation data;

the bearing fault diagnosis training model comprises a feature extractor, a tag classifier and a domain adaptation module;

the domain adaptation module adopts an adaptation joint domain adaptation difference based on AJMMD to measure the edge distribution difference and the conditional distribution difference between a source domain and a target domain and adaptively adjust the importance degree between the two distribution differences;

the step of adaptively adjusting the importance degree between the two distribution differences comprises the following steps:

calculating an adaptive weighting factor through the edge distribution difference and the conditional distribution difference, and adopting the adaptive weighting factor to adaptively adjust the importance degree between the two distribution differences;

the training step of the bearing fault diagnosis training model comprises the following steps:

s1, obtaining n _s Tagged source domain data and n _t Target domain data without labels;

s2, inputting the source domain data and the target domain data into the feature extractor for feature extraction so as to obtain source domain feature data and target domain feature data;

s3, inputting the source domain feature data and the target domain feature data into the tag classifier for tag classification so as to obtain a prediction pseudo tag of the target domain;

s4, the source domain feature data and the target domain feature data,The prediction pseudo tag of the target domain and the real tag of the source domain data are input into the domain adaptation module to perform domain adaptation loss calculation so as to obtain domain adaptation loss L _AJMMD ；

S5, calculating classification loss L according to the real label of the source domain data and the cross entropy loss function _C ；

S6, according to the classification loss L _C Sum domain adaptation loss L _AJMMD Counter-propagating the gradients to optimize updating parameters of the feature extractor and parameters of the tag classifier;

s7, outputting a trained bearing fault diagnosis training model when the current iteration calculation times are greater than or equal to the preset iteration times, otherwise, returning to the step S2 to continue training;

said classifying according to said classification loss L _C Sum domain adaptation loss L _AJMMD The step of back-propagating the gradient to optimize updating parameters of the feature extractor and parameters of the tag classifier includes:

according to the classification loss L _C Domain adaptation loss L _AJMMD And the overall domain adaptation factor constructs an overall loss function and calculates overall loss L;

performing model parameter optimization calculation according to a gradient descent algorithm and the total loss L so as to optimize and update parameters of a feature extractor and parameters of a tag classifier;

calculating domain adaptation loss L through AJMMD self-adaptation joint domain adaptation loss function in the domain adaptation module _AJMMD The function calculation formula is as follows:

wherein L is _AJMMD Denoted as domain adaptation loss, L _MMD Expressed as edge distribution loss, L _LMMD Expressed as a conditional distribution penalty, w is expressed as an adaptive weighting factor for each batch of data in the training,MMD (·) is the maximum mean difference loss functionLMMD (·) is the local maximum mean difference loss function, F (x) _s ) Represented as source domain feature data, F (x _t ) Expressed as target domain feature data, y _s Real tag represented as source domain data, +.>A predictive pseudo tag represented as a target domain;

the calculation formula of the total loss function is as follows:

wherein,expressed as total loss->For classifying loss->Ideal parameters denoted as feature extractor +.>Ideal parameters of tag classifier, +.>Parameters to be optimized, denoted feature extractor, < ->Parameters to be optimized, denoted as label classifier, < ->Is the overall domain adaptation factor.

2. The bearing fault diagnosis method according to claim 1, wherein the feature extractor is a one-dimensional residual feature extractor, which comprises four Block blocks, the first Block comprises a convolution layer, the convolution layer adopts a one-dimensional convolution kernel, and the rest Block blocks comprise two Bottleneck blocks; the tag classifier includes a fully connected output layer.

3. The bearing fault diagnosis method according to claim 1, wherein the calculation formula of the cross entropy loss function is:

4. The bearing fault diagnosis method according to claim 2, wherein the one-dimensional convolution kernel is a 32x1 convolution kernel.

5. A computer device comprising a processor and a memory for storing a computer executable program, the processor reading part or all of the computer executable program from the memory and executing, the processor executing part or all of the computer executable program to implement the bearing fault diagnosis method according to any one of claims 1 to 4.

6. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, enables the bearing fault diagnosis method according to any one of claims 1 to 4.