CN116821762A

CN116821762A - Mechanical fault diagnosis method based on multi-scale graph attention fusion network

Info

Publication number: CN116821762A
Application number: CN202310793938.2A
Authority: CN
Inventors: 王金龙; 倪培浩; 张媛媛; 熊晓芸; 吉爱国; 董良成
Original assignee: Qingdao University of Technology
Current assignee: Qingdao University of Technology
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2023-09-29

Abstract

The invention discloses a mechanical fault diagnosis method based on a multi-scale graph attention fusion network, which comprises the following steps: collecting data of mechanical vibration, constructing a sample data set, and dividing the data set into a training set and a testing set; respectively constructing a training set and a testing set of the data set into a training set family chart and a testing set family chart according to cosine similarity among samples; inputting the training set family diagram into a training network in the built multi-scale diagram attention fusion network to obtain an optimal fault diagnosis model; inputting the test set family diagram into an optimal fault diagnosis model for testing, and evaluating the performance of the model; and inputting the data of the mechanical vibration to be diagnosed into the evaluated fault diagnosis model to perform fault diagnosis. The method disclosed by the invention can obtain higher fault diagnosis accuracy under the conditions of unbalanced data and strong noise.

Description

Mechanical fault diagnosis method based on multi-scale graph attention fusion network

Technical Field

The invention relates to the technical field of mechanical fault diagnosis, in particular to a mechanical fault diagnosis method based on a multi-scale graph attention fusion network.

Background

Rotary machines are a critical component of modern manufacturing equipment, the safe operation of which directly affects the stability of industrial systems. In rotating machinery, the bearings will be subjected to large and constantly changing loads, which will lead to failure. Statistics show that the failure rate of the rotating machinery caused by the bearings is as high as 40%. In order to reduce equipment loss during production, and to increase equipment stability, bearing condition monitoring and fault diagnosis techniques have become central issues of interest to students and engineers.

Current fault diagnosis methods can be divided into three categories: model-based diagnostic methods, signal processing-based diagnostic methods, and data-driven diagnostic methods. The physical model-based approach requires the creation of detailed mathematical models to describe the physical characteristics and failure modes of the system. The fault diagnosis method based on signal processing mainly adopts the technologies of time-frequency domain analysis, wavelet packet transformation, envelope spectrum analysis, high-order statistic analysis and the like to analyze and process signals, so that fault diagnosis is carried out. Both of these methods are too dependent on high quality expertise. Furthermore, for complex mechanical systems, it becomes very difficult to build mathematical models, which requires significant computational and time costs while reducing the overall efficiency of fault diagnosis.

The fault diagnosis method based on the data driving technology does not need to build a complex mathematical model, but builds a feature extraction model and a classifier, and optimizes the model by utilizing a large amount of historical data. The diagnosis method has good effect and can be suitable for fault diagnosis of complex mechanical equipment. In recent years, data-driven based methods have become popular in the field of fault diagnosis and have achieved a lot of research results.

Among them, the method based on the neural network has recently attracted a lot of attention because the information of the geometric relationship between signals can be introduced into the machine fault diagnosis model. The graph neural network can aggregate the characteristics of the central node and the adjacent nodes of the graph into enhanced node characteristics, automatically learn different weights among the nodes through an attention mechanism, and aggregate information according to the learned weights, so that the influence of useful information is enlarged, the influence of interference information is ignored, and node classification is realized. Zhang et al convert the collected acoustic signals into a pattern with a geometric structure, and apply a graph convolution network to realize fault diagnosis of the rolling bearing. Sun et al use the proposed multi-channel residual network (MCRN) to extract weak features in the signal, and generate the signal and finite graphic data of different scales through an Automatic Encoder (AE) graphic generation layer, finally realizing fault diagnosis based on a graph convolution network. Jiang et al used a dynamic time warping method to convert the original vibration signal into map data, and applied GAT for the first time to the field of fault diagnosis. Although the existing methods have been successful, the proposed methods still have a great limitation: traditional fault diagnosis methods based on Graph Neural Networks (GNNs) have difficulty in effectively capturing local and global feature information of data, and most GNN models do not take into account inherent differences between adjacent nodes, and have poor ability to process limited and noisy fault vibration signals in industrial scenarios.

Disclosure of Invention

In order to solve the technical problems, the invention provides a mechanical fault diagnosis method based on a multi-scale graph attention fusion network, so as to achieve the purpose of obtaining fault diagnosis with higher accuracy under the conditions of unbalanced data and strong noise.

In order to achieve the above purpose, the technical scheme of the invention is as follows:

a mechanical fault diagnosis method based on a multiscale graph attention fusion network comprises the following steps:

step one, collecting data of mechanical vibration, constructing a sample data set, and dividing the data set into a training set and a testing set;

step two, respectively constructing a training set and a testing set of the data set into a training set family chart and a testing set family chart according to cosine similarity among samples;

step three, inputting the training set family diagram into a training network in the built multi-scale diagram attention fusion network to obtain an optimal fault diagnosis model;

inputting the test set family diagram into an optimal fault diagnosis model for testing, and evaluating the performance of the model;

and fifthly, inputting the data of the mechanical vibration to be diagnosed into the evaluated fault diagnosis model to perform fault diagnosis.

In the above scheme, in the first step, the collected sample signals are constructed into a sample data set together with the corresponding fault type labels.

In the above scheme, the specific method of the second step is as follows:

step1: normalizing the acquired data;

step2: converting the normalized time series data into frequency domain data;

step3: selecting n nodes with highest similarity as ancestor graph G in the family graph according to the similarity among samples _ancestor Secondly, randomly dividing the n nodes into two groups, and obtaining respective father graphs G by using a cosine similarity method for each group _parent The method comprises the steps of carrying out a first treatment on the surface of the Finally, each parent graph G _parent The n/2 nodes of (2) are randomly divided into two groups again, and each group is respectively used for obtaining a respective sub-graph G by a cosine similarity method _child ；G _ancestor ，G _parent ，G _child Together form a family diagram G _clan And is used as an input to the model.

In the above scheme, in the third step, the multi-scale graph attention fusion network includes an input layer, a multi-scale feature fusion layer and a node classification output layer, where the multi-scale feature fusion layer includes three parallel graph attention modules with different attention scales and one feature fusion module, and the node classification output layer includes two full-connection layers, a leak relu activation layer and a Dropout layer.

In the above scheme, the drawing attention module includes a drawing attention layer, a BatchNomzation layer, a Dropout layer, and a LeakyReLU activation layer.

In the above scheme, the loss function of the attention fusion network of the multi-scale graph selects a cross entropy loss function, and the formula is as follows:

Loss(p,q)＝-∑p(x)log q(x)

where p (x) is the label of the training set and q (x) is the label value of the network prediction.

In the above scheme, the output size of the input layer is 10×512×1, the output size of the multi-scale feature fusion layer is 10×1024×3, the output size of the node classification output layer is 10×512, and the Dropout layer ratio in the drawing meaning module and the node classification output layer is 0.6.

In the above scheme, the output result of the multi-scale feature fusion layer is expressed as:

wherein [ (S)]Representing a join operator; sigma is a Sigmoid activation function; h ₁ ，H ₂ ，H ₃ The attention layer respectively representing three attention modules has H ₁ ，H ₂ ，H ₃ Independent attention mechanisms;indicating that node i and neighbor node m thereof are in h ₁ ，h ₂ ，h ₃ Normalized attention coefficients calculated under the individual attention mechanisms; />Respectively represent the normalized attention coefficient as +.>The corresponding linear transformation weighting matrix; />Node features representing a neighborhood node m, each (-) representing a feature representation from a different scale; h is a ₁ ，h ₂ ，h ₃ Respectively represent three notesThe h of the drawing force layer of the force module ₁ ，h ₂ ，h ₃ Individual attention mechanisms; />Representing the neighborhood of node i in the graph.

In the above scheme, the output Y of the graph with the node characteristic F in the multi-scale graph attention fusion network is expressed as: y=fcl ₂ (FCL ₁ (Leaky_ReLU(MSFFL ₁ (F)),

Leaky_ReLU(MSFFL ₂ (F)),Leaky_ReLU(MSFFL ₃ (F))))

Wherein FCL ₁ And FCL ₂ For the full connection layer of the multi-scale graph attention fusion network, the Leaky_ReLU is an activation function used by the multi-scale graph attention fusion network, and the MSFFL ₁ ，MSFFL ₂ ，MSFFL ₃ And a multi-scale feature fusion layer built for the multi-scale graph attention fusion network.

Through the technical scheme, the mechanical fault diagnosis method based on the multi-scale graph attention fusion network has the following beneficial effects:

1. aiming at the problem that the existing graph neural network method based on data driving only can represent geometric relation information between signals and neglects local and global characteristic information of graph structure data, the invention introduces a family graph, and the local and global information of the graph structure data is effectively represented by converting a data sample into the family graph with a plurality of information scales, thereby overcoming the defect that the local and global characteristic information of the graph structure data cannot be simultaneously expressed, better representing representative characteristics of different fault types and improving the characteristic extraction capability.

2. Aiming at the problem that the difference between adjacent nodes is not considered in most of the existing GNN models, the invention designs a multi-scale graph attention fusion network (MSGAFN), the designed multi-scale feature fusion layer is matched with a family graph, weights of the adjacent nodes are automatically learned at a plurality of scales to represent the importance of the adjacent nodes to a central node, the difference between different adjacent nodes is reflected, the learned nodes are embedded to represent more representativeness and identification degree, the connection between local information and an integral structure is utilized to the maximum extent, the hidden connection in a network topological structure is fully excavated, the robustness of the model is improved, and the fault diagnosis precision under the conditions of data unbalance and strong noise is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a schematic overall flow diagram of a mechanical fault diagnosis method based on a multi-scale graph attention fusion network according to an embodiment of the present invention;

fig. 2 is a network configuration diagram of the MSGAFN;

FIG. 3 is a flow chart of a family diagram construction;

FIG. 4 is a diagram of a schematic force module;

FIG. 5 is a graph of the effect of combinations of attention counts on MSGAFN at different scales;

FIG. 6 is the effect of single-scale and multi-scale texture maps on a model;

FIG. 7 is a comparative test result of bearing data;

fig. 8 is a comparative test result of the bearing gear mixing data.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

The invention provides a mechanical fault diagnosis method based on a multiscale graph attention fusion network, which is mainly used for judging whether a bearing and a gear are faulty or not and the type of the fault by detecting collected vibration signals of the bearing and the gear. The overall flow chart of the method is shown in fig. 1.

In order to verify the effect of the mechanical fault diagnosis method of the MSGAFN network structure on bearing and gear fault diagnosis under the conditions that samples are unbalanced and strong noise exists, the experiment adopts two data sets, namely a gearbox data set collected by the university of eastern and the university of eastern (Southeast University, SEU) and 12KHz driving end data collected by the bearing data center of Kassi Chu Da (Case Western Reserve University, CWRU), respectively.

The method specifically comprises the following steps:

s1: the data of the 12KHz drive end data set collected by Kassi Chu Da bearing data center and part of the gearbox data set collected by southeast university are selected.

Specifically, the kesixi Chu Da science bearing data center used in example 1 collects 12KHz drive end portion data collection comprising the steps of:

step1: the rolling bearing model of the motor drive end was selected to be SKF6205.

Step2: single point damage was machined on the motor drive end rolling bearing with an electric spark at the inner ring and balls of the rolling bearing (damage diameters of 0.007 inches, 0.014 inches and 0.021 inches), and 12 o' clock of the outer ring of the rolling bearing.

It should be noted that, the fault damage of the rolling bearing by using the electric spark single-point machining can imitate the damage of the bearing in the actual working condition. Common bearing damage points are on the outer race, inner race and balls of a rolling bearing, and there are typically differences in damage diameters, where the damage diameters of the inner race and balls are set to 0.007 inches, 0.014 inches and 0.021 inches. There are thus a total of 9 different types of rolling bearing failures. At the same time, there are also a normal class of rolling bearings without any losses, so there are a total of 10 classes of rolling bearings.

Step3: and the vibration sensor is arranged above an SKF6205 type bearing seat at the driving end of the motor and is connected with a 16-channel data recorder, so that an acquisition system for acquiring vibration signals of the rolling bearing is formed.

Step4: the same single load is applied to the same type of rolling bearing, and the rolling bearing vibration signal of the driving end of the motor is collected by a 16-channel data recorder at the sampling frequency of 12 KHZ. In this embodiment, in order to verify that the present invention can be applied to different loads, a plurality of loads, 1HP and 2HP, are provided together. Under each load, the rolling bearing vibration signals of 10 types of rolling bearings are collected.

Step5: for 10 types of rolling bearing vibration signals collected under each load, the collected rolling bearing vibration signals are respectively divided into a series of sample signals according to the set sample signal length, and the sample signals and corresponding fault type labels can be constructed into a sample data set.

It should be noted that in this embodiment, the length of each sample signal is 1024 vibration points, after the segmentation and arrangement, in order to test an unbalanced sample scene, multiple numbers of samples of different types are set, under each load, the rolling bearings of 9 types of different fault types have 80 sample signals respectively, and the normal rolling bearings have 100 sample signals. In order to simulate the actual industrial scenario, the sample signals under 2 loads are mixed, and the same fault type under different loads is regarded as a label to form a data set D _bear For testing the model diagnosis effect under the strong noise environment, the model diagnosis effect is D _bear Adding Gaussian white noise with signal-to-noise ratio of-5, -6, -7, -8, -9, -10dB to obtain a data set D _bear-5 ，D _bear-6 ，D _bear-7 ，D _bear-8 ，D _bear-9 ，D _bear-10 The method comprises the steps of carrying out a first treatment on the surface of the To facilitate subsequent model training and testing, D will _bear-5 ，D _bear-6 ，D _bear-7 ，D _bear-8 ，D _bear-9 ，D _bear-10 The training set and the test set are divided according to the same proportion.

It should be noted that, each load 9 types of rolling bearings with different fault types obtains 720 sample signals, the normal rolling bearing obtains 100 sample signals, and the two loads are mixed with 1640 rolling bearing vibration signals in total to obtain D _bear The specific data set partitioning training and testing sets are shown in table 1.

TABLE 1

The partial data acquisition of the gearbox dataset acquired at southeast university used in example 2 includes the following steps:

step1: steady state signals are collected from a Driveline Dynamic Simulator (DDS). The bearing failure is caused by cracks at the positions of the inner ring, the outer ring, the rolling bodies and the like. Gear faults fall into four categories: cutting teeth, missing teeth, root faults and surface faults; the cracks of the cutting teeth and the gear root are caused by the cracks, but the positions are different; the missing tooth fault is caused by the missing tooth; surface wear indicates that there is wear on the gear surface.

The collected steady state signals contain five health states of bearing data and gear data, respectively: four fault conditions and one health condition. The gear faults and the bearing faults are combined to form a 10-class mixed data set which comprises four gear faults and four bearing faults, wherein the gears and the bearings are respectively in a healthy state.

Step2: the same single load is applied to the bearings and gears and data acquisition is performed using a dynamic drive train simulator (DDS). In this embodiment, to verify that the present invention can be applied to different loads, a total of loads of 20HZ-0V and 30HZ-2V are provided. Under each load, 10 types of hybrid bearing and gear signals were collected.

Step3: for 10 types of bearing and gear vibration signals collected under each load, the bearing and gear vibration signals are respectively divided into a series of sample signals according to the set sample signal length, and the sample signals and corresponding fault type labels can be constructed into a sample data set.

Specifically, in this embodiment, the vibration points are divided according to 1024 vibration points of each sample signal length, after division and arrangement, in order to test an unbalanced sample scene, multiple numbers of different types of samples are set, under each load, 80 sample signals are respectively provided for 8 types of bearings and gears with different fault types, and 100 sample signals are respectively provided for 2 types of bearings and gears with normal types. In order to simulate the actual industrial scenario, the sample signals under 2 loads are mixed, and the same fault type under different loads is regarded as a label to form a data set D _mix The method comprises the steps of carrying out a first treatment on the surface of the To test the model diagnosis effect under the strong noise environment, the model diagnosis effect is D _mix Adding Gaussian white noise with signal-to-noise ratio of 0, -1, -2, -3, -4, -5dB to obtain a data set D _mix-0 ，D _mix-1 ，D _mix-2 ，D _mix-3 ，D _mix-4 ，D _mix-5 The method comprises the steps of carrying out a first treatment on the surface of the To facilitate subsequent model training and testing, D will _mix-0 ，D _mix-1 ，D _mix-2 ，D _mix-3 ，D _mix-4 ，D _mix-5 The training set and the test set are divided according to the same proportion.

It should be noted that, each load 8 types of bearings and gears with different fault types obtain 640 sample signals, each type 2 type of bearings and gears with normal types obtain 200 sample signals, and the vibration signals of the total 1680 bearings and gears under the two loads are mixed to obtain D _mix The specific data set partitioning training and testing sets are shown in table 2.

TABLE 2

S2: respectively constructing a training set and a testing set of the data set into a family graph G according to cosine similarity among samples _clan-train ，G _clan-test ；

Specifically, a flowchart of the construction family diagram is shown in fig. 3. Construction family chart G _clan Using D in training neural networks for diagnosing bearings _bear Use of D in training a neural network that diagnoses gear and bearing mixing _mix 。

Structure G _clan The method of (1) comprises the following steps:

step1: normalizing the acquired data;

it should be noted that, the Min-Max normalization method is used when the normalization operation is performed on the acquired data.

Step2: converting the acquired time series data into frequency domain data;

it should be noted that the time-series data is converted into frequency-domain data using an FFT (fast fourier transform) method.

Step3：Selecting 20 nodes with highest similarity as ancestor graph G in the family graph according to the similarity among samples _ancestor Secondly, randomly dividing the 20 nodes into two groups, and obtaining respective father graphs G by using a cosine similarity method for each group _parent . Finally, randomly dividing the father diagram into two groups again, and obtaining respective sub-image G by using a cosine similarity method for each group _child 。G _ancestor ，G _parent ，G _child Together form G _clan And is used as an input to the model.

It should be noted that, the distance between the nodes is estimated by calculating the similarity between the samples using cosine similarity, and the formula is:wherein (1)>Is->E is the selected radius length. />Get collection->And defines a threshold value of 0. If the similarity is greater than the threshold, there will be an edge e between the two nodes _i 。

S3: handle G _clan-train And inputting the MSGAFN network model for training, and obtaining an optimal rolling bearing fault diagnosis model after meeting the accuracy requirement.

Specifically, the construction of the MSGAFN network model includes the following steps:

step1: designing a multi-scale feature fusion layer to construct G _clan Used as input for multi-scale feature extraction and feature fusion. The multi-scale feature fusion layer comprises three parallel graph attention modules with different attention scales and a feature fusion module. G of multi-scale feature fusion layer pair input _clan Performing multiple operationsScale feature extraction, namely obtaining weights of output features under three scales through a graph attention mechanism, multiplying the weights by the output features under each scale to obtain more representative features, and marking the more representative features as Fea _rep1 ，Fea _rep2 ，Fea _rep3 . Fea is targeted by the LeakyReLU activation function _rep1 ，Fea _rep2 ，Fea _rep3 The nonlinear feature transformation is completed, and then the representative features are fused into an enhanced feature representation in a feature fusion module, denoted Fea _enhance And Fea is added _enhance As input to the subsequent network. The output result of the node multi-scale feature fusion is expressed as:

wherein [ (S)]Representing a join operator; sigma is a Sigmoid activation function; h ₁ ，H ₂ ，H ₃ The attention layer respectively representing three attention modules has H ₁ ，H ₂ ，H ₃ Independent attention mechanisms;indicating that node i and neighbor node m thereof are in h ₁ ，h ₂ ，h ₃ Normalized attention coefficients calculated under the individual attention mechanisms; />Respectively represent the normalized attention coefficient as +.>The corresponding linear transformation weighting matrix; />Node features representing a neighborhood node m, each (-) representing a feature representation from a different scale; h is a ₁ ，h ₂ ，h ₃ The h of the attention layer respectively representing the three attention modules ₁ ，h ₂ ，h ₃ Individual attention mechanisms; />Representing the neighborhood of node i in the graph.

It should be noted that, the three attention header numbers of the multi-scale feature fusion layer are set as follows: h ₁ ，H ₂ ，H ₃ The values of (2) are [4,8,16 respectively]. The drawing attention module comprises a drawing attention layer, a BatchNomzation layer, a Dropout layer and a LeakyReLU activation layer. As shown in fig. 4, the attention weight among the attention layer calculation nodes of the graph determines the importance of different nodes, the batch optimization layer accelerates training convergence, suppresses overfitting, improves the performance of the model, the Dropout layer reduces overfitting by randomly discarding neurons, improves the robustness of the model, the leak renlu activation layer accelerates the convergence speed of the model, and enhances the generalization capability of the model. The feature fusion module is used for splicing the output results of the three attention modules to aggregate the multi-scale features.

Step2: the design node classification output layer comprises two full-connection layers, namely a LeakyReLU activation layer and a Dropout layer. And setting a loss function to complete the construction of the attention fusion network of the multi-scale graph.

It should be noted that, the loss function selects a cross entropy (Cross Entropy Loss) loss function, and the formula is:

Loss(p,q)＝-∑p(x)log q(x)

wherein p (x) is a label of the training set, q (x) is a label value predicted by the network, and the Dropout ratio of the Dropout layer is 0.6.

It should be noted that, the MSGAFN network model provided by the invention can be used for carrying out fault diagnosis of the rolling bearing under the industrial scene of limited fault vibration signals and strong noise, can also be used for carrying out mixed fault diagnosis of the bearing and the gear, and has higher robustness and generalization. The structure of the network is described in detail below.

As shown in fig. 2, the MSGAFN network structure in the present invention is an input layer, a multi-scale feature fusion layer, and a node classification output layer in sequence. The input of the model input layer is G _clan Output of model output layerIs the diagnosis result of the fault type.

In the MSGAFN network model, specific network parameters of each layer can be determined through optimization, and the parameters of each layer are determined as follows through optimization in the invention:

the output size of the model input layer is 10 multiplied by 512 multiplied by 1, the output size of the multi-scale feature fusion layer is 10 multiplied by 1024 multiplied by 3, the output size of the node classification output layer is 10 multiplied by 512, and the Dropout layer ratio in the drawing force module and the node classification output layer is 0.6.

It should be noted that, when training the MSGAFN network model constructed in advance by using the sample data set, the data set may be divided into a training set and a test set according to a conventional model training manner, the training set inputs the model optimization parameters, and then verifies the model optimization parameters by the test set, and the optimal model parameters can be obtained after multiple training and optimization.

Specifically, the model training specifically comprises the following steps:

step1: adjusting the network structure and parameters of the MSGAFN network model according to the training set result in the S3;

step2: and repeating the previous step to obtain an optimal mechanical fault diagnosis model based on the MSGAFN.

It should be noted that during the model training process, D _bear And D _mix The training of the models is independent, and each forms an optimal MSGAFN-based fault diagnosis model.

Here, after repeated experiments, specific network parameters of the layer are obtained as follows: the output size of the model input layer is 10×512×1, the output size of the multi-scale feature fusion layer is 10×1024×3, the output size of the node classification output layer is 10×512, the Dropout layer ratio in the drawing force module and the node classification output layer is 0.6, and the output Y of the drawing with the node feature F in the MSGAFN network structure can be expressed as:

Y＝FCL ₂ (FCL ₁ (Leaky_ReLU(MSFFL ₁ (F)),

Leaky_ReLU(MSFFL ₂ (F)),Leaky_ReLU(MSFFL ₃ (F))))

wherein FCL ₁ And FCL ₂ For the full connection layer of the MSGAFN network model, the Leaky_ReLU is an activation function used by the MSGAFN network model, and the MSFFL ₁ ，MSFFL ₂ ，MSFFL ₃ And a multi-scale feature fusion layer built for the MSGAFN network model.

S4: the test set divided before is input into the optimal MSGAFN network model for testing, and the performance of the model is evaluated, wherein the evaluation is divided into two forms, and the specific steps are as follows:

step1: using D _bear-5 ，D _bear-6 ，D _bear-7 ，D _bear-8 ，D _bear-9 ，D _bear-10 G constructed according to S2 _clan-train Training MSGAFN network model and passing G corresponding thereto _clan-test Performing performance evaluation;

step2: using D _mix-0 ，D _mix-1 ，D _mix-2 ，D _mix-3 ，D _mix-4 ，D _mix-5 G constructed according to S2 _clan-train Training MSGAFN network model and passing G corresponding thereto _clan-test Performing performance evaluation;

step3: the experimental results obtained by the steps 1-Step2 are compared with other methods, so that the method provided by the invention has superior performance. Specific results are shown below:

(1) Network parameter selection

Different numbers of attention counts focus on different features, we use D in order to study the impact of different scale attention counts on the MSGAFN network model _mix-5 Experimental comparisons were made with the data of (c). The accuracy of the experimental results and the training time curve are shown in fig. 5. As the attention-deficit scale increases, the accuracy of MSGAFN increases gradually. However, if the number of heads is too large, too much information fusion is caused, and accuracy and training efficiency of the model are affected. On the premise of ensuring training efficiency, finally, we select 4-8-16 as the attention number combination.

(2) Performance comparison

In practical industrial applications, the number of faulty and normal samples of the mechanical device is not balanced and there is a lot of noise interference. Therefore, the family chart comprising multi-scale information is constructed, the local information and the whole information of chart data are fully expressed, the generalization capability of the network model is enhanced, and the accuracy influence of the single-scale and multi-scale construction chart on the MSGAFN network model is shown in fig. 6.

The MSGAFN network model performs experiments on the bearing data and the bearing gear mixed data under different signal-to-noise ratio scenes, and compares the bearing data with GCN, chebyNet, MRF _GCN and MHGAT, the bearing data comparison test results of the CWRU data set are shown in table 3 and fig. 7, and the bearing gear mixed data comparison test results of the SEU data set are shown in table 4 and fig. 8.

TABLE 3 Table 3

TABLE 4 Table 4

Experimental results show that the MSGAFN network structure provided by the invention has higher testing accuracy in CWRU data sets and SEU data sets than the comparison method. The accuracy in a strong noise scene with the signal-to-noise ratio of-10 dB can still reach 84.83 percent.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A mechanical fault diagnosis method based on a multiscale graph attention fusion network is characterized by comprising the following steps:

2. A method of diagnosing a mechanical failure based on a multiscale graph attention fusion network according to claim 1, wherein in step one, the collected sample signals are constructed as a sample dataset together with corresponding failure type labels.

3. The mechanical fault diagnosis method based on the multi-scale graph attention fusion network according to claim 1, wherein the specific method of the second step is as follows:

step1: normalizing the acquired data;

step2: converting the normalized time series data into frequency domain data;

step3: selecting n nodes with highest similarity as ancestor graph G in the family graph according to the similarity among samples _ancestor Secondly, randomly dividing the n nodes into two groups, and obtaining respective father graphs G by using a cosine similarity method for each group _parent The method comprises the steps of carrying out a first treatment on the surface of the Finally, each parent graph G _parent The n/2 nodes of the model are randomly divided into two groups again, and each group is obtained by using a cosine similarity methodSelf subgraph G _child ；G _ancestor ，G _parent ，G _child Together form a family diagram G _clan And is used as an input to the model.

4. The method for diagnosing mechanical faults based on the multi-scale image attention fusion network according to claim 1, wherein in the third step, the multi-scale image attention fusion network comprises an input layer, a multi-scale feature fusion layer and a node classification output layer, the multi-scale feature fusion layer comprises three parallel image attention modules with different attention scales and one feature fusion module, and the node classification output layer comprises two full-connection layers, a LeakyReLU activation layer and a Dropout layer.

5. The method for diagnosing a mechanical failure based on a multiscale graph attention fusion network of claim 4, wherein said graph attention module comprises a graph attention layer, a batch nomination layer, a Dropout layer and a LeakyReLU activation layer.

6. A method for diagnosing a mechanical failure based on a multi-scale map attention fusion network as recited in claim 1 or 4, wherein the loss function of the multi-scale map attention fusion network is a cross entropy loss function selected from the following formulas:

Loss(p，q)＝-∑p(x)log q(x)

7. The method for diagnosing mechanical faults based on the multi-scale drawing attention fusion network as claimed in claim 5, wherein the output size of an input layer is 10×512×1, the output size of a multi-scale feature fusion layer is 10×1024×3, the output size of a node classification output layer is 10×512, and the Dropout layer ratio in the drawing attention module and the node classification output layer is 0.6.

8. The method for diagnosing a mechanical failure based on a multi-scale graph attention fusion network according to claim 5, wherein the output result of the multi-scale feature fusion layer is expressed as:

9. The method for diagnosing a mechanical failure based on a multi-scale graph attention fusion network according to claim 5, wherein the output Y of the multi-scale graph attention fusion network is represented as:

Y＝FCL ₂ (FCL ₁ (Leaky_ReLU(MSFFL ₁ (F))，Leaky_ReLU(MSFFL ₂ (F))，Leaky_ReLU(MSFFL ₃ (F))))