CN117218508A

CN117218508A - Ball screw fault diagnosis method based on channel parallel fusion multi-attention mechanism

Info

Publication number: CN117218508A
Application number: CN202311080811.2A
Authority: CN
Inventors: 王�华; 赵函; 洪荣晶
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2023-08-25
Filing date: 2023-08-25
Publication date: 2023-12-12

Abstract

A ball screw fault diagnosis method based on channel parallel fusion multi-attention mechanism comprises the following steps: step S1, a ball screw data set is established by preprocessing ball screw acceleration data with a gram angle field and converting the ball screw acceleration data into two-dimensional data; s2, inputting the ball screw data set subjected to pretreatment to an image blocking module, and dividing an input image into small-size subgraphs with the same size; step S3, a ball screw characteristic extraction module based on a channel parallel fusion multi-attention mechanism is established, and a characteristic diagram is obtained in four stages; and S4, after 4 stages of calculation, using a global average pooling layer on the output characteristics, finally using a softMax classifier to realize ball screw fault classification, carrying out gradient back propagation according to a classification loss function to update the parameters of the ball screw fault diagnosis model again, and storing the optimized diagnosis model after training test is completed.

Description

Ball screw fault diagnosis method based on channel parallel fusion multi-attention mechanism

Technical Field

The invention relates to the field of mechanical equipment fault diagnosis, in particular to a ball screw fault diagnosis method based on a channel parallel fusion multi-attention mechanism.

Background

The ball screw pair is a driving mechanism for converting rotary motion into linear motion, has the advantages of high efficiency, high precision, long service life and the like, and is widely applied to the fields of numerical control machine tools, servo transmission, precise instruments and the like. At present, a transmission device in a numerical control machine tool still takes a ball screw pair as a main part and occupies an important position in the field of servo transmission. However, the ball screw pair raceway contact interface inevitably generates phenomena of abrasion, pitting corrosion and the like when working under severe working conditions such as high speed, heavy load, cyclic start and stop and the like for a long time, so that the processing quality and the working safety of mechanical equipment are seriously threatened, serious equipment and even personal safety accidents can be caused once faults occur, and huge economic losses can be generated. Therefore, the reliability and safety of the ball screw pair must be considered, and studies on the fault diagnosis technique of the ball screw pair are performed to reduce serious accidents of mechanical equipment caused by the fault of the ball screw pair.

More and more researches are conducted based on a transducer model, good effects are achieved in the field of computer vision and the field of target detection, but the application of the transducer model in the field of mechanical fault diagnosis is relatively small, and the transducer model still has the defects of poor classification recognition capability, insufficient local feature extraction capability and the like. The Swin transducer solves the problems and achieves good effects on classification, detection and segmentation tasks. In order to obtain higher fault recognition rate, the deep neural network model is developed towards a larger receptive field, but the importance of local information is sometimes ignored, so that the combination of the classical and efficient convolutional neural network model and the computer vision field model has very high research value. And the common channel shuffling operation is utilized to fuse a multi-attention mechanism, and local information extraction is added while the global information of the model is concerned.

In order to solve the problems, the invention integrates the advantages of channel parallelism and multi-attention mechanism to optimize the commonly used Swin transducer module, designs the module based on the channel parallelism fusion multi-attention mechanism to strengthen the local perception of the model and the capability of integrating global information, and improves the classification accuracy.

Disclosure of Invention

Aiming at the defects or improvement demands of the prior art, the invention provides a ball screw fault diagnosis method based on a channel parallel fusion multi-attention mechanism, which solves the problems of poor model classification recognition capability and insufficient local feature extraction capability.

In order to achieve the above purpose, the present invention provides the following technical solutions, including the following steps:

step S1, a ball screw data set is established by preprocessing ball screw acceleration data with a gram angle field and converting the ball screw acceleration data into two-dimensional data;

s2, inputting the ball screw data set subjected to pretreatment to an image blocking module, and dividing an input image into small-size subgraphs with the same size;

step S3, constructing a ball screw characteristic extraction module based on a channel parallel fusion multi-attention mechanism, and obtaining a characteristic diagram in four stages;

and S4, after four stages of calculation, using a global average pooling layer on output characteristics, and finally using a softMax classifier to realize ball screw fault classification, carrying out gradient back propagation according to a classification loss function to update the parameters of the ball screw fault diagnosis model again, and storing the optimized diagnosis model after training test is completed.

In the step S1, the ball screw acceleration data is obtained by performing dynamic simulation on the ball screw through dynamic simulation software.

When the ball screw data set is preprocessed in the step S1, the ball screw acceleration data is first transformed into a gladhand angle field to obtain two-dimensional image data.

Further, the gram angle field converts a one-dimensional time sequence in a Cartesian coordinate system into a polar coordinate system for representation, and then a trigonometric function is used for generating a gram matrix. Firstly, scaling the time sequence in a Cartesian coordinate system to a [0,1] or [ -1,1] interval; converting the polar coordinates, and converting the Cartesian coordinate system sequence into a polar coordinate system time sequence by using a coordinate conversion formula; through the trigonometric function transformation of angles and/or differences, a gram angle sum matrix is obtained if the cosine function of the sum of two angles is used, and a gram angle difference matrix is obtained if the cosine function of the difference of two angles is used.

In the step S2, the image segmentation module segments the input image into non-overlapping small-sized subgraphs according to a size of 4×4.

In the step S3, the ball screw feature extraction module includes four stages, the first stage is composed of a linear embedding module and a channel parallel fusion multi-attention mechanism module, the latter three stages are composed of an image merging module and a channel parallel fusion multi-attention mechanism module, data are respectively input into two branches through channel separation in the channel parallel fusion multi-attention mechanism module, the left branch performs shallow convolution operation to efficiently extract local information, the right branch can pay attention to global information and cross-window information transfer through a Swin transducer module, wherein:

linear embedding: projecting the original features to any dimension to obtain feature vectors corresponding to each sub-graph;

channel parallel fusion multi-attention mechanism module: firstly, dividing the input characteristic diagram into two branches, wherein the number of the channels is 1/2. One branch performs shallow convolution operation, uses the same input channel number and output channel number, and carries out convolution with three step sizes of 1, wherein two 1×1 convolutions are common convolutions, 3×3 convolutions are depth convolutions in depth separable convolutions, the other branch can better extract global information through a Swin transform module, after feature extraction is completed, the two branches can perform cascading operation, channel numbers are added, features are fused, and finally, information communication among different groups is carried out by channel shuffling, so that channels are fully fused, and local features and global features are fused efficiently and accurately;

swin transducer Module: the method comprises the steps of enabling model training to be stable through a normalization layer, adding residual connection, inputting a feature image into a window multi-head self-attention module, firstly carrying out image blocking on the feature image according to the size of 4 multiplied by 4 by the window multi-head self-attention module, then respectively calculating self-attention for each image blocking, inputting an obtained self-attention result into a layer normalization module, inputting a normalization result into a multi-layer perceptron module, carrying out nonlinear transformation on the input normalization result, inputting the obtained feature into the multi-head self-attention module of a moving window, then moving, calculating self-attention in the moved window, obtaining self-attention in the moved window, inputting the self-attention result in the moved window into the layer normalization module, then inputting the normalization result into the multi-layer perceptron module, carrying out nonlinear transformation on the input normalization result, and outputting new features.

A multi-layer perceptron module: the multi-layer perceptron comprises an input layer, a hidden layer and an output layer, wherein different layers of the neural network of the multi-layer perceptron are fully connected, the input layer receives input data, the hidden layer is represented by learning characteristics, the output layer generates a final prediction result, and each neuron of the hidden layer and the output layer is provided with an activation function for introducing nonlinear mapping.

Image merging: the method is used for downsampling to obtain multi-scale characteristic information, reducing resolution and adjusting the number of channels.

In the step S4, after the calculation in four stages, the global average pooling layer is used on the output characteristics, and then the ball screw fault classification is implemented through a SoftMax classifier.

And (3) carrying out gradient back propagation to carry out re-updating on parameters of the ball screw fault diagnosis model according to the classification loss function, obtaining an optimized ball screw fault diagnosis model, inputting a ball screw image to be diagnosed into the ball screw image classification model which is optimized and trained, and outputting a fault diagnosis result of the ball screw.

Compared with the prior art, the invention has the following advantages and beneficial effects:

the invention provides a fault diagnosis method of a ball screw pair, which is characterized in that important features are extracted by dynamically simulating acceleration signals on a vibration sensor of the ball screw pair, the extracted features are mapped into fault categories through an algorithm, and a gradient optimization model is reversely propagated through a loss function, so that the stability and accuracy of prediction are improved.

The invention adds the channel parallelism in each detection layer of the model, enhances the local perception capability of the network, is beneficial to extracting local features and integrating global information, improves the limitation that the traditional Swin transducer module is not good at extracting the local features, and saves a plurality of times of calculated amount under the condition that the calculated amount and the parameter amount are less than tens of times of calculated amount and parameter amount, the final classification accuracy is still far higher than that of the traditional classification model, and the accuracy is almost the same.

Drawings

FIG. 1 is a general flow chart of a method for diagnosing a ball screw fault based on a channel parallel fusion multi-attention mechanism

FIG. 2 is a diagram of a channel parallel fusion multi-attention mechanism module architecture

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

As shown in fig. 1, a method for diagnosing a ball screw fault based on a channel parallel fusion multi-attention mechanism includes the following steps:

and S1, converting the one-dimensional time sequence into a two-dimensional image by preprocessing the ball screw acceleration data with a gram angle field, and establishing a ball screw data set.

According to the invention, the ball screw acceleration data is used as input, and the ball screw acceleration data is obtained by carrying out dynamic simulation on the ball screw through dynamic simulation software, and because an original signal is a time domain signal, the original signal is required to be converted into a two-dimensional image through a gram angle field.

Further, the images obtained by the gram angle field are expressed from the upper left corner to the lower right corner, the time series is firstly obtained by carrying out polar coordinate transformation on the step length and the amplitude in the time series to obtain the radius and the angle, then the trigonometric function value and the correlation of each point are calculated, and the trigonometric function transformation is utilized to form the gram matrix.

And S2, inputting the ball screw data set subjected to pretreatment to an image blocking module, and dividing the input image into small-size subgraphs with the same size.

The w (width) ×h (height) ×3 of the input image is divided into non-overlapping small-sized sub-images by an image dividing module, the characteristics of which are set to original image pixel values, in Swin transform, we use a sub-image size of 4×4, the characteristic dimension of each sub-image is 4×4×3=48, and the number of original image divisions is (w/4) × (h/4).

And S3, establishing a ball screw characteristic extraction module based on a channel parallel fusion multi-attention mechanism, and obtaining a characteristic diagram in four stages.

In step S3.1, linear embedding is applied to the original features, the original features are projected to any dimension, then the sub-image input channels are fused in parallel to a multi-attention mechanism module for feature extraction, and in order to generate multi-scale feature representation, the total number of sub-images is reduced through image merging along with the deep network.

And S3.2, fusing 2X2 adjacent sub-graph features of each group, combining images on the feature dimension after fusion, reducing the total number of the sub-graphs by 4 times, and then fusing the multi-attention mechanism modules in parallel by an input channel to extract the features, wherein the resolution is (w/8) X (h/8).

In the channel parallel fusion multi-attention mechanism, data are respectively input into two branches through channel separation, the left branch performs shallow convolution operation to efficiently extract local information, and the right branch can pay attention to global information and cross-window information transfer through a Swin transducer module, wherein:

linear embedding: and projecting the original features to any dimension to obtain feature vectors corresponding to each sub-graph.

Channel parallel fusion multi-attention mechanism module: firstly, dividing the input characteristic diagram into two branches, wherein the number of the channels is 1/2. One branch performs shallow convolution operation, uses the same input channel number and output channel number, and carries out convolution with three step sizes of 1, wherein two 1×1 convolutions are common convolutions, 3×3 convolutions are depth convolutions in depth separable convolutions, the other branch can better extract global information through a Swin transform module, after feature extraction is completed, the two branches can perform cascading operation, channel numbers are added, features are fused, and finally, channel shuffling is used for carrying out information exchange among different groups, so that channels are fully fused, and local features and global features are fused efficiently and accurately.

Swin transducer: the method comprises the steps of enabling model training to be stable through a normalization layer, adding residual connection, inputting a feature map into a window multi-head self-attention module, performing image blocking on the feature map according to the size of 4 multiplied by 4 by the window multi-head self-attention module, then respectively calculating self-attention for each image blocking, inputting an obtained self-attention result into a layer normalization module for normalization operation, inputting a normalization result into a multi-layer perceptron module for nonlinear transformation of the input normalization result, inputting the obtained features into a shift window multi-head self-attention module for movement, then calculating self-attention in the moved window to obtain self-attention in the moved window, inputting the self-attention result in the moved window into the layer normalization module for normalization, inputting the normalization result into the multi-layer perceptron module for nonlinear transformation of the input normalization result, and outputting new features.

Step S3.3, alternately using a window multi-headed self-attention module and a shift window multi-headed self-attention module in two consecutive Swin transducer modules.

Dividing an 8x8 size feature diagram of a Swin transducer module of a previous layer into 2x2 sub-diagrams, wherein the size of each sub-diagram is 4x4, then moving the window position of the Swin transducer module of the next layer to obtain 3x3 misaligned sub-diagrams, and connecting the adjacent misaligned windows of the previous layer in a dividing mode of moving the windows, so that the receptive field is greatly increased.

The window multi-head self-attention module introduces windows, so that the calculated amount can be reduced to linear increase, the multi-head self-attention module divides an input picture into windows which are not overlapped, then self-attention calculation is carried out in different windows, and one picture has a subgraph of w (width) x h (height), then:

the multi-head self-attention module has the following calculation complexity: Ω (MSA) =4hwc2+2 (hw) 2C;

the computational complexity of the window-based multi-headed self-attention module is: Ω (W-MSA) =4hwc2+2m2hwc.

Where h and w represent the height and width of the original image, respectively, C represents the number of channels, M represents the size of each image block, and h and w of the corresponding image block are both M.

In addition, the shifted window multi-headed self-attention module directs cross-window connections while maintaining efficient computation of non-overlapping windows.

And S4, after four stages of calculation, using a global average pooling layer on the output characteristics, and finally using a softMax classifier to realize ball screw fault classification.

And (3) carrying out the re-updating of the parameters of the ball screw fault diagnosis model by reversely transmitting the gradient according to the classification loss function, and storing the optimized diagnosis model after the training test is finished.

The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A ball screw fault diagnosis method based on a channel parallel fusion multi-attention mechanism is characterized by comprising the following steps:

2. The method for diagnosing the fault of the ball screw based on the channel parallel fusion multi-attention mechanism according to claim 1, wherein the method comprises the following steps of: in the step S1, the ball screw acceleration data is obtained by performing a dynamic simulation on a ball screw.

3. The method for diagnosing the fault of the ball screw based on the channel parallel fusion multi-attention mechanism according to claim 1, wherein the method comprises the following steps of: in the step S1, when the ball screw data set is preprocessed, the ball screw acceleration data is first converted into two-dimensional image data by the gladhand angle field.

4. The method for diagnosing the fault of the ball screw based on the channel parallel fusion multi-attention mechanism according to claim 1, wherein the method comprises the following steps of: in the step S2, the image segmentation module segments the input image into non-overlapping small-sized subgraphs according to a size of 4×4.

5. The method for diagnosing the fault of the ball screw based on the channel parallel fusion multi-attention mechanism according to claim 1, wherein the method comprises the following steps of: in the step S3, the ball screw feature extraction module includes four stages, the first stage is composed of a linear embedding module and a channel parallel fusion multi-attention mechanism module, the latter three stages are composed of an image merging module and a channel parallel fusion multi-attention mechanism module, data are respectively input into two branches through channel separation in the channel parallel fusion multi-attention mechanism module, the left branch performs shallow convolution operation to efficiently extract local information, the right branch can pay attention to global information and cross-window information transfer through a Swin transducer module, wherein:

channel parallel fusion multi-attention mechanism module: the input characteristic channel is divided into two branches, wherein one branch is subjected to shallow convolution operation, three convolutions with the step length of 1 are adopted, two 1 multiplied by 1 convolutions are common convolutions, 3 multiplied by 3 convolutions are deep convolutions in the deep separable convolutions, the other branch can better extract global information through a Swin transform module, after the characteristic extraction is finished, the two branches can be subjected to cascading operation, the channel numbers are added, characteristics are fused, and finally, the channel shuffling is used for carrying out information communication among different groups, so that the channels are fully fused, and the local characteristics and the global characteristics are fused efficiently and accurately;

swin transducer: firstly stabilizing model training through a normalization layer, adding residual connection, inputting a feature map into a window multi-head self-attention module, firstly carrying out image blocking on the feature map according to the size of 4 multiplied by 4 by the window multi-head self-attention module, then respectively calculating self-attention for each image blocking, firstly inputting an obtained self-attention result into a layer normalization module, inputting a normalization result into a multi-layer perceptron module, carrying out nonlinear transformation on the input normalization result, inputting the obtained features into a shift window multi-head self-attention module, then moving, calculating the self-attention in the moved window, obtaining the self-attention in the moved window, inputting the self-attention result in the moved window into the layer normalization module, and then inputting the normalization result into the multi-layer perceptron module to carry out nonlinear transformation on the input normalization result, and outputting new features;

image merging: for downsampling to obtain multi-scale feature information.

6. The method for diagnosing the fault of the ball screw based on the channel parallel fusion multi-attention mechanism according to claim 1, wherein the method comprises the following steps of: in the step S4, after the calculation in four stages, the global average pooling layer is used on the output characteristics, and then the ball screw fault classification is implemented through a SoftMax classifier, the gradient is counter-propagated according to the classification loss function to update the parameters of the ball screw fault diagnosis model again, so as to obtain an optimized ball screw fault diagnosis model, the ball screw image to be diagnosed is input into the ball screw image classification model which is optimized and trained, and the fault diagnosis result of the ball screw is output.