CN114386526A

CN114386526A - Combined convolution neural network diagnosis method for rotary machine fault

Info

Publication number: CN114386526A
Application number: CN202210054695.6A
Authority: CN
Inventors: 杜文辽; 王宏超; 李川; 胡鹏杰; 侯绪坤; 巩晓赟; 赵峰; 谢贵重; 孟凡念; 郭志强; 王良文
Original assignee: Zhengzhou University of Light Industry
Current assignee: Zhengzhou University of Light Industry
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-04-22

Abstract

The invention provides a combined convolution neural network diagnosis method for rotary machine faults, which comprises the steps of data acquisition, data preprocessing, 1D-2D JCNN model construction, model training, verification, diagnosis and the like. The method specifically aims at the vibration signals acquired in different states, multi-scale feature vectors of the signals are acquired in a self-adaptive mode through one-dimensional convolution, the feature vectors are constructed into two-dimensional vectors, and the two-dimensional vectors are used as the input of a two-dimensional convolution neural network. According to the invention, when a 1D-2D JCNN model is constructed, the two-dimensional structural expression of a one-dimensional convolutional neural network self-adaptive construction signal and the strong characteristic learning capacity of the two-dimensional convolutional neural network are fully utilized, the two convolutional neural networks with different structures are unified into an integral frame, and a combined convolutional neural network model for rotary machine fault diagnosis is developed.

Description

Combined convolution neural network diagnosis method for rotary machine fault

Technical Field

The invention relates to a combined convolution neural network diagnosis method for faults of rotary machinery, and belongs to the technical field of intelligent fault diagnosis of rotary machinery.

Background

The rotating machinery is widely applied to various industries and develops towards the direction of precision and intellectualization. If mechanical equipment fails, abnormal shutdown of the equipment is caused, which not only causes great property loss, but also may endanger life safety of people. The study of fault diagnosis of rotary machines has been the focus of research by researchers. Due to the fact that a large amount of state information is carried by vibration signals of equipment, a large amount of research results are obtained by using the vibration signals of the equipment and a fault diagnosis technology based on data driving. Generally, the diagnosis technology needs several steps of information acquisition, data preprocessing, feature extraction, feature dimension reduction, pattern recognition and the like, but the feature extraction and the feature dimension reduction usually need a large amount of manual experience, and the traditional pattern recognition technology is shallow learning, so that the obtained diagnosis precision is not ideal. In recent years, a deep learning method draws attention of a large number of scholars, a convolutional neural network is one of the most effective deep learning models, has the characteristics of weight sharing, local perception, multi-core participation, automatic feature extraction and the like, has great success in the fields of pattern recognition, image voice recognition and the like, and currently, a large number of expert scholars apply the convolutional neural network to the field of fault diagnosis.

CNN is mainly used for processing two-dimensional image problems initially, and for one-dimensional vibration signals, the one-dimensional signals need to be converted into two-dimensional vectors through some transformation, and common conversion methods include: short-time fourier transform (STFT), Continuous Wavelet Transform (CWT), Wavelet Packet Transform (WPT), and the like. The methods obtain better diagnosis results, particularly a convolutional neural network model based on wavelet transformation, the time-frequency analysis capability of the wavelet transformation is fully utilized, but the methods seriously depend on manual experience, different wavelet mother functions are selected, and the diagnosis results are often greatly different.

In recent years, deep learning has been successfully applied in a variety of fields because it is possible to directly learn features of different states from collected signals, and thus, good classification performance is obtained. However, the one-dimensional convolutional neural network is used for directly processing the original signal, and because of the limitation of the information utilization capacity, the one-dimensional convolutional neural network is only used as an intermediate step for establishing a diagnostic model, or further optimization is needed, and the manual experience intervention cannot be avoided.

Disclosure of Invention

In view of the shortcomings of the prior art, the invention aims to provide a joint convolutional neural network diagnosis method for rotating machinery faults.

In order to achieve the purpose, the invention adopts the technical scheme that:

a joint convolution neural network diagnosis method for rotary machine faults comprises the following steps:

step 1, data acquisition:

setting sampling frequency to complete the vibration signal acquisition of the diagnosis object in each state;

step 2, data preprocessing:

carrying out normalization processing on the acquired data, and dividing the data into a training set, a verification set and a test set;

step 3, constructing a 1D-2D JCNN model:

the 1D-2D JCNN model consists of a 1D convolution layer, a 2D convolution layer, a pooling layer and a full-connection layer; firstly, constructing a one-dimensional convolutional neural network, wherein the one-dimensional convolutional neural network only has one convolutional layer, and the size, the step length and the number of n convolutional kernels are set so that the length of a generated characteristic graph is n; inputting a one-dimensional vibration signal into the one-dimensional convolution neural network, performing convolution operation on the one-dimensional signal to generate n characteristic graphs, overlapping the n characteristic graphs together to generate an n multiplied by n two-dimensional matrix, and taking the matrix as the input of the two-dimensional convolution neural network; initializing connection weight values and bias parameters of each layer of the model;

step 4, model training:

selecting cross entropy as a loss function, inputting a training set and a verification set into a constructed 1D-2D JCNN model, starting training on the network model by using a back propagation algorithm, updating the weight by using a gradient descent method, calculating the gradient of each layer by using an error back propagation method according to an error chain propagation rule, and stopping training and storing the trained model when the condition of terminating model training is met;

step 5, verification:

verifying the model obtained in the step 4 by using a verification set sample; if the diagnosis precision does not meet the requirement, the process of the step 2-4 is executed again until the diagnosis precision meets the requirement, and then the training is completed to obtain the final model parameters;

step 6, diagnosis:

and inputting the test set into the trained model to obtain a diagnosis accuracy test result of the model.

The specific method for constructing the 1D-2D JCNN model in the step 3 comprises the following steps:

a. constructing a 1D-2D joint convolution neural network structure: the network consists of a 1D convolutional layer, a 2D convolutional layer, a pooling layer and a full-connection layer, wherein the model structure parameters of the 1D-2D JCNN are 16 layers of the network structure, and the network structure comprises 1 input layer, 1 one-dimensional convolutional layer, 4 two-dimensional convolutional layers, 4 BN layers, 4 pooling layers and 2 Dense layers; the convolutional layer activation function adopts a ReLU function, and a BN layer and a pooling layer are closely arranged behind each two-dimensional convolutional layer; the Dense _1 layer is a full connection layer, the Dense _2 layer is an output layer, and a classifier function adopts a Softmax classifier;

b. forward calculation: input X (X) to joint convolutional network⁰) Firstly, one-dimensional convolution is carried out, and the output of the one-dimensional convolution is a one-dimensional convolution characteristic diagram;

wherein the content of the first and second substances,

the ith convolution kernel representing the ith layer,

for the jth local input of the l-1 th layer,

to output, Conv1D (mush) is a one-dimensional convolution calculation,

bias ith for layer l, area calculated for convolution for N, and activation function for F (& ltSUB & gt); the ReLU function is chosen here as the activation function, expressed as:

F(a)＝max{0,a}

assuming that n convolution kernels are in total in the one-dimensional convolution layer, the result obtained by performing one-dimensional convolution on the p-th sample in the original signal X is X_ij，x_ijThe result of the jth local part of the sample p passing through the ith convolution kernel is shown, and the characteristic graph obtained by the sample p passing through the ith convolution kernel is f_i；

Superposing the obtained n characteristic maps to construct a two-dimensional characteristic map, wherein the specific construction process comprises the steps of reconstructing n 1 multiplied by n matrixes obtained by one-dimensional convolution for each sample into an n multiplied by n two-dimensional matrix, and the matrix element x of the two-dimensional matrix_i,j＝x_ijUsing a matrix F_pExpressed as:

taking a two-dimensional picture constructed by one-dimensional convolution as the input of a two-dimensional convolution network to carry out two-dimensional convolution operation, wherein the output of the neuron is as follows:

where k represents a convolution kernel, N is a convolution calculation region,

for the input of the first layer of convolution,

conv2D (mush) as the two-dimensional convolution calculation, b, for the convolution output^lFor bias, F (& lt) is the activation function, and likewise, is chosen to be the ReLU function.

The specific process of model training in step 4 is as follows: in the network training process, the weight value is updated by adopting a gradient descent method, the gradient of each layer is calculated by adopting an error L back propagation method according to the chain propagation rule of errors, and in the two-dimensional convolution part, the error of the convolution layer is assumed to be

The error of a hidden layer thereon

And need to update parameters

Comprises the following steps:

wherein the content of the first and second substances,

in order to correct the error of the convolution layer,

in order to be the error of the previous hidden layer,

in order to convolve the layer weights,

in order to obtain the weight of the previous hidden layer,

for convolutional layer output, L is the initial error.

When the error is propagated to the one-dimensional-two-dimensional connecting portion in the reverse direction, the error transmitted from the two-dimensional convolution layer is set

Comprises the following steps:

then its one-dimensional convolution kernel gradient update procedure can be formulated as:

wherein the content of the first and second substances,

error of convolution kernel, w_1+iAs a result of the original value of the value,

the error value transmitted by the two-dimensional convolution layer, L is the initial error.

English abbreviation of Chinese word and phrase: one-dimensional convolutional neural network: 1DCNN, two-dimensional convolutional neural network: 2 DCNN.

The invention has the beneficial effects that:

the invention provides a rotary machine fault diagnosis method based on a one-dimensional-two-dimensional combined convolution neural network, which is more specific to vibration signals acquired in different states, utilizes one-dimensional convolution to adaptively acquire multi-scale feature vectors of the signals, constructs the feature vectors into two-dimensional vectors, and uses the two-dimensional vectors as the input of the two-dimensional convolution neural network. According to the invention, when a 1D-2D JCNN model is constructed, the two-dimensional structural expression of a one-dimensional convolutional neural network self-adaptive construction signal and the strong characteristic learning capacity of the two-dimensional convolutional neural network are fully utilized, the two convolutional neural networks with different structures are unified into an integral frame, and a combined convolutional neural network model for rotary machine fault diagnosis is developed.

The method takes the cross entropy error function as a loss function, optimizes the filter parameters of the one-dimensional and two-dimensional combined network model by using an error back propagation algorithm, and obtains a final fault diagnosis model, so that the model obtains better diagnosis performance and good data adaptivity, and can be used for diagnosing various rotary machine faults. The invention establishes a training method of model parameters of a 1D-2D joint convolution neural network in model training. A mechanism for transmitting model training errors from the two-dimensional convolutional neural network to the one-dimensional convolutional neural network is deduced, seamless connection between the two-dimensional convolutional neural network and the one-dimensional convolutional neural network is achieved, and updating of weights and bias parameters of all layers of the whole model is achieved by a gradient descent method.

Drawings

Fig. 1 is a flow chart of the fault diagnosis of the present invention.

Fig. 2 is a waveform diagram of a vibration signal of a bearing fault of class 10 of an experimental data sample in an application example of the invention.

The method comprises the following steps of A, a normal state, B, a slight inner ring fault, C, a slight rolling body fault, d, a slight outer ring fault, e, a medium inner ring fault, f, a medium rolling body fault, g, a medium outer ring fault, h, a serious inner ring fault, i, a serious rolling body fault and j, a serious outer ring fault; the upper graph is a one-dimensional signal graph, and the lower graph is a two-dimensional gray scale graph.

Fig. 3 is a diagram showing the result of fault diagnosis of 10 training tests of bearing data in the application example of the present invention.

Wherein, Times represents the Times, and Accuracy represents the Accuracy.

Detailed Description

The following examples are provided to further illustrate the embodiments of the present invention, and the embodiments and specific procedures of the present invention are given on the premise of the technical solution of the present invention, but the scope of the present invention is not limited to the following examples.

step 1, data acquisition:

and finishing the acquisition of vibration signals of the diagnosis object in each state according to a preset sampling frequency.

Step 2, data preprocessing:

and carrying out normalization processing on the acquired data, and dividing the data into a training set, a verification set and a test set according to a set proportion.

Step 3, constructing a 1D-2D JCNN model:

the 1D-2D JCNN model consists of a 1D convolution layer, a 2D convolution layer, a pooling layer and a full-connection layer; firstly, a one-dimensional convolutional neural network is constructed, wherein the one-dimensional convolutional neural network only has one convolutional layer, and proper convolutional kernel size, step length (note: the length of the generated feature map is n) and the number of n convolutional kernels are set. Inputting a one-dimensional vibration signal into the one-dimensional convolution neural network, performing convolution operation on the one-dimensional signal to generate n characteristic maps, superposing the n characteristic maps together to generate an n multiplied by n two-dimensional matrix, and taking the matrix as the input of the two-dimensional convolution neural network. And initializing the connection weight and bias parameters of each layer of the model.

The specific method for constructing the 1D-2D JCNN model comprises the following steps:

a. and constructing a 1D-2D joint convolution neural network structure. The network is composed of a 1D convolutional layer, a 2D convolutional layer, a pooling layer and a full-connection layer, model structure parameters of the 1D-2D JCNN are 16 layers of the network structure, and the network structure comprises 1 input layer, 1 one-dimensional convolutional layer, 4 two-dimensional convolutional layers, 4 BN layers, 4 pooling layers and 2 Dense layers. The convolutional layer activation function adopts a ReLU function, and each two-dimensional convolutional layer is followed by a BN layer and a pooling layer. The Dense _1 layer is a full connection layer, the Dense _2 layer is an output layer, and the classifier function adopts a Softmax classifier.

b. And (4) forward calculation. Input X (X) to joint convolutional network⁰) First, after one-dimensional convolution, the output is a one-dimensional convolution characteristic diagram.

Wherein the content of the first and second substances,

the ith convolution kernel representing the ith layer,

for the jth local input of the l-1 th layer,

to output, Conv1D (mush) is a one-dimensional convolution calculation,

bias ith for layer l, area calculated for convolution for N, and activation function for F (& lt). The ReLU function is chosen here as the activation function, expressed as:

F(a)＝max{0,a}

assuming that n convolution kernels are in total in the one-dimensional convolution layer, the result obtained by performing one-dimensional convolution on the p-th sample in the original signal X is X_ij，x_ijThe result of the jth local part of the sample p passing through the ith convolution kernel is shown, and the characteristic graph obtained by the sample p passing through the ith convolution kernel is f_i。

Superposing the obtained n characteristic maps to construct a two-dimensional characteristic map, wherein the specific construction process comprises the steps of reconstructing n 1 multiplied by n matrixes obtained by one-dimensional convolution of each sample into oneTwo-dimensional matrix of n x n, the matrix elements x of which_i,j＝x_ijUsing a matrix F_pExpressed as:

taking a two-dimensional picture of the one-dimensional convolution structure as the input of a two-dimensional convolution network to carry out two-dimensional convolution operation, wherein the output of the neuron is

Where k represents a convolution kernel, N is a convolution calculation region,

for the input of the first layer of convolution,

And 4, model training:

selecting cross entropy as a loss function, inputting a training set and a verification set into a constructed 1D-2D JCNN model, starting training on the network model by using a back propagation algorithm, updating the weight by using a gradient descent method, calculating the gradient of each layer by using an error back propagation method according to an error chain propagation rule, and stopping training and storing the trained model when the condition of terminating the model training is met.

The method has the advantages that a mechanism for transmitting model training errors from the 2D convolutional neural network to the 1D convolutional neural network is deduced, seamless connection of the 2D convolutional neural network and the 1D convolutional neural network is achieved, and updating of weights and bias parameters of all layers of the whole model is achieved by using a gradient descent method.

The specific process is as follows: in the network training process, the weight value is updated by adopting a gradient descent method, and errors are adopted according to the chain propagation rule of the errorsCalculating the gradient of each layer by the difference L back propagation method, and in the two-dimensional convolution part, assuming the error of the convolution layer as

The error of a hidden layer thereon

And need to update parameters

Comprises the following steps:

wherein the content of the first and second substances,

in order to correct the error of the convolution layer,

in order to be the error of the previous hidden layer,

in order to convolve the layer weights,

in order to obtain the weight of the previous hidden layer,

for convolutional layer output, L is the initial error.

Comprises the following steps:

wherein the content of the first and second substances,

Step 5, verification:

verifying the deep hybrid convolutional neural network model obtained in the step 4 by using a verification set sample; if the diagnosis precision does not meet the requirement, the process of the step 2-4 is executed again until the diagnosis precision meets the requirement, and then the training is completed to obtain the final model parameters;

and 6, diagnosis:

Application example:

a fault diagnosis method for a rotary machine based on a one-dimensional-two-dimensional combined convolutional neural network is disclosed, and the fault diagnosis process is shown in figure 1.

The experimental data set adopts a published data set CWRU OF American West university OF storage, the bearing to be tested consisting OF a motor, a coupler and a load motor is a motor driving end bearing, the bearing supports a rotating shaft OF the motor, the damage OF the bearing is single-point damage simulated by electric spark machining, the state OF the bearing can be divided into normal (N), inner ring fault (IF), rolling Body Fault (BF) and outer ring fault (OF), the damage degree (fault diameter) according to the fault is different, each fault type can be divided into mild (fault diameter is 0.18mm), moderate (fault diameter is 0.36mm) and severe (fault diameter is 0.53mm) according to different degrees OF the damage degree (fault diameter) OF the fault. The motor operates at approximately 1800 rpm under four loads of 0, 1, 2, and 3. The data acquisition was performed at both 12kHz and 48kHz sampling frequencies, and the data acquired at the 48kHz sampling frequency was used herein as experimental data.

According to the different loads, the experimental data are constructed into 4 data sets, the data set A, B, C, D corresponds to the loads of 0hp, 1hp, 2hp and 3hp, the data set comprises all 4 loads, each data set has 10 health states (as shown in fig. 2, a-j), 10000 data samples, each sample comprises 2048 data, and the data are processed by the following steps as shown in fig. 2:

step 1: 10000 collected data samples are subjected to normalization processing, and the data are divided into a training set, a verification set and a test set according to the proportion of (6:1: 3).

Step 2: the JCNN model is constructed and comprises a 1D convolutional layer, a 2D convolutional layer, a pooling layer and a full-connection layer, and the model structure parameters of the 1D-2D JCNN are 16 layers of a network structure, and comprise 1 input layer, 1 one-dimensional convolutional layer, 4 two-dimensional convolutional layers, 4 BN layers, 4 pooling layers and 2 Dense layers. The convolutional layer activation function adopts a ReLU function, and each two-dimensional convolutional layer is followed by a BN layer and a pooling layer. The Dense _1 layer is a full connection layer, the Dense _2 layer is an output layer, and the classifier function adopts Softmax. The multi-scale feature vector of the signal is obtained in a self-adaptive mode through one-dimensional convolution, and the obtained feature vector is constructed into a two-dimensional vector.

Input X (X) to joint convolutional network⁰) Firstly, one-dimensional convolution is carried out, and the output is a one-dimensional convolution characteristic diagram

Wherein the content of the first and second substances,

i-th volume representing the l-th layerThe number of the kernels is accumulated,

for the jth local input of the l-1 th layer,

to output, Conv1D (mush) is a one-dimensional convolution calculation,

bias ith for layer l, area calculated for convolution for N, and activation function for F (& lt). The ReLU function is chosen here as the activation function, expressed as: f (a) ═ max {0, a }

Setting n convolution kernels in the one-dimensional convolution layer, wherein m samples are in the training set, and the result obtained by performing one-dimensional convolution on the p-th sample in the original signal X is X_ij，x_ijThe result of the jth local part of the sample p passing through the ith convolution kernel is shown, and the characteristic graph obtained by the sample p passing through the ith convolution kernel is f_i。

Superposing the obtained n characteristic maps to form a two-dimensional characteristic map, wherein the matrix element x of the two-dimensional characteristic map_i,j＝x_ijUsing a matrix F_pExpressed as:

Where k represents a convolution kernel, N is a convolution calculation region,

for the input of the first layer of convolution,

And step 3: and initializing and setting the connection weight w and the bias parameter b of each layer of the model.

And 4, inputting the test set and the verification set into the constructed JCNN model, starting training the network model by using a back propagation algorithm, and stopping training and storing the trained model when the model training termination condition is met.

In the network training process, the weight value is updated by adopting a gradient descent method, the gradient of each layer is calculated by adopting an error back propagation method according to the chain propagation rule of errors, and in the two-dimensional convolution part, the error of the convolution layer is assumed to be

The error of a hidden layer thereon

And need to update parameters

Comprises the following steps:

wherein the content of the first and second substances,

in order to correct the error of the convolution layer,

in order to be the error of the previous hidden layer,

in order to convolve the layer weights,

in order to obtain the weight of the previous hidden layer,

for convolutional layer output, L is the initial error.

Comprises the following steps:

wherein the content of the first and second substances,

And 5, testing the trained model by using the test sample for 10 times, wherein the obtained diagnosis precision of the training and testing faults is shown in figure 3.

And 6, diagnosing the actual sample by using the obtained model to obtain a diagnosis result.

In the model training process, a training method of the model parameters of the 1D-2D joint convolution neural network is established. A mechanism for transmitting model training errors from the 2D convolutional neural network to the 1D convolutional neural network is deduced, seamless connection of the 2D convolutional neural network and the 1D convolutional neural network is achieved, and updating of weights and bias parameters of all layers of the whole model is achieved by a gradient descent method.

The method comprises the steps of utilizing one-dimensional convolution to adaptively obtain multi-scale feature vectors of signals aiming at vibration signals acquired in different states, constructing the feature vectors into two-dimensional vectors, and using the two-dimensional vectors as the input of a two-dimensional convolution neural network. And selecting the cross entropy as a loss function, and optimizing the filter parameters of the one-dimensional-two-dimensional combined network model by using an error back propagation algorithm to obtain a final fault diagnosis model. Better diagnostic performance is obtained.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims

1. A joint convolution neural network diagnosis method for rotary machine faults is characterized by comprising the following steps:

step 1, data acquisition:

step 2, data preprocessing:

step 3, constructing a 1D-2D JCNN model:

step 4, model training:

step 5, verification:

step 6, diagnosis:

2. The method for diagnosing the joint convolutional neural network of the rotating machine fault as claimed in claim 1, wherein the specific method for constructing the 1D-2D JCNN model in the step 3 is as follows:

wherein the content of the first and second substances,

the ith convolution kernel representing the ith layer,

for the jth local input of the l-1 th layer,

to output, Conv1D (mush) is a one-dimensional convolution calculation,

F(a)＝max{0,a}

where k represents a convolution kernel, N is a convolution calculation region,

for the input of the first layer of convolution,

3. The method for diagnosing the convolutional neural network of the rotary machine related to the fault of claim 1, wherein the specific process of the model training in the step 4 is as follows: in the network training process, the weight value is updated by adopting a gradient descent method, the gradient of each layer is calculated by adopting an error L back propagation method according to the chain propagation rule of errors, and in the two-dimensional convolution part, the error of the convolution layer is assumed to be