CN114897138A

CN114897138A - System fault diagnosis method based on attention mechanism and depth residual error network

Info

Publication number: CN114897138A
Application number: CN202210484425.9A
Authority: CN
Inventors: 王克璇; 朱小良; 刑天阳; 姜牧笛
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-08-12

Abstract

The invention discloses a system fault diagnosis method based on an attention mechanism and a deep residual error network, which is characterized in that an optimal network layer number and a convolution kernel number are obtained by using an artificial bee colony algorithm to design a network structure, so that the feature extraction capability of convolution operation on time sequence, multidimensional and noise interference data of a large-scale industrial system is fully exerted, and meanwhile, the attention mechanism based on channel attention and space attention is added among residual error blocks, so that the network can automatically learn the importance degrees of different channels and the importance degrees of different pixel points of input data, and the model training convergence speed and the fault diagnosis accuracy rate are accelerated. According to the method, the state variable time series data of the steam-water separation reheating system are input into the model, and high fault diagnosis accuracy is obtained under noise interference.

Description

System fault diagnosis method based on attention mechanism and depth residual error network

Technical Field

The invention relates to the technical field of fault diagnosis of a nuclear power steam-water separation reheating system, in particular to a fault diagnosis algorithm based on an attention mechanism and a depth residual error network.

Background

A steam-water separation reheating system (MSR) is a main subsystem of a nuclear power two-loop system, and safe and efficient operation of the MSR has important significance on safety and economy of a nuclear power unit.

However, due to the increase of the number of nuclear power generating units under construction and the large amount of intermittent renewable energy sources in grid connection, the nuclear power generating units need to deeply participate in peak shaving frequency modulation, so that various typical faults are easy to occur in a steam-water separation reheating system under a time-varying working condition.

Therefore, the accurate diagnosis of the typical fault of the steam-water separation reheating system has important significance for safe and stable operation of the nuclear power unit under the time-varying working condition.

Current fault diagnosis techniques are largely classified into analytical model-based methods, empirical knowledge-based methods, and data-driven-based methods.

After the process system is analyzed by the method based on the analytical model, a function relation which can describe the system function and is formed by coupling mathematical formulas is abstracted, so that the analytical redundancy of the process system is obtained. Resolving the redundancy yields the difference between the measured value and the model estimate, i.e., the physical residual. The key to fault detection is to analyze the residual characteristics to identify, locate and isolate the fault. However, model-based methods rely on the accuracy of the model, and faults can only be identified efficiently when the model is accurate. However, the model established by analysis or experimental data cannot accurately reflect the steady-state and dynamic characteristics of the system, so that the method has a large limitation.

The method based on the empirical knowledge is suitable for complex industrial systems which are deeply known by people, and is fault recognition in the nuclear power field, and the classical method based on the empirical knowledge is an expert system. However, for such a complex system of a nuclear power plant, the currently established knowledge base is still insufficient to reflect the main accident characteristics, so that the development of an expert system is more dependent on the development of system knowledge.

With the rapid development of random computers and the internet, the operation data of the actual system can be more easily stored and mined, and thus a data-driven based method is widely used. However, when the method based on data driving is applied to the fault identification of a complex industrial system, there are some defects, such as large calculation amount, poor real-time performance, high requirement on the amount of data, and the like.

In recent years, a data-driven method represented by deep learning develops rapidly due to computer computing power improvement and big data and computing model progress, and has a wide prospect in fault diagnosis application.

However, the fault diagnosis method applied to complex industrial systems such as the steam-water separation reheating system is limited in practical application as follows: noise interference, data time sequence dimension and variable dimension coupling cause the fault diagnosis model aiming at the steam-water separation reheating system to be difficult to be applied to the ground.

Disclosure of Invention

The technical problems to be solved by the invention are as follows: the invention aims to solve the problem that a fault diagnosis method based on data driving is difficult to apply to fault diagnosis of an actual steam-water separation reheating system due to noise interference, data time sequence dimension and variable dimension coupling.

The invention provides the following technical scheme: the invention discloses a system fault diagnosis method based on an attention mechanism and a deep residual error network, which comprises the following steps,

s1, optimizing the number of network residual blocks and the number of convolution kernels of each layer in the network by using an artificial bee colony heuristic optimization algorithm so as to obtain a pareto optimal network structure meeting the maximum fault diagnosis accuracy rate and the minimum network calculation parameters;

s2, channel attention mechanisms and space attention mechanisms are added among the residual blocks in front and at back, so that the network training convergence speed and the fault diagnosis accuracy are improved;

s3, carrying out sliding window, edge zero filling and standardized processing operations to carry out preprocessing on the data;

and S4, inputting the two-dimensional image data into a network for training to finally obtain a trained network model, and performing state classification on the steam-water separation reheating system state data input into the network model so as to diagnose the system state.

Further, in the step S1, designing the network structure by using an artificial bee colony heuristic optimization algorithm, including the following steps,

taking the number of network residual blocks and the number of convolution kernels of each layer in the network as design variables;

taking the maximum fault diagnosis accuracy and the minimum network calculation parameter total amount after the network training for 20 periods as target variables;

the training period is set to be 100 times, then an artificial bee colony algorithm is used for automatically optimizing to obtain a pareto optimal solution set of fault diagnosis accuracy and network calculation parameter total amount, and the pareto optimal solution set corresponds to the number of optimal network residual block and the number of convolution kernels of each layer in the network, namely an optimal network structure.

Further, in the step S2, the channel attention mechanism includes the steps of,

sa1. compression, wherein the compression formula is as follows:

wherein u is _c (v, w) represents a pixel (v, w) in the feature map of the channel c; z is a radical of _c Representing the c global feature point; the input feature map size is H × W × C.

Sa2. activation, wherein the activation formula is as follows:

s _c ＝σ(W ₂ δ(W ₁ z _c ))

in the formula, sigma represents a sigmoid activation function; δ represents the Relu activation function; w ₁ And W ₂ Representing a weight of the fully connected layer; s _c Represents the activated output value;

sa3. product, the product formula is:

in the formula (I), the compound is shown in the specification,

representing the weighted feature map.

Further, in the step S2, the spatial attention mechanism includes the steps of,

performing pooling operation, namely performing channel-based global maximum pooling and global average pooling on the feature map with the input size of H multiplied by W multiplied by C to obtain two feature planes with the size of H multiplied by W multiplied by 1, and then performing channel splicing on the two feature planes to finally obtain the feature map with the size of H multiplied by W multiplied by 2;

dimension reduction operation, namely reducing the dimension of the feature map with the size of H multiplied by W multiplied by 2 into the feature map with the size of H multiplied by W multiplied by 1 through a convolution kernel of 7 multiplied by 7;

activating operation, namely using a sigmoid activation function to generate weights of different positions, so that the model obtains the capability of distinguishing importance degrees of different positions;

performing multiplication operation, namely assigning the weight coefficients of all the positions learned in the previous step to the initial feature map, so that the model has the distinguishing capability on different position features;

sb5. the spatial attention mechanism can be described as:

M _s (F)＝σ(f ^7×7 ([AvgPool(F)；MaxPool(F)]))

in the formula, sigma represents a sigmoid activation function; f. of ^7×7 Represents a 7 × 7 convolution operation; avgpool (f) and maxpool (f) denote global average pooling and global maximum pooling, respectively.

Furthermore, two-dimensional image data with a fixed time length are acquired by using a sliding window method, and sliding window operation on a time axis is performed on the steam-water separation reheating system state data, so that time sequence data are converted into time sequence-variable two-dimensional image data, and the system state data can be subjected to convolution operation.

Further, the edge zero padding includes adding a pixel point of which the edge of the time sequence-variable two-dimensional image data is 0, so that time sequence-variable two-dimensional image data with a fixed variable length is obtained, and is finally stored as two-dimensional image data with a resolution of 600 × 600, so that the data can be stored as input data with a uniform format, and fault diagnosis operation is facilitated.

Further, the normalization processing includes normalizing the two-dimensional image data obtained by the edge zero padding operation, so as to remove the influence of dimensions on different variables;

the normalization process comprises expanding the input image data according to variable dimensions, calculating the mean value mu and standard deviation sigma of different variables at all time points, subtracting the mean value from the original input data X, and dividing by the standard deviation to obtain the data X after normalization process ^* ；

In the formula, μ — mean of all sample data; σ — standard deviation of all sample data.

Has the advantages that: compared with the prior art, the invention has the advantages that:

1. the method can effectively carry out fault diagnosis on time and variable coupling input data containing noise interference, ensures higher accuracy, and can be applied to fault diagnosis of a nuclear power steam-water separation reheating system; the network structure is optimized by using the artificial bee colony heuristic intelligent algorithm, the network structure which meets the pareto optimal solution set with higher network fault diagnosis accuracy and fewer network calculation parameters can be obtained, the network performance can be exerted as much as possible, and the fault diagnosis accuracy as high as possible can be obtained under the condition of limited calculation resources.

2. According to the method, a channel attention mechanism and a space attention mechanism are added between the residual blocks, wherein the channel attention mechanism can accelerate the model to extract effective channel characteristics, and the space attention mechanism can accelerate the model to extract effective space characteristics of input image data.

3. The invention adopts the data preprocessing operation of sliding window, edge zero padding and standardized processing, can obtain the two-dimensional image input data with uniform data format, and can facilitate the conversion of the actual steam-water separation reheating system data into the input data of the model.

Drawings

FIG. 1 is an overall flow chart of the present invention;

FIG. 2 is a schematic flow chart of an artificial bee colony heuristic optimization algorithm in the present invention;

FIG. 3 is a schematic diagram of a channel attention mechanism of the present invention;

FIG. 4 is a schematic diagram of the spatial attention mechanism of the present invention;

FIG. 5 is a schematic view of a sliding window according to the present invention;

FIG. 6 is a schematic diagram of edge zero padding according to the present invention;

FIG. 7 is a diagram of the results of model fault diagnosis in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be more clearly and completely described below with reference to the accompanying drawings in the examples of the present invention.

As shown in fig. 1, a flow chart is constructed for a fault diagnosis method provided herein based on an attention mechanism and a deep residual error network, referred to as a-DRSN for short, and a specific flow of algorithm design mainly includes the following steps:

1. network structure designed using artificial bee colony heuristic optimization algorithm

and setting the training period to be 100 times, and then automatically optimizing by using an artificial bee colony algorithm to obtain a pareto optimal solution set of the fault diagnosis accuracy and the network calculation parameter total amount, wherein the pareto optimal solution set corresponds to the number of the optimal network residual block and the number of convolution kernels of each layer in the network, namely the optimal network structure.

2, adding a channel attention mechanism and a space attention mechanism between the two residual blocks, and connecting the channel attention mechanism and the space attention mechanism in series, wherein the channel attention mechanism obtains the most important convolution kernel parameters in different convolution kernels through three steps of compression, activation and multiplication, and accelerates the training speed and the model accuracy; the spatial attention mechanism obtains the most important pixel point parameters of the characteristics corresponding to the same convolution kernel in the way through four steps of pooling, dimensionality reduction, activation and convolution, so that the extraction of important spatial characteristics by the network is accelerated, the network training is accelerated, and the model accuracy is improved.

2.1 mechanism for adding attention

The channel attention mechanism is shown in fig. 3, the channel attention mechanism is divided into three steps of compression, activation and multiplication, and the compression operation can be described as follows:

The activation operation may be described as:

s _c ＝σ(W ₂ δ(W ₁ z _c ))

in the formula, sigma represents a sigmoid activation function; δ represents the Relu activation function; w ₁ And W ₂ Representing a weight of the fully connected layer; s _c Representing the activated output value.

The product operation can be described as:

in the formula (I), the compound is shown in the specification,

representing the weighted feature map.

2.2 the spatial attention mechanism is shown in FIG. 4, and is divided into four steps of pooling, dimensionality reduction, activation and multiplication;

firstly, pooling operation is carried out, namely, global maximum pooling and global average pooling based on channels are carried out on the feature map with the input size H multiplied by W multiplied by C, so as to obtain two feature planes with the size H multiplied by W multiplied by 1, and then the two feature planes are subjected to channel splicing, so as to finally obtain the feature map with the size H multiplied by W multiplied by 2.

The second step is a dimensionality reduction operation, namely, the characteristic diagram with the size of H multiplied by W multiplied by 2 is reduced to the characteristic diagram with the size of H multiplied by W multiplied by 1 through a convolution kernel of 7 multiplied by 7.

The third step is activation operation, namely, the sigmoid activation function is used for generating weights of different positions, so that the model obtains the capability of distinguishing importance degrees of different positions.

The fourth step is a multiplication operation, namely, the weight coefficients of all the positions learned in the previous step are assigned to the initial feature map, so that the model has the distinguishing capability for different position features.

Finally, the spatial attention mechanism can be described as:

M _s (F)＝σ(f ^7×7 ([AvgPool(F)；MaxPool(F)]))

3. Carrying out sliding window, edge zero filling and standardized processing operations to carry out preprocessing on the data;

3.1 using sliding window method to collect two-dimensional image data with fixed time length, the sliding window is shown in FIG. 5; and performing sliding window operation on the steam-water separation reheating system state data on a time axis, so that the time sequence data is converted into time sequence-variable two-dimensional image data, and the system state data can be subjected to convolution operation.

3.2 use the zero padding operation of the edge to change the input data size to a unified data format of 600 × 600, as shown in fig. 6; performing edge zero padding, namely padding the edge of the time sequence-variable two-dimensional image data with pixels of which the value is 0 to obtain comfortable time sequence-variable two-dimensional image data with fixed variable length, and finally storing the data as two-dimensional image data with the resolution of 600 multiplied by 600, so that the data can be stored as input data with uniform format to facilitate fault diagnosis operation

The data is standardized to eliminate dimension influence and accelerate network convergence, and the standardized operation is shown as follows:

in the formula, μ — mean value of all sample data; σ — standard deviation of all sample data.

4. Taking the preprocessed data as input training data of a deep residual network added with an attention mechanism, building a network model by using a Pythrch, constructing the network model after artificial bee colony optimization and the attention mechanism addition, inputting the preprocessed training data into the network model, and training parameters of the network model; and simultaneously observing the change of the network fault diagnosis accuracy and the loss value, preventing the overfitting phenomenon caused by excessive training times, and stopping the network training when the accuracy of the training set and the accuracy of the test set of the network meet the requirements

In the training process, network parameters are updated by using an Adam optimization algorithm, and a softmax classifier is used for outputting a fault diagnosis result.

5. And (3) inputting the actual state data of the steam-water separation reheating system serving as input data into a trained network model, and checking the performance of the algorithm.

The depth network fault diagnosis model based on the attention mechanism has high accuracy rate on faults of a steam-water separation reheating system, the accuracy rate reaches over 99% in normal noise-free data, and meanwhile, the method has good identification accuracy on data containing noise signals.

As can be seen from fig. 7, under the condition of low noise, the deep residual error network DRSN without adding an attention mechanism, the conventional convolutional neural network VGG, and the multilayer perceptron MLP all have good accuracy, but as the noise signal in the data increases, the accuracy of other methods is significantly reduced, and the method provided by the present invention has an accuracy of 90% or more, so that it can be concluded that the method provided by the present invention has very good diagnostic accuracy for high-noise data.

The above embodiments are merely illustrative of the technical concept and structural features of the present invention, and are intended to be implemented by those skilled in the art, but the present invention is not limited thereto, and any equivalent changes or modifications made according to the spirit of the present invention should fall within the scope of the present invention.

Claims

1. A system fault diagnosis method based on an attention mechanism and a depth residual error network is characterized in that: comprises the following steps of (a) carrying out,

s2, adding a channel attention mechanism and a space attention mechanism between the residual blocks and in front and back, and improving the convergence speed of network training and the accuracy rate of fault diagnosis;

2. The system fault diagnosis method based on attention mechanism and depth residual error network according to claim 1, characterized in that: in step S1, the method for designing a network structure using an artificial bee colony heuristic optimization algorithm includes the following steps,

3. The system fault diagnosis method based on attention mechanism and depth residual error network according to claim 1 or 2, characterized in that: in the step S2, the channel attention mechanism includes the steps of,

sa1. compression, wherein the compression formula is as follows:

Sa2. activation, wherein the activation formula is as follows:

s _c ＝σ(W ₂ δ(W ₁ z _c ))

sa3. product, the product formula is:

in the formula (I), the compound is shown in the specification,

representing the weighted feature map.

4. The system fault diagnosis method based on attention mechanism and depth residual error network according to claim 3, characterized in that: in the step S2, the spatial attention mechanism includes the steps of,

sb5. the spatial attention mechanism can be described as:

M _s (F)＝σ(f ^7×7 ([AvgPool(F)；MaxPool(F)]))

5. The system fault diagnosis method based on attention mechanism and depth residual error network according to claim 4, characterized in that: the method comprises the steps of collecting two-dimensional image data with a fixed time length by using a sliding window method, and carrying out sliding window operation on state data of the steam-water separation reheating system on a time axis, so that time sequence data are converted into time sequence-variable two-dimensional image data, and the system state data can be subjected to convolution operation.

6. The system fault diagnosis method based on attention mechanism and depth residual error network according to claim 5, characterized in that: and the edge zero padding comprises the step of padding the edge of the time sequence-variable two-dimensional image data with a pixel point of 0 so as to obtain the time sequence-variable two-dimensional image data with a fixed variable length, and the time sequence-variable two-dimensional image data is finally stored as the two-dimensional image data with the resolution of 600 multiplied by 600, so that the data can be stored as input data with a uniform format, and fault diagnosis operation is facilitated.

7. The system fault diagnosis method based on attention mechanism and depth residual error network according to claim 6, characterized in that: the standardization processing comprises the step of carrying out standardization processing on two-dimensional image data obtained by the edge zero filling operation so as to remove the influence of dimensions on different variables;