CN112418005A

CN112418005A - Smoke multi-classification identification method based on backward radiation attention pyramid network

Info

Publication number: CN112418005A
Application number: CN202011226816.8A
Authority: CN
Inventors: 顾锞; 张永慧; 乔俊飞; 李泽东
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2021-02-26
Anticipated expiration: 2040-11-06
Also published as: CN112418005B

Abstract

The invention discloses a smoke multi-classification identification method based on a back radiation attention pyramid network, which is used for judging the working state of an emptying torch. The network is first composed of three pyramid blocks in series, which consist of 3, 4 and 5 basic convolution modules. Then, an attention mechanism is introduced into each pyramid module for feature filtering. And finally, all feedforward outputs of all pyramid modules are connected through backward radiation, and the system comprehensively fuses low, medium and high-level features. A smoke multi-classification identification method based on a backward radiation attention pyramid network belongs to the field of atmospheric environment protection and the field of machine learning.

Description

Smoke multi-classification identification method based on backward radiation attention pyramid network

Technical Field

The invention relates to a method for identifying the working state of a thermal power generation system, in particular to a smoke multi-classification identification method based on a backward radiation attention pyramid network, and belongs to the field of atmospheric environment protection and the field of machine learning.

Background

The discovery and widespread use of electricity is the result of the second and third technological revolution, bringing a lot of convenience to human life. Common power generation modes mainly include thermal power generation, solar power generation, nuclear power generation, wind power generation and the like. However, the cost of solar power generation is too high, and large-scale use cannot be realized; nuclear power generation easily causes unrecoverable pollution to the environment; wind power generation is limited by wind power and terrain. Therefore, thermal power is still the mainstream method in the power generation mode.

In order to better maintain the safety of life and production and protect the ecological environment, the waste gas generated by the thermal power system is completely combusted and then is discharged into the atmosphere through an emptying torch. Under normal operating conditions, the thermal power generation system should produce white smoke (i.e., water vapor) when the flare is vented. However, when the venting torch discharges black smoke (i.e. carbon black) or colorless smoke (i.e. toxic gas), which exceeds the unrecoverable degree specified in the thermal power generation industry, the operation of the thermal power system is abnormal, and serious safety risks and environmental problems are caused. In particular, black smoke is caused by incomplete combustion of exhaust gas, and is liable to cause fatal damage to public health and ecological environment. Colorless fumes most likely mean that toxic exhaust gases are emitted untreated directly into the air, which often results in serious air pollution. Therefore, a well-designed smoke recognition model is needed to monitor the working state of the thermal power generation system, so as to protect the human health and the ecological environment. In recent years, with the rapid development of deep learning, it has been successfully applied to many important tasks. It has been widely accepted by researchers that deep convolutional network extracted features have better characterization capabilities than traditional manually extracted features.

CN201910319875.0 discloses a transformer substation smoke and fire intelligent recognition monitoring method based on deep learning, wherein the method optimizes a frame of YOLO v3 as a basis, and completes construction and training of an image recognition model by adopting an image data set; in addition, the method also utilizes video data to train a pseudo three-dimensional convolution residual error network to complete the construction and training of a video identification model; the extracted video stream and the image are sent to an image recognition model after being preprocessed, and when smoke is detected, the video recognition model is automatically called to carry out secondary recheck to check the detection result.

CN201611116917.3 discloses a flame/smoke recognition method based on RGB reconstruction after forest image cutting, which destroys a background area by utilizing the difference of RGB three-component values of the flame area, the smoke area and the background area, and continuously reduces the area to be detected; after being processed by a modified gray formula for enhancing flame characteristics, the smoke or the suspected flame area is judged according to the proportion of non-negative elements through filtering, cutting, corrosion and expansion; the invention is applied to the occasions for identifying flame/smoke in forest fire prevention.

While deep learning has achieved significant success in recognition, research in smoke multi-class recognition has been quite limited. Therefore, a new model is urgently needed to be established to solve the problem of multi-classification identification of the smoke concentration. Based on the consideration, the invention firstly researches the multi-classification smoke identification problem, and carefully develops a new smoke multi-classification network based on the backward radiation attention pyramid so as to fully utilize the visual characteristics of the smoke at different levels, thereby improving the classification accuracy of the classifier and the algorithm accuracy.

Disclosure of Invention

The invention researches the working state of a thermal power generation system and designs a smoke multi-classification identification method (IRAP-Net) based on a backward radiation attention pyramid network, aiming at the serious problems that under the abnormal operation condition of a thermal power generation system, the waste gas is not completely combusted or is not treated, and a large amount of black smoke or colorless toxic waste gas is discharged into the atmosphere to cause severe explosion and atmospheric pollution. The method adopts the real-time monitoring image of the vented torch smoke acquired by the monitoring camera of the thermal power plant as the data to be detected, optimizes the image through a deep learning framework based on a pyramid network, and utilizes an image data set to construct and train the image to obtain the whole smoke classification condition of the real-time monitoring image of the smoke so as to complete the real-time monitoring of the smoke of the thermal power plant.

The invention adopts the technical scheme that the smoke multi-classification identification method based on the backward radiation attention pyramid network is mainly established through the following steps:

step 1, obtaining an image to be detected from the smoke real-time monitoring video.

For a video clip obtained by an industrial camera, image data is obtained by extracting a video frame and cutting an image to be used as an input of a model.

Step 1.1, extracting video frames, extracting video current frames at intervals of 1.5s, and directly discarding videos of other parts;

step 1.2, cutting the smoke image segment, cutting each frame image extracted from the video into a smoke image block of 48 multiplied by 48 in a mode of stipulating a cutting position, and abandoning other image areas;

and 2, building a pyramid network based on the backward radiation attention.

And constructing a pyramid learning network based on the backward radiation attention, wherein the network is composed of three pyramid modules connected in series, and the three pyramid layers are respectively composed of 3, 4 and 5 basic convolution modules. And introducing an attention mechanism module in each pyramid block for characteristic filtering. And finally, connecting all feedforward outputs of all pyramid modules through backward radiation, and comprehensively fusing low, medium and high-level features. The input of the method is the image block obtained in the step 1, and the output of the method is smoke monitoring data.

Step 2.1, a basic convolution module consisting of two convolution layers, a normalization layer and an activation layer is adopted to carry out network preprocessing, a smoke image with the input value of (48 multiplied by 3) is subjected to pre-training consisting of the basic convolution module and a maximum pooling layer, and the output value of the smoke image with the input value of (24 multiplied by 16), so that the characteristic is pre-adjusted;

step 2.2, extracting low-level features by using 3 groups of complex convolution modules consisting of 2 unit convolution modules as a first-level pyramid model, inputting the low-level features as a result of (24 multiplied by 16) dimensional network pre-training, and outputting the low-level features with (24 multiplied by 96) dimensions after stacking the features;

and 2.3, enhancing relevant characteristics by adopting an attention mechanism. Firstly, considering the signal mapped to each feature by the output pyramid block, initializing the weight by using global average pooling, wherein the formula is as follows:

wherein the content of the first and second substances,

representing the initialization result of the global average pooling weight;

denotes f_cThe characteristic map of the nth smoke image in (1); f. of_cA feature map representing a group c of smoke images;

and

feature map respectively representing smoke image

Height and width of; i and j represent

Height and width coordinates. Then, a two-layer neural network consisting of two complete connection layers and two activation layers is introduced, and the initialized weight is learned and updated, so that the optimal weight is obtained:

wherein the content of the first and second substances,

representing the optimal weight of the smog image multi-classification network; w₁，W₂Is the weight of two fully connected layers; wherein sigma_ReLUAnd σ_SigmoidRefer to the ReLU function and Sigmoid function used in the active layer, respectively; omega_cShows group c smoke diagramsAnd initializing the weight value corresponding to the characteristic diagram of the image. And finally, adding the original feature map into the associated enhanced feature map to generate a new feature map. The input is the low-level feature of (24 × 24 × 96) dimension, and the output is the weighted low-level feature of (24 × 24 × 96);

step 2.4, extracting the middle-layer features by using 4 groups of complex convolution modules consisting of 3 unit convolution modules as a second-layer pyramid model, wherein the input of the complex convolution modules is (24 multiplied by 96) weighted low-layer features, and the output of the complex convolution modules is (24 multiplied by 160) dimensional middle-layer features after the features are stacked;

step 2.5, synchronization step 2.3, its input is the middle layer characteristic (24 × 24 × 160), its output is the weighted middle layer characteristic (24 × 24 × 160);

step 2.6, extracting high-level features by taking 5 groups of complex convolution modules consisting of 4 unit convolution modules as a third-level pyramid model, wherein the input of the complex convolution modules is (24 multiplied by 160) weighted high-level features, and the output of the complex convolution modules is (24 multiplied by 240) dimensional high-level features after the features are stacked;

step 2.7, synchronization step 2.3, its input is the high-level feature of (24 × 24 × 240), its output is the weighted high-level feature of (24 × 24 × 240);

and 2.8, combining the backbone of the pyramid network with the feature map feedforward output by the three pyramid blocks to form a backward radiation connection. Firstly, a unit convolution module is adopted to superpose four groups of output feature maps extracted from a main stem and three pyramid blocks together to obtain a group of new feature maps. The newly created feature map is then stretched to size 128 x 1 using a global average pooling operation. Finally, the vectors are processed by utilizing the full-connection operation, and a multi-classification smoke recognition result is obtained;

and 3, training a pyramid network based on the backward radiation attention.

And 3.1, performing multi-classification training on the smoke based on the back radiation attention pyramid network formed in the step 2 and the step 3 by using the smoke image blocks obtained in the step 1 as a training set. Wherein, the momentum coefficient is set to be 0.9, the initial learning rate is 0.001, and the learning rate attenuation coefficient is 0.00001. A mini-batch cycle of size 16 was used for 300 training periods.

And 4, carrying out multi-classification monitoring on the smoke image.

Video frames of videos obtained by the industrial camera are extracted, images are cut, features are extracted, and smoke classification evaluation is carried out.

Step 4.1, video frame extraction and image cutting are carried out on the video to be tested according to the step 1;

and 4.2, performing multi-classification evaluation on the smoke by using the pyramid network trained in the step 4 and based on the backward radiation attention.

Compared with the prior art, the invention has the following advantages:

(1) aiming at the problems that the abnormal working condition of a thermal power system is easy to cause fatal damage to public health and ecological environment, and severe explosion and atmospheric pollution are caused, the invention provides a novel well-designed smoke multi-classification identification deep network for the first time.

(2) By combining the pyramid body with an attention mechanism and the backward radiation connection, the backward radiation attention pyramid network can effectively fuse various low-grade and high-grade characteristics to realize visual smoke identification.

(3) The present invention performs thorough testing and algorithmic comparisons on a large database of smoke images. The result shows that IRAP-Net obtains superior performance, which is obviously superior to the advanced depth models of other countries.

Drawings

FIG. 1 is a schematic diagram of a backward radiation attention pyramid network design;

fig. 2 is a table of network parameters of the present invention.

Detailed Description

The present invention is described in detail below, and the embodiments and specific operations of the present invention are given in the present examples on the premise of the technical solution of the present invention, but the scope of the present invention is not limited to the following processes.

The implementation mode is shown in figures 1 and 2 and comprises the following steps:

step S10, extracting and cutting the video to obtain image data;

step S20, building a pyramid network based on the back radiation attention;

step S30, training a pyramid network based on the backward radiation attention;

step S40, carrying out multi-classification monitoring on the smoke image;

the step S10 of obtaining image data by video extraction cropping according to the embodiment further includes the steps of:

step S100, extracting video frames, extracting video current frames at constant intervals of 1.5S, and discarding other video segments;

step S110, cutting the smoke image, cutting each frame of smoke image into smoke image blocks of 48 multiplied by 48, and discarding other image areas;

the attention pyramid network building step S20 of the embodiment further includes the steps of:

step S200, preprocessing a network by adopting a basic convolution module and a maximum pooling layer, wherein the basic convolution module consists of two groups of convolution layers, a normalization layer and an activation layer, and the step realizes the pre-adjustment of characteristics;

step S210, extracting low-level features by using 3 groups of complex convolution modules comprising 2 unit convolution modules as a first-level pyramid model;

and step S220, initializing the weight by utilizing global average pooling, introducing a two-layer neural network consisting of two groups of complete connection layers and activation layers to learn and update the initialized weight, adding the original feature graph to the associated enhanced feature graph by using the weight, and generating a new feature graph.

Step S230, taking 4 groups of complex convolution modules consisting of 3 unit convolution modules as a second-layer pyramid model to extract the middle-layer features;

step S240, extracting the weighted middle layer characteristics in the synchronization step 2.3;

step S250, extracting high-level features by taking 5 groups of complex convolution modules consisting of 4 unit convolution modules as a third-level pyramid model;

step S260, the synchronization step 2.3 extracts the weighted high-level features;

and step S270, combining the backbone of the pyramid network and the characteristic diagram feed-forward output by the pyramid block by adopting backward radiation connection. Firstly, a unit convolution module is adopted to superpose four groups of output feature maps extracted from a main stem and three pyramid blocks together to obtain a group of new feature maps. The newly created feature map is then stretched to size 128 x 1 using a global average pooling operation. Finally, processing the vector by utilizing full-connection operation to obtain a multi-classification smoke identification result;

the step S30 of training the pyramid network based on back radiation attention further includes the following steps:

and step S300, performing multi-classification smoke training on the pyramid network based on the back radiation attention, which is formed in the step 2, by using the smoke image blocks obtained in the step 1 as a training set. Wherein, the momentum coefficient is set to be 0.9, the initial learning rate is 0.001, and the learning rate attenuation coefficient is 0.00001. A small batch size of size 16 was used for 300 cycles of training.

The multi-classification monitoring step S40 of the embodiment further includes the steps of:

step S400, video frame extraction and image cutting are carried out on the video to be tested according to the step 1;

and S410, performing multi-classification evaluation on the smoke by using the pyramid network based on the backward radiation attention trained in the step 3.

Step S420, three common evaluation criteria, i.e. accuracy, detection rate and false recognition rate, are used as the indexes for detecting the network performance. The first criterion is Global Accuracy (GAR):

wherein N is₊，N₀，N_-Respectively representing the number of positive samples, the number of neutral samples and the number of negative samples; t is₊，T₀，T_-To be correctThe number of positive samples identified, neutral samples correctly identified, and negative samples correctly identified. The second criterion is Global Detection Rate (GDR):

wherein DR₊，DR₀，DR_-The ratios of the number of correct results in the positive, neural and negative samples to the total positive, neutral and negative samples, respectively, are defined as follows:

the third criterion is the false positive rate (GFAR):

wherein the content of the first and second substances,

in the formula, F₊The number of positive samples that are misclassified as neutral samples or negative samples; f₀A number of neutral samples that are misclassified as positive or negative samples; f_-It is the number of negative samples that are incorrectly classified as positive samples and neutral samples.

The invention compares the performance with the existing Alex-Net [1], ZF-Net [2], VGG-Net [3], Google-Net [4], Res-Net [5], Xception [6], Dense-Net [7], DNCNN [8] and DCNN [9] networks, and gives the experimental results of the application of the invention.

Table 1 comparative test results of the invention

Reference to the literature

[1]Krizhevsky，I.Sutskever，and G.E.Hinton，“ImageNet classification with deep convolutional neural networks，”Proc.Adv.Neural Inf.Process.Syst.(NIPS)，vol.25.pp.1097-1105，Dec.2012.

[2]M.D.Zeiler and R.Fergus，“Visualizing and understanding convolutional networks，”Proc.Eur.Conf.Comp.Vis.(ECCV)，pp.818-833，Sep.2014.

[3]K.Simonyan and A.Zisserman，“Very deep convolutional networks for large-scale image recognition，”arXiv preprint arXiv：1409.1556，Sep.2014.

[4]C.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.E.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,and A.Rabinovich,“Going deeper with convolutions,”IEEE Conf.Computer Vision&Pattern Recognition(CVPR),pp.1-9,2015.

[5]K.He,X.Zhang,S.Ren,and J.Sun,“Deep residual learning for image recognition,”IEEE Conf.Computer Vision&Pattern Recognition(CVPR),pp.770-778,Jun.2016.

[6]F.Chollet,“Xception:Deep Learning with depthwise separable convolutions,”IEEE Conf.Computer Vision&Pattern Recognition(CVPR),pp.1800-1807,Nov.2017.

[7]G.Huang,Z.Liu,L.van der Maaten,and K.Q.Weinberger,“Densely connected convolutional networks,”IEEE Conf.Computer Vision&Pattern Recognition(CVPR),pp.4700-4708,Jul.2017.

[8]Z.Yin,B.Wan,F.Yuan,X.Xia,and J.Shi,“A deep normalization and convolutional neural network for image smoke detection,”IEEEAccess,vol.5,pp.18429-18438,Aug.2017.

[9]K.Gu,Z.Xia,J.Qiao,and W.Lin,“Deep dual-channel neural network for image-based smoke detection,”IEEE Trans.Multimedia,vol.22,no.2,pp.311-323,Feb.2020.

Claims

1. A multi-classification smoke identification method based on a backward radiation attention pyramid network is characterized by comprising the following steps: the method is established by the following steps:

step 1, obtaining an image to be detected from a smoke real-time monitoring video;

for a video clip obtained by an industrial camera, image data is obtained by extracting a video frame and cutting an image and is used as the input of a model;

step 2, building a pyramid network based on the backward radiation attention;

building a pyramid learning network based on the backward radiation attention, wherein the network is composed of three pyramid modules connected in series, and the three pyramid layers are respectively composed of 3, 4 and 5 basic convolution modules; an attention mechanism module is introduced into each pyramid block for feature filtering; finally, all feedforward outputs of all pyramid modules are connected through backward radiation, and low, medium and high-level features are comprehensively fused; inputting the image block obtained in the step (1) and outputting smoke monitoring data;

step 3, training a pyramid network based on the backward radiation attention;

step 4, carrying out multi-classification monitoring on the smoke image;

2. The smoke multi-classification recognition method based on the backward radiation attention pyramid network as claimed in claim 1, wherein: in the step 1, a video frame is extracted, a video current frame is extracted at an interval of 1.5s, and videos of other parts are directly discarded;

step 1.2, cutting the smoke image segment, cutting each frame image extracted from the video into a smoke image block of 48 multiplied by 48 in a mode of stipulating a cutting position, and abandoning other image areas.

3. The smoke multi-classification recognition method based on the backward radiation attention pyramid network as claimed in claim 1, wherein: in step 2, step 2.1, a basic convolution module consisting of two sets of convolution layers, a normalization layer and an activation layer is adopted to carry out network preprocessing, a smoke image with the input of (48 multiplied by 3) is input, and after pre-training consisting of the basic convolution module and a maximum pooling layer, the output of the smoke image is (24 multiplied by 16), so that the characteristic is pre-adjusted;

step 2.2, extracting low-level features by using 3 groups of complex convolution modules consisting of 2 unit convolution modules as a first-level pyramid model, inputting the results of (24 multiplied by 16) dimensional network pre-training, and outputting the low-level features with (24 multiplied by 96) dimensions after stacking the features;

step 2.3, enhancing relevant characteristics by adopting an attention mechanism; firstly, considering the signal mapped to each feature by the output pyramid block, initializing the weight by using global average pooling, wherein the formula is as follows:

wherein the content of the first and second substances,

representing the initialization result of the global average pooling weight;

and

feature map respectively representing smoke image

Height and width of; i and j represent

Coordinates of height and width; then, a two-layer neural network consisting of two complete connection layers and two activation layers is introduced, and the initialized weight is learned and updated, so that the optimal weight is obtained:

wherein the content of the first and second substances,

representing the optimal weight of the smog image multi-classification network; w₁，W₂Is the weight of two fully connected layers; wherein sigma_ReLUAnd σ_SigmoidRefer to the ReLU function and Sigmoid function used in the active layer, respectively; omega_cRepresenting the initialization weight corresponding to the characteristic diagram of the c group of smoke images; finally, adding the original characteristic diagram into the associated enhanced characteristic diagram to generate a new characteristic diagram; the input is the low-level feature of (24 × 24 × 96) dimension, and the output is the weighted low-level feature of (24 × 24 × 96);

step 2.8, combining the backbone of the pyramid network and the feature map feedforward output by the three pyramid blocks to form a backward radiation connection; firstly, a unit convolution module is adopted to superpose four groups of output feature maps extracted from a main stem and three pyramid blocks together to obtain a group of new feature maps; then, a global average pooling operation is used on the newly created feature map, stretching it to a size of 128 x 1; and finally, processing the vector by utilizing full-connection operation to obtain a multi-classification smoke identification result.

4. The smoke multi-classification recognition method based on the backward radiation attention pyramid network as claimed in claim 1, wherein: in step 3, step 3.1, the smoke image blocks obtained in step 1 are used as a training set, and smoke multi-classification training is carried out on the basis of the back radiation attention pyramid network formed in step 2 and step 3.

5. The smoke multi-classification recognition method based on the backward radiation attention pyramid network as claimed in claim 1, wherein: step 4, step 4.1, video frame extraction and image cutting are carried out on the video to be tested according to the step 1;