CN114092458B

CN114092458B - Automatic detection method for thick and light smoke of engine based on improved NanoDet depth network

Info

Publication number: CN114092458B
Application number: CN202111428973.1A
Authority: CN
Inventors: 王静静; 张聪; 张新曼; 罗智元; 陈冕; 程昭晖; 赵红超; 贾士凡; 王书琴; 毛乙舒; 陆罩
Original assignee: Xian Jiaotong University; AECC Sichuan Gas Turbine Research Institute
Current assignee: Xian Jiaotong University; AECC Sichuan Gas Turbine Research Institute
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2024-02-27
Anticipated expiration: 2041-11-29
Also published as: CN114092458A

Abstract

An automatic detection method for thick smoke and light smoke of an engine based on an improved NanoDet depth network collects images of the smoke of the engine to form a data set; by utilizing the improved NanoDet depth network, only a C5 characteristic layer is used under the condition of ensuring the same receptive field, so that network parameters are reduced; carrying out frame prediction on each pixel of the feature layer; and the positive and negative samples are screened by using an adaptive training sample selection algorithm, and the detection head consists of a classification branch, a frame regression branch and an implicit unsupervised target prediction sub-branch, so that the detection precision is improved. And judging whether the engine generates smoke and the type of the smoke or not according to the network output result. If smoke is detected, the type of the detected smoke is judged according to the chromaticity difference between the smoke area and the background area, if light smoke is generated, an alarm is immediately given, the smoke is prevented from being gradually changed, and if the detection result is thick smoke, emergency measures are automatically started except the alarm; if smoke is not detected, detection is continued. The method can realize the detection of thick smoke and light smoke of the automobile and the aeroengine.

Description

Automatic detection method for thick and light smoke of engine based on improved NanoDet depth network

Technical Field

The invention belongs to the technical field of automatic control, relates to automatic detection of engine smoke, and in particular relates to an automatic detection method of engine dense smoke and light smoke based on an improved NanoDet depth network.

Background

In order to meet the functions of high efficiency, small volume and the like, the structure of the engine is increasingly complex, and the engine works for a long time under severe working conditions of high temperature, high pressure, high rotating speed and the like, so that the probability of fault occurrence is increased. On one hand, smoke is one of important image features appearing in the early stage of occurrence of abnormalities such as fire explosion, so that smoke detection is accurately and real-timely carried out on engine test videos in various environments, and the smoke detection method is an important means for reducing economic loss and guaranteeing user safety; on the other hand, with the birth and development of deep learning technology, video understanding and intelligent analysis are widely and deeply studied, which provides an effective implementation means for smoke detection and effectively solves the problem of smoke detection in complex environments with different angles, different illumination, different backgrounds and the like.

At present, smoke detection is mainly divided into three directions, namely a smoke detection sensor, traditional artificial feature extraction and deep learning. The sensor has the advantages of low price and easy installation, but the smoke sensor requires space to be airtight, and is oxidized due to large-area contact with air, so that the sensitivity is low; traditional algorithms may appear abnormally unstable as smoke image color, texture, shape characteristics change with factors such as lighting conditions.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide the automatic detection method for the thick smoke and the light smoke of the engine based on the improved NanoDet depth network, which is used for intelligently analyzing the video based on the improved NanoDet depth network, effectively detecting the smoke generated by the engine and giving an alarm, providing automatic smoke detection support for the test of the engine, realizing the smoke detection of the test monitoring video and giving an event alarm, freeing up human resources, and has wide application range, accurate and reliable detection result and higher robustness.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

an engine dense smoke and light smoke automatic detection method based on an improved NanoDet depth network comprises the following steps:

step 1), acquiring an engine smoke picture to form a data set;

step 2), labeling dense smoke and light smoke of smoke pictures in a data set, dividing the smoke pictures into a training set and a testing set, inputting an improved NanoDet depth network for training, inputting the training set into a ShuffleNet v2 main network for convolution calculation to generate feature layers with different sizes, respectively inputting the C5 feature layers obtained by the 32:1 downsampling rates into an expansion encoder, namely a projection layer and four expansion residual blocks, then carrying out frame prediction of a target on each pixel on the C5 feature layer processed by the expansion encoder, screening positive and negative samples by an adaptive training sample selection Algorithm (ATSS), and finally calculating position regression loss and classification loss by a detection head;

and 3) taking the trained improved NanoDet depth network as a smoke detector to judge the thick smoke and the light smoke of the engine, wherein the method comprises the following steps:

collecting video frames, inputting a training improved NanoDet depth network, judging whether an engine generates smoke and the range of the generated smoke according to the probability of outputting the smoke type and the position of a smoke frame through feature enhancement, feature conversion and standardization, calculating the chromaticity average value of a smoke area and the chromaticity average value of a surrounding background area, calculating the difference between the two, judging that the generated smoke is dense smoke if the difference exceeds a set threshold value, otherwise judging that the generated smoke is light smoke, alarming and taking different actions according to the dense smoke and the light smoke if the probability of detecting the smoke of the engine exceeds a preset threshold value; otherwise, continuing to monitor.

Preferably, in step 1), the acquired engine smoke picture is subjected to data enhancement to amplify the data set.

The data enhancement method can be as follows:

performing Mosaic data enhancement on the original data set, and randomly splicing four pictures to form one picture;

one or more operations of rotation, brightness adjustment, scaling, random occlusion, random clipping are performed on the original dataset.

The method for enhancing the Mosaic data can be as follows: taking batch size (batch size) pictures from a data set, randomly extracting four pictures each time, splicing the four pictures into an output picture according to the sequence from left to right and from top to bottom after cutting, wherein the size of the output picture is consistent with the original size of the output picture, and the cutting method comprises the following steps: randomly generating clipping position parameters cut_x and cut_y; the cut_x and cut_y represent the width and height of the cutting of the upper left corner of the picture 1, and since the size of the output picture is determined, the width and height of the picture 2 to be cut at the upper right corner, the width and height of the picture 3 to be cut at the lower left corner and the width and height of the picture 4 to be cut at the lower right corner are determined, after the four pictures are cut, the obtained sub-pictures 1,2,3,4 are spliced in the order of upper left, upper right, lower left and lower right to form a new picture, the bounding box of the original picture is reserved, the Mosaic data enhancement process is repeated for the batch size times, and a new batch (batch) of data is obtained.

Preferably, in the step 2), the smoke picture in the data set is kept with a uniform length-width ratio, and the thick smoke and the light smoke are marked by rectangular boxes.

Preferably, the improved NanoDet depth network is made up of three parts: a ShuffleNetV2 backbone network, an extension encoder, and a detection head; the basic unit of the SheffeNetV 2 backbone network firstly separates input channels, one part of channels are input into 1X 1 convolution, 3X 3 depth convolution and 1X 1 convolution, the other part of channels are not operated, and finally the two parts of channels are connected to carry out random mixing operation of the channels; the SheffeNet V2 backbone network is responsible for carrying out convolution calculation on input, downsampling to obtain feature layers with different scales, conveying the C5 feature layers to an expansion encoder for processing, wherein the expansion encoder consists of a projection layer and four cascaded expansion residual blocks, the projection layer consists of convolution kernels with the size of 1 multiplied by 1 and 3 multiplied by 3, and the four expansion residual blocks have different expansion factors so as to enlarge a receptive field and solve the problem of too small receptive field caused by using only the C5 feature layers; and carrying out multiple frame predictions on each pixel on the C5 characteristic layer processed by the expansion encoder, namely generating multiple predicted values of distances between the pixel coordinates and four boundaries of the boundary frame, namely multiple candidate frames, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), wherein a detection head consists of two branches and is respectively responsible for calculating classification loss and carrying out frame regression, meanwhile, a sub-branch is led out from the frame regression branch, the probability of detecting a target, namely smoke, in the region is calculated by the sub-branch through the frame position obtained by the frame regression in an unsupervised way, and the probability is multiplied by the classification score to obtain the final probability of classification so as to achieve the function of filtering the background.

Preferably, in the step 2), the C5 feature layer is input into a convolution of 1×1, then into a convolution of 3×3, and then into 4 expansion residual blocks.

Preferably, the detection head uses the focus loss for classification loss calculation.

Preferably, in the step 3), firstly judging whether the engine generates smoke according to the probability of outputting the smoke type, if the probability is larger than the set threshold value, indicating that the engine generates smoke, and continuously judging whether the smoke is detected or the smoke is light through calculating the average chromaticity of the smoke area and the surrounding background area; if the detection value is smaller than the set threshold value, the detection is continued.

Compared with the prior art, the invention has the following beneficial effects:

and when the network is trained, the Mosaic data of the smoke data set of the automobile and the aeroengine, which are collected by monitoring, are enhanced, so that the new training sample contains the information of four original training samples, and the trained network is more robust. Meanwhile, various data enhancement methods such as rotation, brightness adjustment, scaling, random shielding, random cutting and the like are adopted for the data, so that the detection of smoke with different scales, backgrounds and illumination is facilitated.

When the engine smoke image is detected, the real-time efficient high-frame-rate engine smoke detection and alarm function can be realized by improving the detection of the NanoDet depth network and outputting the result, and the detection precision and practicability are improved to a certain extent. The problem of prediction accuracy reduction caused by smoke diffusion can be solved by adopting an improved NanoDet depth network and predicting the frame pixel by pixel, so that the range of smoke diffusion can be predicted more accurately; the original road strength aggregation network structure is replaced by single-feature layer input, the receptive field is enlarged by using the projection layer and the expansion residual error structure, the network detection efficiency is improved on the premise of ensuring the detection accuracy, and the engine smoke detection effect of the neural network is improved to a certain extent; the detection head is composed of a classification branch and a frame regression branch, the frame regression branch is used for separating an implicit non-supervision implicit target prediction sub-branch, a target score is generated, and the background can be filtered better. The focus loss is used as a loss function, so that training of the background generated sample is restrained, training of the foreground sample is not affected, network detection accuracy is improved, and detection speed is improved. The invention can accurately detect the smoke of the engine, has accurate and reliable detection result, strong robustness and wide application range, and has wide application prospect in the intelligent security fields such as social industry, public safety, intelligent analysis, industrial control safety and the like.

Furthermore, the input engine smoke gray level map is detected by improving the NanoDet depth network, and the improved NanoDet depth network is optimized on the basis of the original NanoDet network, so that the engine smoke risk can be better and faster detected, and the use safety is improved. Particularly, compared with some classical target detection networks, the improved NanoDet depth network parameters used in the invention are less, the model is light, the training cost is saved, and the method is convenient to transplant to a mobile terminal. Inputting a video stream to be detected into a trained improved NanoDet depth network, performing corresponding convolution processing, feature extraction and feature enhancement, and finally outputting a target detection category and a target position, wherein if the probability of detecting smoke exceeds a preset threshold, the engine emits smoke, the category of detecting smoke is further judged, if the light smoke is detected for alarming, and if the heavy smoke is detected for starting emergency measures and alarming; and otherwise, continuing to detect.

The deep learning algorithm has great breakthrough in time sequence data analysis, and effectively solves the detection problems caused by video frame distortion, blurring, scaling, illumination change, occlusion and the like. Deep learning networks can learn a large amount of data simultaneously, and their detection efficiency has exceeded that of many conventional algorithms. Among them, convolutional neural networks have been widely used in the fields of target detection, person detection, motion recognition, and the like.

The NanoDet deep neural network is a lightweight target detection network, adopts an anchor-free target detection framework, predicts the frame of each pixel of a characteristic layer, and is suitable for detecting engine smoke. The diffusion of smoke results in low accuracy of anchored detection network using anchor frame regression, while pixel-by-pixel prediction can improve the accuracy of smoke detection and the regression accuracy of frame positions. Meanwhile, the lightweight structure of the NanoDet deep neural network is not only suitable for transplantation and deployment at a mobile terminal, but also is friendly in training, and the detection speed is high. The characteristics very meet the requirement of the engine smoke detection on timeliness.

The invention removes the path aggregation network structure (PAN) in the original NanoDet network based on the original NanoDet deep neural network, only uses a single characteristic layer to carry out regression and prediction, uses a cavity convolution layer to expand the receptive field of the single characteristic layer, and further improves the detection speed under the condition of not losing the accuracy. The improved NanoDet detection head consists of a classification branch and a frame regression branch, and meanwhile, in order to reduce interference of the background on smoke detection, an unsupervised target prediction branch is used for screening out the background, so that the detection precision is improved. And finally, classifying and frame regression are carried out by using focus loss, and learning of a background sample is restrained. Compared with an anchor target detection network, the improved NanoDet neural network has the advantages of low training cost, easy transplantation, high detection speed and high detection precision; compared with the original NanoDet neural network, the improved NanoDet neural network can be lighter and faster in detection speed, and has good effect in an engine smoke detection experiment.

Drawings

FIG. 1 is a schematic block diagram of an engine rich and lean detection method based on an improved NanoDet depth network.

Fig. 2 is a schematic diagram of a Mosaic image enhancement method.

Fig. 3 is a diagram of the original NanoDet network structure.

FIG. 4 is a diagram illustrating a single input single output configuration;

fig. 5 is a diagram of a modified NanoDet deep network architecture.

Fig. 6 is an engine smoke detection result graph, wherein graph (a) is a real automobile engine smoke detection graph, (b) is a simulated aeroengine smoke detection graph, and graph (c) and (d) are network aeroengine test video smoke detection result graphs.

Fig. 7 is a graph of experimental results of laboratory simulation of light smoke of an aeroengine, wherein graphs (a) - (d) represent simulation experiments of different parts and types.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings and examples.

The invention provides an automatic detection method for thick smoke and light smoke of an engine based on an improved NanoDet depth network, which is used for judging whether the engine generates smoke or not in a complex environment, and mainly comprises a training stage and a detection stage, referring to FIG. 1.

As shown in fig. 1, in the network training stage, engine smoke pictures are collected to form a data set, then dense smoke and light smoke are marked, and the data set is divided into a training set and a testing set, the improved NanoDet depth network is trained, and the trained improved NanoDet depth network can be used as a smoke detector.

In particular, the engine of the invention can be an automobile or an aeroengine.

In the invention, the collected engine smoke picture can be subjected to data enhancement so as to amplify the data set. By way of example, the method of data enhancement may be:

and performing Mosaic data enhancement on the original data set, and randomly splicing and fusing the four pictures to form a new picture. And performing one or more operations of rotation, brightness adjustment, scaling, random occlusion and random cropping on the pictures in the original data set. These two processed data sets constitute a new data set. In the new dataset, the aspect ratio of the picture is adjusted to uniform pixels (e.g., 320 x 320), rectangular box labeling is performed for smoke and non-smoke.

In order to solve the problems of multiple parameters for prediction by a plurality of feature layers and high training cost, the invention adopts an improved NanoDet depth network, and replaces a multi-path aggregation network structure (PAN) in the traditional NanoDet by using a single-feature layer input single-layer output structure. Meanwhile, in order to make up for the problem that the receptive field of a single characteristic layer is too small, the receptive field is enlarged through a projection layer and four expansion residual blocks. The improved NanoDet depth network directly predicts the frame of each pixel point of the feature layer and screens positive and negative samples by using an adaptive sample selection Algorithm (ATSS), so that the effect and the speed of the neural network are improved to a certain extent. The detection head consists of a classification branch and a frame regression branch, and an implicit non-supervision target prediction sub-branch is applied to filter the background.

The training set is input into a SheffleNet v2 backbone network to carry out convolution calculation to generate feature layers with different scales, then the C5 feature layers are input into an expansion encoder, namely a projection layer and four expansion residual blocks, then a plurality of frames of targets are predicted for each pixel at the C5 feature layer processed by the expansion encoder, meanwhile, an adaptive training sample selection Algorithm (ATSS) is used for screening positive and negative samples, and finally, through the loss and the classification loss of the calculation position regression of a detection head, the focus loss is adopted for classification regression and frame position regression, so that the learning of the negative samples is restrained, and the detection accuracy is improved.

In summary, the method carries out corresponding convolution processing, feature extraction and feature enhancement by inputting the video of the engine to be detected into an improved NanoDet depth network, outputs detection category probability and frame coordinates, and judges whether the engine generates smoke, dense smoke or light smoke according to the detected category and position. The method comprises the following specific steps:

1. engine smoke image enhancement

Referring to fig. 2, the specific approach to the enhancement of the mosaics data is:

and taking batch size (batch size) pictures from the data set, randomly extracting four pictures each time, splicing the four pictures into an output picture according to the sequence from left to right and from top to bottom after cutting, wherein the size of the output picture is consistent with the original size. The cutting method comprises the following steps: the clipping position parameters cut_x and cut_y are randomly generated. The cut_x and cut_y represent the width and height of the cutting of the upper left corner of the picture 1, and since the size of the output picture is determined, the width and height of the picture 2 to be cut at the upper right corner, the width and height of the picture 3 to be cut at the lower left corner and the width and height of the picture 4 to be cut at the lower right corner are determined, after the four pictures are cut, the obtained sub-pictures 1,2,3,4 are spliced in the order of upper left, upper right, lower left and lower right to form a new picture, the bounding box of the original picture is reserved, the Mosaic data enhancement process is repeated for the batch size times, and a batch of (batch) data is obtained and recorded as a data set 1.

In order to enhance the robustness of the network to the background, illumination, shielding and other external condition changes, the operations of rotation, brightness adjustment, scaling, random cutting, random shielding and the like are carried out on each smoke picture in the original data set, so that the data set 2 is obtained.

The new data set is composed of a data set 1 and a data set 2, the smoke pictures in the new data set are kept with uniform length-width ratio, and thick smoke and light smoke are marked by rectangular frames and are divided into a training set and a testing set.

2. Improved NanoDet deep network training

Fig. 3 is a diagram of the original NanoDet deep network architecture, where NanoDet is an anchor-free one-stage detection network, using ShuffleNet v2 as the backbone network. The original NanoDet depth network performs frame prediction on each pixel at the C5 feature layer without generating a large number of candidate anchor frames at the feature layer. Let F _i Denote the ith feature layer and s denote the accumulated convolution step. The real border of the input image is represented asWherein the first four terms represent the coordinates of the top left and bottom right corners of the bezel and the last term represents the category of the bezel. C is the total number of categories.

For F _i Each pixel location (x, y) on the input image can be mapped to

For each pixel (x, y), the network will learn four parameters (l ^* ,t ^* ,r ^* ,b ^* ) These four parameters define the corresponding frame for this location (x, y). The definition map of the four parameters is as follows:

where (x, y) represents the coordinates after mapping to the artwork.

The final output of the network is a vector p of dimension C _x,y Representing class probability, and a 4-dimensional vector t _x,y = (l, t, r, b) for representing the pixel predictionThe position of the frame.

Fig. 4 is a diagram of a single input single output structure using only a C5 feature layer, reducing network parameters and improving detection speed. Since only one feature layer is used, the receptive field cannot cover all scale targets, and in order not to lose detection accuracy, the receptive field is enlarged by using a hole convolution layer. Specifically, the projection layer and four cascaded expansion residual blocks are accessed after the feature layer. The projection layer is composed of a 1×1 convolution kernel and a 3×3 convolution kernel. The 1×1 convolution kernel is used to reduce the channel dimension and then the 3×3 convolution kernel is used to extract the semantic information. The overall structure of the expanded residual block is a residual structure, and the main channel is a convolution kernel of 1×1, a hole convolution layer of 3×3 and a convolution kernel of 1×1 in sequence. The cavity convolution layer can enable the receptive field to expand exponentially without losing resolution. Order theIs a discrete function, let ∈ ->And let->As a discrete filter, then the discrete convolution operator can be defined as:

generalizing the operator to let l be the expansion factor, then the hole convolves _l The operator is defined as:

the four expansion residual blocks adopt different expansion factors l, so that the receptive field covers all targets to be detected in the picture as much as possible.

Referring to fig. 5, the modified NanoDet deep network is also composed of three parts: a ShuffleNetV2 backbone network, an extension encoder, and a detection head; the basic unit of the SheffeNetV 2 backbone network firstly separates input channels, one part of channels are input into 1X 1 convolution, 3X 3 depth convolution and 1X 1 convolution, the other part of channels are not operated, and finally the two parts of channels are connected to carry out random mixing operation of the channels; the SheffeNet V2 backbone network is responsible for carrying out convolution calculation on input, downsampling to obtain feature layers with different scales, conveying the C5 feature layers to an expansion encoder for processing, wherein the expansion encoder consists of a projection layer and four cascaded expansion residual blocks, the projection layer consists of convolution kernels with the size of 1 multiplied by 1 and 3 multiplied by 3, the C5 feature layers are firstly input into the convolution with the size of 1 multiplied by 1 and then the convolution with the size of 3 multiplied by 3, and then 4 expansion residual blocks are input, and the four expansion residual blocks have different expansion factors, so that the purpose of expanding the receptive field is to solve the problem that the receptive field is too small caused by using only the C5 feature layers. And carrying out multiple frame predictions on each pixel on the C5 characteristic layer processed by the expansion encoder, namely generating multiple predicted values of distances between the pixel coordinates and four boundaries of the boundary frame, namely multiple candidate frames, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), wherein a detection head consists of two branches and is respectively responsible for calculating classification loss and carrying out frame regression, meanwhile, a sub-branch is led out from the frame regression branch, the probability of detecting a target, namely smoke, in the region is calculated by the sub-branch through the frame position obtained by the frame regression in an unsupervised way, and the probability and classification score are multiplied independently, so that the final probability of classification is obtained, and the background filtering function is achieved.

The modified NanoDet deep network uses an adaptive training sample selection algorithm (ats) to screen positive and negative samples. The algorithm has no super-parameters and good robustness, and can improve the speed and stability of smoke detection. The algorithm traverses each real frame on the input image, generates an empty candidate set aiming at one real frame, selects k candidate frames with the shortest center distance from the real frame of the candidate frames in the feature layer, and adds the k candidate frames into the candidate set. Next, a cross-over ratio is calculated for each candidate box and real bounding box in the candidate set (IoU), and the IoU average m of the set is calculated for the candidate set _g And standard deviation v _g . Defining a threshold t _g ＝m _g +v _g The threshold definition takes into account the influence of the average level of the candidate set on the threshold, which should be high if the average level is high. Traversing the candidate set, the candidate boxes that are equal to or greater than the threshold and whose centers are inside the true bounding box will be considered positive samples, the others being defined as negative samples. Each real border on the input image is operated as above. This algorithm has only one super parameter k, the optimal value of which is robust, and experiments show that the optimal value of k is around 9. Therefore, the training sample selection algorithm has almost no super parameters, and can simplify the deep network parameter adjustment process.

The improved NanoDet depth network uses focal loss as a loss function, and the focal loss solves the problems that negative samples generated by the background and positive samples generated by the foreground occupy most of the calculation amount due to the fact that the negative samples occupy the loss function and gradient are reduced due to the difference of the number, so that more calculation resources are used in the calculation of the foreground rather than the calculation of the background. The cross entropy penalty typically used by deep learning networks is defined as follows:

where y ε { + -1 } represents the true category and p ε [0,1 ]]Is the probability value of the network prediction category, defining p _t The method comprises the following steps:

then the cross entropy loss can be written as CE (p, y) =ce (p _t )＝-log(p _t ). The disadvantage of cross entropy loss is that the loss of the easy-to-learn samples, i.e. the negative samples, easily takes up a large part of the total loss, resulting in meaningless calculations when the gradient is decreasing. In order to concentrate the network on the learning of positive samples, the focus loss function judges the difficulty level of sample training according to the probability value of classified samples and suppresses the easy-to-learn samples, which is basically defined as follows (parameter is omittedα _t )：

FL(p _t )＝-(1-p _t ) ^γ log(p _t )

Wherein y represents a true value category, and 0 or 1 is taken. p is the predicted probability value, p ε (0, 1). Gamma is the focus parameter, adjustable. The negative sample is easy to learn, and has a smaller probability of being misclassified, i.e. p _t The value is large, the focusing factor (1-p _t ) ^γ Towards 0, the loss value towards 0, and sample learning is inhibited; positive samples are difficult to learn, p _t Compared to the negative samples, which are small, the focusing factor approaches 1, and the overall loss calculation is not suppressed. Adding a super-parameter weight coefficient alpha _t The network detection accuracy can be improved, and the network detection accuracy is defined as follows:

wherein for positive samples, the weight thereof is sparse alpha _t ＝α∈[0,1]For negative sample alpha _t =1- α, then the complete focus loss is defined as follows:

FL(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

the improved NanoDet depth network uses two branches of classification and frame regression as the detection heads, the structure of which is shown in fig. 5. The classification detection head consists of 2 3×3 convolution kernels, a batch normalization layer and a ReLU activation function; the frame regression detection head consists of four 3×3 convolution kernels, a batch normalization layer and a ReLU activation function. Meanwhile, an implicit non-supervision target prediction sub-branch is added to each predicted frame in the frame regression branch, the sub-branch structure is consistent with the centralized branch structure of the FCOS, and the target index calculation can be carried out on the pixels in the currently detected frame, so that the target-contained degree in the frame can be obtained through the index calculation. The target degree index may be measured using the following index: scale saliency, i.e. calculating the degree of saliency of a pixel within the frame relative to the whole picture, is available at pixel p relative to the whole pictureCalculating, wherein the larger the value is, the greater the possibility of the object in the frame is; the higher the color contrast, i.e. the contrast of the color within the border and its surrounding colors, the higher the likelihood of being the target, the border w and its background Surr (w, θ _CC ) Can be used with the contrast ratio of CC (w, θ _CC )＝χ ² (h(w),h(Surr(w,θ _CC ) -a) calculating; super-pixel crossover: defining superpixels as blocks of pixels having nearly the same color, the superpixel crossing is rarely the case if the majority contained within one frame is a target, there are many superpixel crossing cases if the majority is a background, the degree to which the superpixel crosses the frame w can be used ∈>The higher the value, the higher the likelihood that the in-frame is the target. By calculating the above-described index of the frame of the regression branch, the targeting property of the image in the frame, that is, the possibility of containing the target can be obtained. Multiplying the targeting and classification scores will achieve the effect of filtering the background.

3. Detecting the smoke of the engine, and judging the thick smoke and the light smoke of the detected smoke according to the chromaticity difference

And (3) taking the trained improved NanoDet depth network as a smoke detector to judge the dense smoke and the light smoke of the engine. Firstly judging whether the engine generates smoke according to the probability of the type of the output smoke, if the type of the output smoke is larger than a set threshold value, indicating that the engine generates smoke, and continuously judging whether the detected smoke is thick smoke or light smoke through calculating the average chromaticity of a smoke area and surrounding background areas; if the detection value is smaller than the set threshold value, the detection is continued.

Y in the YUV color space represents luminance, and U and V both represent chromaticity. When the smoke becomes rich, the gray level of the image increases and the chromaticity decreases. The U and V values are calculated pixel by pixel for the detected smoke region.

Y _S ＝0.299R′ _S +0.587G′ _S +0.114B′ _S

U _S ＝0.492(B' _S -Y _S )

V _S ＝0.877(R' _S -Y _S )

Wherein S represents smoke. Averaging all pixels of the smoke region can yield an average chromaticity index:

the same operation is carried out on the pixels at the outer ring of the smoke boundary box to obtain the average chromaticity index of the background:

the two indices are differenced.

T＝|C _B -C _S |

T and a preset threshold T ₀ Comparing if T is greater than T ₀ And judging that the smoke is light, otherwise, judging that the smoke is light.

The above embodiments are merely preferred examples of the present invention and are not intended to limit the present invention, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principles of the present invention should be included in the scope of the present invention.

Fig. 6 is a graph of experimental results of engine smoke detection, wherein graph (a) is a true automobile engine smoke detection graph, graph (b) is a simulated aeroengine smoke detection graph, graph (c), and graph (d) is a network aeroengine test video smoke detection result graph; in fig. 7, (a), (b), (c) and (d) are graphs of simulated aero-engine light smoke detection results, and it can be confirmed that the invention can detect engine dark smoke and light smoke.

Claims

1. The automatic detection method for the thick smoke and the light smoke of the engine based on the improved NanoDet depth network is characterized by comprising the following steps of:

step 1), acquiring an engine smoke picture to form a data set;

step 2), labeling dense smoke and light smoke of smoke pictures in a data set, dividing the smoke pictures into a training set and a testing set, inputting an improved NanoDet depth network for training, inputting the training set into a SheffeNetV 2 backbone network for convolution calculation to generate feature layers on different scales, inputting a C5 feature layer into an expansion encoder, namely a projection layer and four expansion residual blocks, predicting a plurality of frames of each pixel at the C5 feature layer processed by the expansion encoder, screening positive and negative samples by an adaptive training sample selection Algorithm (ATSS), and finally calculating position regression loss and classification loss by a detection head;

2. The method for automatically detecting engine smoke and light smoke based on the improved NanoDet depth network according to claim 1, wherein in the step 1), the collected image of engine smoke is subjected to data enhancement to amplify the data set.

3. The method for automatically detecting the engine dense smoke and the light smoke based on the improved NanoDet depth network according to claim 2, wherein the data enhancement method is as follows:

4. The automatic detection method for engine dense smoke and light smoke based on the improved NanoDet depth network according to claim 3, wherein the method for enhancing the Mosaic data is as follows: taking batch size (batch size) pictures from a data set, randomly extracting four pictures each time, splicing the four pictures into an output picture according to the sequence from left to right and from top to bottom after cutting, wherein the size of the output picture is consistent with the original size of the output picture, and the cutting method comprises the following steps: randomly generating clipping position parameters cut_x and cut_y; the cut_x and cut_y represent the width and height of the cutting of the upper left corner of the picture 1, and since the size of the output picture is determined, the width and height of the picture 2 to be cut at the upper right corner, the width and height of the picture 3 to be cut at the lower left corner and the width and height of the picture 4 to be cut at the lower right corner are determined, after the four pictures are cut, the obtained sub-pictures 1,2,3,4 are spliced in the order of upper left, upper right, lower left and lower right to form a new picture, the bounding box of the original picture is reserved, the Mosaic data enhancement process is repeated for the batch size times, and a new batch (batch) of data is obtained.

5. The method for automatically detecting the dense smoke and the light smoke of the engine based on the improved NanoDet depth network according to claim 1, wherein in the step 2), the uniform length-width ratio of the smoke pictures in the data set is maintained, and the dense smoke and the light smoke are marked by rectangular boxes.

6. The method for automatically detecting engine smoke and light smoke based on the improved NanoDet depth network according to claim 1, wherein the improved NanoDet depth network is composed of three parts: a ShuffleNetV2 backbone network, an extension encoder, and a detection head; the basic unit of the SheffeNetV 2 backbone network firstly separates input channels, one part of channels are input into 1X 1 convolution, 3X 3 depth convolution and 1X 1 convolution, the other part of channels are not operated, and finally the two parts of channels are connected to carry out random mixing operation of the channels; the SheffeNet V2 backbone network is responsible for carrying out convolution calculation on input, downsampling to obtain feature layers with different scales, conveying the C5 feature layers to an expansion encoder for processing, wherein the expansion encoder consists of a projection layer and four cascaded expansion residual blocks, the projection layer consists of convolution kernels with the size of 1 multiplied by 1 and 3 multiplied by 3, and the four expansion residual blocks have different expansion factors so as to enlarge a receptive field and solve the problem of too small receptive field caused by using only the C5 feature layers; and carrying out multiple frame predictions on each pixel on the C5 characteristic layer processed by the expansion encoder, namely generating multiple predicted values of distances between the pixel coordinates and four boundaries of the boundary frame, namely multiple candidate frames, screening positive and negative samples by using an adaptive training sample selection Algorithm (ATSS), wherein a detection head consists of two branches and is respectively responsible for calculating classification loss and carrying out frame regression, meanwhile, a sub-branch is led out from the frame regression branch, the probability of detecting a target, namely smoke, in the region is calculated by the sub-branch through the frame position obtained by the frame regression in an unsupervised way, and the probability is multiplied by the classification score to obtain the final probability of classification so as to achieve the function of filtering the background.

7. The method for automatically detecting engine smoke and light smoke based on the improved NanoDet depth network according to claim 6, wherein in the step 2), the C5 feature layer is input into a convolution of 1×1, then into a convolution of 3×3, and then into 4 expansion residual blocks.

8. The method for automatically detecting engine smoke and light smoke based on the improved NanoDet depth network according to claim 6, wherein the detection head calculates the classification loss using the focus loss.

9. The automatic detection method of engine smoke and light smoke based on improved NanoDet depth network according to claim 1, wherein the step 3) is to judge whether the engine generates smoke according to the probability of the output smoke type, if the probability is larger than the set threshold, the engine generates smoke, and then to judge whether the smoke or light smoke is detected by calculating the average chromaticity of the smoke area and the surrounding background area; if the detection value is smaller than the set threshold value, the detection is continued.