CN112215122B - Fire detection method, system, terminal and storage medium based on video image target detection - Google Patents

Fire detection method, system, terminal and storage medium based on video image target detection Download PDF

Info

Publication number
CN112215122B
CN112215122B CN202011069784.5A CN202011069784A CN112215122B CN 112215122 B CN112215122 B CN 112215122B CN 202011069784 A CN202011069784 A CN 202011069784A CN 112215122 B CN112215122 B CN 112215122B
Authority
CN
China
Prior art keywords
model
image
feature
fire
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011069784.5A
Other languages
Chinese (zh)
Other versions
CN112215122A (en
Inventor
胡金星
王传胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202011069784.5A priority Critical patent/CN112215122B/en
Publication of CN112215122A publication Critical patent/CN112215122A/en
Application granted granted Critical
Publication of CN112215122B publication Critical patent/CN112215122B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Fire-Detection Mechanisms (AREA)
  • Image Analysis (AREA)
  • Multimedia (AREA)
  • Fire Alarms (AREA)

Abstract

The application relates to a fire detection method, a fire detection system, a fire detection terminal and a fire detection storage medium based on video image target detection. Comprising the following steps: converting an original natural image into a dust haze image and a sand dust image by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model; constructing a convolutional neural network model LFNT, inputting a data set into the LFNT model for iterative training to obtain optimal model parameters; the convolutional neural network model LFNT comprises a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model; the skeleton feature extraction model extracts main features of an input image through convolution of three different scales; the main feature extraction model is used for carrying out further feature extraction on main features to generate three groups of feature graphs; and the variable scale feature fusion model carries out self-adaptive fusion on the three groups of feature graphs, and outputs a detection result. The method can improve the robustness of the model in abnormal weather such as sand dust, dust haze and the like, and enable the model to obtain a better detection result.

Description

Fire detection method, system, terminal and storage medium based on video image target detection
Technical Field
The application belongs to the technical field of fire detection, and particularly relates to a fire detection method, a fire detection system, a fire detection terminal and a fire detection storage medium based on video image target detection.
Background
Fire detection plays a vital role in safety monitoring. At present, the traditional fire detection method is a method based on image priori, and the method is based on the color and shape of the image for fire detection, however, the robustness of the color and motion characteristics and the error rate are often affected by preset parameters, so that the method cannot be applied in a complex environment, and the positioning accuracy is easily affected by the area.
Monitoring is a tedious and time-consuming task, especially in an uncertain monitoring environment, which has a large uncertainty in time, space and even scale. The sensor-based detector has limited performance in terms of bit error rate and sensing range, and therefore it cannot detect remote or small fires. In recent years, with the rapid development of deep learning technology, convolutional Neural Networks (CNNs) have been applied to fire detection. However, the existing fire detection method based on deep learning has the following disadvantages:
1. the deep learning-based method requires a large amount of remote sensing images as training data, and training of a model is very challenging due to the scarcity of real remote sensing images.
2. The fire detection model based on deep learning is too large in scale to be suitable for resource-constrained devices.
3. The complexity of the existing algorithm is too high to detect in real time.
4. The anti-interference capability is weak, and the influence of severe monitoring environments such as dust haze, dust and the like is easy to be received.
5. Most fire detection algorithms focus on only a single environment and therefore a high error rate can occur in an uncertain environment.
In summary, the existing fire detection method has great room for improvement in terms of algorithm complexity, application scene range, model size and the like.
Disclosure of Invention
The application provides a fire detection method, a fire detection system, a fire detection terminal and a fire detection storage medium based on video image target detection, and aims to at least solve one of the technical problems in the prior art to a certain extent.
In order to solve the problems, the application provides the following technical scheme:
a fire detection method based on video image object detection, comprising:
converting an original natural image into a dust haze image and a sand dust image by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model;
constructing a convolutional neural network model LFNT, and inputting the data set into the LFNT model for iterative training to obtain optimal model parameters; the skeleton feature extraction model adopts convolution of 3, 5 and 7 to 7 scales to extract the features of the input image to obtain feature images with the sizes of 13, 26 and 52; the main feature extraction model performs further feature extraction on the main features to generate three groups of feature graphs with the sizes of 52, 26 and 13; the variable scale feature fusion model maps three groups of feature images to different convolution kernels and step sizes for convolution, and splices all convolutions with the same size to obtain three groups of feature images, and the three groups of feature images are operated by using a channel-based attention mechanism to obtain feature images with the sizes of 13, 26 and 52, which are respectively used for detecting small, medium and large objects; the inputting the data set into the LFNet model for iterative training further comprises: respectively selecting a mean square error and a cross entropy as a loss function to perform model optimization;
the convolutional neural network model LFNT comprises a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model; the skeleton feature extraction model extracts main features of an input image through convolution of three different scales; the main feature extraction model is used for carrying out further feature extraction on the main features to generate three groups of feature graphs; the variable-scale feature fusion model carries out self-adaptive fusion on the three groups of feature images and outputs a detection result;
and inputting the fire image to be detected into a trained LFNT model, and outputting a fire positioning area and a fire type of the fire image to be detected through the LFNT model.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the method for converting the original natural image into the dust haze image and the sand dust image by adopting the data enhancement algorithm based on the atmospheric scattering model comprises the following steps:
acquiring an original natural image; the original natural image includes a non-alarm image without a fire alarm area and a real fire alarm image.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the method for converting the original natural image into the dust haze image by adopting the data enhancement algorithm based on the atmospheric scattering model comprises the following steps:
the atmospheric scattering model respectively adopts at least two transmission rates to respectively simulate and generate dust haze images with different concentrations; the dust haze image imaging formula is as follows:
I(x)=J(x)t(x)+ɑ(1-t(x))
in the above formula, I (x) is a simulated haze image, J (x) is an input haze-free image, a is an atmospheric light value, and t (x) is a scene transmission rate.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the method for converting the original natural image into the dust image by adopting the data enhancement algorithm based on the atmospheric scattering model comprises the following steps:
the atmospheric scattering model adopts fixed transmissivity and atmospheric light values, and combines three colors to simulate and generate sand images with different concentrations; the sand image simulation formula is as follows:
D(x)=J(x)t(x)+a(C(x)*(1-t(x)))
in the above formula, D (x) is a simulated dust image, J (x) is an input haze-free image, and C (x) is a color value.
The technical scheme adopted by the embodiment of the application further comprises the following steps: the loss function is specifically:
counting the brightness, dark channel value and R channel data of a path of a fire region, regarding the statistic data as a combustion histogram prior, and writing a formula of CHP:
in the above formula, R (x) represents R channel of the image, SCP (x) is the difference between brightness of the image and dark channel, w is the width of the histogram, and h is the height of the histogram;
SCP(x)=||v(x)-DCP(x)||
in the above formula, v (x) is the brightness of the image, and DCP (x) is the value of the dark channel of the image;
L CHP =||CHP(I)-CHP(R)|| 2
in the formula, CHP represents a combustion histogram prior, and CHP (I) and CHP (R) respectively represent CHP values of a region selected by a target detection algorithm and a marked region;
the loss function is a weighted summation of three different loss functions:
L CHP =βL CE +γL MSE +δL CHP
in the above formula, L CHP As a final loss function, I CE L is a cross entropy loss function MSE As a mean square error loss function, L CHP A priori losses for the combustion histogram.
The embodiment of the application adopts another technical scheme that: a fire detection system based on video image object detection, comprising:
the data set construction module: the method comprises the steps of converting an original natural image into a dust haze image and a sand dust image by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model;
LFNet model training module: the method comprises the steps of constructing a convolutional neural network model LFNT, inputting the data set into the LFNT model for iterative training, and obtaining optimal model parameters; the skeleton feature extraction model adopts convolution of 3, 5 and 7 to 7 scales to extract the features of the input image to obtain feature images with the sizes of 13, 26 and 52; the main feature extraction model performs further feature extraction on the main features to generate three groups of feature graphs with the sizes of 52, 26 and 13; the variable scale feature fusion model maps three groups of feature images to different convolution kernels and step sizes for convolution, and splices all convolutions with the same size to obtain three groups of feature images, and the three groups of feature images are operated by using a channel-based attention mechanism to obtain feature images with the sizes of 13, 26 and 52, which are respectively used for detecting small, medium and large objects; the inputting the data set into the LFNet model for iterative training further comprises: respectively selecting a mean square error and a cross entropy as a loss function to perform model optimization;
the convolutional neural network model LFNT comprises a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model; the skeleton feature extraction model extracts main features of an input image through convolution of three different scales; the main feature extraction model is used for carrying out further feature extraction on the main features to generate three groups of feature graphs; the variable-scale feature fusion model carries out self-adaptive fusion on the three groups of feature images and outputs a detection result; the detection result comprises a fire localization area of the fire image and a fire type.
The embodiment of the application adopts the following technical scheme: a terminal comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the fire detection method based on video image target detection;
the processor is configured to execute the program instructions stored by the memory to control fire detection based on video image object detection.
The embodiment of the application adopts the following technical scheme: a storage medium storing program instructions executable by a processor for performing the fire detection method based on video image object detection.
Compared with the prior art, the embodiment of the application has the beneficial effects that: according to the fire detection method, the system, the terminal and the storage medium based on the video image target detection, the original image is converted into the dust haze or sand dust image with different degrees by using the data enhancement algorithm based on the atmospheric scattering model, the data set for training the model is generated, and the convolutional neural network model LFNT suitable for fire smoke detection under uncertain environments is constructed, so that the robustness of the model in abnormal weather such as sand dust, dust haze and the like can be improved, and a better detection result can be obtained. Meanwhile, the LFNT model in the embodiment of the application has smaller size, can reduce the calculation cost and is beneficial to being applied to equipment with limited resources.
Drawings
FIG. 1 is a flow chart of a fire detection method based on video image object detection in an embodiment of the present application;
FIG. 2 is a schematic diagram of the simulation effect of dust haze and sand images based on an atmospheric scattering model according to an embodiment of the present application;
FIG. 3 is a block diagram of a convolutional neural network model of an embodiment of the present application;
FIG. 4 is a block diagram of a variable scale feature fusion model of an embodiment of the present application;
FIG. 5 is a block diagram of a channel-based attention mechanism of an embodiment of the present application;
FIG. 6 is a schematic diagram of a fire detection system based on video image object detection according to an embodiment of the present application;
fig. 7 is a schematic diagram of a terminal structure according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Referring to fig. 1, a flow chart of a fire detection method based on video image object detection according to an embodiment of the application is shown. The fire detection method based on video image target detection provided by the embodiment of the application comprises the following steps of:
s10: acquiring an original natural image;
in this step, the raw natural images obtained include 293 non-alarm images of areas without fire alarm and 5073 real fire alarm images. The robustness of the training algorithm to the non-alarm target can be improved by utilizing the non-alarm image, and the error rate of the detector is reduced. The detection capability of the target detection model can be improved by using a real fire alarm image.
S20: converting an original natural image into a new synthetic image influenced by different types and different degrees of abnormal weather by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model;
in the step, the influence of abnormal weather such as dust haze or sand dust on performance is usually ignored by the existing intelligent monitoring algorithm, so that the robustness of the monitoring algorithm under uncertain weather conditions is poor. In order to solve the defects, the problem of influence of abnormal weather on a fire detection algorithm is considered, and the data enhancement method based on the atmospheric scattering model is used for respectively simulating dust haze images and dust images with different degrees, so that an original natural image is converted into a new synthetic image influenced by dust haze or dust weather with different degrees, and a large-scale reference data set for training and testing the fire detection model is constructed, so that the robustness of the target detection model in abnormal weather such as dust and dust haze is improved.
Further, please refer to fig. 2, which is a schematic diagram illustrating the simulation effect of the dust-haze and sand-dust images based on the atmospheric scattering model according to the embodiment of the present application, wherein (a) is an original image, (b), (c) and (d) are respectively dust-haze images synthesized by the atmospheric scattering model with different transmission rates, and (e), (f) and (g) are respectively sand-dust images simulated by adopting fixed transmittance and atmospheric light values and combining three different colors. The imaging formula of the dust-haze image is as follows:
I(x)=J(x)t(x)+ɑ(1-t(x)) (1)
in equation (1), I (x) is the simulated haze image, J (x) is the input haze-free image, a is the atmospheric light value, and t (x) is the scene transmission rate, which describes the portion of the view that is unscattered and reaches the camera sensor. In order to simulate the haze weather with different concentrations, the embodiment of the application sets the atmospheric light value alpha to 0.8 and the transmittance to 0.8, 0.6 and 0.4 respectively.
Since depth information does not play a major role in the image dusting task, it is assumed that the transmission does not change with the depth of the image. Through priori statistics, the embodiment of the application selects three colors suitable for simulating the dust images to simulate respectively, and the dust image simulation formula is as follows:
D(x)=J(x)t(x)+a(C(x)*(1-t(x))) (2)
in the formula (2), D (x) is a simulated dust image, J (x) is an input haze-free image, and C (x) is a selected color value.
S30: constructing a convolutional neural network model LFNT;
in an embodiment of the present application, a framework of a convolutional neural network model is shown in fig. 3. LFNet is made up of common convolution layer, bottleneck building block, parameter correction linear unit, group normalization, etc., including: the framework feature extraction model, the main feature extraction model and the variable scale feature fusion model have the following specific functions:
and (3) extracting a skeleton characteristic extraction model: for extracting the main features of the input image. In order to extract more abundant image features, the features of the input image are first extracted by convolution of 3 x 3, 5 x 5 and 7 x 7 scales respectively, the receiving field is enlarged, and more image features are extracted. After convolution at three different scales, feature maps of 13 x 13, 26 x 26 and 52 x 52, respectively, are obtained. Based on the above, by extracting the feature map by adopting multi-scale convolution, feature information of different sizes around the pixel can be extracted, which is particularly important for fire images.
Main feature extraction model: the method is used for further feature extraction of main features extracted by a skeleton feature extraction model, and three groups of feature graphs with the sizes of 52, 26 and 13 are generated, wherein each small-size feature graph is extracted from the upper-layer large-size feature graph, and each convolution block is extracted by a one-layer convolution structure and a five-layer residual structure.
Variable scale feature fusion model: features extracted by the main feature extraction model are connected in series by adopting Variable Scale Feature Fusion (VSFF), and then the features are extracted by convolution and subjected to self-adaptive fusion. The structure of the variable scale feature fusion model is shown in fig. 4. To fuse the convolutionally extracted feature maps of different scales, three sets of feature map maps are fused, extending the functions of 13 x 13 and 26 x 26 to 52 x 52. The three inputs are feature maps of sizes 13 x 13, 26 x 26, 52 x 52, respectively, and the three feature maps of different sizes are mapped to different convolution kernels and steps for convolution to upsample or downsample into two other sizes. And finally, splicing all convolutions with the same size to obtain three groups of feature maps. Because the feature map obtained by splicing contains richer image features, the model positioning can be more accurate.
Further, embodiments of the present application operate three sets of feature maps extracted in a VSFF using a channel-based attention mechanism. The channel-based attention mechanism can be seen as a process of weighting the feature map according to its importance. For example, in a set of convolutions of 24 x 13, the channel-based attention mechanism will determine which of the set of feature maps has a more pronounced effect on the prediction result, and then increase the weight of that portion. By means of the attention mechanism, three times of fusion are carried out to obtain characteristic diagrams with the sizes of 13, 26 and 52 respectively for detecting small, medium and large objects. The detailed structure of the channel-based attention mechanism is shown in fig. 5.
Based on the structure, the LFNT model of the embodiment of the application has very small size (22.5M), but takes the lead position in quantitative and qualitative evaluation, reduces the calculation cost and is beneficial to the application of LNet to equipment with limited resources.
S40: inputting the data set into an LFNT model for iterative training to obtain optimal model parameters;
in this step, in the model training process, the LFNet model has two tasks: firstly, accurately positioning an alarm area in an image; and secondly, classifying disaster types of the alarm areas. In order to enable the model to better complete the two tasks, the embodiment of the application respectively selects a Mean Square Error (MSE) and a Cross Entropy (CE) as loss functions to guide network optimization, and the loss functions are based on a large amount of statistics on different fire images or videos, so that the LFNT can be helped to effectively detect fire areas.
Specifically, through a great deal of experiments on various fire images, it is found that in a smoke area, the absolute value of the difference between the brightness and the dark channel value is higher than that in other areas, the R channel of the fire area is higher than that in a non-fire area, namely the brightness, the dark channel value and the R channel of a path change along with the difference of the fire hazard area, the smoke concentration increases along with the absolute value of the difference between the brightness and the dark channel, and the visual characteristics of the fire are closely related to the pixel value of the R channel. Based on the above features, embodiments of the present application consider these statistics as a Combustion Histogram Prior (CHP), from which they are written as a formula for CHP:
in the formula (3), R (x) represents R channel of the image, SCP (x) is the difference between brightness of the image and dark channel, w is the width of the histogram, and h is the height of the histogram; can also be written as:
SCP(x)=||v(x)-DCP(x)|| (4)
in formula (4), v (x) is the brightness of the image, and DCP (x) refers to the value of the dark channel of the image.
L CHP =||CHP(I)-CHP(R)|| 2 (5)
In equation (5), CHP represents the combustion histogram prior, and CHP (I) and CHP (R) represent the CHP values of the region selected by the target detection algorithm and the region labeled in the group trunk, respectively.
The final loss function is weighted summation of three different loss functions of a cross entropy loss function, a mean square error loss function and a combustion histogram prior loss function, and the formula is as follows:
L CHP =βL CE +γL MSE +δL CHP (6)
in the formula (6), L CHP L is the final loss function CE L is a cross entropy loss function MSE As a mean square error loss function, L CHP For the combustion histogram prior loss, β, γ, and δ are set to 0.25, and 0.5, respectively.
S50: and inputting the fire image to be detected into a trained LFNT model, and outputting a fire positioning area and a fire type of the fire image to be detected through the LFNT model.
Referring to fig. 6, a schematic diagram of a fire detection system based on video image object detection according to an embodiment of the application is shown. The fire detection system 40 based on video image object detection according to an embodiment of the present application includes:
the data set construction module 41: the method comprises the steps of converting an original natural image into a dust haze image and a sand dust image by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model;
LFNet model training module 42: the method comprises the steps of constructing a convolutional neural network model LFNT, inputting the data set into the LFNT model for iterative training, and obtaining optimal model parameters; the skeleton feature extraction model adopts convolution of 3, 5 and 7 to 7 scales to extract the features of the input image to obtain feature images with the sizes of 13, 26 and 52; the main feature extraction model performs further feature extraction on the main features to generate three groups of feature graphs with the sizes of 52, 26 and 13; the variable scale feature fusion model maps three groups of feature images to different convolution kernels and step sizes for convolution, and splices all convolutions with the same size to obtain three groups of feature images, and the three groups of feature images are operated by using a channel-based attention mechanism to obtain feature images with the sizes of 13, 26 and 52, which are respectively used for detecting small, medium and large objects; the inputting the data set into the LFNet model for iterative training further comprises: respectively selecting a mean square error and a cross entropy as a loss function to perform model optimization;
the convolutional neural network model LFNT comprises a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model; the skeleton feature extraction model extracts main features of an input image through convolution of three different scales; the main feature extraction model is used for carrying out further feature extraction on the main features to generate three groups of feature graphs; the variable-scale feature fusion model carries out self-adaptive fusion on the three groups of feature images and outputs a detection result;
model optimization module 43: and the method is used for respectively selecting the mean square error and the cross entropy as loss functions to perform model optimization.
Fig. 7 is a schematic diagram of a terminal structure according to an embodiment of the application. The terminal 50 includes a processor 51, a memory 52 coupled to the processor 51.
The memory 52 stores program instructions for implementing the fire detection method based on video image object detection described above.
The processor 51 is operative to execute program instructions stored in the memory 52 to control the detection of fire based on the detection of video image objects.
The processor 51 may also be referred to as a CPU (Central Processing Unit ). The processor 51 may be an integrated circuit chip with signal processing capabilities. Processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Fig. 8 is a schematic structural diagram of a storage medium according to an embodiment of the application. The storage medium of the embodiment of the present application stores a program file 61 capable of implementing all the methods described above, where the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a terminal device such as a computer, a server, a mobile phone, a tablet, or the like.
According to the fire detection method, the system, the terminal and the storage medium based on the video image target detection, the original image is converted into the dust haze or sand dust image with different degrees by using the data enhancement algorithm based on the atmospheric scattering model, the data set for training the model is generated, and the convolutional neural network model LFNT suitable for fire smoke detection under uncertain environments is constructed, so that the robustness of the model in abnormal weather such as sand dust, dust haze and the like can be improved, and a better detection result can be obtained. Meanwhile, the LFNT model in the embodiment of the application has smaller size, can reduce the calculation cost and is beneficial to being applied to equipment with limited resources.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A fire detection method based on video image target detection, comprising:
converting an original natural image into a dust haze image and a sand dust image by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model;
constructing a convolutional neural network model LFNT, and inputting the data set into the LFNT model for iterative training to obtain optimal model parameters; the skeleton feature extraction model adopts convolution of 3, 5 and 7 to 7 scales to extract the features of the input image to obtain feature images with the sizes of 13, 26 and 52; the main feature extraction model performs further feature extraction on the main features to generate three groups of feature graphs with the sizes of 52, 26 and 13; the variable scale feature fusion model maps three groups of feature images to different convolution kernels and step sizes for convolution, and splices all convolutions with the same size to obtain three groups of feature images, and the three groups of feature images are operated by using a channel-based attention mechanism to obtain feature images with the sizes of 13, 26 and 52, which are respectively used for detecting small, medium and large objects; the inputting the data set into the LFNet model for iterative training further comprises: respectively selecting a mean square error and a cross entropy as a loss function to perform model optimization;
the convolutional neural network model LFNT comprises a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model; the skeleton feature extraction model extracts main features of an input image through convolution of three different scales; the main feature extraction model is used for carrying out further feature extraction on the main features to generate three groups of feature graphs; the variable-scale feature fusion model carries out self-adaptive fusion on the three groups of feature images and outputs a detection result;
and inputting the fire image to be detected into a trained LFNT model, and outputting a fire positioning area and a fire type of the fire image to be detected through the LFNT model.
2. The fire detection method based on video image object detection according to claim 1, wherein before converting the original natural image into the dust-haze image and the dust-sand image by adopting the data enhancement algorithm based on the atmospheric scattering model, the method comprises:
acquiring an original natural image; the original natural image includes a non-alarm image without a fire alarm area and a real fire alarm image.
3. The fire detection method based on video image object detection according to claim 1 or 2, wherein the converting the original natural image into the dust-haze image using the data enhancement algorithm based on the atmospheric scattering model comprises:
the atmospheric scattering model respectively adopts at least two transmission rates to respectively simulate and generate dust haze images with different concentrations; the dust haze image imaging formula is as follows:
I(x)=J(x)t(x)+ɑ(1-t(x))
in the above formula, I (x) is a simulated haze image, J (x) is an input haze-free image, a is an atmospheric light value, and t (x) is a scene transmission rate.
4. A fire detection method based on video image object detection as defined in claim 3, wherein said converting the original natural image into a dust image using a data enhancement algorithm based on an atmospheric scattering model comprises:
the atmospheric scattering model adopts fixed transmissivity and atmospheric light values, and combines three colors to simulate and generate sand images with different concentrations; the sand image simulation formula is as follows:
D(x)=J(x)t(x)+a(C(x)*(1-t(x)))
in the above formula, D (x) is a simulated dust image, J (x) is an input haze-free image, and C (x) is a color value.
5. The fire detection method based on video image object detection according to claim 1, wherein the loss function is specifically:
counting the brightness, dark channel value and R channel data of a path of a fire region, regarding the statistic data as a combustion histogram prior, and writing a formula of CHP:
in the above formula, R (x) represents R channel of the image, SCP (x) is the difference between brightness of the image and dark channel, w is the width of the histogram, and h is the height of the histogram;
SCP(x)=||v(x)-DCP(x)||
in the above formula, v (x) is the brightness of the image, and DCP (x) is the value of the dark channel of the image;
L CHP =||CHP(I)-CHP(R)|| 2
in the formula, CHP represents a combustion histogram prior, and CHP (I) and CHP (R) respectively represent CHP values of a region selected by a target detection algorithm and a marked region;
the loss function is a weighted summation of three different loss functions:
L CHP =βL CE +γL MSE +δL CHP
in the above formula, L CHP L is the final loss function CE L is a cross entropy loss function MSE As a mean square error loss function, L CHP A priori losses for the combustion histogram.
6. A fire detection system based on video image object detection, comprising:
the data set construction module: the method comprises the steps of converting an original natural image into a dust haze image and a sand dust image by adopting a data enhancement algorithm based on an atmospheric scattering model, and generating a data set for training the model;
LFNet model training module: the method comprises the steps of constructing a convolutional neural network model LFNT, inputting the data set into the LFNT model for iterative training, and obtaining optimal model parameters; the skeleton feature extraction model adopts convolution of 3, 5 and 7 to 7 scales to extract the features of the input image to obtain feature images with the sizes of 13, 26 and 52; the main feature extraction model performs further feature extraction on the main features to generate three groups of feature graphs with the sizes of 52, 26 and 13; the variable scale feature fusion model maps three groups of feature images to different convolution kernels and step sizes for convolution, and splices all convolutions with the same size to obtain three groups of feature images, and the three groups of feature images are operated by using a channel-based attention mechanism to obtain feature images with the sizes of 13, 26 and 52, which are respectively used for detecting small, medium and large objects; the inputting the data set into the LFNet model for iterative training further comprises: respectively selecting a mean square error and a cross entropy as a loss function to perform model optimization;
the convolutional neural network model LFNT comprises a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model; the skeleton feature extraction model extracts main features of an input image through convolution of three different scales; the main feature extraction model is used for carrying out further feature extraction on the main features to generate three groups of feature graphs; the variable-scale feature fusion model carries out self-adaptive fusion on the three groups of feature images and outputs a detection result; the detection result comprises a fire localization area of the fire image and a fire type.
7. A terminal comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the video image object detection-based fire detection method of any one of claims 1 to 5;
the processor is configured to execute the program instructions stored by the memory to control fire detection based on video image object detection.
8. A storage medium storing program instructions executable by a processor for performing the fire detection method based on video image object detection according to any one of claims 1 to 5.
CN202011069784.5A 2020-09-30 2020-09-30 Fire detection method, system, terminal and storage medium based on video image target detection Active CN112215122B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011069784.5A CN112215122B (en) 2020-09-30 2020-09-30 Fire detection method, system, terminal and storage medium based on video image target detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011069784.5A CN112215122B (en) 2020-09-30 2020-09-30 Fire detection method, system, terminal and storage medium based on video image target detection

Publications (2)

Publication Number Publication Date
CN112215122A CN112215122A (en) 2021-01-12
CN112215122B true CN112215122B (en) 2023-10-24

Family

ID=74052827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011069784.5A Active CN112215122B (en) 2020-09-30 2020-09-30 Fire detection method, system, terminal and storage medium based on video image target detection

Country Status (1)

Country Link
CN (1) CN112215122B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113506293B (en) * 2021-09-08 2021-12-07 成都数联云算科技有限公司 Image processing method, device, equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107749067A (en) * 2017-09-13 2018-03-02 华侨大学 Fire hazard smoke detecting method based on kinetic characteristic and convolutional neural networks
KR20180050832A (en) * 2016-11-07 2018-05-16 한국과학기술원 Method and system for dehazing image using convolutional neural network
KR101869442B1 (en) * 2017-11-22 2018-06-20 공주대학교 산학협력단 Fire detecting apparatus and the method thereof
CN108256496A (en) * 2018-02-01 2018-07-06 江南大学 A kind of stockyard smog detection method based on video
CN108428324A (en) * 2018-04-28 2018-08-21 温州大学激光与光电智能制造研究院 The detection device of smog in a kind of fire scenario based on convolutional network
CN108764264A (en) * 2018-03-16 2018-11-06 深圳中兴网信科技有限公司 Smog detection method, smoke detection system and computer installation
CN109063728A (en) * 2018-06-20 2018-12-21 燕山大学 A kind of fire image deep learning mode identification method
CN109522819A (en) * 2018-10-29 2019-03-26 西安交通大学 A kind of fire image recognition methods based on deep learning
CN110490043A (en) * 2019-06-10 2019-11-22 东南大学 A kind of forest rocket detection method based on region division and feature extraction
CN111046827A (en) * 2019-12-20 2020-04-21 哈尔滨理工大学 Video smoke detection method based on convolutional neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7280696B2 (en) * 2002-05-20 2007-10-09 Simmonds Precision Products, Inc. Video detection/verification system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180050832A (en) * 2016-11-07 2018-05-16 한국과학기술원 Method and system for dehazing image using convolutional neural network
CN107749067A (en) * 2017-09-13 2018-03-02 华侨大学 Fire hazard smoke detecting method based on kinetic characteristic and convolutional neural networks
KR101869442B1 (en) * 2017-11-22 2018-06-20 공주대학교 산학협력단 Fire detecting apparatus and the method thereof
CN108256496A (en) * 2018-02-01 2018-07-06 江南大学 A kind of stockyard smog detection method based on video
CN108764264A (en) * 2018-03-16 2018-11-06 深圳中兴网信科技有限公司 Smog detection method, smoke detection system and computer installation
CN108428324A (en) * 2018-04-28 2018-08-21 温州大学激光与光电智能制造研究院 The detection device of smog in a kind of fire scenario based on convolutional network
CN109063728A (en) * 2018-06-20 2018-12-21 燕山大学 A kind of fire image deep learning mode identification method
CN109522819A (en) * 2018-10-29 2019-03-26 西安交通大学 A kind of fire image recognition methods based on deep learning
CN110490043A (en) * 2019-06-10 2019-11-22 东南大学 A kind of forest rocket detection method based on region division and feature extraction
CN111046827A (en) * 2019-12-20 2020-04-21 哈尔滨理工大学 Video smoke detection method based on convolutional neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FireNet: A specialized lightweight fire and smoke detection model for real time IoT application;Arpit Jadon, et al;arXiv:1905.11922;1-6 *
多视角三维仿真城市地图的实现方法;任鹏 等;地理与地理信息科学;第27卷(第3期);34-37 *

Also Published As

Publication number Publication date
CN112215122A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
WO2022067668A1 (en) Fire detection method and system based on video image target detection, and terminal and storage medium
Hu et al. Fast forest fire smoke detection using MVMNet
Kim et al. High-speed drone detection based on yolo-v8
CN112689843B (en) Closed loop automatic data set creation system and method
CN113807276B (en) Smoking behavior identification method based on optimized YOLOv4 model
CN113076809A (en) High-altitude falling object detection method based on visual Transformer
CN110598558A (en) Crowd density estimation method, device, electronic equipment and medium
TWI667621B (en) Face recognition method
CN115457428A (en) Improved YOLOv5 fire detection method and device integrating adjustable coordinate residual attention
Jiang et al. A self-attention network for smoke detection
CN111738054A (en) Behavior anomaly detection method based on space-time self-encoder network and space-time CNN
CN111582074A (en) Monitoring video leaf occlusion detection method based on scene depth information perception
CN111611889A (en) Miniature insect pest recognition device in farmland based on improved convolutional neural network
CN113627504B (en) Multi-mode multi-scale feature fusion target detection method based on generation of countermeasure network
CN111881915A (en) Satellite video target intelligent detection method based on multiple prior information constraints
CN112215122B (en) Fire detection method, system, terminal and storage medium based on video image target detection
CN113158963B (en) Method and device for detecting high-altitude parabolic objects
CN114399734A (en) Forest fire early warning method based on visual information
Bhise et al. Plant disease detection using machine learning
CN113505643A (en) Violation target detection method and related device
CN115205793B (en) Electric power machine room smoke detection method and device based on deep learning secondary confirmation
CN116977256A (en) Training method, device, equipment and storage medium for defect detection model
CN114170269B (en) Multi-target tracking method, equipment and storage medium based on space-time correlation
CN108804981B (en) Moving object detection method based on long-time video sequence background modeling frame
Shen et al. Lfnet: Lightweight fire smoke detection for uncertain surveillance environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant