CN112949453A - Training method of smoke and fire detection model, smoke and fire detection method and smoke and fire detection equipment - Google Patents

Training method of smoke and fire detection model, smoke and fire detection method and smoke and fire detection equipment Download PDF

Info

Publication number
CN112949453A
CN112949453A CN202110215838.2A CN202110215838A CN112949453A CN 112949453 A CN112949453 A CN 112949453A CN 202110215838 A CN202110215838 A CN 202110215838A CN 112949453 A CN112949453 A CN 112949453A
Authority
CN
China
Prior art keywords
firework
smoke
motion
weak
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110215838.2A
Other languages
Chinese (zh)
Other versions
CN112949453B (en
Inventor
曹毅超
孙飞
施燕平
李溯
陈斌锋
封晓强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NANJING ENBO TECHNOLOGY CO LTD
Original Assignee
NANJING ENBO TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NANJING ENBO TECHNOLOGY CO LTD filed Critical NANJING ENBO TECHNOLOGY CO LTD
Priority to CN202110215838.2A priority Critical patent/CN112949453B/en
Publication of CN112949453A publication Critical patent/CN112949453A/en
Application granted granted Critical
Publication of CN112949453B publication Critical patent/CN112949453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Multimedia (AREA)

Abstract

The invention discloses a training method of a smoke and fire detection model, a smoke and fire detection method and smoke and fire detection equipment, and belongs to the technical field of smoke and fire detection. The training method comprises the following steps: constructing a video firework sample data set; performing feature extraction on an input image input texture branch to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid; calculating a frame difference image of the input image and the reference image, and inputting the frame difference image into a motion branch to calculate a motion attention weight map; performing motion perception enhancement on the texture features; generating a weak firework target mask with a weak attention module; and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting the firework target. And a method for smoke and fire detection using the trained model, and an apparatus for performing the method for smoke and fire detection are proposed. The method can effectively improve the detection accuracy of fireworks, and is low in calculation cost and convenient to deploy.

Description

Training method of smoke and fire detection model, smoke and fire detection method and smoke and fire detection equipment
Technical Field
The invention belongs to the technical field of smoke and fire detection, and particularly relates to a training method of a smoke and fire detection model, a smoke and fire detection method and smoke and fire detection equipment.
Background
The occurrence of fire can not only cause property loss, but also seriously harm the life safety of people, once heavy and extra-large fire occurs, the direct industrial loss caused is often more huge, even if the fire occurs in the first brain, communication links, units involved in foreign countries, ancient buildings, scenic spots and other areas, serious political influence is often caused, and the whole country is influenced or even the world is involved. With the development of the deep learning technology, the computational vision technology has been developed greatly, the deep learning technology has been successful in the fields of target detection, behavior recognition, super-resolution and the like, and the computer vision technology is used for detecting fire and smoke, so that the computer vision technology has attracted extensive attention in the academic and industrial fields.
However, the firework target is different from the general rigid body target detection, the edge of the firework target has the properties of blurring and translucency, and the firework target belongs to a special fluid target. In addition, color and texture may also vary greatly under different lighting conditions. The existing firework detection method can be divided into image-based and video-based according to the dimension of input data. Image-based detection methods typically focus on static texture, edges, contours, etc. information of pyrotechnic objects. The video-based detection algorithm focuses more on the dynamic characteristics of diffusion speed, frequency variation and the like of the firework target. Due to the lack of dynamic information, compared with a video-based method, an image-based detection algorithm generally causes a higher false negative rate and a higher false positive rate, and therefore, a video-based method is adopted in many existing firework detection methods. For video-based smoke detection methods, the accuracy of smoke detection needs to be considered, and in most application scenarios, the deployment of the method needs to be considered.
Disclosure of Invention
The technical problem is as follows: aiming at the problem that the accuracy of the existing firework detection method based on video is low, the invention firstly provides a firework detection model training method, so that a firework detection model with higher identification accuracy can be trained; then, based on the trained smoke and fire detection model, a smoke and fire detection method capable of accurately identifying smoke and fire is provided; further, a device for implementing the method for detection of smoke and fire is proposed, so as to enable deployment, accurate detection of smoke and fire; in addition, the invention has low calculation cost and is convenient to deploy.
The technical scheme is as follows: in one aspect, the invention provides a method for training a smoke and fire detection model, the smoke and fire detection model comprising a texture branch and a motion branch, the method comprising:
constructing a video firework sample data set, wherein the video firework sample data set comprises a plurality of samples, and each sample comprises an input image and a reference image;
performing feature extraction on an input image input texture branch to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid;
calculating a frame difference image of the input image and the reference image, and inputting the frame difference image into a motion branch to calculate a motion attention weight map;
performing motion perception enhancement on the texture features;
generating a weak firework target mask with a weak attention module;
and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting the firework target.
Further, the method for inputting the frame difference image into the motion branch to calculate the motion attention weight map comprises the following steps:
down-sampling the frame difference image;
graying the down-sampled frame difference image;
and inputting the grayed image into a standard residual block, and calculating to obtain a motion attention weight map.
Further, the method for enhancing motion perception of the texture features comprises the following steps:
Fm=Fa+Fa*Am
wherein, FmFor texture features after enhancement of motion perception, FaTexture features before motion perception enhancement, AmFor the motor attention weight map, channel-by-channel multiplication using AmSingle element with FaThe channel elements at that location in (a) are multiplied.
Further, the method of generating a weak pyrotechnic target mask with a weak attention module includes:
randomly sampling pixel points of a plurality of smoke and fire targets and non-smoke and fire targets in a data labeling frame of an input image to construct a smoke and fire pixel data set;
training a random forest model by taking RGB (red, green and blue) channels as features;
classifying pixels in a labeling frame in the input image by using a trained random forest model to obtain a mask code in the labeling frame;
and (3) representing the pixel value of the firework area by 1, representing the pixel value of the non-firework area by 0, and pasting the mask in the marking frame to a mask with the total number of 0 according to the position of the marking frame to obtain a complete weak firework target mask.
Further, the method for obtaining the firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception enhancement comprises the following steps:
Fw=Fm*Aw+Fr
wherein, FwShows the Firework characteristics, AwRepresenting weak-pyrotechnic object masks, FmRepresenting a textural feature after enhancement of motion perception, FrIndicating the compensated residual.
Further, the samples in the video pyrotechnic sample data set comprise positive and negative samples, wherein:
the positive samples are frames with firework targets in the video, any frame is selected as an input image for each positive sample, one frame is randomly extracted in the range of 200 ms-5 s before and after the frame to serve as a corresponding reference image, and firework target boundary box labeling is carried out on the input image;
and the negative samples are frames without firework targets in the video, any frame is selected as an input image for each negative sample, and one frame is randomly extracted as a corresponding reference image in the range of 200 ms-5 s before and after the frame.
In another aspect, the invention provides a smoke and fire detection method, wherein a smoke and fire detection model is obtained by training by using the training method, and comprises a texture branch and a motion branch; the method comprises the following steps:
acquiring a firework video image, randomly extracting one frame in a preset time range before and after the frame as a corresponding reference image for any frame of video input image, and calculating a frame difference image between the input image and the reference image;
inputting an input image into a texture branch to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid;
inputting the frame difference image into a motion branch circuit to calculate a motion attention weight map;
performing motion perception enhancement on the texture features;
predicting a weak firework target mask according to the texture features after the motion perception is enhanced;
and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting a firework target.
Further, the method for calculating the motion attention weight map according to the frame difference image input motion branch comprises the following steps:
down-sampling the frame difference image;
graying the down-sampled frame difference image;
and inputting the grayed image into a standard residual block, and calculating to obtain a motion attention weight map.
Further, the method for enhancing motion perception of the texture features comprises the following steps:
Fm=Fa+Fa*Am
wherein, FmFor texture features after enhancement of motion perception, FaTexture features before motion perception enhancement, AmFor the motor attention weight map, channel-by-channel multiplication using AmSingle element with FaThe channel elements at that location in (a) are multiplied.
Further, the method for predicting the weak firework target mask according to the texture features after the motion perception enhancement comprises the following steps:
optimization was performed using a standard Focal local.
Further, the method for obtaining the smoke and fire feature map according to the texture features and the weak smoke and fire target masks after the motion perception enhancement comprises the following steps:
Fw=Fm*Aw+Fr
wherein, FwShows the Firework characteristics, AwRepresenting weak-pyrotechnic object masks, FmRepresenting a textural feature after enhancement of motion perception, FrIndicating the compensated residual.
In yet another aspect, the present invention provides a smoke and fire detection apparatus comprising:
the image acquisition device is used for acquiring firework video images;
a processor; and
a memory having stored therein computer program instructions which, when executed by the processor, cause the processor to perform the pyrotechnic detection method of any of claims 7-10.
Has the advantages that: compared with the prior art, the invention has the following advantages:
(1) the model trained by the training method for the smoke and fire detection model has a two-way network, can avoid the problem of loss of dynamic information of the traditional single-frame detection method, can highlight the dynamic characteristics of smoke and fire areas, and enables the model to have strong identification capability and interference factor resisting discrimination capability for unobvious smoke and fire targets, so that the trained model has high identification accuracy for smoke and fire, and can accurately identify smoke and fire when the model trained by the method is applied to smoke and fire detection. Moreover, the trained model can finish accurate detection of the firework target only by inputting two frames of images, so that the model is ensured to have lower calculation cost, and engineering deployment is facilitated.
In addition, during the training of the model, a weak guide attention module is introduced, mask generation for the semi-transparent smoke region is achieved, and the problem of divergence possibly generated by manual labeling of the semi-transparent smoke target is solved. By using the weak guide attention module and a multi-task learning strategy, the attention degree of the model to the pixel level of the firework target is improved, so that the training data can be more fully utilized by the model, and the detection accuracy of the model to the firework target is improved.
(2) According to the firework detection method, the firework detection model trained by the firework detection model training method provided by the invention is utilized, and the trained firework detection model has a two-way network, so that the problem of dynamic information loss of the traditional single-frame detection method can be avoided, the dynamic characteristics of a firework area can be highlighted, a firework target which is not highlighted has strong recognition capability and interference factor resisting discrimination capability, and firework has high recognition accuracy, so that the firework target can be detected more accurately when the firework target is detected. In addition, the firework detection model has higher attention degree to the firework target pixel level, training data can be more fully utilized, and the firework detection accuracy is further improved. In addition, the model has low calculation cost, so that the application and the deployment of the method are facilitated.
(3) The smoke and fire detection equipment provided by the invention can be used for more accurately detecting smoke and fire targets, and can be used for accurately detecting the smoke and fire targets when being deployed in a specific application scene, so that fire can be timely found, and casualties and property loss caused by fire are reduced.
Drawings
FIG. 1 is a network architecture framework diagram in an embodiment of the invention;
FIG. 2 is a flow chart of a pyrotechnic detection model training method in an embodiment of the invention;
FIG. 3 is a flow diagram of a weak lead attention module generating a weak pyrotechnic target mask during model training in an embodiment of the invention;
FIG. 4 is a graph of the results of the visualization of five samples at different stages during model training according to an embodiment of the present invention;
FIG. 5 is a flow chart of a method of smoke and fire detection in an embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following examples and the accompanying drawings.
First, a structure of a smoke and fire detection model according to an embodiment of the present invention is described, and as shown in fig. 1, in the embodiment of the present invention, the smoke and fire detection model is a two-way network, which is called a two-frame motion-aware backbone network, and the smoke and fire detection model includes two branches, namely a texture branch for processing an input image and a motion branch for processing a frame difference image. In one embodiment of the present invention, the texture branch adopts a network structure of MobileNetV3, and the motion branch includes a down-sampling module, a graying module, and a convolutional neural network residual block, in one embodiment of the present invention, a residual block of MobileNetV2 is adopted, and it is described that, in other embodiments, the residual blocks in the texture branch and the operation branch can be replaced by other existing networks by those skilled in the art.
With reference to fig. 1 and 2, an embodiment of the training method of the smoke detection model of the present invention is described, the training method comprising:
s100: and constructing a video firework sample data set. In the embodiment of the invention, the video firework sample data set comprises a plurality of positive samples and negative samples, wherein the positive samples are frames with firework targets in a video, any frame is selected as an input image for each positive sample, one frame is randomly extracted in the range of 200 ms-5 s before and after the frame as a corresponding reference image, and firework target boundary box labeling is carried out on the input image; the negative sample is a frame without a firework target in the video, and the input image and the reference image of the negative sample are extracted by the same method without marking the input image of the negative sample.
S110: and inputting the input image into the texture branch for feature extraction to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid. In one embodiment of the present invention, the texture branch adopts a network structure of MobileNetV3, and is formulated as follows:
Figure BDA0002953065040000051
the texture branch in this embodiment can produce a four-scale representation of the features, where IF is the input image, with dimensions of 3 × H × W, the number of color image channels is 3, H and W are the height and width of the input image respectively,
Figure BDA0002953065040000052
in this embodiment, after the texture branch performs feature extraction on the input image, feature representations of four scales are obtained. Then, the characteristics of four scales are expressed and fused into a texture characteristic F through a characteristic pyramid FPNaExpressed as:
Figure BDA0002953065040000061
in the embodiment of the invention, the characteristic pyramid adopts a standard characteristic pyramid, multi-scale characteristic representation of the input image is extracted through the texture branch, and the multi-scale characteristic representation is fused by the characteristic pyramid, so that the texture characteristic F of the input image is extractedaAnd, the texture feature FaHas a size of
Figure BDA0002953065040000062
S120: and calculating a frame difference image of the input image and the reference image, and inputting the frame difference image into the motion branch to calculate a motion attention weight map.
In one embodiment of the invention, the frame difference image is obtained by subtracting the input image from the reference image, and the frame difference image is calculated to highlight the changed area in the firework image. Since the information of the frame difference image is sparse and the calculation amount is large if the frame difference image is not processed, in an embodiment of the present invention, after the frame difference image is input into the motion branch, the frame difference image of 3 × H × W is first down-sampled to
Figure BDA0002953065040000063
Then grayed to
Figure BDA0002953065040000064
Then inputting the grayed image into a standard residual block, and calculating to obtain a motion attention weight map AmThe calculation flow of the motion branch can be formulated as follows:
Am=fMC{grayscale[downscale(IF-RF)]}
wherein A ismRepresents the motion attention weight graph, down scale represents the length-width fourfold down sampling, graph represents the graying of the color frame difference image, fMCRepresenting the calculation of the standard residual block of the motion branch, IF representing the input image, RF representing the reference image, IF-RF representing the frame difference image. In this embodiment, the dimensions are calculated as
Figure BDA0002953065040000065
The exercise attention weight map Am
It should be noted that, in other embodiments of the present invention, steps S110 and S120 may be performed synchronously, or step S120 may be performed first, and then step S110 may be performed.
S130: and performing motion perception enhancement on the texture features. When obtaining the attention weight map AmThen, the attention weight map is used to compare the texture features F of the input imageaPerforming motion perception enhancement, in one of the present inventionIn an embodiment, the method for enhancing motion perception includes:
Fm=Fa+Fa*Am
wherein, FmFor texture features after enhancement of motion perception, FaTexture features before motion perception enhancement, AmFor the motor attention weight map, channel-by-channel multiplication using AmSingle element with FaMultiplication of the channel elements at this position, FmSize of (D) and FaThe same is true. Through the process, the characteristics corresponding to the motion areas in the image can be enhanced.
S140: a weak pyrotechnic target mask is generated with a weak attention module. In an embodiment of the invention, in order to raise the attention of the model to the pyrotechnic region, a weak guidance attention module is introduced for generating a weak pyrotechnic target mask to guide the identification of the model to the pyrotechnic region. Specifically, in one embodiment of the present invention, the method of generating the weak pyrotechnic target mask in the weak attention module is, as shown in fig. 3:
s1401: randomly sampling pixel points of a plurality of smoke and fire targets and non-smoke and fire targets in a data labeling frame of an input image to construct a smoke and fire pixel data set;
s1402: training a random forest model by taking RGB (red, green and blue) channels as features;
s1403: classifying pixels in a label box in the input image by using the trained random forest model to obtain a mask in the label box, for example, in one embodiment, in this way, the mask in the label box shown in fig. 1 is obtained
Figure BDA0002953065040000071
S1404: and (3) representing the pixel value of the firework area by 1, representing the pixel value of the non-firework area by 0, and pasting the mask in the marking frame to a mask with the total number of 0 according to the position of the marking frame to obtain a complete weak firework target mask. In one embodiment of the invention, the size of the weak pyrotechnic object mask corresponds to the size of the sports attention weight map, e.g. when a sports betThe gravity graph has the size of
Figure BDA0002953065040000072
The size of the weak pyrotechnic target mask is also
Figure BDA0002953065040000073
S150: and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting the firework target. Specifically, in the embodiment of the present invention, the specific method is:
Fw=Fm*Aw+Fr
wherein, FwShows the Firework characteristics, AwRepresenting weak-pyrotechnic object masks, FmRepresenting a textural feature after enhancement of motion perception, FrIndicating the compensated residual.
As can be seen from the above method, the weak pyrotechnic target mask AwIs used again for the feature map FmAnd (6) carrying out adjustment. Compensating residual FrHas the effects ofwMay lose a portion of the detail information and therefore add a compensation component to preserve the detail feature information. The resulting pyrotechnic profile F characterized by weak guidancewThe size of which is still equal to AmAnd (5) the consistency is achieved.
In the training process of the model, A can be predicted by using the texture features after the enhancement of motion perceptionwAnd FrWhen prediction is carried out, the method can be similar to a multitask supervision process, and standard Focal local is adopted for optimization, so that A is obtainedwAnd FrBut during training, for AwObtained by using a weak lead attention module, not predicted, FrThe method is obtained by prediction.
In the embodiment of the invention, when the firework target detection is carried out, the firework target detection can be carried out by adopting a CenterNet detection head of Anchor-free.
FIG. 4 shows the states of the video image when the model is trained, wherein FIG. 4 showsLine (a) shows the input image, line (b) shows the residual image, and line (c) shows the attention-to-motion weight map Am(ii) a Line (d) shows a weak pyrotechnic target mask AwLine (e) shows a pyrotechnic characteristic diagram Fw. Where fig. 4 gives an example of five samples in total, the first three columns being positive samples and the last two columns being negative samples.
The model trained by the training method for the smoke and fire detection model has a two-way network, can avoid the problem of loss of dynamic information of the traditional single-frame detection method, can highlight the dynamic characteristics of smoke and fire areas, and enables the model to have strong identification capability and interference factor resisting discrimination capability for unobvious smoke and fire targets, so that the trained model has high identification accuracy for smoke and fire, and can accurately identify smoke and fire when the model trained by the method is applied to smoke and fire detection. Moreover, the trained model can finish accurate detection of the firework target only by inputting two frames of images, so that the model is ensured to have lower calculation cost, and engineering deployment is facilitated.
In addition, during the training of the model, a weak guide attention module is introduced, mask generation for the semi-transparent smoke region is achieved, and the problem of divergence possibly generated by manual labeling of the semi-transparent smoke target is solved. By using the weak guide attention module and a multi-task learning strategy, the attention degree of the model to the pixel level of the firework target is improved, so that the training data can be more fully utilized by the model, and the detection accuracy of the model to the firework target is improved.
Based on the training method, a firework detection model with high detection accuracy can be trained, and by using the trained firework detection model, the invention provides a firework detection method, as shown in fig. 5, the method comprises the following steps:
s200: acquiring a firework video image, randomly extracting one frame in a set time range before and after the frame image as a corresponding reference image for any frame of video input image, and calculating a frame difference image between the input image and the reference image. In the implementation process of the specific method, video acquisition equipment such as a camera and the like can be adopted to acquire a firework video image, and in a general situation, a firework detection system deployed in a specific scene works in real time and needs to continuously detect surrounding scenes, so that each frame of the acquired firework video image is input into a firework detection model for detection, and therefore, one frame can be randomly extracted as a corresponding reference image within a range of 200ms to 5s before and after the frame of image.
S210: inputting an input image into a texture branch to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid; specifically, the specific operation of this step is the same as the operation of step S110 in the model training method, and is not described here again.
S220: inputting the frame difference image into a motion branch circuit to calculate a motion attention weight map; the specific operation of this step is the same as the operation mode of step S220 in the model training method, and the frame difference image is first down-sampled, then the down-sampled frame difference image is grayed, and finally the grayed image is input into the standard residual block, and the motion attention weight map is obtained by calculation, and more specifically, this is not repeated here.
Similar to the training process of the model, in other embodiments, steps S210 and S220 may be performed simultaneously, or S220 may be performed first and then S210 may be performed.
S230: performing motion perception enhancement on the texture features; the specific operation of this step is the same as the operation of step S220 in the model training method, and is not described herein again.
S240: predicting a weak firework target mask according to the texture features after the motion perception is enhanced; unlike the training method of the firework detection model, when firework target detection is performed, a weak firework target mask needs to be predicted through the texture features after motion perception enhancement, and a weak guiding attention module is not used for generating the weak firework attention mask. In one embodiment of the invention, the weak Firework attention mask A is predicted using a standard Focal local optimization, similar to the process of multitask supervisionw. While predicting the weak firework attention mask, the compensation residual F is predicted at the same timer
S250: and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting a firework target. Specifically, the specific operation of this step is the same as the operation of step S110 in the model training method, and is not described here again. In the embodiment of the invention, the smoke and fire target detection is carried out by adopting a CenterNet detection head of Anchor-free.
According to the firework detection method, the firework detection model trained by the firework detection model training method provided by the invention is utilized, and the trained firework detection model has a two-way network, so that the problem of dynamic information loss of the traditional single-frame detection method can be avoided, the dynamic characteristics of a firework area can be highlighted, a firework target which is not highlighted has strong recognition capability and interference factor resisting discrimination capability, and firework has high recognition accuracy, so that the firework target can be detected more accurately when the firework target is detected. In addition, the firework detection model has higher attention degree to the firework target pixel level, training data can be more fully utilized, and the firework detection accuracy is further improved. In addition, the model has low calculation cost, so that the application and the deployment of the method are facilitated.
Further, the invention provides a smoke and fire detection device, which comprises an image acquisition device, a processor and a memory, wherein the image acquisition device is used for acquiring smoke and fire video images, for example, a camera can be used for acquiring the smoke and fire video images; the memory has stored therein computer program instructions which, when executed by the processor, cause the processor to perform a smoke detection method in an embodiment of the invention.
The processor may be one or more, may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
The memory may likewise be one or more, and the memory may be various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium and executed by a processor to implement the smoke detection method in the embodiments of the present application described above. Also, the memory may store images generated at various stages during the smoke and fire detection process.
The smoke and fire detection equipment provided by the invention can be used for more accurately detecting smoke and fire targets, and can be used for accurately detecting the smoke and fire targets when being deployed in a specific application scene, so that fire can be timely found, and casualties and property loss caused by fire are reduced.
The above examples are only preferred embodiments of the present invention, it should be noted that: it will be apparent to those skilled in the art that various modifications and equivalents can be made without departing from the spirit of the invention, and it is intended that all such modifications and equivalents fall within the scope of the invention as defined in the claims.

Claims (12)

1. A method of training a smoke detection model, the smoke detection model comprising a texture branch and a motion branch, the method comprising:
constructing a video firework sample data set, wherein the video firework sample data set comprises a plurality of samples, and each sample comprises an input image and a reference image;
performing feature extraction on an input image input texture branch to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid;
calculating a frame difference image of the input image and the reference image, and inputting the frame difference image into a motion branch to calculate a motion attention weight map;
performing motion perception enhancement on the texture features;
generating a weak firework target mask with a weak attention module;
and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting the firework target.
2. The training method of claim 1, wherein the method of inputting the frame difference image into the motion branch to calculate the motion attention weight map comprises:
down-sampling the frame difference image;
graying the down-sampled frame difference image;
and inputting the grayed image into a standard residual block, and calculating to obtain a motion attention weight map.
3. The training method according to claim 1, wherein the method for enhancing the texture features by motion perception is as follows:
Fm=Fa+Fa*Am
wherein, FmFor texture features after enhancement of motion perception, FaTexture features before motion perception enhancement, AmFor the motor attention weight map, channel-by-channel multiplication using AmSingle element with FaThe channel elements at that location in (a) are multiplied.
4. A training method as claimed in claim 1, wherein the method of generating a weak pyrotechnic target mask using a weak attention module comprises:
randomly sampling pixel points of a plurality of smoke and fire targets and non-smoke and fire targets in a data labeling frame of an input image to construct a smoke and fire pixel data set;
training a random forest model by taking RGB (red, green and blue) channels as features;
classifying pixels in a labeling frame in the input image by using a trained random forest model to obtain a mask code in the labeling frame;
and (3) representing the pixel value of the firework area by 1, representing the pixel value of the non-firework area by 0, and pasting the mask in the marking frame to a mask with the total number of 0 according to the position of the marking frame to obtain a complete weak firework target mask.
5. Training method according to claim 1, wherein the method of deriving a smoke signature graph from the motion perception enhanced texture features and the weak smoke target masks is:
Fw=Fm*Aw+Fr
wherein, FwShows the Firework characteristics, AwRepresenting weak-pyrotechnic object masks, FmRepresenting a textural feature after enhancement of motion perception, FrIndicating the compensated residual.
6. A training method as claimed in any one of claims 1-5, wherein the samples in the video pyrotechnic sample data set comprise positive and negative samples, wherein:
the positive samples are frames with firework targets in the video, any frame is selected as an input image for each positive sample, one frame is randomly extracted in the range of 200 ms-5 s before and after the frame to serve as a corresponding reference image, and firework target boundary box labeling is carried out on the input image;
and the negative samples are frames without firework targets in the video, any frame is selected as an input image for each negative sample, and one frame is randomly extracted as a corresponding reference image in the range of 200 ms-5 s before and after the frame.
7. A smoke and fire detection method, characterized in that a smoke and fire detection model is obtained by training according to the training method of any one of claims 1 to 6, wherein the smoke and fire detection model comprises a texture branch and a motion branch; the method comprises the following steps:
acquiring a firework video image, randomly extracting one frame in a preset time range before and after the frame as a corresponding reference image for any frame of video input image, and calculating a frame difference image between the input image and the reference image;
inputting an input image into a texture branch to obtain multi-scale feature representation, and fusing the multi-scale feature representation into texture features through a feature pyramid;
inputting the frame difference image into a motion branch circuit to calculate a motion attention weight map;
performing motion perception enhancement on the texture features;
predicting a weak firework target mask according to the texture features after the motion perception is enhanced;
and obtaining a firework characteristic diagram according to the texture characteristics and the weak firework target mask after the motion perception is enhanced, and detecting a firework target.
8. The method of claim 7, wherein the method of computing the motion attention weight map from the frame difference image input motion branch comprises:
down-sampling the frame difference image;
graying the down-sampled frame difference image;
and inputting the grayed image into a standard residual block, and calculating to obtain a motion attention weight map.
9. The method of claim 7, wherein the method for enhancing motion perception of the texture feature comprises:
Fm=Fa+Fa*Am
wherein, FmFor texture features after enhancement of motion perception, FaTexture features before motion perception enhancement, AmFor the motor attention weight map, channel-by-channel multiplication using AmSingle element with FaThe channel elements at that location in (a) are multiplied.
10. The method according to claim 7, wherein the method of predicting a weak smoke target mask from motion perception enhanced texture features comprises:
optimization was performed using a standard Focal local.
11. The method according to claim 10, wherein the method for obtaining the smoke feature map according to the texture feature and the weak smoke target mask after the motion perception enhancement comprises:
Fw=Fm*Aw+Fr
wherein, FwShows the Firework characteristics, AwRepresenting weak-pyrotechnic object masks, FmRepresenting a textural feature after enhancement of motion perception, FrIndicating the compensated residual.
12. A smoke detection device, comprising:
the image acquisition device is used for acquiring firework video images;
a processor; and
a memory having stored therein computer program instructions which, when executed by the processor, cause the processor to perform the pyrotechnic detection method of any of claims 7-11.
CN202110215838.2A 2021-02-26 2021-02-26 Training method of smoke and fire detection model, smoke and fire detection method and equipment Active CN112949453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110215838.2A CN112949453B (en) 2021-02-26 2021-02-26 Training method of smoke and fire detection model, smoke and fire detection method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110215838.2A CN112949453B (en) 2021-02-26 2021-02-26 Training method of smoke and fire detection model, smoke and fire detection method and equipment

Publications (2)

Publication Number Publication Date
CN112949453A true CN112949453A (en) 2021-06-11
CN112949453B CN112949453B (en) 2023-12-26

Family

ID=76246365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110215838.2A Active CN112949453B (en) 2021-02-26 2021-02-26 Training method of smoke and fire detection model, smoke and fire detection method and equipment

Country Status (1)

Country Link
CN (1) CN112949453B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469050A (en) * 2021-07-01 2021-10-01 安徽大学 Flame detection method based on image subdivision classification
CN113870254A (en) * 2021-11-30 2021-12-31 中国科学院自动化研究所 Target object detection method and device, electronic equipment and storage medium
CN116468974A (en) * 2023-06-14 2023-07-21 华南理工大学 Smoke detection method, device and storage medium based on image generation

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094699A1 (en) * 2011-10-12 2013-04-18 Industry Academic Cooperation Foundation Keimyung University Forest fire smoke detection method using random forest classification
CN105139429A (en) * 2015-08-14 2015-12-09 大连理工大学 Fire detecting method based on flame salient picture and spatial pyramid histogram
CN110874592A (en) * 2019-10-21 2020-03-10 南京信息职业技术学院 Forest fire smoke image detection method based on total bounded variation
CN111145222A (en) * 2019-12-30 2020-05-12 浙江中创天成科技有限公司 Fire detection method combining smoke movement trend and textural features
CN111464814A (en) * 2020-03-12 2020-07-28 天津大学 Virtual reference frame generation method based on parallax guide fusion
CN111523410A (en) * 2020-04-09 2020-08-11 哈尔滨工业大学 Video saliency target detection method based on attention mechanism
CN111860398A (en) * 2020-07-28 2020-10-30 河北师范大学 Remote sensing image target detection method and system and terminal equipment
CN111860504A (en) * 2020-07-20 2020-10-30 青岛科技大学 Visual multi-target tracking method and device based on deep learning
DE102020118241A1 (en) * 2019-07-22 2021-01-28 Samsung Electronics Co., Ltd. VIDEO DEPTH ESTIMATION BASED ON TEMPORAL ATTENTION

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094699A1 (en) * 2011-10-12 2013-04-18 Industry Academic Cooperation Foundation Keimyung University Forest fire smoke detection method using random forest classification
CN105139429A (en) * 2015-08-14 2015-12-09 大连理工大学 Fire detecting method based on flame salient picture and spatial pyramid histogram
DE102020118241A1 (en) * 2019-07-22 2021-01-28 Samsung Electronics Co., Ltd. VIDEO DEPTH ESTIMATION BASED ON TEMPORAL ATTENTION
CN110874592A (en) * 2019-10-21 2020-03-10 南京信息职业技术学院 Forest fire smoke image detection method based on total bounded variation
CN111145222A (en) * 2019-12-30 2020-05-12 浙江中创天成科技有限公司 Fire detection method combining smoke movement trend and textural features
CN111464814A (en) * 2020-03-12 2020-07-28 天津大学 Virtual reference frame generation method based on parallax guide fusion
CN111523410A (en) * 2020-04-09 2020-08-11 哈尔滨工业大学 Video saliency target detection method based on attention mechanism
CN111860504A (en) * 2020-07-20 2020-10-30 青岛科技大学 Visual multi-target tracking method and device based on deep learning
CN111860398A (en) * 2020-07-28 2020-10-30 河北师范大学 Remote sensing image target detection method and system and terminal equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469050A (en) * 2021-07-01 2021-10-01 安徽大学 Flame detection method based on image subdivision classification
CN113870254A (en) * 2021-11-30 2021-12-31 中国科学院自动化研究所 Target object detection method and device, electronic equipment and storage medium
CN116468974A (en) * 2023-06-14 2023-07-21 华南理工大学 Smoke detection method, device and storage medium based on image generation
CN116468974B (en) * 2023-06-14 2023-10-13 华南理工大学 Smoke detection method, device and storage medium based on image generation

Also Published As

Publication number Publication date
CN112949453B (en) 2023-12-26

Similar Documents

Publication Publication Date Title
CN110176027B (en) Video target tracking method, device, equipment and storage medium
CN107301383B (en) Road traffic sign identification method based on Fast R-CNN
EP3937481A1 (en) Image display method and device
EP3598736B1 (en) Method and apparatus for processing image
CN112949453A (en) Training method of smoke and fire detection model, smoke and fire detection method and smoke and fire detection equipment
CN108389224B (en) Image processing method and device, electronic equipment and storage medium
CN111046880A (en) Infrared target image segmentation method and system, electronic device and storage medium
CN111461213B (en) Training method of target detection model and target rapid detection method
CN111401324A (en) Image quality evaluation method, device, storage medium and electronic equipment
CN110796041B (en) Principal identification method and apparatus, electronic device, and computer-readable storage medium
CN110135446B (en) Text detection method and computer storage medium
CN110781962B (en) Target detection method based on lightweight convolutional neural network
CN112102141B (en) Watermark detection method, watermark detection device, storage medium and electronic equipment
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN111539456B (en) Target identification method and device
CN113052170A (en) Small target license plate recognition method under unconstrained scene
CN115620022A (en) Object detection method, device, equipment and storage medium
CN111881984A (en) Target detection method and device based on deep learning
CN115049675A (en) Generation area determination and light spot generation method, apparatus, medium, and program product
CN111553337A (en) Hyperspectral multi-target detection method based on improved anchor frame
CN111815529B (en) Low-quality image classification enhancement method based on model fusion and data enhancement
CN110334703B (en) Ship detection and identification method in day and night image
CN116612272A (en) Intelligent digital detection system for image processing and detection method thereof
CN114821356B (en) Optical remote sensing target detection method for accurate positioning
CN110751163A (en) Target positioning method and device, computer readable storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant