CN114387557A - Deep learning-based method and system for detecting smoking and calling of gas station - Google Patents

Deep learning-based method and system for detecting smoking and calling of gas station Download PDF

Info

Publication number
CN114387557A
CN114387557A CN202210070796.2A CN202210070796A CN114387557A CN 114387557 A CN114387557 A CN 114387557A CN 202210070796 A CN202210070796 A CN 202210070796A CN 114387557 A CN114387557 A CN 114387557A
Authority
CN
China
Prior art keywords
human head
image
smoking
deep learning
state machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210070796.2A
Other languages
Chinese (zh)
Inventor
李平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yi Tai Fei Liu Information Technology LLC
Original Assignee
Yi Tai Fei Liu Information Technology LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yi Tai Fei Liu Information Technology LLC filed Critical Yi Tai Fei Liu Information Technology LLC
Priority to CN202210070796.2A priority Critical patent/CN114387557A/en
Publication of CN114387557A publication Critical patent/CN114387557A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for detecting smoking and calling of a gas station based on deep learning, which comprises the steps of firstly obtaining a monitoring video stream of the gas station, framing the monitoring video stream and obtaining a plurality of video frame images; dividing an interested target area, inputting a video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area; and sending the cut human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors. The invention combines the human head detection, the behavior classification model and the finite state machine to realize the advantage complementation of the models, thereby being capable of keeping high robustness and high accuracy under the conditions of complex open environment and different shielded illumination. The invention combines the smoking and calling behavior recognition method of human head target detection and image classification, and can quickly and effectively judge the image.

Description

Deep learning-based method and system for detecting smoking and calling of gas station
Technical Field
The invention relates to the technical field of machine learning, in particular to a method and a system for detecting smoking and calling of a gas station based on deep learning.
Background
With the rapid development of the traffic network in China, more and more gas stations emerge. However, in a gas station, a lot of dangerous behaviors and actions such as smoking and calling are generated, so that a lot of manpower and material resources are needed to be invested in order to regulate the behavior of the gas station and ensure personal and property safety, and detection based on manual dangerous behaviors can cause careless mistakes due to human fatigue and other factors. Based on the artificial abnormal behavior monitoring of the gas station, the requirement for guaranteeing the safety of the gas station cannot be met, intelligent analysis is used as a more accurate and effective technology to magnify the heteroscedasticity in more and more fields, and the real-time intelligent analysis of the monitoring video of the gas station by the artificial intelligent algorithm is also one of important application scenes.
The existing detection method for making a call in smoking is mainly based on a human body posture estimation method, which is used for estimating the human body posture of each video frame image in a video stream and analyzing and judging postures of making a call in smoking and the like. According to the method, dangerous behaviors such as smoking, calling and the like of the gas station are monitored manually, but the scheme is not only low in efficiency, but also careless mistakes can be easily caused by manual monitoring. In addition, the detection and calculation of the human body key points of the human body posture estimation are complex, so that the calculation resource consumption of the algorithm is high, and the real-time human body posture target requirement is met. Secondly, when the number of people appearing in the video stream is too large, the human body posture is missed to be detected, and when people are far away from the camera, the detected key point information is rather mixed.
Disclosure of Invention
In view of the above drawbacks of the prior art, an object of the present invention is to provide a method and a system for detecting a smoke and phone call of a gas station based on deep learning, which are used to solve the problems of robustness, tedious process and poor intuitiveness in the existing method for detecting contaminants in a photovoltaic module.
In order to achieve the above objects and other related objects, the present invention provides a method for detecting smoking and phone call of a gas station based on deep learning, comprising the following steps:
acquiring a monitoring video stream of a gas station, framing the monitoring video stream, and acquiring a plurality of video frame images;
dividing an interested target area, inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area;
and sending the cut human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors.
Optionally, after recognizing that smoking and calling behaviors exist in the human head image, the method further includes:
taking the smoking and calling behaviors as dangerous behaviors, and triggering a Finite State Machine (FSM) to carry out verification analysis based on the dangerous behaviors;
and obtaining a verification analysis result, and judging whether the monitoring video stream of the gas station has smoking and calling behaviors or not by combining the classification result of the image classification model.
Optionally, the process of taking the smoking and calling behaviors as dangerous behaviors and triggering a finite state machine FSM to perform verification analysis based on the dangerous behaviors includes:
if the image classification model identifies that smoking and calling behaviors exist in the human head image, the smoking and calling behaviors are used as dangerous behaviors, all state machine FSMs are triggered based on the dangerous behaviors, and whether state machines similar to the human head image exist in all current state machines or not is judged;
if yes, adding 1 to the counting times of the existing state machine;
if not, a new state machine is created and the initial count is 1;
after all the head pictures on the current picture are judged, deleting the head state machine information which does not exist on the current picture;
judging whether the counting times of the current under-head state machine reach a preset threshold value T or not; if the counting times reach a threshold value T, the judgment is true, and whether smoking and calling behaviors exist in the monitoring video stream of the gas station or not is judged by combining the classification result of the image classification model; and if the statistical times do not reach the threshold value, judging the result to be false.
Optionally, the process of inputting the video frame image into a deep learning target detection model to detect a human head region in the target region of interest, and cutting out a corresponding human head image from the video frame image according to the human head region includes:
acquiring a model for detecting a human head in a YOLOv3 model in Darknet as the deep learning target detection model, and performing human head detection on the acquired video frame image;
and amplifying the detected human head frame, and cutting out the corresponding human head frame from a preset interested target area to be used as a human head image.
Optionally, before sending the cut-out human head image into an image classification model for classification, the method further includes:
carrying out sample classification on the human head image, and dividing the human head image into a normal sample, a smoking sample and a calling sample;
taking the classified samples as training data of an image classification model;
inputting the training data into a ResNet18 network for training to obtain a corresponding image classification model; during training, the triple loss function is used as a regularization item, the samples of the same class are used as positive samples, and the samples of different classes are used as negative samples.
Optionally, the method further comprises alerting when smoking and calling activities are present.
The invention also provides a deep learning-based gas station smoking and phone call detection system, which comprises:
the system comprises a video image acquisition module, a data processing module and a data processing module, wherein the video image acquisition module is used for acquiring a monitoring video stream of a gas station, framing the monitoring video stream and acquiring a plurality of video frame images;
setting an interested area module for dividing an interested target area;
the human head target detection module is used for inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area and cutting out a corresponding human head image from the video frame image according to the human head area;
the image classification module is used for sending the cut human head images into an image classification model for classification and identifying whether smoking and calling behaviors exist in the human head images;
the finite state machine module is used for triggering a finite state machine FSM to carry out verification analysis according to smoking and calling behaviors;
and the event alarm module is used for acquiring a verification analysis result and early warning by combining a classification result of the image classification model.
Optionally, the process that the finite-state machine module triggers the finite-state machine FSM to perform verification analysis according to smoking and calling behaviors includes:
if the image classification model identifies that smoking and calling behaviors exist in the human head image, the smoking and calling behaviors are used as dangerous behaviors, all state machine FSMs are triggered based on the dangerous behaviors, and whether state machines similar to the human head image exist in all current state machines or not is judged;
if yes, adding 1 to the counting times of the existing state machine;
if not, a new state machine is created and the initial count is 1;
after all the head pictures on the current picture are judged, deleting the head state machine information which does not exist on the current picture;
judging whether the counting times of the current under-head state machine reach a preset threshold value T or not; if the counting times reach a threshold value T, the judgment is true, and whether smoking and calling behaviors exist in the monitoring video stream of the gas station or not is judged by combining the classification result of the image classification model; and if the statistical times do not reach the threshold value, judging the result to be false.
Optionally, the process of inputting the video frame image into a deep learning target detection model by the human head target detection module to detect a human head region in the target region of interest, and cutting out a corresponding human head image from the video frame image according to the human head region includes:
acquiring a model for detecting a human head in a YOLOv3 model in Darknet as the deep learning target detection model, and performing human head detection on the acquired video frame image;
and amplifying the detected human head frame, and cutting out the corresponding human head frame from a preset interested target area to be used as a human head image.
Optionally, before the image classification module sends the cut-out human head image to an image classification model for classification, the method further includes:
carrying out sample classification on the human head image, and dividing the human head image into a normal sample, a smoking sample and a calling sample;
taking the classified samples as training data of an image classification model;
inputting the training data into a ResNet18 network for training to obtain a corresponding image classification model; during training, the triple loss function is used as a regularization item, the samples of the same class are used as positive samples, and the samples of different classes are used as negative samples.
As described above, the present invention provides a method and a system for detecting a smoking and calling in a gas station based on deep learning, which have the following advantages: firstly, acquiring a monitoring video stream of a gas station, framing the monitoring video stream, and acquiring a plurality of video frame images; dividing an interested target area, inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area; and sending the cut human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors. The invention combines the human head detection, the behavior classification model and the finite state machine to realize the advantage complementation of the models, thereby being capable of keeping high robustness and high accuracy under the conditions of complex open environment and different shielded illumination. The invention combines the smoking and calling behavior recognition method of human head target detection and image classification, and can quickly and effectively judge the image. The invention improves the accuracy and the precision of behavior recognition based on the judgment module of the finite-state machine.
Drawings
FIG. 1 is a schematic flow chart of a method for detecting a smoke and phone call at a gas station based on deep learning according to an embodiment;
FIG. 2 is a schematic flow chart illustrating a method for detecting a smoke and phone call at a gas station based on deep learning according to another embodiment;
fig. 3 is a schematic hardware configuration diagram of a deep learning-based gas station smoking and phone call detection system according to an embodiment.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
Referring to fig. 1, the present invention provides a method for detecting a smoking and phone call in a gas station based on deep learning, comprising the following steps:
s100, acquiring a monitoring video stream of a gas station, framing the monitoring video stream, and acquiring a plurality of video frame images. As an example, the monitoring video stream of the gas station is acquired by an analog camera and a digital camera.
S200, dividing an interested target area, inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area. As an example, a model for detecting a human head in a YOLOv3 model in Darknet is obtained as the deep learning target detection model, and human head detection is performed on a collected video frame image; and amplifying the detected human head frame, and cutting out the corresponding human head frame from a preset interested target area to be used as a human head image.
S300, sending the cut human head images into an image classification model for classification, and identifying whether smoking and calling behaviors exist in the human head images. When smoking and calling behaviors exist, the embodiment can also give an alarm.
According to the above description, in an exemplary embodiment, after recognizing that smoking and calling actions exist in the human head image, the method further includes: taking the smoking and calling behaviors as dangerous behaviors, and triggering a Finite State Machine (FSM) to carry out verification analysis based on the dangerous behaviors; and obtaining a verification analysis result, and judging whether the monitoring video stream of the gas station has smoking and calling behaviors or not by combining the classification result of the image classification model. Specifically, if the image classification model identifies that smoking and calling behaviors exist in the human head image, the smoking and calling behaviors are used as dangerous behaviors, all state machines FSMs are triggered based on the dangerous behaviors, and whether state machines similar to the human head image exist in all current state machines or not is judged; if yes, adding 1 to the counting times of the existing state machine; if not, a new state machine is created and the initial count is 1; after all the head pictures on the current picture are judged, deleting the head state machine information which does not exist on the current picture; judging whether the counting times of the current under-head state machine reach a preset threshold value T or not; if the counting times reach a threshold value T, the judgment is true, and whether smoking and calling behaviors exist in the monitoring video stream of the gas station or not is judged by combining the classification result of the image classification model; and if the statistical times do not reach the threshold value, judging the result to be false.
According to the above description, in an exemplary embodiment, before sending the cut-out head image to the image classification model for classification, the method further includes: carrying out sample classification on the human head image, and dividing the human head image into a normal sample, a smoking sample and a calling sample; taking the classified samples as training data of an image classification model; inputting the training data into a ResNet18 network for training to obtain a corresponding image classification model; during training, the triple loss function is used as a regularization item, the samples of the same class are used as positive samples, and the samples of different classes are used as negative samples.
In a specific exemplary embodiment, as shown in fig. 2, the embodiment further provides a deep learning-based smoking and phone call detection method for a gas station, which includes the following steps:
step 201, detecting whether a human head exists in a target area under a gas station camera by using a deep learning target detection method, if the human head is detected, respectively inputting each detected human head image into step 202, otherwise, ending.
Step 202, receiving the judgment result of step 201, judging whether the detected human head image belongs to smoking and making a call dangerous behavior, if so, sending the judgment information to the step 203 of the state machine, otherwise, ending.
And step 203, judging whether the state machine similar to the human head image exists in all the current state machines, if so, adding one to the counting times of the existing state machines, and if not, creating a new state machine and initializing the counting number to be 1. And if all the human head pictures on the current picture are judged, deleting the human head state machine information which does not exist on the current picture.
And step 204, judging whether the counting frequency of the state machine under the current head reaches a preset threshold value T, if the counting frequency reaches the threshold value T, judging that the counting frequency is true, outputting judgment information to the step 205, and if not, finishing.
And step 205, receiving the judgment result of the step 204, and giving out a dangerous behavior judgment result after comprehensive analysis.
In an exemplary embodiment, the embodiment further provides a method for detecting a smoking and calling behavior of a gas station based on deep learning, which includes the specific steps of:
(1) and acquiring images of the monitoring video stream of the gas station.
(2) And dividing an interested target region, and sending the image into a deep learning target detection model to detect the head of the person in the interested target region. And the image of the human head is cut.
(3) And sending the cut head image into an image classification module for classification, and identifying smoking, calling and normal behaviors.
(4) And (4) if the classification model judges the dangerous behaviors in the step (3), triggering the FSM (finite state machine) to perform further verification analysis.
(5) And (5) comprehensively analyzing the results of the steps (3) and (4), and judging whether smoking and calling exist and giving an alarm under the monitoring video of the gas station.
In summary, the present invention provides a method for detecting a smoke and phone call in a gas station based on deep learning, the method comprises the steps of firstly obtaining a surveillance video stream of the gas station, framing the surveillance video stream, and obtaining a plurality of video frame images; dividing an interested target area, inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area; and sending the cut human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors. The method combines human head detection, a behavior classification model and a finite state machine to realize the advantage complementation of the models, and further can keep high robustness and high accuracy under the conditions of complex open environment and different shielded illumination. The method combines the smoking and calling behavior recognition method of human head target detection and image classification, and can quickly and effectively judge the image. The method is based on a judgment module of a finite state machine, and improves the accuracy and the precision of behavior recognition.
As shown in fig. 3, the present invention further provides a deep learning-based smoking and phone call detection system for a gas station, comprising:
the system comprises a video image acquisition module 101, a monitoring video stream acquisition module and a video frame display module, wherein the video image acquisition module 101 is used for acquiring a monitoring video stream of a gas station, framing the monitoring video stream and acquiring a plurality of video frame images;
a region-of-interest module 102 is configured to divide a target region of interest;
and the human head target detection module 103 is configured to input the video frame image into a deep learning target detection model to detect a human head region in the target region of interest, and cut out a corresponding human head image from the video frame image according to the human head region. As an example, the process of inputting the video frame image into a deep learning target detection model by the human head target detection module to detect a human head region in the target region of interest, and cutting out a corresponding human head image from the video frame image according to the human head region includes: acquiring a model for detecting a human head in a YOLOv3 model in Darknet as the deep learning target detection model, and performing human head detection on the acquired video frame image; and amplifying the detected human head frame, and cutting out the corresponding human head frame from a preset interested target area to be used as a human head image.
And the image classification module 104 is used for sending the cut-out human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors. As an example, before the image classification module sends the cut-out human head image to the image classification model for classification, the method further includes: carrying out sample classification on the human head image, and dividing the human head image into a normal sample, a smoking sample and a calling sample; taking the classified samples as training data of an image classification model; inputting the training data into a ResNet18 network for training to obtain a corresponding image classification model; during training, the triple loss function is used as a regularization item, the samples of the same class are used as positive samples, and the samples of different classes are used as negative samples.
And the finite-state machine module 105 is used for triggering the finite-state machine FSM to carry out verification analysis according to smoking and calling behaviors. As an example, the process of the finite state machine module triggering the finite state machine FSM for verification analysis according to smoking and calling behaviors includes: if the image classification model identifies that smoking and calling behaviors exist in the human head image, the smoking and calling behaviors are used as dangerous behaviors, all state machine FSMs are triggered based on the dangerous behaviors, and whether state machines similar to the human head image exist in all current state machines or not is judged; if yes, adding 1 to the counting times of the existing state machine; if not, a new state machine is created and the initial count is 1; after all the head pictures on the current picture are judged, deleting the head state machine information which does not exist on the current picture; judging whether the counting times of the current under-head state machine reach a preset threshold value T or not; if the counting times reach a threshold value T, the judgment is true, and whether smoking and calling behaviors exist in the monitoring video stream of the gas station or not is judged by combining the classification result of the image classification model; and if the statistical times do not reach the threshold value, judging the result to be false.
And the event alarm module 106 is used for acquiring a verification analysis result and early warning by combining a classification result of the image classification model.
As shown in fig. 3, the present invention further provides a deep learning-based smoking and phone call detection system for a gas station, comprising: the system comprises a video image acquisition module 101, a region of interest setting module 102, a target detection module 103, an image classification module 104, a finite state machine module 105 and a result alarm module 106.
The video image acquisition module 101 is used for acquiring a gas station real-time scene image from a monitoring camera (including an analog camera, a digital camera, and the like).
The region of interest 102 is set, and in order to reduce the complexity of calculation and consider the head quality problem of the detection region, the invention firstly needs to divide an obvious detection region in the image.
The target detection module 103 detects a human head model by using a YOLOv3 model in Darknet to perform human head detection on the video image acquired by the video image acquisition module 101, performs appropriate amplification according to the detected human head frame, and finally cuts out the human head of the region of interest set by the region of interest 102 according to the target detection frame.
The image classification module 104 uses ResNet18 as a model for image classification, and divides the head image into normal samples, smoking samples and calling samples as training data of the classification model. Because the division of behavior actions such as smoking, calling and the like in the face area of the human head range is small, the design scheme adopts the triple loss function as a regularization item during training, samples of the same class are used as positive samples, samples of different classes are used as negative samples for training, and the improvement of the recognition degree of the model on dangerous actions is facilitated.
The finite state machine module 105, the output result of the image classification at the image classification module 104, serves as a trigger module of the finite state machine. When the image classification module 104 outputs the dangerous action information, the state machine starts and stores the head image, and then records the continuous frame number of the dangerous action of different heads. And if the dangerous action lasts for a period of time, judging that the dangerous action occurs in the monitoring video.
And a result alarm module 106, which receives the signal and outputs alarm information after the finite state machine module 105 outputs the dangerous action signal.
In summary, the present invention provides a deep learning-based system for detecting a smoke and a phone call in a gas station, which first obtains a surveillance video stream of the gas station, and frames the surveillance video stream to obtain a plurality of video frame images; dividing an interested target area, inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area; and sending the cut human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors. The system combines human head detection, a behavior classification model and a finite state machine to realize the advantage complementation of the models, and further can keep high robustness and high accuracy under the conditions of complex open environment and different shading illumination. The system combines the smoking and calling behavior recognition method of human head target detection and image classification, and can quickly and effectively judge the image. The system is based on a judgment module of a finite state machine, and improves the accuracy and the precision of behavior recognition.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (10)

1. A gas station smoking and calling detection method based on deep learning is characterized by comprising the following steps:
acquiring a monitoring video stream of a gas station, framing the monitoring video stream, and acquiring a plurality of video frame images;
dividing an interested target area, inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area, and cutting out a corresponding human head image from the video frame image according to the human head area;
and sending the cut human head image into an image classification model for classification, and identifying whether the human head image has smoking and calling behaviors.
2. The deep learning based gas station smoking and phone call detection method according to claim 1, further comprising, after recognizing that there is smoking and phone call behavior in the human head image:
taking the smoking and calling behaviors as dangerous behaviors, and triggering a Finite State Machine (FSM) to carry out verification analysis based on the dangerous behaviors;
and obtaining a verification analysis result, and judging whether the monitoring video stream of the gas station has smoking and calling behaviors or not by combining the classification result of the image classification model.
3. The deep learning based gas station smoking and phone call detection method according to claim 2, wherein the process of taking the smoking and phone call behavior as dangerous behavior and triggering a Finite State Machine (FSM) to perform verification analysis based on the dangerous behavior comprises:
if the image classification model identifies that smoking and calling behaviors exist in the human head image, the smoking and calling behaviors are used as dangerous behaviors, all state machine FSMs are triggered based on the dangerous behaviors, and whether state machines similar to the human head image exist in all current state machines or not is judged;
if yes, adding 1 to the counting times of the existing state machine;
if not, a new state machine is created and the initial count is 1;
after all the head pictures on the current picture are judged, deleting the head state machine information which does not exist on the current picture;
judging whether the counting times of the current under-head state machine reach a preset threshold value T or not; if the counting times reach a threshold value T, the judgment is true, and whether smoking and calling behaviors exist in the monitoring video stream of the gas station or not is judged by combining the classification result of the image classification model; and if the statistical times do not reach the threshold value, judging the result to be false.
4. The deep learning based gas station smoking and phone call detection method according to claim 1, wherein the process of inputting the video frame images into a deep learning target detection model to detect a human head region in the target region of interest, and cutting out a corresponding human head image from the video frame images according to the human head region comprises:
acquiring a model for detecting a human head in a YOLOv3 model in Darknet as the deep learning target detection model, and performing human head detection on the acquired video frame image;
and amplifying the detected human head frame, and cutting out the corresponding human head frame from a preset interested target area to be used as a human head image.
5. The deep learning based gas station smoking and phone call detection method according to claim 4, wherein before sending the cut-out human head image into the image classification model for classification, the method further comprises:
carrying out sample classification on the human head image, and dividing the human head image into a normal sample, a smoking sample and a calling sample;
taking the classified samples as training data of an image classification model;
inputting the training data into a ResNet18 network for training to obtain a corresponding image classification model; during training, the triple loss function is used as a regularization item, the samples of the same class are used as positive samples, and the samples of different classes are used as negative samples.
6. The deep learning based gasoline station smoking call detection method according to any one of claims 1 to 5, characterized in that the method further comprises alerting when there is smoking and call activity.
7. The utility model provides a filling station smoking detection system of making a call based on deep learning which characterized in that, including:
the system comprises a video image acquisition module, a data processing module and a data processing module, wherein the video image acquisition module is used for acquiring a monitoring video stream of a gas station, framing the monitoring video stream and acquiring a plurality of video frame images;
setting an interested area module for dividing an interested target area;
the human head target detection module is used for inputting the video frame image into a deep learning target detection model to detect a human head area in the interested target area and cutting out a corresponding human head image from the video frame image according to the human head area;
the image classification module is used for sending the cut human head images into an image classification model for classification and identifying whether smoking and calling behaviors exist in the human head images;
the finite state machine module is used for triggering a finite state machine FSM to carry out verification analysis according to smoking and calling behaviors;
and the event alarm module is used for acquiring a verification analysis result and early warning by combining a classification result of the image classification model.
8. The deep learning based gas station smoking and phone call detection system according to claim 7, wherein the finite state machine module triggers the finite state machine FSM to perform verification analysis according to smoking and phone call behavior, comprising:
if the image classification model identifies that smoking and calling behaviors exist in the human head image, the smoking and calling behaviors are used as dangerous behaviors, all state machine FSMs are triggered based on the dangerous behaviors, and whether state machines similar to the human head image exist in all current state machines or not is judged;
if yes, adding 1 to the counting times of the existing state machine;
if not, a new state machine is created and the initial count is 1;
after all the head pictures on the current picture are judged, deleting the head state machine information which does not exist on the current picture;
judging whether the counting times of the current under-head state machine reach a preset threshold value T or not; if the counting times reach a threshold value T, the judgment is true, and whether smoking and calling behaviors exist in the monitoring video stream of the gas station or not is judged by combining the classification result of the image classification model; and if the statistical times do not reach the threshold value, judging the result to be false.
9. The deep learning based gas station smoking and phone call detection system according to claim 7, wherein the process of the human head target detection module inputting the video frame images into a deep learning target detection model to detect human head regions in the target regions of interest, and cutting out corresponding human head images from the video frame images according to the human head regions comprises:
acquiring a model for detecting a human head in a YOLOv3 model in Darknet as the deep learning target detection model, and performing human head detection on the acquired video frame image;
and amplifying the detected human head frame, and cutting out the corresponding human head frame from a preset interested target area to be used as a human head image.
10. The deep learning based gas station smoking and phone call detection system according to claim 9, wherein the image classification module sends the cut-out human head image to the image classification model for classification, and further comprises:
carrying out sample classification on the human head image, and dividing the human head image into a normal sample, a smoking sample and a calling sample;
taking the classified samples as training data of an image classification model;
inputting the training data into a ResNet18 network for training to obtain a corresponding image classification model; during training, the triple loss function is used as a regularization item, the samples of the same class are used as positive samples, and the samples of different classes are used as negative samples.
CN202210070796.2A 2022-01-21 2022-01-21 Deep learning-based method and system for detecting smoking and calling of gas station Pending CN114387557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210070796.2A CN114387557A (en) 2022-01-21 2022-01-21 Deep learning-based method and system for detecting smoking and calling of gas station

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210070796.2A CN114387557A (en) 2022-01-21 2022-01-21 Deep learning-based method and system for detecting smoking and calling of gas station

Publications (1)

Publication Number Publication Date
CN114387557A true CN114387557A (en) 2022-04-22

Family

ID=81202864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210070796.2A Pending CN114387557A (en) 2022-01-21 2022-01-21 Deep learning-based method and system for detecting smoking and calling of gas station

Country Status (1)

Country Link
CN (1) CN114387557A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115713715A (en) * 2022-11-22 2023-02-24 天津安捷物联科技股份有限公司 Human behavior recognition method and system based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115713715A (en) * 2022-11-22 2023-02-24 天津安捷物联科技股份有限公司 Human behavior recognition method and system based on deep learning
CN115713715B (en) * 2022-11-22 2023-10-31 天津安捷物联科技股份有限公司 Human behavior recognition method and recognition system based on deep learning

Similar Documents

Publication Publication Date Title
CN109117827B (en) Video-based method for automatically identifying wearing state of work clothes and work cap and alarm system
CN111191576B (en) Personnel behavior target detection model construction method, intelligent analysis method and system
CN109299703B (en) Method and device for carrying out statistics on mouse conditions and image acquisition equipment
CN108319926A (en) A kind of the safety cap wearing detecting system and detection method of building-site
CN111783744A (en) Operation site safety protection detection method and device
CN112819068B (en) Ship operation violation behavior real-time detection method based on deep learning
CN112232211A (en) Intelligent video monitoring system based on deep learning
WO2022041484A1 (en) Human body fall detection method, apparatus and device, and storage medium
CN101316371B (en) Flame detecting method and device
CN109543607A (en) Object abnormal state detection method, system, monitor system and storage medium
CN111401310B (en) Kitchen sanitation safety supervision and management method based on artificial intelligence
Alzughaibi et al. Review of human motion detection based on background subtraction techniques
CN111079694A (en) Counter assistant job function monitoring device and method
JP6616906B1 (en) Detection device and detection system for defective photographing data
CN112464797A (en) Smoking behavior detection method and device, storage medium and electronic equipment
CN114387557A (en) Deep learning-based method and system for detecting smoking and calling of gas station
CN115690496A (en) Real-time regional intrusion detection method based on YOLOv5
CN117576632B (en) Multi-mode AI large model-based power grid monitoring fire early warning system and method
CN113920585A (en) Behavior recognition method and device, equipment and storage medium
CN114463779A (en) Smoking identification method, device, equipment and storage medium
CN105427303B (en) A kind of vision measurement and method of estimation of substation's legacy
CN112104838A (en) Image distinguishing method, monitoring camera and monitoring camera system thereof
CN113947795B (en) Mask wearing detection method, device, equipment and storage medium
CN114973135A (en) Head-shoulder-based sequential video sleep post identification method and system and electronic equipment
CN110855932A (en) Alarm method and device based on video data, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination