CN112183532A - Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium - Google Patents

Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium Download PDF

Info

Publication number
CN112183532A
CN112183532A CN202011042403.4A CN202011042403A CN112183532A CN 112183532 A CN112183532 A CN 112183532A CN 202011042403 A CN202011042403 A CN 202011042403A CN 112183532 A CN112183532 A CN 112183532A
Authority
CN
China
Prior art keywords
sample
image
instance
supervised
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011042403.4A
Other languages
Chinese (zh)
Inventor
吴衍
马碧芳
郭永宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Normal University
Original Assignee
Fujian Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Normal University filed Critical Fujian Normal University
Priority to CN202011042403.4A priority Critical patent/CN112183532A/en
Publication of CN112183532A publication Critical patent/CN112183532A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of deep learning and target identification, and particularly relates to a safety helmet identification method based on a weak supervision collaborative learning algorithm and a storage medium. The method of the invention comprises the following steps: training a weak supervised collaborative learning algorithm network by adopting an image level label image; inputting the image to be detected into the trained weak supervised cooperative learning algorithm network for target detection to obtain a probability vector; and judging whether the person in the image to be detected correctly wears the safety helmet or not according to the probability vector. The technical scheme of the application adopts a novel cooperative learning framework, the weakly supervised detection sub-network and the supervised detection sub-network are connected into a unified whole in the weakly supervised learning process, and the example prediction consistency of the two detection sub-networks is enhanced by using the prediction consistency loss, so that the weakly supervised cooperative learning algorithm has the capability of a weakly supervised efficient training network and also has the accurate detection precision of the supervised algorithm.

Description

Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium
Technical Field
The invention belongs to the technical field of deep recognition and target recognition, and particularly relates to a safety helmet recognition method based on a weak supervision collaborative learning algorithm and a storage medium.
Background
Safety considerations based on construction sites require that every person entering the site wears a safety helmet, which is important as the last line of safety defense. However, people who enter the building site area often leave the lucky cap without the safety helmet due to laziness, forgetting or holding, and the danger of dangerous injury is huge. Therefore, the safety helmet detection, identification and reminding are particularly important for safety precaution on building sites, whether workers who detect and early warn on duty wear the safety helmet as required or not and do safety precaution measures to work can really achieve safety production informatization management, and achieve prevention in advance, normal state monitoring in advance and standard management after the work.
For the detection that a worker in a construction site wears a safety helmet, the detection is usually carried out at an entrance, but in a construction region, whether the worker wears the safety helmet or not is difficult to detect. The following detection modes are common at present:
(1) the manual inspection, arrange the special person at every entry on building site and keep watch the inspection, arrange personnel in the regional inspection of patrolling of building site, but this mode needs to consume human resources, and there is the possibility that the inspection is missed.
(2) The sensor, the chip or the label are adopted for reminding, the sensing devices are arranged in the safety helmet, when a worker or a visitor wears the safety helmet to enter a construction site, the entrance guard reader can identify the sensor, the chip or the label in the safety helmet, and if the worker or the visitor does not wear the safety helmet, the system is triggered for reminding. Although the method can detect the wearing of the person at the entrance without omission, the method cannot detect whether the person is worn correctly, and cannot detect the person if the person is simply carried and is not worn. Meanwhile, the detection method can only be used for detecting at an entrance, and the method fails once entering a construction site area or needs manpower for detection.
(3) Helmet wearing behavior detection based on computer video image recognition. A camera is arranged at each entrance and important place of a construction site, the head and the position of the safety helmet of a constructor in the picture are detected by adopting a computer video image recognition algorithm according to the video pictures shot by the camera, and then the judgment of whether the safety helmet is worn is carried out, so that the real-time detection and early warning of the wearing behavior of the safety helmet of the constructor are realized. The method utilizes a computer to process and identify images, saves the cost of manpower and sensors, improves the detection precision and efficiency, is the current mainstream method, is mostly based on a deep learning algorithm, and is roughly divided into two types: in the first kind of Two-step (Two-stage) algorithm, candidate regions (regions) are generated first, and then are classified by using Convolutional Neural Networks (CNN), which typically represent R-CNN, Fast R-CNN, and the like. The second type, One-step (One-stage) algorithm, directly generates the object's class and location coordinates, and such algorithms are typically YOLO and SSD, among others.
The construction site helmet detection algorithm based on deep learning adopts supervised algorithms, for example, a helmet detection method and system in a dynamic background of Chinese patent number CN201810913181, a helmet identification method and device of CN201811570198, a helmet wearing identification method and device of CN201910300441 based on deep learning, and a helmet positioning and color identification method and system of CN201910484745 based on deep learning. A supervised algorithm is adopted, namely a deep learning algorithm network is trained through a labeled training sample (namely a labeled safety helmet picture), so that an optimal algorithm model is obtained, and the model is used for identifying and judging new construction site safety helmet image data, so that the purpose of detection is achieved. The biggest problem in this process is that the generation cost of the training samples is too high, the label requirement of the picture of the training samples with the supervision algorithm is very high, the training target object in the picture needs to give object types and also has information such as position coordinates, which are labels at the level of a bounding box (bounding-box level), and the labels are usually marked by manpower, so that the process is time-consuming and labor-consuming.
In addition, although the industry is also studying unsupervised algorithms, that is, training a deep learning algorithm network by using unlabeled training samples, the learning process is too complicated and difficult, and the technology development is slow and has not been broken through. The weak supervision algorithm is produced as a supervised and unsupervised compromise algorithm, and training samples are only labels of image levels (namely labels only comprise the types of objects in pictures) instead of labels of bounding-box levels (which are time-consuming and labor-consuming), so that the simple and easy operation of a learning and training process is ensured, the efficiency of the algorithm is improved, the weak supervision algorithm is an ideal choice of a construction-site safety helmet detection algorithm, and related research results do not appear at present.
Aiming at the problems, the invention provides a construction site safety helmet identification method based on a weak supervision collaborative learning algorithm, which is a technical problem to be solved urgently, and aims to early warn dangerous behaviors of a safety helmet which is not worn in real time through video real-time analysis and early warning and store an alarm screenshot and a video into a database to form a report.
Disclosure of Invention
One of the purposes of the invention is to overcome the defects and provide a safety helmet identification method based on a weak supervision collaborative learning algorithm, which can analyze, identify, track and alarm whether a person in a production area of a construction site wears a safety helmet in real time.
In order to solve the technical problem, the invention provides a safety helmet identification method based on a weak supervision collaborative learning algorithm, which comprises the following steps:
step 1, training a weak supervised collaborative learning algorithm network by adopting an image level label image;
step 2, inputting the image to be detected into the trained weak supervised cooperative learning algorithm network for target detection to obtain a probability vector;
and 3, judging whether the person in the image to be detected correctly wears the safety helmet or not according to the probability vector.
Further, the training of the weakly supervised cooperative learning algorithm network by using the image-level label samples comprises the following steps:
step 11, performing feature extraction on the image level label image through the convolution layer and the ROI pooling layer to obtain a training picture packet containing a sample instance;
step 12, training by a weak supervision learning module according to the training picture packet and generating a first example subset of a bounding box level label image;
step 13, the supervised learning module trains according to the training picture packet and the first example subset of the bounding box level label images and generates a second example subset of the bounding box level label images;
and 14, the supervised learning module calculates consistency loss for the first example subset of the boundary box level label image and the second example subset of the boundary box level label image, and updates the network parameters of the supervised learning module according to the calculation result.
Further, the weak supervised learning module performs training according to the training picture packet and generates a first result set of bounding box level label images, comprising the following steps:
step 121, a continuous instance selector performs subset division on sample instances in the training picture packets;
step 122, the continuous evaluator evaluates the sample examples in the training picture packets according to the sample example subset division result of the training picture packets;
and step 123, taking the evaluated training picture package as a first result set of the bounding box level label image.
Further, the consecutive instance selector performs subset partitioning on the sample instances in the training picture packet, including the steps of:
step 1211, calculating a score of each sample instance in the training picture set by the continuous instance selector, wherein the score is calculated by the formula: s (t)i,js)=σs T(FVb-FVf) Wherein, tijRepresents the jth sample in the ith packetExample, σsIs a network parameter, FV, of an instance selectorbIs sample instance tijFisher vector of middle background, FVfIs sample instance tijFisher vector of medium foreground;
step 1212, selecting a sample instance which is highest in score and does not belong to any bounding box level label image instance subset from all sample instances of the training picture package, and establishing a new bounding box level label image instance subset;
step 1213, classifying sample examples which have an overlapping area greater than or equal to τ and do not belong to any of the subset of the sample example images of the bounding box level label image with the highest score in step 1211 into the same set of sample examples of the bounding box level label image, wherein τ represents a continuous parameter and has a value range of [0,1 ];
step 1214, calculating an objective function value of a continuous instance selector according to the established bounding box level label image instance subset, wherein the objective function of the continuous instance selector is as follows:
Figure BDA0002707035810000041
wherein, tiRepresents the ith training panel packet, ti,β(τ)Represents the β (τ) th subset of instances, σ, in the ith packetsIs a network parameter of the instance selector,/iPositive and negative marks, N, representing the ith training panel packetβ(τ)Is the example subset ti,β(τ)Number of instances of (1), ti,j∈ti,β(τ)
Step 1215 of determining whether the objective function value of the continuous instance selector is greater than a preset threshold value;
step 1216, if the objective function value of the continuous instance selector is less than or equal to a preset threshold, skipping to step 1212 to continue execution;
step 1217, if the objective function value of the continuous instance selector is greater than the preset threshold, the subset division is completed.
Further, the continuous evaluator evaluates the sample instances in the training picture packets according to the result of dividing the sample instance subsets of the training picture packets, and the evaluation formula is as follows:
Figure BDA0002707035810000042
where the IOU function represents the cross-over ratio, t, between two sample instancesijRepresents the jth sample instance, l, in the ith packetijRepresents a sample instance tijPositive and negative flags of (1), ti,j*Sample instance subset t with the highest scorei,β(τ)*The sample example with the highest median score, τ, represents a continuous parameter with a value in the range of [0, 1%]。
Further, the predicted consistency loss is calculated according to the following formula:
Figure BDA0002707035810000043
wherein L iscFor consistency loss, xwsRepresenting an example of a bounding box level label image sample, x, obtained by a weakly supervised learning modulesAnd the IOU function represents the intersection ratio between the two sample instances.
Further, the probability vector includes: the probability value of identifying the object in the image to be detected as the safety helmet, the probability value of identifying the object in the image to be detected as the person and the position coordinate information of the object in the image to be detected.
Further, the step of judging whether the person in the image to be detected correctly wears the safety helmet according to the probability vector comprises the following steps:
if the probability value of the person identified by the object in the image to be detected and the probability value of the safety helmet identified by the object are greater than the fixed threshold value and the position of the safety helmet is on the top of the head of the person, judging that the person in the image to be detected correctly wears the safety helmet, otherwise, judging that the person in the image does not correctly wear the safety helmet.
Accordingly, the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs, which are executable by one or more processors to implement the steps of the weak supervised collaborative learning algorithm based helmet identification method according to any of the preceding claims.
The technical scheme of the invention has the beneficial effects that:
1. according to the technical scheme, the weakly supervised deep learning algorithm is adopted, only the image-level label sample picture is required to be trained, the requirement on the training sample picture is greatly reduced, the time for manually processing the training sample picture is reduced, and the working efficiency is improved.
2. The weak supervision deep learning algorithm adopts an improved continuous multi-instance learning method, and aims functions of a plurality of sub-problems jointly approach a final objective function by smoothing the objective function of the multi-instance learning method and converting the objective function into a plurality of sub-problems which are easier to solve, so that the non-convexity of the weak supervision objective function can be better solved, the model is prevented from being trapped into a local minimum value too early, and a better optimization result is obtained.
3. In the technical scheme of the application, a novel cooperative learning framework is adopted in the weak supervision cooperative learning algorithm, a weak supervision detection sub-network and a supervised detection sub-network are connected into a unified whole in the weak supervision learning process, and the example prediction consistency of the two detection sub-networks is enhanced by using the loss of the prediction consistency, so that the weak supervision cooperative learning algorithm has the capability of a high-efficiency training network with weak supervision and also has the accurate detection precision of the supervised algorithm.
Drawings
Fig. 1 is a flow chart of steps of a safety helmet identification method based on a weak supervision collaborative learning algorithm according to the invention.
FIG. 2 is a flowchart of training a weakly supervised cooperative learning algorithm network with image level label samples according to the present invention.
FIG. 3 is a flowchart illustrating steps of a weakly supervised learning module performing training according to the training picture package and generating a first result set of bounding box level label images according to the present invention.
FIG. 4 is a flowchart illustrating the steps of the continuous instance selector performing subset partitioning on the sample instances in the training picture packets according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the flowchart of the steps of the method for identifying a safety helmet based on a weak supervised collaborative learning algorithm of the present invention includes the following steps:
step 1, training a weak supervised collaborative learning algorithm network by adopting an image level label image;
in the technical field of deep learning and object recognition, images for training can be divided into image-level (image level) label images and boundary-box-level (bounding-box level) label images, wherein the image-level (image level) label images refer to images for training, which only know that they contain people or helmets and do not know specific positions of the people or helmets in the images, and the boundary-box-level (bounding-box level) label images not only know that they contain people or helmets but also know specific positions of everyone or helmets in the images, so that boundary-box-level label image training samples usually mark positions of people or helmets by manual work, which is time-consuming, labor-consuming and inefficient. As shown in fig. 2, the step diagram of training the weakly supervised cooperative learning algorithm network by using image-level label samples in the present invention includes the following steps:
step 11, performing feature extraction on the image level label image through the convolution layer and the ROI pooling layer to obtain a training picture packet containing a sample instance;
the image level label images for training are sent to the convolutional layer, the convolutional layer extracts candidate regions containing important features of different targets (people or safety helmets) according to the input images, then the results are sent to the pooling layer, and the pooling layer converts the candidate regions with different sizes into output with fixed sizes so as to perform post-training. The result output by the pooling layer is the training picture packet containing the sample instance. For example, each input image-level label image for training generates a corresponding training image packet, and a sample instance in each training image packet corresponds to a candidate region of a person or a helmet in the image, which is also referred to as a sample instance.
Step 12, training by a weak supervision learning module according to the training picture packet and generating a first example subset of a bounding box level label image; as shown in fig. 3, it is a flowchart of the step of the weakly supervised learning module training according to the training picture packet and generating the first result set of the bounding box level label images according to the present invention, and includes the following steps:
step 121, a continuous instance selector performs subset division on sample instances in the training picture packets;
let it be assumed that T is defined to represent all packets of training pictures, TiE T represents the ith training picture packet, liPositive and negative marks representing the ith training panel packet, and when it is 1, it represents tiFor positive sample packets, i.e. tiContains at least one positive sample instance, e.g., at least one sample instance of a person or a helmet, which when it is-1, indicates tiIs a negative sample packet, i.e. tiThe sample instances included in (a) are all negative sample instances, e.g., sample instances that do not include people and hard hats at all. t is ti,jRepresented is the jth sample instance, l, in the ith packeti,jDenotes ti,jA sample instance with a positive or negative flag of 1 indicates that the sample instance is a positive sample instance, and a sample instance with a negative flag of-1 indicates that the sample instance is a negative sample instance. The technical solution of the present application requires dividing the sample instances in the training picture package into a plurality of subsets, each subset representing a set of similar sample instances, for example, one subset is a face region set, and another subset is a face region setThe subset is a set of headgear regions. As shown in fig. 4, it is a flowchart of the step of the continuous instance selector performing subset partitioning on the sample instances in the training picture packet according to the present invention, and the method includes the following steps:
step 1211, calculating, by the continuous instance selector, a score of each sample instance in the training picture, where the score is an object score calculated for each sample instance, that is, a score indicating that a sample instance belongs to a specific object (a human face or a helmet), and the score is calculated by the continuous instance selector, where the calculation formula of the score is: s (t)i,js)=σs T(FVb-FVf) Wherein, tijRepresents the jth sample instance, σ, in the ith packetsIs a network parameter, FV, of an instance selectorbIs sample instance tijFisher vector of middle background, FVfIs sample instance tijFisher vector of medium foreground;
step 1212, selecting a sample instance which is highest in score and does not belong to any bounding box level label image instance subset from all sample instances of the training picture package, and establishing a new bounding box level label image instance subset;
step 1213, classifying sample examples which have an overlapping area greater than or equal to τ and do not belong to any of the subset of the sample example images of the bounding box level label image with the highest score in step 1211 into the same set of sample examples of the bounding box level label image, wherein τ represents a continuous parameter and has a value range of [0,1 ];
step 1214, calculating an objective function value of a continuous instance selector according to the established bounding box level label image instance subset, wherein the objective function of the continuous instance selector is as follows:
Figure BDA0002707035810000071
wherein, tiRepresents the ith training panel packet, ti,β(τ)Represents the β (τ) th subset of instances, σ, in the ith packetsIs a network parameter of the instance selector,/iPositive and negative marks, N, representing the ith training panel packetβ(τ)Is the example subset ti,β(τ)Number of instances of (1), ti,j∈ti,β(τ)By means of the successive instance selector S (t)i,β(τ)s) Computing the subset ti,β(τ)Is scored. When τ is 0, packet tiIs divided into a subset containing all instances; when τ is 1, packet tiIs divided into a plurality of subsets, each subset having only one instance; when 0 < tau < 1, each packet has a plurality of subsets, the objective function is smoother than that of the conventional example selector of multi-example learning, the smoothness can greatly relieve the non-convex problem, a better solution can be found by optimizing the objective function, and finally, the example subset t with the highest score is obtainedi,β(τ)*
Step 1215 of determining whether the objective function value of the continuous instance selector is greater than a preset threshold value;
step 1216, if the objective function value of the continuous instance selector is less than or equal to a preset threshold, skipping to step 1212 to continue execution;
step 1217, if the objective function value of the continuous instance selector is greater than the preset threshold, the subset division is completed.
The traditional weak supervision deep learning algorithm mostly adopts a multi-instance learning method, each graph is regarded as a bag (bag), objects in the graph are regarded as instances (instance), a candidate region (pro-visual) with the highest score is selected from the bag in an iterative mode in the training process to conduct network training, due to the fact that weak supervision information only indicates whether the image contains image level information of objects of corresponding categories, the instances have positioning randomness, an objective function describing candidate region selection is prone to non-convexity in an early learning stage, and optimization of the objective function can be trapped in a local minimum value, so that the objective function only detects a partial region and ignores a complete range.
The weak supervision deep learning algorithm of the technical scheme adopts an improved continuous multi-instance learning method, introduces a continuous optimization method on the basis of the traditional multi-instance learning method, aims to better solve the non-convexity of the weak supervision target detection problem, effectively divides instance candidate regions in a picture, divides spatially related and category related instances into a subset, defines a corresponding smooth target function on the corresponding subset, and jointly approximates the target functions of different subsets to the original target function to obtain a better optimization result. The target function of the multi-instance learning method is smoothed and converted into a plurality of sub-problems which are easier to solve, so that the target functions of the sub-problems are combined to approach the final target function, the non-convexity of the weakly supervised target function can be better solved, the model is prevented from being trapped into a local minimum value too early, and a better optimization result is obtained.
Step 122, the continuous evaluator evaluates the sample examples in the training picture packets according to the sample example subset division result of the training picture packets; the continuous evaluator is used for further evaluating the positive and negative sign results of each sample example based on the sample example set selected by the continuous example selector, eliminating some error or interference backgrounds and further improving the training accuracy, wherein the continuous evaluator adopts an evaluation formula as follows:
Figure BDA0002707035810000081
where the IOU function represents the cross-over ratio, t, between two sample instancesijRepresents the jth sample instance, l, in the ith packetijRepresents a sample instance tijPositive and negative flags of (1), ti,j*Sample instance subset t with the highest scorei,β(τ)*The sample example with the highest median score, τ, represents a continuous parameter with a value in the range of [0, 1%]. From the equation, it can be seen that when the sample instance is associated with ti,j*The intersection ratio is greater than or equal to the threshold value
Figure BDA0002707035810000082
When the sample instance is a positive sample, the instance is associated with ti,j*The intersection-to-union ratio is less than the threshold value
Figure BDA0002707035810000083
When the sample instance is a negative sample, other instance cases are not considered.
And step 123, using the training picture packets evaluated by the continuous evaluator as a first result set of the bounding box level label images.
Step 13, the supervised learning module trains according to the training picture packet and the first example subset of the bounding box level label images and generates a second example subset of the bounding box level label images; the supervised learning module adopts a conventional Faster R-CNN algorithm, inputs a first example subset of a bounding box level label image, trains and outputs a second example subset of the bounding box level label image;
and 14, the supervised learning module calculates consistency loss for the first example subset of the boundary box level label image and the second example subset of the boundary box level label image, and updates the network parameters of the supervised learning module according to the calculation result. Wherein, the predicted consistency loss is calculated by the following formula:
Figure BDA0002707035810000084
wherein L iscFor consistency loss, xwsRepresenting an example of a bounding box level label image sample, x, obtained by a weakly supervised learning modulesAnd the IOU function represents the intersection ratio between the two sample instances. And performing consistency loss judgment on the first example subset of the boundary frame level label image obtained by the weak supervised learning module and the second example subset of the boundary frame level label image obtained by the supervised learning module, reversely updating the network parameters of the supervised learning module according to the calculation result, and learning by keeping the prediction consistency with the weak supervised learning module so as to achieve the aim of training the supervised algorithm.
In addition, the supervised learning module and the weak supervised learning module have certain similarity on the network, network parameters can be shared in the training process, and partial image feature representations can be shared in the convolutional layer and the partial complete connection layer, so that the consistency of perception of two algorithms is ensured, and the training efficiency of two branches is improved.
Step 2, inputting the image to be detected into the trained weak supervised cooperative learning algorithm network for target detection to obtain a probability vector; specifically, a supervised learning module performs target detection on an image to be detected, and a probability vector is obtained by using a fast R-CNN algorithm, where the probability vector includes: the probability value of identifying the object in the image to be detected as the safety helmet, the probability value of identifying the object in the image to be detected as the person and the position coordinate information of the object in the image to be detected.
And 3, judging whether the person in the image to be detected correctly wears the safety helmet or not according to the probability vector, wherein the method comprises the following steps: if the probability value of the person identified by the object in the image to be detected and the probability value of the safety helmet identified by the object are greater than the fixed threshold value and the position of the safety helmet is on the top of the head of the person, judging that the person in the image to be detected correctly wears the safety helmet, otherwise, judging that the person in the image does not correctly wear the safety helmet. Specifically, image recognition is carried out through a weak supervision collaborative learning algorithm network to obtain a detection result, and then a head safety helmet joint detection model is established: when no person is present, judging that no person is present on the site; only people appear, or the safety helmet and people appear, but the safety helmet is below the top of the head of the people, and the safety helmet is determined not to be worn; wearing a helmet is only determined when the helmet and the person are present at the same time and the helmet is present on top of the person's head.
Preferably, the technical solution of the present application may further provide a computer-readable storage medium, which stores one or more programs, and the one or more programs are executable by one or more processors to implement the steps of the weak supervised collaborative learning algorithm based helmet identification method according to any of the above claims.
According to the technical scheme, a weak supervision cooperative learning framework is provided through a technology of cooperative learning of a weak supervision algorithm and a supervised Faster R-CNN algorithm, and task similarity, namely prediction consistency between the two algorithms is utilized to jointly train two algorithm sub-networks. Firstly, training and learning by using a continuous multi-instance learning method through a weak supervision algorithm, wherein the algorithm only needs a label sample picture at an image level for training, and the requirement on the training sample picture is greatly reduced; and inputting the boundary frame level label image output by the weak supervision algorithm into a supervised Faster R-CNN algorithm for training, and simultaneously keeping the similarity with the result of the weak supervision algorithm by using the predicted consistency loss so as to provide the accuracy of training and learning.
The above embodiments are merely illustrative of the technical solutions of the present invention, and the present invention is not limited to the above embodiments, and any modifications or alterations according to the principles of the present invention should be within the protection scope of the present invention.

Claims (9)

1. A safety helmet identification method based on a weak supervision collaborative learning algorithm is characterized by comprising the following steps:
step 1, training a weak supervised collaborative learning algorithm network by adopting an image level label image;
step 2, inputting the image to be detected into the trained weak supervised cooperative learning algorithm network for target detection to obtain a probability vector;
and 3, judging whether the person in the image to be detected correctly wears the safety helmet or not according to the probability vector.
2. The method for identifying a safety helmet based on a weakly supervised cooperative learning algorithm as recited in claim 1, wherein the training of the weakly supervised cooperative learning algorithm network by using image level label samples comprises the following steps:
step 11, performing feature extraction on the image level label image through the convolution layer and the ROI pooling layer to obtain a training picture packet containing a sample instance;
step 12, training by a weak supervision learning module according to the training picture packet and generating a first example subset of a bounding box level label image;
step 13, the supervised learning module trains according to the training picture packet and the first example subset of the bounding box level label images and generates a second example subset of the bounding box level label images;
and 14, the supervised learning module calculates consistency loss for the first example subset of the boundary box level label image and the second example subset of the boundary box level label image, and updates the network parameters of the supervised learning module according to the calculation result.
3. The method for helmet identification based on weakly supervised cooperative learning algorithm as recited in claim 2, wherein the weakly supervised learning module performs training according to the training picture packets and generates a first result set of bounding box level label images, comprising the steps of:
step 121, a continuous instance selector performs subset division on sample instances in the training picture packets;
step 122, the continuous evaluator evaluates the sample examples in the training picture packets according to the sample example subset division result of the training picture packets;
and step 123, taking the evaluated training picture package as a first result set of the bounding box level label image.
4. The weak supervised cooperative learning algorithm-based helmet identification method according to claim 3, wherein the continuous instance selector performs subset division on the sample instances in the training picture packets, and comprises the following steps:
step 1211, calculating a score of each sample instance in the training picture set by the continuous instance selector, wherein the score is calculated by the formula: s (t)i,js)=σs T(FVb-FVf) Wherein, tijRepresents the jth sample instance, σ, in the ith packetsIs a network parameter, FV, of an instance selectorbIs sample instance tijFisher vector of middle background, FVfIs sample instance tijFisher vector of medium foreground;
step 1212, selecting a sample instance which is highest in score and does not belong to any bounding box level label image instance subset from all sample instances of the training picture package, and establishing a new bounding box level label image instance subset;
step 1213, classifying sample examples which have an overlapping area greater than or equal to τ and do not belong to any of the subset of the sample example images of the bounding box level label image with the highest score in step 1211 into the same set of sample examples of the bounding box level label image, wherein τ represents a continuous parameter and has a value range of [0,1 ];
step 1214, calculating an objective function value of a continuous instance selector according to the established bounding box level label image instance subset, wherein the objective function of the continuous instance selector is as follows:
Figure FDA0002707035800000021
wherein, tiRepresents the ith training panel packet, ti,β(τ)Represents the β (τ) th subset of instances, σ, in the ith packetsIs a network parameter of the instance selector,/iPositive and negative marks, N, representing the ith training panel packetβ(τ)Is the example subset ti,β(τ)Number of instances of (1), ti,j∈ti,β(τ)
Step 1215 of determining whether the objective function value of the continuous instance selector is greater than a preset threshold value;
step 1216, if the objective function value of the continuous instance selector is less than or equal to a preset threshold, skipping to step 1212 to continue execution;
step 1217, if the objective function value of the continuous instance selector is greater than the preset threshold, the subset division is completed.
5. The method according to claim 3, wherein the continuous evaluator evaluates the sample instances in the training picture packets according to the result of the sample instance subset division of the training picture packets, and the evaluation formula is as follows:
Figure FDA0002707035800000022
where the IOU function represents the cross-over ratio, t, between two sample instancesijRepresents the jth sample instance, l, in the ith packetijRepresents a sample instance tijThe positive and negative signs of (a) are,
Figure FDA0002707035800000024
sample instance subset with the highest score
Figure FDA0002707035800000025
The sample example with the highest median score, τ, represents a continuous parameter with a value in the range of [0, 1%]。
6. The weak supervised cooperative learning algorithm-based helmet identification method according to claim 3, wherein the predicted consistency loss is calculated by the following formula:
Figure FDA0002707035800000023
wherein L iscFor consistency loss, xwsRepresenting an example of a bounding box level label image sample, x, obtained by a weakly supervised learning modulesAnd the IOU function represents the intersection ratio between the two sample instances.
7. The weak supervised cooperative learning algorithm-based hard hat identification method according to claim 1, wherein the probability vector comprises: the probability value of identifying the object in the image to be detected as the safety helmet, the probability value of identifying the object in the image to be detected as the person and the position coordinate information of the object in the image to be detected.
8. The method for identifying safety helmets based on the weak supervised collaborative learning algorithm as claimed in claim 7, wherein the step of determining whether the person in the image to be detected correctly wears the safety helmets according to the probability vector comprises the following steps:
if the probability value of the person identified by the object in the image to be detected and the probability value of the safety helmet identified by the object are greater than the fixed threshold value and the position of the safety helmet is on the top of the head of the person, judging that the person in the image to be detected correctly wears the safety helmet, otherwise, judging that the person in the image does not correctly wear the safety helmet.
9. A computer readable storage medium, storing one or more programs, the one or more programs being executable by one or more processors to perform the steps of any of claims 1-8 for a weak supervised collaborative learning algorithm based headgear identification method.
CN202011042403.4A 2020-09-28 2020-09-28 Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium Pending CN112183532A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011042403.4A CN112183532A (en) 2020-09-28 2020-09-28 Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011042403.4A CN112183532A (en) 2020-09-28 2020-09-28 Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium

Publications (1)

Publication Number Publication Date
CN112183532A true CN112183532A (en) 2021-01-05

Family

ID=73945397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011042403.4A Pending CN112183532A (en) 2020-09-28 2020-09-28 Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium

Country Status (1)

Country Link
CN (1) CN112183532A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610120A (en) * 2021-07-21 2021-11-05 燕山大学 App image content safety detection method based on weak supervised learning
CN117278696A (en) * 2023-11-17 2023-12-22 西南交通大学 Method for editing illegal video of real-time personal protective equipment on construction site

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110263686A (en) * 2019-06-06 2019-09-20 温州大学 A kind of construction site safety of image cap detection method based on deep learning
CN110728223A (en) * 2019-10-08 2020-01-24 济南东朔微电子有限公司 Helmet wearing identification method based on deep learning
CN110738127A (en) * 2019-09-19 2020-01-31 福建师范大学福清分校 Helmet identification method based on unsupervised deep learning neural network algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110263686A (en) * 2019-06-06 2019-09-20 温州大学 A kind of construction site safety of image cap detection method based on deep learning
CN110738127A (en) * 2019-09-19 2020-01-31 福建师范大学福清分校 Helmet identification method based on unsupervised deep learning neural network algorithm
CN110728223A (en) * 2019-10-08 2020-01-24 济南东朔微电子有限公司 Helmet wearing identification method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
青晨;禹晶;肖创柏;段娟;: "深度卷积神经网络图像语义分割研究进展", 中国图象图形学报, no. 06, 16 June 2020 (2020-06-16), pages 1069 - 1090 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610120A (en) * 2021-07-21 2021-11-05 燕山大学 App image content safety detection method based on weak supervised learning
CN113610120B (en) * 2021-07-21 2023-09-29 燕山大学 App image content safety detection method based on weak supervision learning
CN117278696A (en) * 2023-11-17 2023-12-22 西南交通大学 Method for editing illegal video of real-time personal protective equipment on construction site
CN117278696B (en) * 2023-11-17 2024-01-26 西南交通大学 Method for editing illegal video of real-time personal protective equipment on construction site

Similar Documents

Publication Publication Date Title
CN110738127B (en) Helmet identification method based on unsupervised deep learning neural network algorithm
Fang et al. Detecting non-hardhat-use by a deep learning method from far-field surveillance videos
Nath et al. Deep learning for site safety: Real-time detection of personal protective equipment
US10614310B2 (en) Behavior recognition
CN108053427B (en) Improved multi-target tracking method, system and device based on KCF and Kalman
CN108009473B (en) Video structuralization processing method, system and storage device based on target behavior attribute
CN111898514B (en) Multi-target visual supervision method based on target detection and action recognition
CN108052859B (en) Abnormal behavior detection method, system and device based on clustering optical flow characteristics
CN111598040B (en) Construction worker identity recognition and safety helmet wearing detection method and system
CN104063722B (en) A kind of detection of fusion HOG human body targets and the safety cap recognition methods of SVM classifier
CN111488804A (en) Labor insurance product wearing condition detection and identity identification method based on deep learning
Elhamod et al. Automated real-time detection of potentially suspicious behavior in public transport areas
US8416296B2 (en) Mapper component for multiple art networks in a video analysis system
CN109165685B (en) Expression and action-based method and system for monitoring potential risks of prisoners
CN110414400B (en) Automatic detection method and system for wearing of safety helmet on construction site
CN110490171B (en) Dangerous posture recognition method and device, computer equipment and storage medium
CN113989944B (en) Operation action recognition method, device and storage medium
WO2021114765A1 (en) Depth image-based method and system for anti-trailing detection of self-service channel
CN110728252A (en) Face detection method applied to regional personnel motion trail monitoring
CN112183532A (en) Safety helmet identification method based on weak supervision collaborative learning algorithm and storage medium
CN112184773A (en) Helmet wearing detection method and system based on deep learning
CA3196344A1 (en) Rail feature identification system
CN113743256A (en) Construction site safety intelligent early warning method and device
CN111931573A (en) Helmet detection and early warning method based on YOLO evolution deep learning model
Laptev et al. Visualization system for fire detection in the video sequences

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination