CN111160440A

CN111160440A - Helmet wearing detection method and device based on deep learning

Info

Publication number: CN111160440A
Application number: CN201911349221.9A
Authority: CN
Inventors: 王楠; 马敬奇; 焦泽昱; 杨锦; 吴亮生; 钟震宇; 卢杏坚; 陈再励
Original assignee: Guangdong Institute of Intelligent Manufacturing
Current assignee: Guangdong Institute of Intelligent Manufacturing
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-15
Anticipated expiration: 2039-12-24
Also published as: CN111160440B

Abstract

The invention discloses a safety helmet wearing detection method and device based on deep learning, wherein the method comprises the following steps: obtaining construction site worker operation pictures with more than a preset number, and labeling the construction site worker operation pictures to obtain a data set; preprocessing data in the data set to obtain 6 clustering template frames with different sizes; building a deep learning network for helmet wearing detection based on a Yolov3 network, and inputting cluster template boxes with 6 different sizes in a training set into the deep learning network for training; leading 6 cluster template frames with different sizes in the test set to a trained deep learning network to judge whether the training is converged; and if the images are converged, inputting the images to be recognized into the convergence depth learning network model for helmet wearing detection. In the embodiment of the invention, whether a human target wears or whether a safety helmet is worn correctly can be detected in real time through a deep learning network for training convergence.

Description

Helmet wearing detection method and device based on deep learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a safety helmet wearing detection method and device based on deep learning.

Background

The safety helmet is a safety guarantee for the operation of constructors, the safety accidents can be greatly reduced when the safety helmet is worn, and the safety helmet is required to be worn in the environment of construction sites, production workshops, outdoor high-altitude first-aid repair and the like. However, a great number of safety accidents occur each year due to the absence or unreasonable wearing of safety helmets, which results in life and property loss. Detection of the wearing of the safety helmet is of great significance.

The earliest safety helmet detection method is implemented based on an image processing technology, for example, identification is carried out according to images, shapes and colors of safety helmets, and the method is easily interfered by noise, has narrow application range and poor identification effect; with the improvement of computer computing power, a safety helmet identification method based on a machine learning algorithm is proposed; compared with the image processing technology for identification, the machine learning algorithm has the advantages that the identification rate is greatly improved, the robustness is greatly enhanced, but the traditional machine learning algorithm always has the problem of difficult feature extraction, and in addition, the traditional machine learning algorithm is not good at extracting the image features.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, and provides a safety helmet wearing detection method and device based on deep learning.

In order to solve the technical problem, an embodiment of the present invention provides a method for detecting wearing of a safety helmet based on deep learning, where the method includes:

the method comprises the steps of obtaining construction site worker operation pictures with the number more than a preset number, and marking the construction site worker operation pictures by using a labelImg marking tool to obtain a data set, wherein the data set comprises a training set and a testing set;

preprocessing a target frame of a K-means clustering safety helmet data set on the data in the data set to obtain clustering template frames with 6 different sizes;

building a deep learning network for helmet wearing detection based on a Yolov3 network, and inputting cluster template boxes with 6 different sizes in the training set into the deep learning network for training;

after training is finished, importing 6 clustering template frames with different sizes in the test set to the trained deep learning network to judge whether training is converged;

and if the images are converged, inputting the images to be recognized into the convergence deep learning network model for helmet wearing detection.

Optionally, it is right to utilize labelImg marking tool the building site workman operation picture marks, obtains the data set, includes:

the labelImg marking tool is used for marking the target frame of the construction site worker operation picture to obtain a data set;

the size of the construction site worker operation picture is not fixed, and the target frame mark comprises a target frame coordinate and the category of an object in the frame; the categories are three and are respectively marked as person, hat and head.

Optionally, the number ratio of the training set to the test set is 7: 3.

Optionally, the preprocessing of the target frame of the K-means clustering helmet data set is performed on the data in the data set to obtain clustering template frames of 6 different sizes, including:

carrying out scale normalization processing on the data in the data set, and normalizing to 416 × 416 scales to obtain a normalized data set;

and performing K-means clustering on the target frames of the data set of the safety helmet by utilizing a clustering center initialization mode of an expanded binary ordering tree on the normalized data set to obtain 6 clustering template frames with different sizes.

Optionally, K in the pair of K-means clusters is 6;

the clustering template boxes with 6 different sizes are as follows: [ [ [ w1, h1] [ w2, h2] [ w3, h3] ] [ w4, h4] [ w5, h5] [ w6, h6] ] ] and is sorted from large to small according to height-width values.

Optionally, the building of the deep learning network for helmet wearing detection based on the Yolov3 network includes:

changing three groups of output tensors of the output of the Yolov3 network into two groups, removing the output with the tensor cross section of 13 × 13 scales, and obtaining a deep learning network for the wearing detection of the safety helmet, wherein the output characteristic layers are 26 × 21 and 52 × 21 respectively;

where 21 represents two categories, four coordinate box values and one confidence C in each of the three object boxes.

Optionally, the importing the clustering template frames of 6 different sizes in the test set to the trained deep learning network to determine whether the training is converged includes:

importing 6 clustering template frames with different sizes in the test set to the trained deep learning network, and outputting test results;

and judging whether convergence occurs or not according to whether the accuracy in the test result is greater than a preset probability or not.

Optionally, the method further includes:

and if not, updating the trained deep learning network node parameters based on a reverse algorithm, and after the updating is completed, retraining by using the training set in the data set.

Optionally, the inputting the image to be recognized into the convergence deep learning network model for helmet wearing detection includes:

inputting an image to be recognized into a convergence deep learning network model, and outputting the image to be recognized;

and calculating the overlapping rate of the head target frame and the safety helmet target frame in the image to be recognized, and judging whether the worker wears the safety helmet or not according to the overlapping rate.

In addition, the embodiment of the invention also provides a safety helmet wearing detection device based on deep learning, which comprises:

a data acquisition module: the method comprises the steps of obtaining construction site worker operation pictures with the number more than a preset number, and marking the construction site worker operation pictures by using a labelImg marking tool to obtain a data set, wherein the data set comprises a training set and a testing set;

a data preprocessing module: the data processing device is used for preprocessing a data set target frame of the K-means clustering safety helmet to obtain clustering template frames with 6 different sizes;

a network construction and training module: the method comprises the steps of constructing a deep learning network for helmet wearing detection based on a Yolov3 network, and inputting cluster template boxes with 6 different sizes in the training set into the deep learning network for training;

a convergence judgment module: after training is finished, importing the clustering template frames with 6 different sizes in the test set to the trained deep learning network to judge whether the training is converged;

the identification detection module: and if the image to be recognized is converged, inputting the image to be recognized into the convergence deep learning network model for helmet wearing detection.

In the embodiment of the invention, the deep learning network is constructed on the basis of the original yolov3 network, the training meter test is carried out on the deep learning network, and after convergence, the wearing condition of the safety helmet is detected by using the converged deep learning network, so that the wearing condition of the safety helmet can be detected in real time, the accident rate of a construction site is greatly reduced, and the life safety of workers is also ensured.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a method for detecting wearing of a safety helmet based on deep learning in an embodiment of the invention;

fig. 2 is a schematic structural composition diagram of a helmet wearing detection device based on deep learning in an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Examples

Referring to fig. 1, fig. 1 is a schematic flow chart of a helmet wearing detection method based on deep learning according to an embodiment of the present invention.

As shown in fig. 1, a method for detecting wearing of a safety helmet based on deep learning includes:

s11: the method comprises the steps of obtaining construction site worker operation pictures with the number more than a preset number, and marking the construction site worker operation pictures by using a labelImg marking tool to obtain a data set, wherein the data set comprises a training set and a testing set;

in a specific implementation process of the present invention, the tagging of the construction site worker work picture by using the labelImg tagging tool to obtain a data set includes: the labelImg marking tool is used for marking the target frame of the construction site worker operation picture to obtain a data set; the size of the construction site worker operation picture is not fixed, and the target frame mark comprises a target frame coordinate and the category of an object in the frame; the categories are three and are respectively marked as person, hat and head.

Further, the number ratio of the training set to the test set is 7: 3.

Specifically, the number of the obtained original worker work pictures is more than 5000, the original worker work pictures are obtained by downloading through a picture website channel, and the length and width of the original worker work pictures are required to be more than 300 pixels; after the construction site worker operation picture is obtained, a target frame is labeled on the construction site worker operation picture by using a labelImg labeling tool, and a data set is obtained; mainly labeling a rectangular frame on a safety helmet and a human head in a construction site worker operation picture, and distributing labeling categories; the size of the construction site worker operation picture is not fixed, and the target frame mark comprises a target frame coordinate and the category of an object in the frame; the categories are three and are respectively marked as person, hat and head.

Also, the data set needs to be divided into training sets and test sets, where the number ratio of training sets to test sets is 7: 3.

S12: preprocessing a target frame of a K-means clustering safety helmet data set on the data in the data set to obtain clustering template frames with 6 different sizes;

in the specific implementation process of the invention, the preprocessing of the target frame of the data set of the safety helmet by K-means clustering is performed on the data in the data set to obtain clustering template frames with 6 different sizes, and the method comprises the following steps: carrying out scale normalization processing on the data in the data set, and normalizing to 416 × 416 scales to obtain a normalized data set; and performing K-means clustering on the target frames of the data set of the safety helmet by utilizing a clustering center initialization mode of an expanded binary ordering tree on the normalized data set to obtain 6 clustering template frames with different sizes.

Further, K in the pair of K-means clusters is 6; the clustering template boxes with 6 different sizes are as follows: [ [ [ w1, h1] [ w2, h2] [ w3, h3] ] [ w4, h4] [ w5, h5] [ w6, h6] ] ] and is sorted from large to small according to height-width values.

Specifically, firstly, performing scale normalization processing on data in a data set, and normalizing to 416 × 416 scales to obtain a normalized data set; the size of the picture is changed, the coordinate label of the target frame is correspondingly changed when the training set is manufactured, and the manufactured data set is too small relative to the data sets such as COCO, so that the data set needs to be expanded, including color brightness adjustment, picture angle rotation and random cutting, and then the data set is subjected to mean value reduction preprocessing to eliminate noise influence; therefore, the normalized data set needs to be clustered, and specifically, a K-means clustering safety helmet data set target frame is performed by using a clustering center initialization mode of an extended binary ordering tree to obtain 6 clustering template frames with different sizes.

The method comprises the following steps of obtaining clustering template frames of 6 target frames with different scales in a data set by a K-means clustering algorithm, setting K of the specific K-means clustering algorithm to be 6, initializing a clustering center and adopting an extended binary ordering tree, wherein the specific algorithm is as follows: inputting: the number of classifications k and the data set containing n objects; and (3) outputting: 6 initial clustering center points; step 1: creating an expanded binary ordering tree for the data set; step 2: calculating the density and median of each partition; and step 3: selecting max and a median point as a first initial point; and 4, step 4: selecting the 2 nd to 6 th initial points; for t 2to 6; for j is 1to p; calculating d_j＝min_k＝1,2(e(C_k,m_j))]·ρ_j(ii) a Select max d_jIs C_t。

Clustering the sizes of the target frames in the training set by using a K-means algorithm, wherein K is 6, and the size frames representing 6 types are as follows: [ [ [ w1, h1] [ w2, h2] [ w3, h3] ] [ w4, h4] [ w5, h5] [ w6, h6] ] ] ]; the arrays are ordered from small to large according to the height and width values.

S13: building a deep learning network for helmet wearing detection based on a Yolov3 network, and inputting cluster template boxes with 6 different sizes in the training set into the deep learning network for training;

in the specific implementation process of the invention, the deep learning network for helmet wearing detection is built based on the Yolov3 network, and the deep learning network comprises the following components: changing three groups of output tensors of the output of the Yolov3 network into two groups, removing the output with the tensor cross section of 13 × 13 scales, and obtaining a deep learning network for the wearing detection of the safety helmet, wherein the output characteristic layers are 26 × 21 and 52 × 21 respectively; where 21 represents two categories, four coordinate box values and one confidence C in each of the three object boxes.

Specifically, the deep learning network consists of a Yolov3 network part including a Darknet-53 layer, a DBL (Darknetconv2d _ bn _ leak), an upsampling layer and a combining layer, wherein the full connection layer is removed; the input is a picture with the size of 416 x 3, and since the size of the picture of the data set is not fixed, size normalization is required; the Darknet53 with the full connectivity layer removed consists of the following components: DBL, res1, res2, res8, res 4; in resn, n represents the composition of n residual units, each residual unit comprises two DBL structures, and each DBL structure comprises a convolutional layer (conv2d), a regularization layer (bn) and an activation function (leak). Yolov3 sets that each grid predicts three boxes, each box needs five basic parameters (x, y, w, h Confidence), Yolov3 obtains anchor boxes of six target frames with different scales in a data set through a k-means clustering algorithm for improving the detection precision under different target scales; the output of Yolov3 includes two groups, the predicted values under different target scales, the shaded part represents the removed part relative to the original Yolov3 network, the output feature layers are respectively 26 × 21 and 52 × 21, 21 represents that each of the three target frames has two categories, four coordinate frame values and a confidence coefficient C.

The resulting size frame is a manually obtained template frame; assuming that there is a target frame in the original image, the category belongs to the helmet, the target frame is labeled [ xmin, ymin, xmax, ymax, label ], the original image is divided into 26 meshes and 52 meshes, which correspond to the first two dimensional sizes of the three output feature map arrays, the anchor boxes size frame represents rectangular frames of 6 sizes, the rectangular frame is made to coincide with the center point of the target frame, the overlap ratio (IOU) of each anchor box and the target frame is calculated, the overlap ratio is S (overlap)/[ S (anchor box) + S (target box) -S (overlap) ], the anchor box _ max with the largest overlap ratio is selected, the array can be written as [ boxes _ 26 ] [ boxes _ 36set ], [ other two output feature boxes with 52 set 21, if the output feature boxes belong to [ boxes _ 01], [ other output feature boxes with 3 _ 36max ], labeling the output feature map with dimension 26 × 24; assuming the first case, continuously judging that the index of the anchor box _ max is i in [ boxes _ set02], changing the output feature map with the dimension of 52 x 21 into the dimension of 52 x 3 x 7, according to the fact that the center point of the target frame in the original image is located in the kth of the 52 x 52 grids, the dimension of the kth grid corresponding to the ith anchor box _ max is 1 x 7, the array is only required to be marked, the other values of the output feature map are all 0, the first four values of the last dimension of the array are (xmin + xmax)/2/416, (ymin + ymax)/2/416, (xmax-xmin)/416, (ymax-ymin)/416, the fifth value of the last dimension of the array is 1, which represents the confidence of the grid for the maximum template, and if the target frame belongs to the class c (total class of 0), 1) the c +6 th value is 1. If the original image has a plurality of target frames, the output feature map is labeled according to the labeling process.

Mapping the original image target frame into the output feature map, and mapping the target frame into the output feature map by taking the above segment as an example, only modifying the first four coordinate values in the marked dimension 1 x 7 array, b_x＝(xmin+xmax)/2/416*52、b_y＝(ymin+ymax)/2/416*52、b_w＝(xmax-xmin)/416*52、b_h＝(ymax-ymin)/416*52，(b_x,b_y,b_w,b_h) Respectively mapping the central point coordinate and the width and the height of the target frame in the output characteristic diagram for the original image target frame, wherein the central point coordinate is positioned in a grid S, and the grid S is positioned in (c) in two-dimensional coordinates of 52 x 52_x,c_y) Position of (b)_x-c_xRepresents the distance t of the center point coordinate from the horizontal axis of the upper left corner of the grid S_x，b_y-c_yRepresents the distance t of the center point coordinate from the vertical axis of the upper left corner of the grid S_ySince the ratio is obtained by comparing the width and height of the original target frame with the width and height of the anchor box _ max corresponding to the target frame, the result is the same on the original as that mapped on the output feature map, and the obtained ratio k is_w、k_hAre respectively paired with k_w、k_hObtaining t by calculating the logarithm of e as the base_w、t_h. To this end, t_x、t_y、t_w、t_hIs the value that the model needs to predict, and is the final label of the original image target frame four coordinate values, namely the k-th network corresponds to the array of the ith anchor box _ max with the dimension of 1 x 7, and the first four values are t_x、t_y、t_w、t_hThe last four values are confidence C and class C, notMaking any changes; the loss function contains three parts: coordinate loss, confidence loss, category loss.

The formula is as follows:

loss＝coordErr+confiErr+clsErr；

solving errors configerr and clsrr for t by using sigmod cross entropy function to C, p (c) in coordinate frame_x、t_yUsing smmthL 1 function to solve errors coordErr _ x and coordErr _ y, prelu is an activation function, t_w、t_hThe error of (2) is obtained by adopting a total error; lambda [ alpha ]_coordIs the degree to which the coordErr contributes to the total loss,

is the confidence, x, of the jth template box in the ith grid_i、y_i、w_i、h_iRespectively, the ith grid t in the output feature map_x、t_y、t_w、t_hThe predicted value is in the range of (0,1) through the sigmoid function; c_i' is a predicted value, p, of the i-th grid to confidence C in the output feature map_i' (c) is the prediction of the ith network pair p (c) in the output feature map, and p (c) is the probability that the target box object belongs to the class c; s²Is to divide the original image into s²A grid; b is 3, meaning that three template frame prediction target frames are specified per mesh.

After defining the loss function, starting to train the model, and setting the training hyper-parameters as follows: the learning rate is 0.001, the train _ batch _ size is 10, the number of training sets is 118287, the verification set is 5000, the regularization attenuation value is 0.99, the training epoch is 50, and a model.ckpt model is obtained by training in a deep learning framework Tensorflow by using a GPU of a GTX1070 model.

S14: after training is finished, importing 6 clustering template frames with different sizes in the test set to the trained deep learning network to judge whether training is converged;

in a specific implementation process of the present invention, the importing the 6 clustering template frames with different sizes in the test set to the trained deep learning network to determine whether the training is converged includes: importing 6 clustering template frames with different sizes in the test set to the trained deep learning network, and outputting test results; and judging whether convergence occurs or not according to whether the accuracy in the test result is greater than a preset probability or not.

Specifically, 6 clustering template boxes with different sizes in a test set are input into a trained deep learning network, and a test result is output; and then, whether the deep network is converged after training is confirmed by calculating the accuracy in the output result.

In the specific implementation process of the invention, if the data is not converged, the trained deep learning network node parameters are updated based on a reverse algorithm, and after the updating is completed, the training set in the data set is used for retraining.

S15: and if the images are converged, inputting the images to be recognized into the convergence deep learning network model for helmet wearing detection.

In the specific implementation process of the invention, the inputting the image to be identified into the convergence deep learning network model for the helmet wearing detection comprises the following steps: inputting an image to be recognized into a convergence deep learning network model, and outputting the image to be recognized; and calculating the overlapping rate of the head target frame and the safety helmet target frame in the image to be recognized, and judging whether the worker wears the safety helmet or not according to the overlapping rate.

Specifically, the size of the image to be recognized needs to be scaled to 416 × 416, and the predicted four coordinate values are x_i、y_i、w_i、h_i(ii) a Solving the coordinates of the target frame; secondly, drawing the target frame in the original test picture, and performing deviation adjustment on the coordinates of the target frame; finally, [ box _ mins _ x, box _ mins _ y, box _ maxes _ x, box _ maxes _ y in the target frame coordinates box will be predicted]Normalized to the original image, namely boxes/corresponding grid size 416; in addition, the object of the target frame is classified, that is, the index value i of the maximum confidence C p (C) represents that the target frame belongs to the class i, a threshold value of 0.3 is set, and if the maximum C p (C) is less than 0.3, the target frame is considered to be absent.

b_x＝σ(x_i)+c_x；

b_y＝σ(y_i)+c_y；

According to the target detection algorithm, if the overlapping rate IOU of the human head target frame and the safety helmet target frame is more than 0.3, the degree is_iThe subject correctly wears the headgear, otherwise the headgear is mistaken or not.

Examples

Referring to fig. 2, fig. 2 is a schematic structural composition diagram of a safety helmet wearing detection device based on deep learning in an embodiment of the present invention.

As shown in fig. 2, a helmet wearing detection apparatus based on deep learning, the apparatus comprising:

the data obtaining module 21: the method comprises the steps of obtaining construction site worker operation pictures with the number more than a preset number, and marking the construction site worker operation pictures by using a labelImg marking tool to obtain a data set, wherein the data set comprises a training set and a testing set;

Further, the number ratio of the training set to the test set is 7: 3.

The data preprocessing module 22: the data processing device is used for preprocessing a data set target frame of the K-means clustering safety helmet to obtain clustering template frames with 6 different sizes;

The method comprises the following steps of obtaining clustering template frames of 6 target frames with different scales in a data set by a K-means clustering algorithm, setting K of the specific K-means clustering algorithm to be 6, initializing a clustering center and adopting an extended binary ordering tree, wherein the specific algorithm is as follows: inputting: number of classesItem k and a data set containing n objects; and (3) outputting: 6 initial clustering center points; step 1: creating an expanded binary ordering tree for the data set; step 2: calculating the density and median of each partition; and step 3: selecting max and a median point as a first initial point; and 4, step 4: selecting the 2 nd to 6 th initial points; for t 2to 6; for j is 1to p; calculating d_j＝min_k＝1,2(e(C_k,m_j))]·ρ_j(ii) a Select max d_jIs C_t。

The network construction and training module 23: the method comprises the steps of constructing a deep learning network for helmet wearing detection based on a Yolov3 network, and inputting cluster template boxes with 6 different sizes in the training set into the deep learning network for training;

Mapping the original image target frame into the output feature map, and mapping the target frame into the output feature map by taking the above segment as an example, only modifying the first four coordinate values in the marked dimension 1 x 7 array, b_x＝(xmin+xmax)/2/416*52、b_y＝(ymin+ymax)/2/416*52、b_w＝(xmax-xmin)/416*52、b_h＝(ymax-ymin)/416*52，(b_x,b_y,b_w,b_h) Respectively mapping the central point coordinate and the width and the height of the target frame in the output characteristic diagram for the original image target frame, wherein the central point coordinate is positioned in a grid S, and the grid S is positioned in (c) in two-dimensional coordinates of 52 x 52_x,c_y) Position of (b)_x-c_xRepresents the distance t of the center point coordinate from the horizontal axis of the upper left corner of the grid S_x，b_y-c_yRepresents the distance t of the center point coordinate from the vertical axis of the upper left corner of the grid S_ySince the ratio is obtained by comparing the width and height of the original target frame with the width and height of the anchor box _ max corresponding to the target frame, the result is the same on the original as that mapped on the output feature map, and the obtained ratio k is_w、k_hAre respectively paired with k_w、k_hObtaining t by calculating the logarithm of e as the base_w、t_h. To this end, t_x、t_y、t_w、t_hIs the value that the model needs to predict, and is the final label of the original image target frame four coordinate values, namely the k-th network corresponds to the array of the ith anchor box _ max with the dimension of 1 x 7, and the first four values are t_x、t_y、t_w、t_hThe last four values are confidence C and category C, and no change is made; the loss function contains three parts: coordinate loss, confidence loss, category loss.

The formula is as follows:

loss＝coordErr+confiErr+clsErr；

solving errors configerr and clsrr for t by using sigmod cross entropy function to C, p (c) in coordinate frame_x、t_yUsing smmthL 1 function to solve errors coordErr _ x and coordErr _ y, prelu is an activation function, t_w、t_hThe error of (2) is obtained by adopting a total error; lambda [ alpha ]_coordIs the coordErr to total lossThe degree of contribution of (a) to (b),

The convergence judgment module 24: after training is finished, importing the clustering template frames with 6 different sizes in the test set to the trained deep learning network to judge whether the training is converged;

The recognition detection module 25: and if the image to be recognized is converged, inputting the image to be recognized into the convergence deep learning network model for helmet wearing detection.

Specifically, the image to be recognized needs to be scaled to 416 x 416 size first,by predicting four coordinate values as x_i、y_i、w_i、h_i(ii) a Solving the coordinates of the target frame; secondly, drawing the target frame in the original test picture, and performing deviation adjustment on the coordinates of the target frame; finally, [ box _ mins _ x, box _ mins _ y, box _ maxes _ x, box _ maxes _ y in the target frame coordinates box will be predicted]Normalized to the original image, namely boxes/corresponding grid size 416; in addition, the object of the target frame is classified, that is, the index value i of the maximum confidence C p (C) represents that the target frame belongs to the class i, a threshold value of 0.3 is set, and if the maximum C p (C) is less than 0.3, the target frame is considered to be absent.

b_x＝σ(x_i)+c_x；

b_y＝σ(y_i)+c_y；

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.

In addition, the method and the device for detecting the wearing of the safety helmet based on deep learning provided by the embodiment of the invention are described in detail, a specific example is adopted to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A safety helmet wearing detection method based on deep learning is characterized by comprising the following steps:

2. The method for detecting the wearing of the safety helmet according to claim 1, wherein the labeling of the working pictures of the construction worker by using a labelImg labeling tool to obtain a data set comprises:

3. The headgear wear detection method of claim 1, wherein a number ratio of the training set to the test set is 7: 3.

4. The method for detecting the wearing of the safety helmet according to claim 1, wherein the preprocessing of K-means clustering of the target frames of the safety helmet data set is performed on the data in the data set to obtain clustering template frames with 6 different sizes, and comprises:

5. The headgear wear detection method of claim 4, wherein K in the pair of K-means clusters is 6;

6. The helmet wearing detection method according to claim 1, wherein the building of the deep learning network for helmet wearing detection based on the Yolov3 network comprises the following steps:

7. The method for detecting wearing of a safety helmet according to claim 1, wherein the importing the clustering template frames of 6 different sizes in the test set to the trained deep learning network determines whether training is converged, including:

8. The headgear wear detection method of claim 1, further comprising:

9. The method for detecting the wearing of the safety helmet according to claim 1, wherein the inputting the image to be identified into the convergence deep learning network model for detecting the wearing of the safety helmet comprises:

10. A safety helmet wearing detection device based on deep learning, characterized in that the device comprises: