CN111428604B - Facial mask recognition method, device, equipment and storage medium - Google Patents
Facial mask recognition method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN111428604B CN111428604B CN202010194398.2A CN202010194398A CN111428604B CN 111428604 B CN111428604 B CN 111428604B CN 202010194398 A CN202010194398 A CN 202010194398A CN 111428604 B CN111428604 B CN 111428604B
- Authority
- CN
- China
- Prior art keywords
- mask
- feature map
- sample image
- target feature
- facial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000001815 facial effect Effects 0.000 title claims abstract description 82
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000002372 labelling Methods 0.000 claims abstract description 48
- 238000012549 training Methods 0.000 claims abstract description 24
- 238000000605 extraction Methods 0.000 claims description 42
- 230000004927 fusion Effects 0.000 claims description 33
- 238000011176 pooling Methods 0.000 claims description 14
- 230000007246 mechanism Effects 0.000 claims description 11
- 238000011478 gradient descent method Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims 2
- 238000013527 convolutional neural network Methods 0.000 description 71
- 230000008569 process Effects 0.000 description 9
- 230000011218 segmentation Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the field of biological recognition, and discloses a facial mask recognition method, a facial mask recognition device, facial mask recognition equipment and a storage medium. The facial mask recognition method comprises the following steps: obtaining sample images, and labeling each sample image to obtain label information of each sample image, wherein the sample images comprise facial images of a worn mask and facial images of a non-worn mask; inputting the sample image and corresponding label information into a preset MASK R-CNN model, extracting an eye feature image and a facial integral feature image of the sample image through the MASK R-CNN model, and training to obtain an identification model for identifying whether a MASK is worn on the face; and acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result. The embodiment can improve the accuracy of the face recognition model in detecting whether the face is worn on the mask.
Description
Technical Field
The invention relates to the field of biological recognition, in particular to a facial mask recognition method, a facial mask recognition device, facial mask recognition equipment and a storage medium.
Background
Face recognition is a technique for performing identity recognition based on facial features of a person. With the development of deep learning, the face recognition accuracy is higher and higher, and the method has wider and wider application in public places and replaces manpower in certain fields. If in the traffic field, the face recognition device can detect whether a passenger of an automobile wears safety belts or not, and can recognize whether the personnel working on the construction site wear safety helmets or not on the construction site.
However, the existing face recognition model is built on the basis of a complete face to judge the face. Because the mask shields a part of face information, the recognition rate of judging whether the face is worn on the mask is only about 30% based on a conventional face recognition model at present. And the risk of the person being infected is judged by adopting manual judgment.
Disclosure of Invention
The invention mainly aims to improve the technical problem of accuracy of a face recognition model for detecting whether a mask is worn or not.
The first aspect of the present invention provides a facial mask recognition method, comprising:
obtaining sample images, and labeling each sample image to obtain label information of each sample image, wherein the sample images comprise facial images of a worn mask and facial images of a non-worn mask;
Inputting the sample image and corresponding label information into a preset MASK R-CNN model, extracting an eye feature image and a facial integral feature image of the sample image through the MASK R-CNN model, and training to obtain an identification model for identifying whether a MASK is worn on the face;
and acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result.
Optionally, in a first implementation manner of the first aspect of the present invention, the MASK R-CNN model sequentially includes: a target feature extraction network, an RPN network, an ROI alignment layer and an FCN network;
the target feature extraction network is used for extracting a target feature map of the sample image, wherein the target feature map comprises an eye feature map, a face integral feature map and a fusion feature map,
the RPN network is used for generating a pre-selection frame corresponding to the target feature map,
the ROI alignment layer is used for dividing a preselected frame in the target feature map and fusing the preselected frame and the target feature map in an endpoint pooling mode to generate and obtain a marked feature map;
and the FCN network is used for predicting each pixel point of the labeling feature map to obtain a prediction result corresponding to the sample image.
Optionally, in a second implementation manner of the first aspect of the present invention, the training method of the MASK R-CNN model includes:
inputting the sample image and corresponding label information into a preset MASK R-CNN model;
extracting a target feature map of the sample image through the target feature extraction network;
inputting the target feature map into the RPN network so as to generate a pre-selected frame corresponding to the target feature map through the RPN network according to preset anchor frame information;
inputting the pre-selection frame and the target feature map into the ROI alignment layer for fusing the pre-selection frame and the target feature map through the ROI alignment layer pair, and dividing the pre-selection frame and pooling endpoints to obtain a labeling feature map;
inputting the labeling feature map into the FCN network so as to predict each pixel point of the labeling feature map through the FCN network, obtaining a prediction result corresponding to the sample image and outputting the prediction result;
and optimizing parameters of the MASK R-CNN model according to the prediction result and the label information until the MASK R-CNN model converges to obtain an identification model.
Optionally, in a third implementation manner of the first aspect of the present invention, the generating, by the RPN network, a pre-selected box corresponding to the target feature map according to the preset anchor box information includes:
Acquiring preset anchor frame information through the RPN network, and generating candidate frames of each pixel point in the target feature map according to the anchor frame information;
judging whether the candidate frame contains a face for wearing a mask;
if yes, reserving the candidate frame, and adjusting the candidate frame to obtain a preselected frame of the target feature map.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the extracting, by the target feature extraction network, a target feature map of the sample image includes:
extracting a facial integral feature map corresponding to the sample image through the target feature extraction network;
extracting an eye feature map in the whole facial feature map based on a preset eye feature attention mechanism;
and carrying out multi-level feature fusion on the eye feature map and the whole facial feature map to obtain a fusion feature map.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the predicting, by the FCN network, each pixel point of the labeling feature map, to obtain a prediction result corresponding to the sample image, and output the prediction result includes:
convolving the labeling feature map through the FCN network to generate a mask corresponding to the preselected frame and a first heat map containing predicted values corresponding to all pixel points;
Upsampling the first heat map to obtain a second heat map consistent with the size of the sample image;
and outputting the mask, the pre-selection frame and the second heat map as a prediction result corresponding to the sample image.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the optimizing parameters of the MASK R-CNN model according to the prediction result and the tag information until the MASK R-CNN model converges, and obtaining the identification model includes:
calculating a loss value between the prediction result and the tag information according to a preset loss function;
the loss value is reversely transmitted back to the MASK R-CNN model, and parameters of the MASK R-CNN model are optimized according to a random gradient descent method;
and if the MASK R-CNN model converges, taking the current MASK R-CNN model as an identification model.
A second aspect of the present invention provides a facial mask recognition device comprising:
the acquisition module is used for acquiring sample images and labeling the sample images to obtain label information of the sample images, wherein the sample images comprise facial images of a wearing mask and facial images of an unworn mask;
The training module is used for inputting the sample image and the corresponding label information into a preset MASK R-CNN model so as to extract an eye feature image and a face integral feature image of the sample image through the MASK R-CNN model for training and obtain an identification model for identifying whether the face is worn on the MASK;
the identification module is used for acquiring an image to be detected, inputting the image to be detected into the identification model for identification, and outputting an identification result.
Optionally, in a first implementation manner of the second aspect of the present invention, the MASK R-CNN model includes a target feature extraction network, an RPN network, an ROI alignment layer, and an FCN network;
the target feature extraction network is used for extracting a target feature map of the sample image, wherein the target feature map comprises an eye feature map, a face integral feature map and a fusion feature map;
the RPN network is used for generating a pre-selection frame corresponding to the target feature map;
the ROI alignment layer is used for dividing a preselected frame in the target feature map and fusing the preselected frame and the target feature map in an endpoint pooling mode to generate and obtain a marked feature map;
and the FCN network is used for predicting each pixel point of the labeling feature map to obtain a prediction result corresponding to the sample image.
Optionally, in a second implementation manner of the second aspect of the present invention, the training module includes:
the input unit is used for inputting the sample image and the corresponding label information into a preset MASK R-CNN model;
an extraction unit configured to extract a target feature map of the sample image through the target feature extraction network;
the preselection frame unit is used for inputting the target feature map into the RPN network so as to generate a preselection frame corresponding to the target feature map through the RPN network according to preset anchor frame information;
the processing unit is used for inputting the pre-selection frame and the target feature map into the ROI alignment layer so as to fuse the pre-selection frame and the target feature map through the ROI alignment layer pair, and dividing the pre-selection frame and pooling the end points to obtain a labeling feature map;
the output unit is used for inputting the labeling feature diagram into the FCN network so as to predict each pixel point of the labeling feature diagram through the FCN network, and obtaining and outputting a corresponding prediction result of the sample image;
and the convergence unit is used for optimizing the parameters of the MASK R-CNN model according to the prediction result and the label information until the MASK R-CNN model converges to obtain an identification model.
Optionally, in a third implementation manner of the second aspect of the present invention, the pre-selection box unit is specifically configured to:
acquiring preset anchor frame information through the RPN network, and generating candidate frames of each pixel point in the target feature map according to the anchor frame information;
judging whether the candidate frame contains a face for wearing a mask;
if yes, the candidate frame is reserved, and coordinates of the candidate frame are adjusted and regressed to obtain a preselected frame of the target feature map.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the extracting unit is specifically configured to:
extracting a facial integral feature map corresponding to the sample image through the target feature extraction network;
extracting an eye feature map in the whole facial feature map based on a preset eye feature attention mechanism;
and carrying out multi-level feature fusion on the eye feature map and the whole facial feature map to obtain a fusion feature map.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the output unit is specifically configured to:
convoluting the marked feature map through an FCN network to generate a mask corresponding to the preselected frame and a first heat map containing predicted values corresponding to all pixel points;
Upsampling the first heat map to obtain a second heat map consistent with the size of the sample image;
and outputting the mask, the pre-selection frame and the second heat map as a prediction result corresponding to the sample image.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the convergence unit is specifically configured to:
calculating a loss value between the prediction result and the tag information according to a preset loss function;
the loss value is reversely transmitted back to the MASK R-CNN model, and parameters of the MASK R-CNN model are optimized according to a random gradient descent method;
and if the MASK R-CNN model converges, taking the current MASK R-CNN model as an identification model.
A third aspect of the present invention provides a face-worn mask recognition apparatus comprising: the system comprises a memory and at least one processor, wherein instructions are stored in the memory, and the memory and the at least one processor are interconnected through a line; the at least one processor invokes the instructions in the memory to cause the facial mask recognition device to perform the facial mask recognition method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having instructions stored therein that, when run on a computer, cause the computer to perform the above-described face-worn mask recognition method.
In the technical scheme provided by the invention, facial images of the worn mask and the unworn mask are firstly taken as sample images, and then the sample images are marked to obtain corresponding label information. And inputting the sample image and the label information into a preset MASK R-CNN model. The MASK R-CNN model is a target recognition model capable of realizing example segmentation, and has higher accuracy and faster training speed compared with the existing model. In the present invention, the MASK R-CNN model includes a target feature extraction network, an RPN network, an ROI alignment layer, and an FCN network. The target feature extraction network comprises an eye attention mechanism, a facial integral feature map and an eye feature map in a sample image can be extracted, a fusion feature map which fuses the facial integral feature map and the eye feature map is generated, and the accuracy of identifying whether a face wears a mask is improved through multistage fusion of the facial integral feature map and the eye feature map.
Drawings
Fig. 1 is a schematic view showing a first embodiment of a face-wearing mask recognition method according to an embodiment of the present invention;
fig. 2 is a schematic view showing a first embodiment of a face-wearing mask recognition method according to an embodiment of the present invention;
fig. 3 is a schematic view of a third embodiment of a face-wearing mask recognition method according to an embodiment of the present invention;
Fig. 4 is a schematic view of a fourth embodiment of a face-wearing mask recognition method according to an embodiment of the present invention;
fig. 5 is a schematic view showing a fifth embodiment of a face-wearing mask recognition method according to an embodiment of the present invention;
fig. 6 is a schematic view of an embodiment of a face-worn mask recognition device according to an embodiment of the present invention;
fig. 7 is a schematic view of another embodiment of a face-worn mask recognition device according to an embodiment of the present invention;
fig. 8 is a schematic view of an embodiment of a face-worn mask recognition apparatus according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a facial MASK recognition method, a device, equipment and a storage medium, wherein the MASK R-CNN model is used as an initial model, and is a target recognition model capable of realizing example segmentation, and compared with the existing model, the method and the device have the advantages of higher accuracy and faster training speed. In the invention, the MASK R-CNN model comprises a target feature extraction network, the target feature extraction network comprises an eye attention mechanism, a facial integral feature map and an eye feature map in a sample image can be extracted, a fusion feature map which fuses the facial integral feature map and the eye feature map is generated, and the accuracy of identifying whether a face wears a MASK is improved through multistage fusion of the facial integral feature map and the eye feature map.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
For ease of understanding, a specific flow of an embodiment of the present invention is described below with reference to fig. 1, where an embodiment of a face wearing mask recognition method according to an embodiment of the present invention includes:
101. obtaining sample images, and labeling each sample image to obtain label information of each sample image, wherein the sample images comprise facial images of a worn mask and facial images of a non-worn mask;
It is to be understood that the execution body of the present invention may be a facial mask recognition device, or may be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.
In this embodiment, face images of the wearing mask and the non-wearing mask are acquired through a network, a camera, or the like, and taken as sample images.
And marking the face of the person wearing the mask in the sample image by circling a circle, a rectangle or an irregular polygon, and storing the marked position coordinates as tag information.
102. Inputting the sample image and corresponding label information into a preset MASK R-CNN model, extracting an eye feature image and a facial integral feature image of the sample image through the MASK R-CNN model, and training to obtain an identification model for identifying whether a MASK is worn on the face;
MASK RCNN was developed in 17 years to enable instance-partitioned target detection models. The method is characterized in that a fully-connected split sub-network is added on the basis of the prior Faster RCNN, and the original classification and regression tasks are changed into classification, regression and separation into a whole, and the method mainly comprises a target feature extraction network, an RPN network, an ROI alignment layer and an FCN network.
The target feature network may be various convolutional neural networks including VGG series, alexnet series, and the like. Firstly, inputting a sample image into a target feature extraction network, extracting an eye feature image and a face integral feature image of the sample image by the target feature extraction network, and then obtaining a fusion feature image of the fusion eye feature image and the face integral feature image through multistage fusion. And (3) taking the three feature maps as target feature maps to be transmitted into the RPN network.
RPN (Regin Proposal Network, regional production network), "region Proposal" is "regional selection". In the RPN network, preset anchor frame information is acquired, whether the anchor frame contains an identification target is judged, if yes, the anchor frame is reserved, and position adjustment is carried out on the anchor frame, so that a preselected frame of a sample image is obtained.
And fusing the target with the feature map and the pre-selected frame through an ROI alignment (Region of Interest Align) layer to obtain the marked feature map. ROI alignment is a region of interest matching method for solving the problem of region mismatch caused by two quantization in ROI Pooling operation.
Then, through FCN network (Fully Convolutional Networks, full convolution network), the preselected frame of the marked feature map, probability value and mask of each pixel point are obtained, and output as the recognition result. The full convolution network replaces the conventional full connection layer and is used for outputting a prediction result of the whole image.
And finally, optimizing parameters of the model according to a random gradient descent method, so as to obtain the identification model.
103. And acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result.
After the identification model is obtained, the image to be detected is input into the identification model. The identification model is extracted to a target feature map through a target extraction network, an RPN network generates a preselected frame on the target feature map, an ROI alignment layer generates a labeling feature map, and finally the labeling feature map is input into an FCN network to obtain an identification result.
In the embodiment of the invention, facial images of the worn mask and the unworn mask are taken as sample images, and then marked to obtain corresponding label information. And inputting the sample image and the label information into a preset MASK R-CNN model. The MASK R-CNN model is a target recognition model capable of realizing example segmentation, and has higher accuracy and faster training speed compared with the existing model. In addition, in the process of training the MASK R-CNN model, an eye feature image and overall facial feature information in a sample image can be extracted, so that the recognition accuracy is improved.
Referring to fig. 2, a second embodiment of a face mask recognition method according to an embodiment of the present invention includes:
201. Acquiring sample images, and labeling each sample image to obtain label information of each sample image;
202. inputting the sample image and corresponding label information into a preset MASK R-CNN model, wherein the MASK R-CNN model sequentially comprises: a target feature extraction network, an RPN network, an ROI alignment layer and an FCN network;
the target feature extraction network is used for extracting a target feature map of the sample image, wherein the target feature map comprises an eye feature map, a facial integral feature map and a fusion feature map;
the RPN network is used for generating a pre-selection frame corresponding to the target feature map;
the ROI alignment layer is used for dividing a preselected frame in the target feature map and fusing the preselected frame and the target feature map in an endpoint pooling mode to generate and obtain a marked feature map;
the FCN network is used for predicting each pixel point of the labeling feature map to obtain a prediction result corresponding to the sample image;
203. extracting a facial integral feature map corresponding to the sample image through the target feature extraction network;
the present embodiment preferably employs ResNet as the target feature extraction network. In the ResNet network constructed in this embodiment, there are a total of 5 stages, and each stage performs convolution operation. As stage deepens, the depth of the obtained feature map becomes deeper. The resulting graphs for these five stages are designated C1, C2, C3, C4 and C5, respectively. stage 3 extracts a face overall feature map, and C3 is a face overall feature extraction map.
204. Extracting an eye feature map in the facial overall feature map based on a preset eye feature attention mechanism;
stage5 extracts an eye feature map in the face, and C5 is the eye feature map. Notably, stage5 is associated with a preset eye attention mechanism when extracting the eye feature map.
In the embodiment, the eye attention mechanism is used for extracting the eye features, so that the accuracy of eye recognition is improved, and the face is better positioned. The present embodiment preferably employs a soft attention mechanism to build an eye feature attention mechanism.
205. Performing multi-level feature fusion on the eye feature map and the whole facial feature map to obtain a fusion feature map;
at the feature fusion layer, a 1x1 convolution operation is performed on C5 to form a feature map P5. And C4 also performs convolution operation, the generated feature map is summed with P5 to generate P4, and the like to generate P2, P3, P4 and P5. Feature extraction graphs obtained by each stage are input into a feature fusion layer. In order to prevent jitter, preferably, the feature fusion layer performs a 3x3 convolution operation on all the input feature graphs, and then downsamples P5, so as to implement multi-level feature fusion, and generate P6, where P6 is the fusion feature graph.
206. Inputting the target feature map into the RPN network so as to generate a pre-selected frame corresponding to the target feature map through the RPN network according to preset anchor frame information;
inputting the target feature map into an RPN (remote procedure network), acquiring preset anchor frame information by the RPN, establishing anchor frames for all pixel points in the target feature map, and adjusting the anchor frames containing faces wearing the mask, so that the circled range is more accurate.
207. Inputting the pre-selection frame and the target feature map into the ROI alignment layer for fusing the pre-selection frame and the target feature map through the ROI alignment layer pair, and carrying out segmentation and endpoint pooling on the pre-selection frame to obtain a labeling feature map;
the ROI alignment identifies the location of the candidate box on the target feature map, thereby fusing the candidate box and the target feature map. The ROI alignment uses bilinear interpolation to divide the pre-selected frame, and then the divided end points are maximally pooled. The bilinear interpolation value effectively avoids the accumulation of boundary non-integer factors in the segmentation process, so that the fusion precision of the two is higher.
208. Inputting the labeling feature map into the FCN network so as to predict each pixel point of the labeling feature map through the FCN network, obtaining a prediction result corresponding to the sample image and outputting the prediction result;
The FCN network classifies the image at the pixel level, thereby solving the problem of image segmentation at the semantic level. The FCN converts the full connection layer in the conventional CNN into a convolution layer, and after the convolution of the last layer, the generated graph is called a heat graph, and each pixel point on the heat graph contains a probability that the pixel point is of a certain type.
If the probability of a pixel point is 80% of the probability of wearing the mask, the pixel point corresponding to the image is marked with color, so that the mask of the pixel point is obtained. And classifying and marking pixel points in the range of the preselected frame in the first heat map, so as to obtain a mask corresponding to the preselected frame.
The first heat map is then up-sampled to expand the size of the second heat map to obtain a second heat map that is consistent with the sample image size.
And finally, outputting the mask, the pre-selection frame and the second heat map as a prediction result corresponding to the sample image.
209. Optimizing parameters of the MASK R-CNN model according to the prediction result and the label information until the MASK R-CNN model converges to obtain an identification model;
in the MASK R-CNN model, a loss value between the prediction result and the tag information is calculated based on a preset loss function. The loss values are then passed back into the MASK R-CNN model by back propagation. And then optimizing parameters of each network according to a random gradient descent method. If the model converges, the current model is stored as an identification model.
210. And acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result.
A common object detection model is multi-level feature extraction. As the number of layers of the neural network increases, feature extraction is further refined from low level to high level. For example, low level refers to extracting contours of a person's face, while high level extracts higher features of eyes, nose, etc. However, as the network deepens, each layer may lose some information, and eventually more information. The invention aims to combine the eye feature data and the whole face feature data of the person to judge whether to wear the mask, so the feature map of the image is extracted by adopting a multi-scale feature fusion mode, thereby greatly improving the recognition accuracy.
Referring to fig. 3, a third embodiment of a face wearing mask recognition method according to an embodiment of the present invention includes:
301. acquiring sample images, and labeling each sample image to obtain label information of each sample image;
302. inputting the sample image and corresponding label information into a preset MASK R-CNN model;
303. extracting a target feature map of the sample image through the target feature extraction network;
304. Acquiring preset anchor frame information through the RPN network, and generating candidate frames of each pixel point in the target feature map according to the anchor frame information;
anchor frame information such as size and number is preset. In this embodiment, the RPN network establishes nine anchor frames with different sizes for each pixel point in the feature map, and uses all the anchor frames as candidate frames.
305. Judging whether the candidate frame contains a face for wearing a mask;
the RPN network is also provided with a softmax classifier, and whether the candidate frame contains a face wearing the mask can be judged through the softmax classifier.
306. If yes, reserving the candidate frame, and adjusting the candidate frame to obtain a preselected frame of the target feature map;
if yes, the candidate box is saved, and the candidate box is input into a reshape layer and a proposal layer in the RPN network.
Since there may be a case where a plurality of candidate frames overlap in the selection area, the overlapping area between the candidate frames may be calculated by a non-maximum suppression method. In this embodiment, the threshold value of the overlapping region may be set in advance, and then the candidate frames whose overlapping region is larger than the threshold value are retained, and the candidate frames smaller than the threshold value are deleted. And then, the coordinates of the candidate frames are adjusted and regressed through a preset candidate frame offset formula, so that the frame selected area is closer to the real range.
307. Inputting the pre-selection frame and the target feature map into the ROI alignment layer for fusing the pre-selection frame and the target feature map through the ROI alignment layer pair, and carrying out segmentation and endpoint pooling on the pre-selection frame to obtain a labeling feature map;
308. inputting the labeling feature map into the FCN network so as to predict each pixel point of the labeling feature map through the FCN network, obtaining a prediction result corresponding to the sample image and outputting the prediction result;
309. optimizing parameters of the MASK R-CNN model according to the prediction result and the label information until the MASK R-CNN model converges to obtain the identification model;
310. and acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result.
In this embodiment, the process of generating a pre-selected frame of a sample image by the MASK R-CNN model is described in detail. Because the softmax classifier is adopted for judging, the response layer and the proposal layer are used for further adjusting the candidate frames, the softmax classifier, the response layer and the proposal layer can be trained in the training process of the model, and therefore the accuracy of selecting the identification target by the frame of the preselected frame is improved.
Referring to fig. 4, a fourth embodiment of the facial mask recognition method according to the embodiment of the present invention includes:
401. acquiring sample images, and labeling each sample image to obtain label information of each sample image;
402. inputting the sample image and corresponding label information into a preset MASK R-CNN model;
403. extracting a target feature map of the sample image through the target feature extraction network;
404. inputting the target feature map into the RPN network so as to generate a pre-selected frame corresponding to the target feature map through the RPN network according to preset anchor frame information;
405. inputting the pre-selection frame and the target feature map into the ROI alignment layer for fusing the pre-selection frame and the target feature map through the ROI alignment layer pair, and carrying out segmentation and endpoint pooling on the pre-selection frame to obtain a labeling feature map;
406. convolving the labeling feature map through the FCN network to generate a mask corresponding to the preselected frame and a first heat map containing predicted values corresponding to all pixel points;
the FCN comprises a plurality of convolution layers, and after the input labeling feature images are rolled and mapped for a plurality of times, the obtained images are smaller and the resolution is lower. After the convolution of the last layer, the generated graph is called a heat graph, and each pixel point, i.e. the probability that the pixel point is of a certain type, of each pixel point on the heat graph.
If the probability of a pixel point is 80% of the probability of wearing the mask and the probability of not wearing the mask is 20%, the pixel point corresponding to the image is color-marked, so that the mask of the pixel point is obtained. And classifying and marking pixel points in the range of the preselected frame in the first heat map, so as to obtain a mask corresponding to the preselected frame.
407. Upsampling the first heat map to obtain a second heat map consistent with the size of the sample image;
the first heat map is then up-sampled to expand the size of the second heat map to obtain a second heat map that is consistent with the sample image size.
408. Outputting the mask, the pre-selection frame and the second heat map as prediction results corresponding to the sample image;
409. optimizing parameters of the MASK R-CNN model according to the prediction result and the label information until the MASK R-CNN model converges to obtain the identification model;
410. and acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result.
In this embodiment, since the final recognition result includes the mask, the heat map and the pre-selected frame, the final recognition result of the present invention can learn and train the mask, the probability value of each pixel point and the pre-selected frame position, thereby improving the accuracy of the recognition result.
Referring to fig. 5, a fifth embodiment of a face mask recognition method according to an embodiment of the present invention includes:
501. acquiring sample images, and labeling each sample image to obtain label information of each sample image;
502. inputting the sample image and corresponding label information into a preset MASK R-CNN model;
503. extracting a target feature map of the sample image through the target feature extraction network;
504. inputting the target feature map into the RPN network so as to generate a pre-selected frame corresponding to the target feature map through the RPN network according to preset anchor frame information;
505. inputting the pre-selection frame and the target feature map into the ROI alignment layer for fusing the pre-selection frame and the target feature map through the ROI alignment layer pair, and carrying out segmentation and endpoint pooling on the pre-selection frame to obtain a labeling feature map;
506. inputting the labeling feature map into the FCN network so as to predict each pixel point of the labeling feature map through the FCN network, obtaining a prediction result corresponding to the sample image and outputting the prediction result;
507. calculating a loss value between the prediction result and the tag information according to a preset loss function;
In the MASK R-CNN model, the loss function is l=l cls +L box +L mask . Wherein L is cls Is a classification loss, i.e. a loss of accuracy of pixel point identification, L box Refers to the loss of the pre-selected frame, which is obtained by comparing the position coordinates of the pre-selected frame with the position coordinates of the selected frame in the label information, and L mask Refers to the loss of mask. The loss value between the predicted result and the tag information can be calculated by a preset loss function.
508. The loss value is reversely transmitted back to the MASK R-CNN model, and parameters of the MASK R-CNN model are optimized according to a random gradient descent method;
the loss values are transmitted back to the MASK R-CNN model through back propagation, and parameters of each network are optimized according to a random gradient descent method.
509. If the MASK R-CNN model converges, taking the current MASK R-CNN model as the identification model;
510. and acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result.
The embodiment specifically describes how to learn and train according to the prediction result after the model obtains the prediction result. The invention mainly adopts a mode of reversely transmitting the loss value back to the model and optimizing the parameters by a random gradient descent method, and the adopted calculated loss value comprises a mask, a preselected frame and classification, so that the three results can be further optimized by adopting the learning training mode of the scheme, thereby improving the recognition efficiency.
The method for identifying the face-worn mask in the embodiment of the present invention is described above, and the face-worn mask identifying device in the embodiment of the present invention is described below, referring to fig. 6, where one embodiment of the face-worn mask identifying device in the embodiment of the present invention includes:
the acquiring module 601 is configured to acquire sample images, and label each sample image to obtain label information of each sample image, where the sample images include a facial image of a mask worn and a facial image of a mask not worn;
the training module 602 is configured to input the sample image and the corresponding tag information into a preset MASK R-CNN model, so as to extract an eye feature map and a facial overall feature map of the sample image through the MASK R-CNN model for training, and obtain an identification model for identifying whether the MASK is worn on the face;
the recognition module 603 is configured to obtain an image to be detected, input the image to be detected into the recognition model for recognition, and output a recognition result.
In the embodiment of the invention, facial images of the worn mask and the unworn mask are taken as sample images, and then marked to obtain corresponding label information. And inputting the sample image and the label information into a preset MASK R-CNN model. The MASK R-CNN model is a target recognition model capable of realizing example segmentation, and has higher accuracy and faster training speed compared with the existing model. In addition, in the process of training the MASK R-CNN model, an eye feature image and overall facial feature information in a sample image can be extracted, so that the recognition accuracy is improved.
Referring to fig. 7, another embodiment of the facial mask recognition device according to the present invention includes:
the acquiring module 701 is configured to acquire sample images, and label each sample image to obtain label information of each sample image, where the sample image includes a facial image of a mask worn and a facial image of a mask not worn;
the training module 702 is configured to input the sample image and the corresponding tag information into a preset MASK R-CNN model, so as to extract an eye feature map and a whole facial feature map of the sample image through the MASK R-CNN model for training, and obtain an identification model for identifying whether the face is wearing a MASK;
the recognition module 703 is configured to obtain an image to be detected, input the image to be detected into the recognition model for recognition, and output a recognition result.
Optionally, the MASK R-CNN model includes a target feature extraction network, an RPN network, an ROI alignment layer, and an FCN network,
the target feature extraction network is used for extracting a target feature map of the sample image, wherein the target feature map comprises an eye feature map, a face integral feature map and a fusion feature map;
the RPN network is used for generating a pre-selection frame corresponding to the target feature map;
The ROI alignment layer is used for dividing a preselected frame in the target feature map and carrying out endpoint pooling to generate a labeling feature map;
and the FCN network is used for predicting each pixel point of the labeling feature map to obtain a prediction result corresponding to the sample image.
Wherein the training module 702 comprises:
an input unit 7021, configured to input the sample image and the corresponding label information into a preset MASK R-CNN model;
an extraction unit 7022 for extracting a target feature map of the sample image through the target feature extraction network;
a preselection frame unit 7023, configured to input the target feature map into the RPN network, so that a preselection frame corresponding to the target feature map is generated through the RPN network according to preset anchor frame information;
a processing unit 7024, configured to input the pre-selection frame and the target feature map into the ROI alignment layer, so that the pre-selection frame and the target feature map are fused through the ROI alignment layer pair, and the pre-selection frame is segmented and the target feature map is pooled to obtain a labeling feature map;
the output unit 7025 is configured to input the labeling feature map into the FCN network, so that each pixel point of the labeling feature map is predicted by the FCN network, and a recognition result corresponding to the sample image is obtained and output;
And a convergence unit 7026, configured to optimize parameters of the MASK R-CNN model according to the identification result and the tag information until the MASK R-CNN model converges, so as to obtain an identification model.
Optionally, the preselection frame unit 7023 is specifically configured to:
acquiring preset anchor frame information through the RPN network, and generating candidate frames of each pixel point in the target feature map according to the anchor frame information;
judging whether the candidate frame contains a face for wearing a mask;
if yes, the candidate frame is reserved, and coordinates of the candidate frame are adjusted and regressed to obtain a preselected frame of the target feature map.
Optionally, the extracting unit 7022 is specifically configured to:
extracting a facial integral feature map corresponding to the sample image through the target feature extraction network;
extracting an eye feature map in the whole facial feature map based on a preset eye feature attention mechanism;
and adopting the feature fusion layer to perform multistage feature fusion on the eye feature map and the whole facial feature map to obtain a fusion feature map.
Optionally, the output unit 7027 is specifically configured to:
convoluting the marked feature map through an FCN network to generate a mask corresponding to the preselected frame and a first heat map containing predicted values corresponding to all pixel points;
Upsampling the first heat map to obtain a second heat map consistent with the size of the sample image;
and outputting the mask, the pre-selection frame and the second heat map as a prediction result corresponding to the sample image.
Optionally, the convergence unit 7026 is specifically configured to:
calculating a loss value between the initial classification result and the label information according to a preset loss function;
the loss value is reversely transmitted back to the MASK R-CNN model, and parameters of the MASK R-CNN model are optimized according to a random gradient descent method;
and if the MASK R-CNN model converges, taking the current MASK R-CNN model as an identification model.
In the technical scheme provided by the invention, on the basis of the above embodiment, the MASK R-CNN model comprises a target feature extraction network, an RPN network, an ROI alignment layer and an FCN network. The target characteristic information is used for generating a fusion characteristic image fused with the facial integral characteristic image and the eye characteristic image, and the accuracy of identifying whether the face is worn on the mask is improved through multistage fusion of the facial integral characteristic image and the eye characteristic image. The final recognition result comprises the mask, the heat map and the preselection frame, so that the final recognition result can learn and train the mask, the probability value of each pixel point and the preselection frame position, and the accuracy of the mask, the classification and the preselection frame position in the recognition result is improved.
The face-wearing mask recognition device in the embodiment of the present invention is described in detail from the point of view of the modularized functional entity in fig. 4 and 5 above, and the face-wearing mask recognition apparatus in the embodiment of the present invention is described in detail from the point of view of hardware processing below.
Fig. 6 is a schematic structural diagram of a facial mask recognition device according to an embodiment of the present invention, where the facial mask recognition device 800 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 810 (e.g., one or more processors) and a memory 820, and one or more storage media 840 (e.g., one or more mass storage devices) storing application programs 833 or data 832. Wherein memory 820 and storage medium 840 can be transitory or persistent. The program stored in the storage medium 840 may include one or more modules (not shown), each of which may include a series of instruction operations in the face mask recognition device 800. Still further, the processor 810 may be configured to communicate with the storage medium 840 to perform a series of instruction operations in the storage medium 840 on the face-worn mask recognition device 800.
The face-worn mask-based identification device 800 may also include one or more power supplies 840, one or more wired or wireless network interfaces 880, one or more input/output interfaces 880, and/or one or more operating systems 831, such as Windows Serve, mac OS X, unix, linux, freeBSD, and the like. It will be appreciated by those skilled in the art that the structure of the facial mask-based facial mask recognition device illustrated in fig. 8 is not limiting and may include more or fewer components than illustrated, or may be combined with certain components, or may be arranged in a different arrangement of components.
The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and which may also be a volatile computer readable storage medium, the computer readable storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the steps of the facial mask recognition method.
It will be clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, apparatuses and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not described in detail herein.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (9)
1. The face wearing mask identification method is characterized by comprising the following steps:
obtaining sample images, and labeling each sample image to obtain label information of each sample image, wherein the sample images comprise facial images of a worn mask and facial images of a non-worn mask;
inputting the sample image and corresponding label information into a preset MASK R-CNN model, extracting an eye feature image and a facial integral feature image of the sample image through the MASK R-CNN model, and training to obtain an identification model for identifying whether a MASK is worn on the face;
acquiring an image to be detected, inputting the image to be detected into the recognition model for recognition, and outputting a recognition result;
the MASK R-CNN model sequentially comprises the following steps: a target feature extraction network, an RPN network, an ROI alignment layer and an FCN network;
the target feature extraction network is used for extracting a target feature map of the sample image, wherein the target feature map comprises an eye feature map, a facial integral feature map and a fusion feature map;
the RPN network is used for generating a pre-selection frame corresponding to the target feature map;
the ROI alignment layer is used for dividing a preselected frame in the target feature map and fusing the preselected frame and the target feature map in an endpoint pooling mode to generate and obtain a marked feature map;
And the FCN network is used for predicting each pixel point of the labeling feature map to obtain a prediction result corresponding to the sample image.
2. The face mask recognition method according to claim 1, wherein the training method of the recognition model comprises:
inputting the sample image and corresponding label information into a preset MASK R-CNN model;
extracting a target feature map of the sample image through the target feature extraction network;
inputting the target feature map into the RPN network so as to generate a pre-selected frame corresponding to the target feature map through the RPN network according to preset anchor frame information;
inputting the pre-selection frame and the target feature map into the ROI alignment layer for fusing the pre-selection frame and the target feature map through the ROI alignment layer pair, and dividing the pre-selection frame and pooling endpoints to obtain a labeling feature map;
inputting the labeling feature map into the FCN network so as to predict each pixel point of the labeling feature map through the FCN network, obtaining a prediction result corresponding to the sample image and outputting the prediction result;
and optimizing parameters of the MASK R-CNN model according to the prediction result and the label information until the MASK R-CNN model converges to obtain the identification model.
3. The method of claim 2, wherein the generating, by the RPN network, a pre-selected box corresponding to the target feature map according to preset anchor box information comprises:
acquiring preset anchor frame information through the RPN network, and generating candidate frames of each pixel point in the target feature map according to the anchor frame information;
judging whether the candidate frame contains a face for wearing a mask;
if yes, reserving the candidate frame, and adjusting the candidate frame to obtain a preselected frame of the target feature map.
4. The face mask recognition method of claim 2, wherein the extracting the target feature map of the sample image through the target feature extraction network comprises:
extracting a facial integral feature map corresponding to the sample image through the target feature extraction network;
extracting an eye feature map in the whole facial feature map based on a preset eye feature attention mechanism;
and carrying out multi-level feature fusion on the eye feature map and the whole facial feature map to obtain a fusion feature map.
5. The face recognition method according to claim 2, wherein predicting each pixel point of the labeling feature map through the FCN network, obtaining a prediction result corresponding to the sample image, and outputting the prediction result includes:
Convolving the labeling feature map through the FCN network to generate a mask corresponding to the preselected frame and a first heat map containing predicted values corresponding to all pixel points;
upsampling the first heat map to obtain a second heat map consistent with the size of the sample image;
and outputting the mask, the pre-selection frame and the second heat map as a prediction result corresponding to the sample image.
6. The method for identifying a MASK for face wearing according to claim 2, wherein optimizing parameters of the MASK R-CNN model according to the prediction result and the tag information until the MASK R-CNN model converges, obtaining the identification model includes:
calculating a loss value between the prediction result and the tag information according to a preset loss function;
the loss value is reversely transmitted back to the MASK R-CNN model, and parameters of the MASK R-CNN model are optimized according to a random gradient descent method;
and if the MASK R-CNN model converges, taking the current MASK R-CNN model as the identification model.
7. A facial mask recognition device, characterized in that the facial mask recognition device comprises:
The device comprises an acquisition module, a labeling module and a display module, wherein the acquisition module is used for acquiring sample images and labeling each sample image to obtain label information of each sample image, and the sample images comprise facial images of a worn mask and facial images of an unworn mask;
the training module is used for inputting the sample image and the corresponding label information into a preset MASK R-CNN model so as to extract an eye feature image and a face integral feature image of the sample image through the MASK R-CNN model for training and obtain an identification model for identifying whether the face is worn on the MASK;
the identification module is used for acquiring an image to be detected, inputting the image to be detected into the identification model for identification, and outputting an identification result;
the MASK R-CNN model comprises a target feature extraction network, an RPN network, a ROIAlign layer and an FCN network;
the target feature extraction network is used for extracting a target feature map of the sample image, wherein the target feature map comprises an eye feature map, a face integral feature map and a fusion feature map;
the RPN network is used for generating a pre-selection frame corresponding to the target feature map;
the ROI alignment layer is used for dividing a preselected frame in the target feature map and fusing the preselected frame and the target feature map in an endpoint pooling mode to generate and obtain a marked feature map;
And the FCN network is used for predicting each pixel point of the labeling feature map to obtain a prediction result corresponding to the sample image.
8. A facial mask recognition device, characterized in that the facial mask recognition device comprises: a memory and at least one processor, the memory having instructions stored therein, the memory and the at least one processor being interconnected by a line;
the at least one processor invoking the instructions in the memory to cause the facial mask recognition device to perform the facial mask recognition method of any one of claims 1-6.
9. A computer-readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the facial mask recognition method of any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010194398.2A CN111428604B (en) | 2020-03-19 | 2020-03-19 | Facial mask recognition method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010194398.2A CN111428604B (en) | 2020-03-19 | 2020-03-19 | Facial mask recognition method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111428604A CN111428604A (en) | 2020-07-17 |
CN111428604B true CN111428604B (en) | 2023-06-13 |
Family
ID=71548133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010194398.2A Active CN111428604B (en) | 2020-03-19 | 2020-03-19 | Facial mask recognition method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111428604B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111739027B (en) * | 2020-07-24 | 2024-04-26 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and readable storage medium |
CN112036245A (en) * | 2020-07-30 | 2020-12-04 | 拉扎斯网络科技(上海)有限公司 | Image detection method, information interaction method and device and electronic equipment |
CN111931623A (en) * | 2020-07-31 | 2020-11-13 | 南京工程学院 | Face mask wearing detection method based on deep learning |
CN111985374B (en) * | 2020-08-12 | 2022-11-15 | 汉王科技股份有限公司 | Face positioning method and device, electronic equipment and storage medium |
CN112001872B (en) | 2020-08-26 | 2021-09-14 | 北京字节跳动网络技术有限公司 | Information display method, device and storage medium |
CN112115818B (en) * | 2020-09-01 | 2022-03-11 | 燕山大学 | Mask wearing identification method |
CN112052789B (en) * | 2020-09-03 | 2024-05-14 | 腾讯科技(深圳)有限公司 | Face recognition method and device, electronic equipment and storage medium |
CN112183461A (en) * | 2020-10-21 | 2021-01-05 | 广州市晶华精密光学股份有限公司 | Vehicle interior monitoring method, device, equipment and storage medium |
CN112597867B (en) * | 2020-12-17 | 2024-04-26 | 佛山科学技术学院 | Face recognition method and system for wearing mask, computer equipment and storage medium |
CN112560756A (en) * | 2020-12-24 | 2021-03-26 | 北京嘀嘀无限科技发展有限公司 | Method, device, electronic equipment and storage medium for recognizing human face |
US11436881B2 (en) | 2021-01-19 | 2022-09-06 | Rockwell Collins, Inc. | System and method for automated face mask, temperature, and social distancing detection |
CN113239739B (en) * | 2021-04-19 | 2023-08-01 | 深圳市安思疆科技有限公司 | Wearing article identification method and device |
CN114663966B (en) * | 2022-05-25 | 2023-06-16 | 深圳市博德致远生物技术有限公司 | Information acquisition management method and related device based on artificial intelligence |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011118588A (en) * | 2009-12-02 | 2011-06-16 | Honda Motor Co Ltd | Mask wearing determination apparatus |
CN109101923A (en) * | 2018-08-14 | 2018-12-28 | 罗普特(厦门)科技集团有限公司 | A kind of personnel wear the detection method and device of mask situation |
CN109902584A (en) * | 2019-01-28 | 2019-06-18 | 深圳大学 | A kind of recognition methods, device, equipment and the storage medium of mask defect |
-
2020
- 2020-03-19 CN CN202010194398.2A patent/CN111428604B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011118588A (en) * | 2009-12-02 | 2011-06-16 | Honda Motor Co Ltd | Mask wearing determination apparatus |
CN109101923A (en) * | 2018-08-14 | 2018-12-28 | 罗普特(厦门)科技集团有限公司 | A kind of personnel wear the detection method and device of mask situation |
CN109902584A (en) * | 2019-01-28 | 2019-06-18 | 深圳大学 | A kind of recognition methods, device, equipment and the storage medium of mask defect |
Non-Patent Citations (1)
Title |
---|
邓黄潇 ; .基于迁移学习与RetinaNet的口罩佩戴检测的方法.电子技术与软件工程.2020,(05),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111428604A (en) | 2020-07-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111428604B (en) | Facial mask recognition method, device, equipment and storage medium | |
US11188783B2 (en) | Reverse neural network for object re-identification | |
WO2021212659A1 (en) | Video data processing method and apparatus, and computer device and storage medium | |
CN106548127B (en) | Image recognition method | |
JP6330385B2 (en) | Image processing apparatus, image processing method, and program | |
CN109858372B (en) | Lane-level precision automatic driving structured data analysis method | |
CN103136504B (en) | Face identification method and device | |
CN109242869A (en) | A kind of image instance dividing method, device, equipment and storage medium | |
CN113361495B (en) | Method, device, equipment and storage medium for calculating similarity of face images | |
WO2020062360A1 (en) | Image fusion classification method and apparatus | |
KR20200060194A (en) | Method of predicting depth values of lines, method of outputting 3d lines and apparatus thereof | |
JPWO2015025704A1 (en) | Video processing apparatus, video processing method, and video processing program | |
CN104915642B (en) | Front vehicles distance measuring method and device | |
JP2005190400A (en) | Face image detection method, system, and program | |
CN117157678A (en) | Method and system for graph-based panorama segmentation | |
CN117392733B (en) | Acne grading detection method and device, electronic equipment and storage medium | |
CN112966618A (en) | Dressing identification method, device, equipment and computer readable medium | |
CN112669343A (en) | Zhuang minority nationality clothing segmentation method based on deep learning | |
CN110969642B (en) | Video filtering method and device, electronic equipment and storage medium | |
CN113705294A (en) | Image identification method and device based on artificial intelligence | |
WO2023279799A1 (en) | Object identification method and apparatus, and electronic system | |
JP2019109843A (en) | Classification device, classification method, attribute recognition device, and machine learning device | |
CN112766176B (en) | Training method of lightweight convolutional neural network and face attribute recognition method | |
CN113221667A (en) | Face and mask attribute classification method and system based on deep learning | |
Agunbiade et al. | Enhancement performance of road recognition system of autonomous robots in shadow scenario |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |