CN115527254A - Face recognition method, model training method, face recognition device, model training device, electronic equipment and storage medium - Google Patents

Face recognition method, model training method, face recognition device, model training device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115527254A
CN115527254A CN202211154023.9A CN202211154023A CN115527254A CN 115527254 A CN115527254 A CN 115527254A CN 202211154023 A CN202211154023 A CN 202211154023A CN 115527254 A CN115527254 A CN 115527254A
Authority
CN
China
Prior art keywords
mask
face
face recognition
feature
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211154023.9A
Other languages
Chinese (zh)
Other versions
CN115527254B (en
Inventor
魏梦
随海亮
赵欲苗
陈智超
户磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Dilusense Technology Co Ltd
Original Assignee
Beijing Dilusense Technology Co Ltd
Hefei Dilusense Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dilusense Technology Co Ltd, Hefei Dilusense Technology Co Ltd filed Critical Beijing Dilusense Technology Co Ltd
Priority to CN202211154023.9A priority Critical patent/CN115527254B/en
Publication of CN115527254A publication Critical patent/CN115527254A/en
Application granted granted Critical
Publication of CN115527254B publication Critical patent/CN115527254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the field of artificial intelligence, and discloses a face recognition method, a model training method, a face recognition device, a model training device, an electronic device and a storage medium, wherein the model training method comprises the following steps: acquiring a face feature map of an input image, wherein the input image comprises a mask-wearing face image and a mask-not-wearing face image; generating a mask characteristic mask with zero mask coverage area characteristics according to the face characteristic image and mask coverage area position information of the input image; the mask coverage area position information is determined based on the face key point detection result; acquiring a feature vector of an input image according to the mask feature mask and the face feature image; and acquiring a loss value of the face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value. When the face image of the mask wearing the mask is identified, the generated mask characteristic mask is utilized to reduce the contribution of the face characteristics of the mask covering area in face identification, and the face identification of the face image of the mask wearing the mask is accurately finished.

Description

Face recognition method, model training method, face recognition device, model training device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the field of artificial intelligence, in particular to a face recognition method, a model training method, a face recognition device, a model training device, electronic equipment and a storage medium.
Background
The face recognition system widely used at present has excellent recognition capability in a fitted scene, but still faces some challenges in a complex real application scene. An important influencing factor is face shielding, when a face region is shielded, visible information of a face is reduced, and the recognition capability of a general face recognition system which is not specially designed is greatly reduced. Especially, people go on a journey in recent years and wear the gauze mask, promote the recognition ability of wearing gauze mask people's face and be the difficult problem that face identification field is badly in need of solving.
For the recognition of the face of the gauze mask, the following two solutions are generally available: the method depends on the accuracy of a face generation model for recovering a face shielding area, and face generation is a relatively difficult problem, and the generated face easily loses face identity information, so that the recognition precision is not enough and the algorithm is relatively complex.
The other idea is to remove the characteristics damaged by the mask shielding and only use the effective characteristics of the visible part of the human face to perform the human face recognition. The idea can also be divided into two methods, one method is to perform shielding detection before face recognition, judge whether a face area is shielded by a mask, and perform image segmentation on a face shielded by the mask. The other method is to add a feature mask generator in a face recognition model, define a plurality of rectangular feature masks in advance, add feature mask labels in training samples, and perform supervision training on the feature mask generator and the face recognition model by using the training samples, but the feature masks are limited by the types of the predefined masks, and the covered mask shielding condition is limited, so the face recognition is easily interfered, the recognition accuracy is unstable, and the feature mask generator and the face recognition model need to be trained separately and jointly, and the parameter adjustment in the training process is difficult.
Disclosure of Invention
An object of some embodiments of the present application is to provide a face recognition method, a model training method, an apparatus, an electronic device, and a storage medium, which reduce training difficulty and cost of a face recognition model, and enable the trained face recognition model to generate a mask feature mask that makes facial features of a mask coverage area contribute less in a face recognition process when obtaining a mask-wearing face image, thereby accurately completing face recognition of the mask-wearing face image.
In order to solve the above technical problem, an embodiment of the present application provides a face recognition model training method, including: acquiring a face feature map of an input image, wherein the input image comprises a mask face image and a mask face image; generating a mask feature mask with zero feature of the mask coverage area according to the face feature map and mask coverage area position information of the input image; the mask coverage area position information is determined based on a face key point detection result; acquiring a feature vector of the input image according to the mask feature mask and the face feature map; and acquiring a loss value of a face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value.
In order to solve the above technical problem, an embodiment of the present application further provides a face recognition method, including: according to the face recognition model training method, a trained face recognition model is obtained; and acquiring a face image to be recognized, and taking the face image to be recognized as the input of the face recognition model to acquire a face recognition result.
In order to solve the above technical problem, an embodiment of the present application further provides a face recognition model training device, including: the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a face feature map of an input image, and the input image comprises a mask wearing face image and a mask not wearing face image; the generating module is used for generating a mask feature mask with zero mask coverage area features according to the face feature map and mask coverage area position information of the input image; the mask coverage area position information is determined based on a face key point detection result; the extraction module is used for acquiring a feature vector of the input image according to the mask feature mask and the face feature map; and the training module is used for acquiring a loss value of a face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value.
In order to solve the above technical problem, an embodiment of the present application further provides a face recognition apparatus, including: the acquisition module is used for acquiring a trained face recognition model according to the face recognition model training method; and the recognition module is used for acquiring a face image to be recognized, and taking the face image to be recognized as the input of the face recognition model to acquire a face recognition result.
In order to solve the above technical problem, an embodiment of the present application further provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute the above-mentioned face recognition model training method or face recognition method.
In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the above face recognition model training method or the face recognition method.
According to the face recognition model training method provided by the embodiment of the application, in the face recognition model training process, feature extraction is carried out on input face images with masks and face images without masks, a face feature map is obtained, then, according to mask covering area position information and the face feature map determined based on a face key point detection result, a mask feature mask with mask covering area features set to zero is generated, and feature vectors of the input images are obtained according to the mask feature mask and the face feature map. And then obtaining a loss value of the face recognition result according to the feature vector and a preset loss function, and updating the model parameter based on the loss value. The mask characteristic mask is generated by using the mask coverage area position information determined based on the face key point detection result, so that the condition that a mask label needs to be added for model supervision training is avoided, and the parameter adjustment difficulty and the training cost in the training process are reduced; the feature vectors are obtained based on the mask feature mask and the face feature image to carry out model training, so that when the trained model obtains an image with a poor mask coverage area feature value, the mask feature mask with small contribution of the mask coverage area features in the face recognition process can be directly generated, and the recognition of the face of the wearer can be accurately completed.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.
FIG. 1 is a flowchart of a face recognition model training method provided in an embodiment of the present application;
FIG. 2 is a diagram illustrating a key point detection result in an embodiment of the present application;
fig. 3 is a schematic view of the mask and the mask covering area in the present embodiment;
FIG. 4 is a schematic structural diagram of a feature extraction network model in an embodiment of the present application;
FIG. 5 is a schematic diagram of the model operation in an embodiment of the present application;
fig. 6 is a flowchart of a face recognition method according to another embodiment of the present application;
FIG. 7 is a schematic structural diagram of a face recognition model training apparatus according to another embodiment of the present application;
fig. 8 is a schematic structural diagram of a face recognition apparatus according to another embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to another embodiment of the present application.
Detailed Description
Known by the background art, in the existing mask-wearing face recognition model training process based on mask feature masks, parameter adjustment is difficult, the mask type is limited by self-definition in a supervision training process, face recognition of the trained model is easily interfered, and recognition accuracy is unstable, so that the problem that how to simply and efficiently train a face recognition model with a high-efficiency and accurate face recognition effect on a mask-wearing face image is urgently needed to be solved is solved.
In order to achieve the purpose of training a face recognition model having a face image with a mask wearing effect with high efficiency and accuracy, an embodiment of the invention provides a face recognition model training method, which comprises the following steps: acquiring a face feature map of an input image, wherein the input image comprises a mask-wearing face image and a mask-not-wearing face image; generating a mask characteristic mask with zero mask coverage area characteristics according to the face characteristic image and mask coverage area position information of the input image; the mask coverage area position information is determined based on the face key point detection result; acquiring a feature vector of an input image according to the mask feature mask and the face feature image; and obtaining a loss value of the face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value.
According to the face recognition model training method provided by the embodiment of the application, in the face recognition model training process, feature extraction is carried out on input face images with masks and face images without masks, a face feature map is obtained, then, according to mask covering area position information and the face feature map determined based on a face key point detection result, a mask feature mask with mask covering area features set to zero is generated, and feature vectors of the input images are obtained according to the mask feature mask and the face feature map. And then obtaining a loss value of the face recognition result according to the feature vector and a preset loss function, and updating the model parameter based on the loss value. The mask characteristic mask is generated by using the mask coverage area position information determined based on the face key point detection result, so that the condition that a mask label needs to be added for model supervision training is avoided, and the parameter adjustment difficulty and the training cost in the training process are reduced; the feature vectors are obtained based on the mask feature mask and the face feature image for model training, so that when the trained model obtains an image with a poor mask coverage area feature value, mask feature masks with small contribution of mask coverage area features in the face recognition process can be directly generated, and the face of a user wearing the mask is accurately recognized.
To make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The following description will specifically describe implementation details of the face recognition model training method described in the present application with reference to specific embodiments, and the following description is only provided for facilitating understanding of the implementation details and is not necessary to implement the present invention.
In a specific application, the face recognition model training method may be applied to a terminal capable of performing data interaction and operation, such as an electronic device like a computer or a mobile phone, and the present embodiment is described by taking an application as an example in a computer, and a process of face recognition model training may refer to fig. 1, and includes the following steps:
step 101, acquiring a face feature map of an input image, wherein the input image comprises a mask wearing face image and a mask not wearing face image.
Specifically, in the process of training the face recognition model, the computer acquires the face recognition model to be trained and training samples adopted in model training, namely, a plurality of mask-wearing face images and a plurality of mask-not-wearing face images, from a preset storage address or a local storage space. And then, taking the training samples one by one as the input of the face recognition model to be trained, and extracting the features of the input image through a feature extraction network in the face recognition model to be trained to obtain a face feature map of the input image. When the training sample is obtained, the training sample can be generated by reading the pre-stored processed image, or the training sample can be generated by obtaining the unprocessed image and processing the unprocessed image in real time.
In one example, the non-mask face image used in the face recognition model training may be obtained by: respectively detecting key points of the human face of a plurality of human face images without wearing a mask according to a preset human face detection algorithm; and for each face image without wearing the mask, performing face key point correction on the face image without wearing the mask according to the detection result of the face key points to generate a plurality of face images without wearing the mask.
Specifically, a plurality of unprocessed face images without wearing a mask can be acquired in advance or acquired in real time, and then, for each face image without wearing a mask, a face region and face key points including five positions of left and right eyes, a nose tip and left and right corners in the face image without wearing the mask are detected according to a preset face detection algorithm, so that a face key point detection result including a face contour is obtained. And then, correcting the key points of the face according to the detection result of the key points of the face, and taking the corrected image as a face image without wearing a mask in a training sample. The face key points in the images are corrected according to the face key point detection results, so that the accuracy of face features in input images used in the model training process is ensured, and the model training efficiency and effect are improved.
After face key point correction is completed, a plurality of specific key points can be determined in a corrected face key point detection result according to a plurality of key points contained in the template, the specific key points are respectively aligned with corresponding key points in the template, then normalization processing is carried out on the image to obtain a face large photo with a certain size, and the face large photo is used as a face image without a mask. By processing the input images into images of a certain size, training efficiency is improved.
In another example, a face image of a wearer may be obtained by: selecting a plurality of face images without wearing a mask; and adding a simulation mask for the mask-free face image according to a preset simulation mask adding algorithm for each selected mask-free face image, and generating a plurality of mask-free face images.
Particularly, because the cost and difficulty for directly acquiring the face image of the wearer are high, a plurality of images can be directly selected from the generated face image of the wearer without the wearer in the process of acquiring the face image of the wearer. And for each selected non-mask-wearing face image, acquiring a template of the added simulated mask according to a preset simulated mask adding algorithm, adding the simulated mask for the non-mask-wearing face image according to the acquired template, and determining and storing the position information of the coverage area of the simulated mask according to the detection result of the face characteristic points. By adding the simulated mask, the acquisition cost of the face image of the wearer mask is reduced.
It should be mentioned that, when the face image without wearing the mask is selected and the simulation mask is added, the image selection mode may be random selection, fixed interval selection or selection according to the number of the image, and the like, the simulation mask addition mode may be three-dimensional modeling, AR simulation, and the like, and the covering ring covered by the simulation mask is defined, and the specific mode of the image selection and the simulation mask addition mode in this embodiment is not limited.
In another example, the preset face detection algorithm is any one of the following detection algorithms: 68, 98 and 128 key point detection algorithms.
Specifically, in the model training process, according to the precision requirement on the trained face recognition model, when the input image in the model training process is generated, a suitable face detection algorithm is selected to adjust the feature precision of the input image, so that the applicability of the model training method is improved.
For example, when a 68-key point detection algorithm is used to detect and correct an unprocessed image to generate an input image during model training, a face region on a face image and five-point key points including a left eye, a right eye, a nose tip, a left mouth corner and a right mouth corner are detected according to the 68-key point detection algorithm to obtain face contour key points including a face, eyebrows, eyes, a nose and a mouth. The face keypoint detection result can refer to fig. 2. And then, the face image is cut according to the requirement, and the 5-point key points are corrected according to 68 key points to obtain more accurate 5-point key points. And aligning the corrected 5-point key points with the 5-point key points of the template, then carrying out normalization processing on the image, and taking the obtained large-head photo of the human face with a certain size as the human face image without wearing the mask. Then, a plurality of face images without wearing masks are selected, simulation masks are added to the face images without wearing masks according to a preset algorithm, a plurality of face images with wearing masks are generated, and position information of a coverage area of the simulation masks is determined according to a face characteristic point detection result. Referring to fig. 3, the face image and the mask covering area of the wearer with the simulated mask added thereto may be represented by an area surrounded by black outlines to identify the covering area of the simulated mask.
102, generating a mask characteristic mask with mask coverage area characteristics set to zero according to the face characteristic image and mask coverage area position information of the input image; and the position information of the mask coverage area is determined based on the detection result of the key points of the face.
Specifically, after a face feature map of an input image is acquired, mask coverage area position information of the input image is read, wherein the mask coverage area position information is determined based on a face key point detection result, the mask coverage area position information of a mask wearing image is determined according to a simulated mask boundary and the face key point detection result, and the mask coverage area position information of a mask not wearing image is 0 or an empty set. And then generating a mask characteristic mask with zero mask coverage area characteristics according to the face characteristic diagram and the mask coverage area position information of the input image.
In one example, a mask feature mask with mask coverage area features set to zero is generated according to a face feature map and mask coverage area position information of an input image, and the mask feature mask comprises: acquiring weights corresponding to all characteristic values in the face characteristic diagram according to a preset function; and according to the position information of the mask covering area, setting the weight of the characteristic value of the mask covering area to zero to generate a mask characteristic mask.
Specifically, in order to generate a mask feature mask, a mask masker is added to a face recognition model in advance, the input of the mask masker is a face feature map extracted from the face recognition model and mask coverage area position information of an input image, and after the face feature map is acquired, the mask masker performs convolution calculation on the face feature map through a convolution layer with the same number as the number of channels of the face feature map, thereby acquiring the weight of each feature. Then, the mask feature mask of the face feature map is generated by setting the weight of the feature value of the mask coverage area to zero based on the mask coverage area position information of the input image, for example, based on 68 key point area position information. The weight of the characteristic value of the mask coverage area is set to be zero according to the position information of the mask coverage area, so that the generated mask characteristic mask can accurately reduce the contribution of the mask coverage area characteristics in the face recognition process.
In another example, the mask covering area position information may be acquired by: selecting a group of face key points as the boundary of a mask covering area according to a face key point detection result and a preset selection rule; and taking the codes and the position information of the key points of the selected face as the position information of the mask covering area.
Specifically, when the mask covering area position information is obtained, a group of face key points are selected as the boundary of the mask covering area after the simulated mask is added in the face key point detection result according to the face key point detection result and the algorithm added by the simulated mask, and then the codes and the position information of the selected face key points are used as the mask covering area position information. A group of face key points are selected as mask boundaries according to face key point detection results, so that a mask feature mask which is more fit with the shape of a face is obtained, additional labels are not needed for supervision training, the cost caused by mask label acquisition is avoided, and the problem that a mask device and a face recognition model are difficult to compound and lose weight parameters and difficult to converge is solved.
It is worth mentioning that when the key points are selected, the key points can be selected according to the corresponding scene of the simulation mask, so that various mask covering areas which are more fit with different human face shapes can be combined, and the embodiment does not limit the specific selection of the key points.
In another example, a mask feature mask with mask coverage area features set to zero is generated according to a face feature map and mask coverage area position information of an input image, and the mask feature mask comprises: acquiring an original face feature map of an ith layer output by a feature extraction network; wherein i is a positive integer; acquiring weights corresponding to all characteristic values in the original face characteristic diagram of the ith layer according to a preset function; according to the position information of the mask covering area, setting the weight of the characteristic value of the mask covering area to zero to generate a mask characteristic mask of the ith layer; the method for acquiring the face feature map of the input image comprises the following steps: and acquiring a multiplication result of the original face feature map of the ith layer and the mask feature mask of the ith layer according to the corresponding channels, and taking the multiplication result as the face feature map of the ith layer.
Specifically, the face recognition model to be trained is constructed by adding a mask on the basis of a convolutional neural network model with the capability of updating parameters according to the loss gradient back propagation, for example, the face recognition model to be trained can be constructed on the basis of a VGG-Net deep convolutional neural network model, a ResNet residual convolutional neural network model or a densneet convolutional neural network model. The feature extraction network in the face recognition model comprises a plurality of layers of feature extraction, so that a mask masker can be added behind each layer of output of the feature extraction network, after the original face feature map of the ith layer is output by the feature extraction network, the mask masker acquires the original face feature map of the ith layer, the weight corresponding to each feature value in the original face feature map of the ith layer is calculated according to a preset function, then the weight of the feature value of the mask coverage area is set to be zero according to the position information of the mask coverage area, and the mask feature mask of the ith layer is generated. And after the mask feature mask of the ith layer is obtained, multiplying the original face feature image of the ith layer by the mask feature mask of the ith layer according to the corresponding channel, and taking the result of multiplication according to the corresponding channel as the face feature image of the ith layer for subsequent calling. And performing similar processing on the output of each layer of the feature extraction network until finally outputting a one-dimensional face feature vector processed by using the mask feature mask. By adding a mask masker to each layer of feature network and taking the product of the mask feature mask output by the mask masker and the original face feature map of the ith layer output by the feature extraction network as the face feature map of the ith layer, the interference of the degraded features of the mask shielding area to the feature extraction in the feature extraction process is avoided as much as possible.
In another example, the characteristic value of the mask covering area is set to zero according to the position information of the mask covering area; the method comprises the following steps: acquiring the feature size of an original face feature map of an ith layer; scaling the size of the position information of the mask covering area according to the feature size of the original face feature image of the ith layer; and according to the position information of the mask coverage area after the size scaling, setting the weight of the characteristic value of the mask coverage area in the original face characteristic diagram of the ith layer to zero.
Specifically, when the features are extracted by the feature extraction network in the face recognition model, the sizes of the output feature maps are different, when the mask feature mask of the ith layer is generated, the feature size of the original face feature map of the ith layer is firstly obtained, the corresponding size scaling is performed on the position information of the mask covering area according to the ratio of the feature size of the original face feature map of the ith layer to the original size, and then the weight of the feature value of the mask covering area in the original face feature map of the ith layer is set to zero according to the position information of the mask covering area after the size scaling. The mask feature mask generating method has the advantages that the mask covering area position information is correspondingly scaled according to the feature size of the original face feature map on the ith layer, so that accuracy of mask feature mask generation is guaranteed, influence of degradation features of the mask area is eliminated, and loss of features of a face visible area is avoided.
For example, taking the ResNet-50 model as the basic model of the face recognition model as an example, the feature extraction network of the model includes 5 stages, a mask masker is added in each of the stages 2 to 5, and a schematic structural diagram of the feature extraction network model after the mask masker is added can refer to fig. 4, where the scaling of the mask coverage area position information is represented in the form of a picture. The model operation principle with the mask added can refer to fig. 5, taking the face feature map generation at the second stage as an example, the size of the 2 nd layer original face feature map output by the feature extraction network is H × W × C, and the size of the initial mask feature mask generated by the mask feature mask through the convolution layer is also H × W. And then, the mask masker performs size scaling on the position information of the mask coverage area according to the original size corresponding to the position information of the mask coverage area and the current size of the face feature map, and then performs zero setting on the weight of the feature value of the mask coverage area in the initial mask feature mask according to the position information of the mask coverage area after size scaling to obtain a layer 2 mask feature mask. And multiplying the mask feature mask and the input 2 nd layer original face feature image according to corresponding channels, and taking the multiplied result as the 2 nd layer face feature image. Similar processing is carried out on the 3 rd to 5 th layers until a one-dimensional face feature vector processed by a mask feature mask is finally generated for subsequent face recognition.
In another example, obtaining the weight corresponding to each feature value in the face feature map according to a preset function includes: and normalizing the characteristic value of each point in the mask characteristic mask to be within an interval from 0 to 1 according to a preset function.
Specifically, in the process of generating the mask characteristic mask according to the acquired face characteristic diagram, after generating the initial mask characteristic mask through convolution calculation, the mask masker may normalize each characteristic weight in the initial mask characteristic mask according to a preset function, for example, a sigmoid function, and normalize the characteristic values in the initial mask characteristic mask to continuous values in the interval of [ 0,1 ]. By normalizing the characteristic values into continuous values in the interval of (0, 1), the trained model cannot completely remove the characteristics of the mask region while weakening the characteristic contribution of the mask shielding region, information loss caused by errors of the mask shielding region is avoided, meanwhile, the characteristic values of the real shielding region are smaller and smaller as the operation of mask characteristic masks is overlapped for multiple times along with the increase of the network depth, the positions with errors at the early stage cannot be completely removed, and the characteristic robustness is enhanced.
And 103, acquiring a feature vector of the input image according to the mask feature mask and the face feature map.
Specifically, a face recognition model to be trained acquires a face feature map of an input image and a generated mask feature mask layer by layer, multiplies the face feature map and the corresponding mask feature mask according to corresponding channels, and takes the multiplication result as a face feature map of each layer. And then, performing feature extraction layer by layer and mask feature mask processing to obtain a one-dimensional face feature vector corresponding to the input image.
And 104, obtaining a loss value of the face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value.
Specifically, after the feature vector of the input image is obtained, the feature vector is used as the input of the face recognition model, the face identity information corresponding to the input image is used as the supervision signal, for example, a face image is input, the identity information corresponding to the face in the input image is predicted according to the feature vector through the face recognition model, and then the loss value of the face recognition result is obtained by calculating the loss value of the prediction result and the supervision signal through a preset loss function. Wherein, the loss function can be a classification loss function obtained by Softmax function and a variant thereof or a loss function based on metric learning. And then carrying out gradient updating based on loss on the face recognition model according to the calculated loss value. And training and updating parameters of the face recognition model by using a plurality of face images wearing the mask and face images not wearing the mask as input images until the calculated loss value is converged, and judging that the model training is finished.
The face recognition model is subjected to mixed training by utilizing the face image of the wearing mask and the face image of the non-wearing mask, and the loss value is converged, so that after the face feature map is acquired by a mask masker in the trained face recognition model, whether the input image is the face image of the wearing mask can be automatically judged according to the face feature map, a mask feature mask with lower feature weight of the wearing mask area is adaptively generated for the face image of the wearing mask, mask feature masks with all weights of 1 are generated for the face image of the non-wearing mask, then the face feature vector of the input image is acquired according to the mask feature mask, and accurate face recognition of the face image of the wearing mask and the face image of the non-wearing mask is completed according to the face feature vector.
Another aspect of the embodiments of the present application provides a face recognition method, where a face recognition process may refer to fig. 6, and the method includes the following steps:
step 601, obtaining a trained face recognition model.
Specifically, a face recognition model is trained in real time according to the face recognition model training method, or a face recognition model trained in advance according to the face recognition model training method is read at a preset memory address.
Step 602, a face image to be recognized is obtained, and the face image to be recognized is used as the input of a face recognition model to obtain a face recognition result.
Specifically, after a trained face recognition model is obtained, a face image to be recognized is obtained in a communication or real-time acquisition mode, and the face image to be recognized is used as the input of the face recognition model to obtain a face recognition result.
Another aspect of the embodiments of the present application provides a face recognition model training apparatus, with reference to fig. 7, including:
the acquiring module 701 is configured to acquire a face feature map of an input image, where the input image includes a mask-worn face image and a mask-not-worn face image.
A generating module 702, configured to generate a mask feature mask with mask coverage area features set to zero according to the face feature map and mask coverage area position information of the input image; and the position information of the mask coverage area is determined based on the detection result of the key points of the face.
The extraction module 703 is configured to obtain a feature vector of the input image according to the mask feature mask and the face feature map.
And the training module 704 is configured to obtain a loss value of the face recognition result according to the feature vector and a preset loss function, and perform parameter updating on the face recognition model according to the loss value.
It is to be understood that this embodiment is an embodiment of an apparatus corresponding to the embodiment of the face recognition model training method, and the embodiment may be implemented in cooperation with the embodiment of the face recognition model training method. Relevant technical details mentioned in the embodiment of the face recognition model training method are still valid in the embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related technical details mentioned in the embodiment can also be applied to the embodiment of the face recognition model training method.
It should be noted that, all the modules involved in this embodiment are logic modules, and in practical application, one logic unit may be one physical unit, may also be a part of one physical unit, and may also be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, a unit which is not so closely related to solve the technical problem proposed by the present invention is not introduced in the present embodiment, but this does not indicate that there is no other unit in the present embodiment.
Another aspect of the embodiments of the present application provides a face recognition apparatus, with reference to fig. 8, including:
an obtaining module 801, configured to obtain a trained face recognition model according to the face recognition model training method;
the recognition module 802 is configured to obtain a face image to be recognized, and obtain a face recognition result by using the face image to be recognized as an input of a face recognition model.
It is obvious that this embodiment is an embodiment of an apparatus corresponding to the embodiment of the face recognition method, and this embodiment can be implemented in cooperation with the embodiment of the face recognition method. Related technical details mentioned in the embodiment of the face recognition method are still valid in the embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related technical details mentioned in the embodiment can also be applied to the embodiment of the face recognition method.
It should be noted that, all the modules involved in this embodiment are logic modules, and in practical application, one logic unit may be one physical unit, may also be a part of one physical unit, and may also be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, a unit which is not so closely related to solve the technical problem proposed by the present invention is not introduced in the present embodiment, but this does not indicate that there is no other unit in the present embodiment.
Another aspect of the embodiments of the present application further provides an electronic device, with reference to fig. 9, including: at least one processor 901; and, memory 902 communicatively connected to at least one processor 901; the memory 902 stores instructions executable by the at least one processor 901, and the instructions are executed by the at least one processor 901, so that the at least one processor 901 can execute the face recognition model training method or the face recognition method described in any of the above method embodiments.
The memory 902 and the processor 901 are coupled by a bus, which may comprise any number of interconnected buses and bridges that couple one or more of the various circuits of the processor 901 and the memory 902. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 901 is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor 901.
The processor 901 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 902 may be used for storing data used by processor 901 in performing operations.
Another aspect of the embodiments of the present application also provides a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the present application, and that various changes in form and details may be made therein without departing from the spirit and scope of the present application in practice.

Claims (14)

1. A face recognition model training method is characterized by comprising the following steps:
acquiring a face feature map of an input image, wherein the input image comprises a mask-wearing face image and a mask-not-wearing face image;
generating a mask characteristic mask with the mask coverage area characteristic set to zero according to the face characteristic image and mask coverage area position information of the input image; the mask covering area position information is determined based on a face key point detection result;
acquiring a feature vector of the input image according to the mask feature mask and the face feature map;
and acquiring a loss value of a face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value.
2. The training method of the face recognition model according to claim 1, wherein the generating of the mask feature mask with the mask coverage area feature set to zero according to the face feature map and the mask coverage area position information of the input image comprises:
acquiring weights corresponding to all characteristic values in the face characteristic diagram according to a preset function;
and according to the mask covering area position information, setting the weight of the characteristic value of the mask covering area to zero to generate the mask characteristic mask.
3. The training method of the face recognition model according to claim 2, wherein the generating of the mask feature mask with the mask coverage area feature set to zero according to the face feature map and mask coverage area position information of the input image includes:
acquiring an original face feature map of an ith layer output by a feature extraction network; wherein i is a positive integer;
acquiring weights corresponding to all characteristic values in the original face characteristic diagram of the ith layer according to the preset function;
according to the mask covering area position information, the weight of the characteristic value of the mask covering area is set to be zero, and the mask characteristic mask of the ith layer is generated;
the method for acquiring the face feature map of the input image comprises the following steps:
and acquiring a multiplication result of the original face feature map of the ith layer and the mask feature mask of the ith layer according to corresponding channels, and taking the multiplication result as the face feature map of the ith layer.
4. The training method of the face recognition model according to claim 3, wherein the eigenvalue of the mask coverage area is set to zero according to the mask coverage area position information; the method comprises the following steps:
acquiring the feature size of the original face feature map of the ith layer;
according to the feature size of the original face feature map of the ith layer, carrying out size scaling on the position information of the mask coverage area;
and setting the weight of the characteristic value of the mask coverage area in the original face characteristic diagram of the ith layer to zero according to the position information of the mask coverage area after the size is scaled.
5. The training method of the face recognition model according to claim 2, wherein the obtaining the weight corresponding to each feature value in the face feature map according to a preset function comprises:
and normalizing the characteristic value of each point in the mask characteristic mask to be within an interval from 0 to 1 according to the preset function.
6. The training method for the face recognition model according to any one of claims 1 to 5, wherein the non-mask face image is obtained by:
respectively detecting key points of the human face of a plurality of human face images without wearing a mask according to a preset human face detection algorithm;
and for each face image without the mask, performing face key point correction on the face image without the mask according to the face key point detection result to generate a plurality of face images without the mask.
7. The training method of the face recognition model according to claim 6, wherein the face image of the mask wearing face is obtained by:
selecting a plurality of face images without wearing a mask;
and for each selected non-mask-wearing face image, adding a simulation mask for the non-mask-wearing face image according to a preset simulation mask adding algorithm to generate a plurality of mask-wearing face images.
8. The training method of the face recognition model according to claim 6, wherein the preset face detection algorithm is any one of the following detection algorithms: 68 keypoint detection algorithm, 98 keypoint detection algorithm, and 128 keypoint detection algorithm.
9. The training method of the face recognition model according to claim 6, wherein the position information of the mask covering area is obtained by:
selecting a group of face key points as the boundary of a mask covering area according to the face key point detection result and a preset selection rule;
and using the selected codes and position information of the key points of the human face as position information of the mask covering area.
10. A face recognition method, comprising:
the face recognition model training method according to any one of claims 1 to 9, acquiring a trained face recognition model;
and acquiring a face image to be recognized, and taking the face image to be recognized as the input of the face recognition model to acquire a face recognition result.
11. A face recognition model training device, comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a face feature map of an input image, and the input image comprises a mask wearing face image and a mask not wearing face image;
the generating module is used for generating a mask feature mask with zero mask coverage area features according to the face feature map and mask coverage area position information of the input image; the mask coverage area position information is determined based on a face key point detection result;
the extraction module is used for acquiring a feature vector of the input image according to the mask feature mask and the face feature map;
and the training module is used for acquiring a loss value of a face recognition result according to the feature vector and a preset loss function, and updating parameters of the face recognition model according to the loss value.
12. A face recognition apparatus, comprising:
an obtaining module, configured to obtain a trained face recognition model according to the face recognition model training method according to any one of claims 1 to 9;
and the recognition module is used for acquiring a face image to be recognized, and taking the face image to be recognized as the input of the face recognition model to acquire a face recognition result.
13. An electronic device, comprising: at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a face recognition model training method as claimed in any one of claims 1 to 9, or a face recognition method as claimed in claim 10.
14. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the face recognition model training method according to any one of claims 1 to 9 or the face recognition method according to claim 10.
CN202211154023.9A 2022-09-21 2022-09-21 Face recognition and model training method and device, electronic equipment and storage medium Active CN115527254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211154023.9A CN115527254B (en) 2022-09-21 2022-09-21 Face recognition and model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211154023.9A CN115527254B (en) 2022-09-21 2022-09-21 Face recognition and model training method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115527254A true CN115527254A (en) 2022-12-27
CN115527254B CN115527254B (en) 2023-06-20

Family

ID=84699408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211154023.9A Active CN115527254B (en) 2022-09-21 2022-09-21 Face recognition and model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115527254B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783601A (en) * 2020-06-24 2020-10-16 北京百度网讯科技有限公司 Training method and device of face recognition model, electronic equipment and storage medium
CN111881770A (en) * 2020-07-06 2020-11-03 上海序言泽网络科技有限公司 Face recognition method and system
CN111932439A (en) * 2020-06-28 2020-11-13 深圳市捷顺科技实业股份有限公司 Method and related device for generating face image of mask
CN112818901A (en) * 2021-02-22 2021-05-18 成都睿码科技有限责任公司 Wearing mask face recognition method based on eye attention mechanism
WO2021174880A1 (en) * 2020-09-01 2021-09-10 平安科技(深圳)有限公司 Feature extraction model training method, facial recognition method, apparatus, device and medium
CN113420731A (en) * 2021-08-23 2021-09-21 北京的卢深视科技有限公司 Model training method, electronic device and computer-readable storage medium
EP3958173A1 (en) * 2020-06-24 2022-02-23 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for training facial recognition model, electronic device, and storage medium
CN114373210A (en) * 2021-12-31 2022-04-19 北京工业大学 Face recognition method under mask shielding scene
CN114898450A (en) * 2022-07-14 2022-08-12 中国科学院自动化研究所 Face confrontation mask sample generation method and system based on generation model

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783601A (en) * 2020-06-24 2020-10-16 北京百度网讯科技有限公司 Training method and device of face recognition model, electronic equipment and storage medium
EP3958173A1 (en) * 2020-06-24 2022-02-23 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for training facial recognition model, electronic device, and storage medium
CN111932439A (en) * 2020-06-28 2020-11-13 深圳市捷顺科技实业股份有限公司 Method and related device for generating face image of mask
CN111881770A (en) * 2020-07-06 2020-11-03 上海序言泽网络科技有限公司 Face recognition method and system
WO2021174880A1 (en) * 2020-09-01 2021-09-10 平安科技(深圳)有限公司 Feature extraction model training method, facial recognition method, apparatus, device and medium
CN112818901A (en) * 2021-02-22 2021-05-18 成都睿码科技有限责任公司 Wearing mask face recognition method based on eye attention mechanism
CN113420731A (en) * 2021-08-23 2021-09-21 北京的卢深视科技有限公司 Model training method, electronic device and computer-readable storage medium
CN114373210A (en) * 2021-12-31 2022-04-19 北京工业大学 Face recognition method under mask shielding scene
CN114898450A (en) * 2022-07-14 2022-08-12 中国科学院自动化研究所 Face confrontation mask sample generation method and system based on generation model

Also Published As

Publication number Publication date
CN115527254B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
JP6330385B2 (en) Image processing apparatus, image processing method, and program
CN110659582A (en) Image conversion model training method, heterogeneous face recognition method, device and equipment
CN108805016B (en) Head and shoulder area detection method and device
CN109063584B (en) Facial feature point positioning method, device, equipment and medium based on cascade regression
CN113420731B (en) Model training method, electronic device and computer-readable storage medium
KR102400609B1 (en) A method and apparatus for synthesizing a background and a face by using deep learning network
CN113449704B (en) Face recognition model training method and device, electronic equipment and storage medium
CN112836625A (en) Face living body detection method and device and electronic equipment
CN113591823B (en) Depth prediction model training and face depth image generation method and device
CN112488067B (en) Face pose estimation method and device, electronic equipment and storage medium
CN112613471B (en) Face living body detection method, device and computer readable storage medium
CN113033524B (en) Occlusion prediction model training method and device, electronic equipment and storage medium
CN114494347A (en) Single-camera multi-mode sight tracking method and device and electronic equipment
CN111914748A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN111680573B (en) Face recognition method, device, electronic equipment and storage medium
CN111862040A (en) Portrait picture quality evaluation method, device, equipment and storage medium
CN115620022A (en) Object detection method, device, equipment and storage medium
CN111723688B (en) Human body action recognition result evaluation method and device and electronic equipment
CN110414522A (en) A kind of character identifying method and device
CN113642479A (en) Human face image evaluation method and device, electronic equipment and storage medium
CN116959113A (en) Gait recognition method and device
CN115527254B (en) Face recognition and model training method and device, electronic equipment and storage medium
CN112686851B (en) Image detection method, device and storage medium
CN110751163A (en) Target positioning method and device, computer readable storage medium and electronic equipment
CN112749705B (en) Training model updating method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230404

Address after: 230091 room 611-217, R & D center building, China (Hefei) international intelligent voice Industrial Park, 3333 Xiyou Road, high tech Zone, Hefei, Anhui Province

Applicant after: Hefei lushenshi Technology Co.,Ltd.

Address before: 100083 room 3032, North B, bungalow, building 2, A5 Xueyuan Road, Haidian District, Beijing

Applicant before: BEIJING DILUSENSE TECHNOLOGY CO.,LTD.

Applicant before: Hefei lushenshi Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant