WO2024021504A1 - 人脸识别模型训练方法、识别方法、装置、设备及介质 - Google Patents

人脸识别模型训练方法、识别方法、装置、设备及介质 Download PDF

Info

Publication number
WO2024021504A1
WO2024021504A1 PCT/CN2022/142236 CN2022142236W WO2024021504A1 WO 2024021504 A1 WO2024021504 A1 WO 2024021504A1 CN 2022142236 W CN2022142236 W CN 2022142236W WO 2024021504 A1 WO2024021504 A1 WO 2024021504A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
image
recognition model
model
face recognition
Prior art date
Application number
PCT/CN2022/142236
Other languages
English (en)
French (fr)
Inventor
吴鹏
肖嵘
王孝宇
Original Assignee
成都云天励飞技术有限公司
深圳云天励飞技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都云天励飞技术有限公司, 深圳云天励飞技术股份有限公司 filed Critical 成都云天励飞技术有限公司
Publication of WO2024021504A1 publication Critical patent/WO2024021504A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present application relates to the field of face recognition technology, and in particular to a face recognition model training method, recognition method, device, equipment and medium.
  • the main purpose of this application is to provide a face recognition model training method, recognition method, device, equipment and medium, aiming to make the trained face recognition model more accurate and improve the accuracy of low-resolution face image recognition. .
  • this application provides a face recognition method, which includes the following steps:
  • a preset face recognition model is trained according to the plurality of first face images and the plurality of second face images until the face recognition model converges.
  • this application also provides a face recognition method, including:
  • the identity information of the person corresponding to the face image to be recognized is determined.
  • this application also provides a face recognition model training device.
  • the face recognition model training device includes a first acquisition module, a generation module and a training module, wherein:
  • the first acquisition module is used to acquire a plurality of first sample face images and an identity identification code corresponding to each of the first sample face images;
  • the generation module is used to perform preset augmentation processing on the plurality of first sample face images to obtain a plurality of first face images, and to perform preset augmentation processing on the plurality of first face images according to the preset image augmentation model.
  • a sample face image is subjected to augmentation processing to obtain multiple second face images.
  • the first sample face image is the same as the identity identification code corresponding to the second face image.
  • the image augmentation model is used to Perform image blur augmentation on the first face image;
  • the first training module is used to train a preset face recognition model according to the plurality of first face images and the plurality of second face images until the face recognition model converges.
  • the present application also provides a terminal device, which includes a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program is When the processor is executed, the steps of the face recognition model training method and/or the face recognition method as mentioned above are implemented.
  • the present application also provides a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, the above-mentioned face recognition model training method is implemented. and/or steps of face recognition methods.
  • This application provides a face recognition model training method, recognition method, device, equipment and medium.
  • This application obtains multiple first sample face images and the identity identification code corresponding to each first sample face image; and then A plurality of first sample face images are subjected to preset augmentation processing to obtain a plurality of first face images, and according to the preset image augmentation model, the plurality of first sample face images are augmented to obtain multiple first sample face images.
  • a second face image, the first sample face image has the same identity code corresponding to the second face image, and the image augmentation model is used to perform image blur augmentation on the first face image; according to a plurality of third face images
  • One face image and multiple second face images are used to train the preset face recognition model until the face recognition model converges.
  • This solution uses preset augmentation and image augmentation models to augment multiple first sample face images, and can obtain a large number of first face images and second face images, greatly increasing the number of training samples. , by jointly training the preset face recognition model on the first face image and the second face image, so that the trained face recognition model is more accurate.
  • Figure 1 is a schematic flow chart of a face recognition model training method provided by an embodiment of the present application
  • Figure 2 is a schematic flowchart of an image augmentation model training provided by an embodiment of the present application
  • Figure 3 is a schematic flowchart of the sub-steps of image augmentation model training in Figure 2;
  • Figure 4 is a schematic flowchart of the sub-steps of the face recognition model training method in Figure 1;
  • Figure 5 is a schematic flow chart of the steps of the face recognition method provided by the embodiment of the present application.
  • Figure 6 is a schematic block diagram of a face recognition model training device provided by an embodiment of the present application.
  • Figure 7 is a schematic block diagram of a sub-module of the face recognition model training device provided by the embodiment of the present application.
  • Figure 8 is a schematic block diagram of an image augmentation model training device provided by an embodiment of the present application.
  • Figure 9 is a schematic block diagram of a face recognition device provided by an embodiment of the present application.
  • Figure 10 is a schematic structural block diagram of a terminal device provided by an embodiment of the present application.
  • Embodiments of the present application provide a face recognition model training method, recognition method, device, equipment and medium.
  • the face recognition model training method can be applied to terminal devices, which can be electronic devices such as mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, and wearable devices.
  • FIG. 1 is a schematic flowchart of a face recognition model training method provided by an embodiment of the present application.
  • the face recognition model training method includes steps S101 to S103.
  • Step S101 Obtain a plurality of first sample face images and an identity identification code corresponding to each first sample face image.
  • the identity identification code is the identity identification corresponding to the first sample face image.
  • the identity identification code can be set according to the actual situation. This is not specifically limited in the embodiment of the present invention.
  • the identity identification code can be an ID card number.
  • the preset resolution can be set according to the actual situation, and the embodiment of the present invention does not specifically limit this.
  • the preset resolution is 720P.
  • a plurality of first sample face images and an identity identification code corresponding to each first sample face image are obtained, and a plurality of sample face images and an identity identification code corresponding to each first sample face image are obtained. code, the resolution of each first sample face image is less than or equal to the preset resolution.
  • the acquisition method of the first sample face image can be selected according to the actual situation, and the embodiment of the present invention does not specifically limit this.
  • the first sample face image can be an image intercepted from a video, or It may be an image collected by a shooting device.
  • the shooting device may be selected according to the actual situation. This is not specifically limited in the embodiment of the present invention.
  • the shooting device may be a camera, a camcorder, a mobile phone, and other devices.
  • Step S102 Perform preset augmentation processing on the plurality of first sample face images to obtain a plurality of first face images, and perform preset augmentation processing on the plurality of first sample face images according to the preset image augmentation model.
  • the image is augmented to obtain multiple second face images.
  • Preset augmentation processing and image augmentation model processing are performed on multiple first sample face images to obtain more samples.
  • FIG. 2 is a schematic flowchart of an image augmentation model training provided by an embodiment of the present application.
  • the image augmentation model training includes steps S201 to S202.
  • Step S201 Acquire a plurality of second sample face images, and add noise to each second sample face image to obtain a plurality of third sample face images.
  • the second sample face image It may be an image captured from a video or an image collected by a shooting device.
  • the shooting device may be selected according to the actual situation. This is not specifically limited in the embodiment of the present invention.
  • the shooting device may be a camera or a video camera. and mobile phones and other devices.
  • preset photon noise, readout noise and quantization noise are obtained; photon noise, readout noise and quantization noise are added to each second sample face image according to the resolution of each second sample face image. Quantize the noise to obtain multiple third sample face images.
  • the preset photon noise, readout noise and quantization noise can be set according to actual conditions, and this is not specifically limited in the embodiment of the present invention.
  • the photon noise is the optical noise generated by the photoelectric effect when photons are converted into electrons when collecting images
  • the readout noise is the inherent factor of the circuit in the process of converting electrons into voltage when collecting images, such as the heat of electrons in the device. Movement, etc., causes inaccuracy in the results, and the error generated is called readout noise
  • the quantization noise is the conversion of voltage into numbers when collecting images, and the conversion of continuous signals into digital signals. The information loss caused is called quantization error or The rounding error is the quantization noise.
  • the method of obtaining photon noise can also be: obtaining the collected image is the number of photons I received by the sensor, and using the Poisson distribution to fit the number of photons I to obtain the photon noise.
  • the photon noise can be accurately obtained by fitting the number of received photons through the Poisson distribution.
  • the method of obtaining the readout noise can also be: obtain the error in the process of converting electrons into voltage during the image acquisition process, perform Gaussian distribution processing on the error and process it through the preset Tukey lambda distribution, and generate the readout noise. By processing the error in the conversion of electrons into voltage during image acquisition, the readout noise can be accurately obtained.
  • the method of obtaining the quantization noise may also be: obtaining the quantization noise distribution, and the quantization noise distribution is [-0.5q, 0.5q], where q is the number of quantization steps.
  • the number of this quantization step is 1 (that is, q is 1), and the quantization noise is [-0.5, 0.5]; the number of this quantization step is 2 (that is, q is 2), and the quantization noise is [-0.1, 0.1].
  • the resolution of each second sample face image is obtained, and photon noise, readout noise and quantization are added to each second sample face image according to the resolution of each second sample face image. Noise, multiple third sample face images are obtained. By adding photon noise, readout noise and quantization noise to the second sample face image, a sample image that is more consistent with the characteristics of the low-resolution image can be obtained.
  • the noise superposition formula is obtained.
  • N 2 is the readout noise of the number of points per pixel
  • N 3 is the quantization noise of the number of points per pixel, obtain the photon noise, readout noise and quantization noise.
  • the photon noise gain value is set according to the imaging system, and the photon noise gain value can be set according to the actual situation, which is not specifically limited in the embodiment of the present invention. According to the resolution of the second sample face image, total noise is added to the second sample face image to generate a third sample face image.
  • Step S202 Train a preset image augmentation model based on a plurality of the third sample face images until the image augmentation model converges.
  • the image augmentation model includes an image downsampling model and a Gaussian blur model.
  • step S202 includes sub-steps S2021 to sub-step S2023.
  • Sub-step S2021 Process each of the third sample face images through a preset image augmentation model to obtain a plurality of third face images.
  • Each third sample face image is down-sampled through an image down-sampling model, and the down-sampled image is processed with a Gaussian blur model to obtain a third face image corresponding to each third sample face image.
  • Using the image downsampling model to downsample the third sample face image can make the sample image fit the size of the display area and generate corresponding image thumbnails. Gaussian blur processing on the thumbnails can accurately obtain the third face image.
  • Sub-step S2022 Determine whether the image augmentation model converges based on a plurality of second sample face images and a plurality of third face images.
  • Calculate the facial feature similarity between the two second sample face images matching each identity identification code obtain at least one face similarity corresponding to each identity identification code, and establish a first similarity based on the similarity of each face degree histogram; calculate the facial feature similarity between the two third face images matching each identity identification code, obtain at least one face similarity corresponding to each identity identification code, and establish the similarity based on each face similarity
  • the second similarity histogram perform curve fitting on the first similarity histogram to obtain the first curve, perform curve fitting on the second similarity histogram to obtain the second curve; determine the distance between the first curve and the coordinate axis The first area enclosed and the second area enclosed by the second curve and the coordinate axis; when the area of the intersection area of the first area and the second area is greater than or equal to the preset area threshold, it is determined that the image augmentation model has converged ; When the area of the intersection area of the first area and the second area is less than the preset area threshold, it is determined that the image augmentation model has converge
  • the preset area threshold can be set according to actual conditions, and this is not specifically limited in the embodiment of the present invention. Whether the image augmentation model converges can be accurately known by determining the area of the intersection area where each second sample face image and the coordinate axis form a first area, and each third sample face image and the coordinate axis form a second area.
  • the method of calculating the facial feature similarity between two second sample face images matching each identity identification code may be: obtaining two second sample face images matching the identity identification code. , calculate the cosine distance of the features of the two second sample face images, and obtain the similarity of the two second sample face images. By calculating the cosine distance of the features of the two second sample face images, the similarity of the two second sample face images can be accurately obtained.
  • the method of establishing the first similarity histogram based on the similarity of each face may be: using the face similarity as the abscissa, and using the number of the same face similarities as the ordinate to establish a rectangular coordinate system, according to The similarity of each face and the number of similarities of the same face establish a first similarity histogram.
  • the method of calculating the facial feature similarity between two third face images matching each identity identification code and obtaining the similarity of at least one face corresponding to each identity identification code can refer to Calculating Each Identity
  • the similarity of facial features between the two second sample face images whose identification codes match is used to obtain the similarity of at least one face corresponding to each identification code; establishing a second similarity histogram based on the similarity of each face
  • the way of the graph can refer to the way of establishing the first similarity histogram based on the similarity of each face.
  • a preset curve fitting method is obtained, curve fitting is performed on the first similarity histogram based on the preset curve fitting method to obtain the first curve, and the second similarity histogram is performed. Curve fitting to obtain the second curve.
  • the preset curve fitting method can be selected according to the actual situation. The embodiment of the present invention does not specifically limit this.
  • the preset curve fitting method can be using the mlab module in matplotlib or using seaborn. distplot plotting in the library. Through this curve fitting method, the first curve corresponding to the first similarity histogram and the second curve corresponding to the second similarity histogram can be accurately obtained.
  • Sub-step S2023 If the image augmentation model has not converged, adjust the model parameters of the image augmentation model to update the image augmentation model, and continue to train the updated image augmentation model until the Image augmentation model converges.
  • the image augmentation model includes an image downsampling model and a Gaussian blur model
  • the image augmentation model is adjusted
  • the downsampling parameters of the image downsampling model and the model parameters of the Gaussian blur model are adjusted to update the image downsampling model and Gaussian blur model, and continue training the updated image downsampling model and Gaussian blur model until the image downsampling model and Gaussian blur
  • the model converges and a converged image augmentation model is obtained.
  • a converged image augmentation model can be accurately obtained.
  • the method of adjusting the downsampling parameters of the image downsampling model and adjusting the model parameters of the Gaussian blur model in the image augmentation model can be: selecting one from the preset downsampling parameter library and the preset model parameter library.
  • the parameters are used as the downsampling parameters of the adjusted image downsampling model and the model parameters of the adjusted Gaussian blur model.
  • the preset downsampling parameter library and the preset model parameter library can be set according to actual conditions, and this is not specifically limited in the embodiment of the present invention.
  • the downsampling parameters included in the downsampling parameter library can be 10 times, 20 times, and 50 times;
  • the model parameters include Gaussian kernel parameters, and the Gaussian kernel parameters can be 0.5, 5, and 8.
  • the downsampling parameters and model parameters from the preset downsampling parameter library and the preset model parameter library, it is possible to accurately adjust the downsampling parameters of the image downsampling model in the image augmentation model and adjust the model parameters of the Gaussian blur model. .
  • a plurality of first sample face images are subjected to preset augmentation processing to obtain a plurality of first face images.
  • the preset augmentation can be selected according to the actual situation, which is not specified in the embodiment of the present invention.
  • the preset augmentation can be random flipping, brightness contrast adjustment, image grayscale, random erasure, etc.
  • a plurality of first sample face images are augmented according to a preset image augmentation model to obtain a plurality of second face images, in which the first sample face image and the corresponding second face image are obtained.
  • the identification codes of the face images are the same.
  • the first sample face image is augmented through an image augmentation model to obtain multiple second face images.
  • Step S103 Train a preset face recognition model based on the plurality of first face images and the plurality of second face images until the face recognition model converges.
  • the face recognition model is a neural network model.
  • the specific type of the neural network model can be selected according to the actual situation. This is not specifically limited in the embodiment of the present invention.
  • the neural network model can be a knowledge distillation neural network model.
  • the face recognition model can be a face recognition model based on knowledge distillation neural network, a face recognition model based on convolutional neural network, and other models.
  • the structure included in the face recognition model can be backbone+L2 norm.
  • the backbone can be selected according to the actual situation. This embodiment of the present invention does not specifically limit this.
  • the backbone can be MobileFaceNet, iresnet, and vit.
  • the structure of the face recognition model can be MobileFaceNet+L2 norm, iresnet+L2 norm, vit+L2 norm and other structural models.
  • step S103 includes sub-steps S1031 to sub-step S1034.
  • Sub-step S1031 Input the first face image to the face recognition model for processing to obtain a first feature vector.
  • the first face image is input to the face recognition model for processing to obtain a first feature vector.
  • the first feature vector corresponding to the first face image can be accurately obtained through the face recognition model.
  • Sub-step S1032 Input the second face image to the face recognition model for processing to obtain a second feature vector.
  • the second face image is input to the face recognition model for processing to obtain a second feature vector.
  • the second feature vector corresponding to the second face image can be accurately obtained through the face recognition model.
  • Sub-step S1033 Determine the target loss value of the face recognition model based on the first feature vector and the second feature vector, and determine whether the face recognition model converges based on the target loss value.
  • the method of generating the first loss value may be: obtaining a preset first loss value formula.
  • a first loss value is generated. The first loss value can be accurately calculated through the first loss value formula.
  • the method of generating the second loss value based on the second feature vector and the first feature vector may be: performing distillation learning on the second feature vector and the first feature vector, specifically: converting the second feature vector into Vector and each of the first feature vectors set a triplet, the second feature vector is used as anchor, the first feature vector with the same identity code as the second feature vector and the smallest feature similarity is taken as positive, and the first feature vector with the smallest feature similarity is taken as positive.
  • the identity vectors of the two feature vectors have different identity codes and the first feature vector with the greatest feature similarity is used as the negative.
  • a ternary process is performed.
  • the group loss value is calculated to obtain the second loss value.
  • the second loss value can be accurately calculated based on the constructed triplet and triplet loss principle. It should be noted that the triplet loss principle is based on the Euclidean distance formalization principle.
  • a preset second loss value formula is obtained.
  • the mapping function is f (x)
  • the second loss value formula is simplified, and the simplified second loss value formula is obtained:
  • L 2 second loss value N is the total number of samples
  • n is the sample
  • a anchor
  • p is positive
  • n negative
  • m is a constant
  • the constant m and the mapping function f (x) It can be set according to the actual situation, and the embodiment of the present invention does not specifically limit this.
  • a second loss value is generated.
  • the first loss value and the second loss value are weighted and summed to obtain the target loss value by: obtaining the first weight parameter and the second weight parameter, and adding the first weight parameter and the first loss value.
  • the value is multiplied to obtain the third loss value
  • the second weight parameter is multiplied by the second loss value to obtain the fourth loss value
  • the third loss value and the fourth loss value are summed to obtain the target loss value.
  • the first weight parameter and the second weight parameter can be set according to the actual situation. The embodiment of the present invention does not specifically limit this. By performing a weighted sum of the first loss value and the second loss value, it can be accurately obtained Target loss value.
  • the target loss value after obtaining the target loss value, it is determined whether the target loss value is less than or equal to the preset threshold. If the target loss value is less than or equal to the preset threshold, it is determined that the face recognition model has converged; if the target loss value is greater than If the threshold is preset, it is determined that the face recognition model has not converged.
  • the preset threshold can be set according to the actual situation, and the embodiment of the present invention does not specifically limit this.
  • Sub-step S1034 If the face recognition model has not converged, adjust the model parameters of the face recognition model to update the face recognition model, and continue to train the updated face recognition model. If If the face recognition model converges, the converged face recognition model will be obtained.
  • the target loss value is less than or equal to the preset threshold. If the target loss value is less than or equal to the preset threshold, it is determined that the face recognition model has converged; if the target loss value is greater than the preset threshold, it is determined that the face recognition model has not converged. Adjust the model parameters of the face recognition model to update the face recognition model, and continue to train the updated face recognition model. If the target loss value of the updated face recognition model is less than or equal to the preset threshold, determine the person The face recognition model has converged. When the face recognition model does not converge, update the model parameters and continue training to obtain a converged face recognition model.
  • the face recognition model training method obtains a plurality of first sample face images and the identity identification code corresponding to each first sample face image; and then presets the plurality of first sample face images. Augmentation processing is performed to obtain a plurality of first face images, and according to a preset image augmentation model, augmentation processing is performed on a plurality of first sample face images to obtain a plurality of second face images, the first The sample face image is the same as the identity identification code corresponding to the second face image; based on the plurality of first face images and the plurality of second face images, the preset face recognition model is trained until the face Identify model convergence.
  • This solution uses preset augmentation and image augmentation models to augment multiple first sample face images, and can obtain a large number of first face images and second face images, greatly increasing the number of training samples. , by jointly training the preset face recognition model on the first face image and the second face image, so that the trained face recognition model is more accurate.
  • FIG. 5 is a schematic flowchart of the steps of the face recognition method provided by the embodiment of the present application.
  • the face recognition method includes steps S301 to S303.
  • Step S301 Obtain the face image to be recognized.
  • the face image may be a face photo or a frame of face image in a video. This is not specifically limited in the embodiment of the present invention.
  • Step S302 Input the face image to be recognized to the face recognition model to obtain the identity characteristics of the person corresponding to the face image to be recognized.
  • the face recognition model is trained through the aforementioned face recognition model training method.
  • the identity characteristics of the person corresponding to the face image can be accurately obtained.
  • Step S303 Determine the identity information of the person corresponding to the face image to be recognized based on the identity characteristics and the preset identity information database.
  • the preset identity information database is an identity information database established in advance based on the identity information of each person, and each identity information in the identity information database maps the preset identity characteristics of each person.
  • the preset identity information database can be established according to actual conditions, and this is not specifically limited in the embodiment of the present invention.
  • the similarity between the identity feature and each preset identity feature in the identity information database is calculated to obtain the similarity of each preset identity feature of the identity feature, and the largest similarity correspondence is selected from the similarity queue.
  • the preset identity feature is used as the target identity feature, and the identity information corresponding to the target identity feature is used as the identity information of the person corresponding to the face image to be recognized.
  • the similarity between the identity feature and each preset identity feature in the identity information database is calculated.
  • the similarity of each preset identity feature in the identity information database can be obtained by: obtaining the preset cosine similarity formula.
  • the cosine similarity formula is Among them, L 3 is the identity feature similarity, A is the identity feature, and B is the preset identity feature. Substitute the identity feature and the preset identity feature into the cosine similarity formula to obtain the similarity between the identity feature and the preset identity feature.
  • the face recognition method obtained by the above embodiments obtains the face image to be recognized; then inputs the face image into the face recognition model to obtain the identity characteristics of the person corresponding to the face image, and then based on the identity characteristics and preset
  • the identity information database determines the identity information of the person corresponding to the face image to be recognized.
  • Figure 6 is a schematic block diagram of a face recognition model training device provided by an embodiment of the present application.
  • the face recognition model training device 400 includes a first acquisition module 410, a generation module 420 and a first training module 430, where:
  • the first acquisition module 410 is used to acquire a plurality of first sample face images and the identity identification code corresponding to each of the first sample face images;
  • the generation module 420 is used to perform preset augmentation processing on the plurality of first sample face images to obtain a plurality of first face images, and to perform preset augmentation processing on the plurality of first face images according to the preset image augmentation model.
  • the first sample face image is augmented to obtain a plurality of second face images.
  • the first sample face image has the same identity code corresponding to the second face image.
  • the image augmentation model uses Perform image blur augmentation on the first face image;
  • the first training module 430 is used to train a preset face recognition model according to the plurality of first face images and the plurality of second face images until the face recognition model converges. .
  • the first training module 430 also includes a first processing module 431, a second processing module 432, a first determination module 433 and an update module 434, wherein:
  • the first processing module 431 is used to input the first face image to the face recognition model for processing to obtain a first feature vector
  • the second processing module 432 is used to input the second face image to the face recognition model for processing to obtain a second feature vector
  • the first determination module 433 is configured to determine the target loss value of the face recognition model based on the first feature vector and the second feature vector, and determine the face recognition model based on the target loss value. Whether to converge;
  • the update module 434 is configured to adjust the model parameters of the face recognition model to update the face recognition model if the face recognition model does not converge, and continue to train the updated face recognition model, If the face recognition model converges, a converged face recognition model is obtained.
  • the first determination module 433 is also used to:
  • the first loss value and the second loss value are weighted and summed to obtain a target loss value.
  • FIG. 8 is a schematic block diagram of an image augmentation model training device provided by an embodiment of the present application.
  • the image augmentation model training device 500 includes a second acquisition module 510, an adding module 520 and a second training model 530, wherein:
  • the second acquisition module 510 is used to acquire a plurality of second sample face images
  • Adding module 520 is used to add noise to each of the second sample face images to obtain multiple third sample face images
  • the second training model 530 is used to train a preset image augmentation model based on a plurality of the third sample face images until the image augmentation model converges.
  • the adding module 520 is also used to:
  • the photon noise, readout noise and quantization noise are added to each second sample face image to obtain a plurality of third sample face images.
  • the second training module 530 is also used to:
  • the image augmentation model does not converge, adjust the model parameters of the image augmentation model to update the image augmentation model, and continue to train the updated image augmentation model until the image augmentation model convergence.
  • the second training module 530 is also used to:
  • FIG. 9 is a schematic block diagram of a face recognition device provided by an embodiment of the present application.
  • the face recognition device 600 includes a third acquisition module 610, a recognition module 620 and a second determination module 630, wherein:
  • the third acquisition module 610 is used to acquire the face image to be recognized
  • the recognition module 620 is used to input the face image to be recognized to the face recognition model and obtain the identity characteristics of the person corresponding to the face image to be recognized;
  • the second determination module 630 is used to determine the identity information of the person corresponding to the face image to be recognized based on the identity characteristics and the preset identity information database.
  • FIG. 10 is a schematic structural block diagram of a terminal device provided by an embodiment of the present application.
  • the terminal device 700 includes a processor 701 and a memory 702.
  • the processor 701 and the memory 702 are connected through a bus 703, which is, for example, an I2C (Inter-integrated Circuit) bus.
  • I2C Inter-integrated Circuit
  • the processor 701 is used to provide computing and control capabilities to support the operation of the entire terminal device 700 .
  • the processor 701 can be a central processing unit (Central Processing Unit, CPU).
  • the processor 701 can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC). ), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general processor may be a microprocessor or the processor may be any conventional processor.
  • the memory 702 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a U disk or a mobile hard disk, etc.
  • ROM Read-Only Memory
  • the memory 702 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a U disk or a mobile hard disk, etc.
  • FIG. 10 is only a block diagram of a partial structure related to the solution of the present invention, and does not constitute a limitation on the terminal equipment to which the solution of the present invention is applied.
  • the specific terminal device 700 More or fewer components may be included than shown in the figures, or certain components may be combined, or may have a different arrangement of components.
  • the processor 701 is used to run a computer program stored in the memory to implement the following steps:
  • a preset face recognition model is trained until the face recognition model converges.
  • the processor 701 trains a preset face recognition model based on the plurality of first face images and the plurality of second face images until the When the face recognition model converges, it is used to implement:
  • the face recognition model does not converge, adjust the model parameters of the face recognition model to update the face recognition model, and continue to train the updated face recognition model. If the face recognition model When the model converges, the converged face recognition model is obtained.
  • the processor 701 when determining the target loss value of the face recognition model based on the first feature vector and the second feature vector, is configured to:
  • the first loss value and the second loss value are weighted and summed to obtain a target loss value.
  • the processor 701 before implementing the acquisition of multiple first sample face images and the identity identification code corresponding to each of the first sample face images, the processor 701 is also configured to:
  • a preset image augmentation model is trained according to a plurality of the third sample face images until the image augmentation model converges.
  • the processor 701 when the processor 701 adds noise to each of the second sample face images to obtain multiple third sample face images, the processor 701 is configured to:
  • the photon noise, readout noise and quantization noise are added to each second sample face image to obtain a plurality of third sample face images.
  • the processor 701 is configured to train a preset image augmentation model based on a plurality of the third sample face images until the image augmentation model converges. accomplish:
  • the image augmentation model does not converge, adjust the model parameters of the image augmentation model to update the image augmentation model, and continue to train the updated image augmentation model until the image augmentation model convergence.
  • the processor 701 is configured to determine whether the image augmentation model converges based on a plurality of the second sample face images and a plurality of the third face images. accomplish:
  • the processor 701 is used to implement:
  • the identity information of the person corresponding to the face image to be recognized is determined.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program includes program instructions.
  • the method implemented when the program instructions are executed may refer to this document.
  • Various embodiments of facial recognition methods are applied for.
  • the computer-readable storage medium may be an internal storage unit of the computer device described in the previous embodiment, such as a hard disk or memory of the computer device.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), or a secure digital (SD) equipped on the computer device. ) card, Flash Card, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

本申请提供一种人脸识别模型训练方法、识别方法、装置、设备及介质,该方法包括:获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码;对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像;根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。本申请通过对第一人脸图像和第二人脸图像对预设的人脸识别模型进行联合训练,使训练出的人脸识别模型更加准确。

Description

人脸识别模型训练方法、识别方法、装置、设备及介质
本申请要求于2022年7月29日提交中国专利局,申请号为202210914189.X、发明名称为“人脸识别模型训练方法、识别方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人脸识别技术领域,尤其涉及一种人脸识别模型训练方法、识别方法、装置、设备及介质。
背景技术
人脸作为区分个体之间的一个基本属性,在计算机视觉和多媒体应用领域频繁识别。这些应用中,人脸识别模型需要被重新部署在移动手机甚至智能摄像头中,用于相机自动对焦、人机交互、照片管理、城市安防监控、智能驾驶等诸多领域。当前,人脸识别在开放环境条件下的实际应用中,经常需要识别低分辨率人脸图像,但是目前对低分辨率人脸图像识别准确性较差。目前为了提高对低分辨率人脸图像识别的准确性采用基于增强的方法和基于嵌入的方法,但是这两种处理方法并不理想,并不能达到用户要求。因此,如何提高对低分辨率人脸图像识别的准确性是目前亟待解决的问题。
发明内容
本申请的主要目的在于提供一种人脸识别模型训练方法、识别方法、装置、设备及介质,旨在使训练出来的人脸识别模型更加准确,以提高对低分辨人脸图像识别的准确性。
第一方面,本申请提供一种人脸识别方法,所述人脸识别方法包括以下步骤:
获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份 标识码;
对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,所述图像增广模型用于对所述第一人脸图像进行图像模糊增广;
根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。
第二方面,本申请还提供一种人脸识别方法,包括:
获取待识别的人脸图像;
将所述待识别的人脸图像输入至人脸识别模型,得到所述待识别的人脸图像对应的人物的身份特征,其中,所述人脸识别模是通过所述的人脸识别模型训练方法进行训练得到的;
根据所述身份特征和预设的身份信息库,确定所述待识别的人脸图像对应的人物的身份信息。
第三方面,本申请还提供一种人脸识别模型训练装置,所述人脸识别模型训练装置包括第一获取模块、生成模块和训练模块,其中:
所述第一获取模块,用于获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码;
所述生成模块,用于对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,所述图像增广模型用于对所述第一人脸图像进行图像模糊增广;
所述第一训练模块,用于根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。
第四方面,本申请还提供一种终端设备,所述终端设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机程序,其中所述计算机程序被所述处理器执行时,实现如上述的人脸识别模型训练方法和/或人脸识别方法的步骤。
第五方面,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中所述计算机程序被处理器执行时,实现如上述的人脸识别模型训练方法和/或人脸识别方法的步骤。
本申请提供一种人脸识别模型训练方法、识别方法、装置、设备及介质,本申请通过获取多个第一样本人脸图像以及每个第一样本人脸图像对应的身份标识码;然后对多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,图像增广模型用于对第一人脸图像进行图像模糊增广;根据多个第一人脸图像和多个第二人脸图像,对预设的人脸识别模型进行训练,直至人脸识别模型收敛。本方案通过预设增广和图像增广模型对多个第一样本人脸图像进行增广处理,能够得到大量的第一人脸图像和第二人脸图像,极大地增加了训练样本的数量,通过对第一人脸图像和第二人脸图像对预设的人脸识别模型进行联合训练,使训练出的人脸识别模型更加准确。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请的实施例提供的一种人脸识别模型训练方法的流程示意图;
图2为本申请的实施例提供的一种图像增广模型训练的流程示意图;
图3为图2中的图像增广模型训练的子步骤流程示意图;
图4为图1中的人脸识别模型训练方法的子步骤流程示意图;
图5为本申请实施例提供的人脸识别方法的步骤流程示意图;
图6为本申请实施例提供的一种人脸识别模型训练装置的示意性框图;
图7为本申请实施例提供的人脸识别模型训练装置的子模块的示意性框图;
图8为本申请实施例提供的一种图像增广模型训练装置的示意性框图;
图9为本申请实施例提供的一种人脸识别装置的示意性框图;
图10为本申请实施例提供的一种终端设备的结构示意性框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。
本申请实施例提供一种人脸识别模型训练方法、识别方法、装置、设备及介质。其中,该人脸识别模型训练方法可应用于终端设备中,该终端设备可以是手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等电子设备。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
请参照图1,图1为本申请的实施例提供的一种人脸识别模型训练方法的流程示意图。
如图1所示,该人脸识别模型训练方法包括步骤S101至步骤S103。
步骤S101、获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码。
其中,身份标识码为第一样本人脸图像对应的身份标识,该身份标识码可以根据实际情况进行设置,本发明实施例对此不做具体限定,例如,该身份标识码可以是身份证号码,该预设分辨率可以根据实际情况进行设置,本发明实施例对此不做具体限定,例如,该预设分辨率为720P。
在一实施例中,获取多个第一样本人脸图像和每个第一样本人脸图像对应的身份标识码,得到多个样本人脸图像以及每个第一样本人脸图像对应的 身份标识码,该每个第一样本人脸图像的分辨率均小于或等于预设分辨率。
需要说明的是,第一样本人脸图像的获取方式可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,第一样本人脸图像可以是从视频中截取的图像,也可以是通过拍摄设备采集的图像,该拍摄设备可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,该拍摄设备可以是照相机、摄像机和手机等设备。
步骤S102、对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像。
通过多个第一样本人脸图像进行预设增广处理和图像增广模型处理,以获得更多的样本。
在一实施例中,请参照图2,图2为本申请的实施例提供的一种图像增广模型训练的流程示意图。
如图2所示,该图像增广模型训练包括步骤S201至步骤S202。
步骤S201、获取多个第二样本人脸图像,并给每个所述第二样本人脸图像添加噪声,得到多个第三样本人脸图像。
获取多个第二样本人脸图像,其中,获取该多个第二样本人脸图像的方式可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,第二样本人脸图像可以是从视频中截取的图像,也可以是通过拍摄设备采集的图像,该拍摄设备可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,该拍摄设备可以是照相机、摄像机和手机等设备。
在一实施例中,获取预设的光子噪声、读出噪声和量化噪声;根据每个第二样本人脸图像的分辨率,给每个第二样本人脸图像添加光子噪声、读出噪声和量化噪声,得到多个第三样本人脸图像。其中,该预设的光子噪声、读出噪声和量化噪声可以根据实际情况进行设置设备,本发明实施例对此不做具体限定。通过对每个第二样本人脸图像添加噪声,使得训练图像增广模型的样本图像更加切合实际情况,以使训练出的图像增广模型更加准确。
需要说明的是,该光子噪声为采集图像时光子转化为电子发生光电效应所产生的光噪声;该读出噪声为采集图像时电子转换为电压过程中电路固有的因素,如器件中电子的热运动等,造成结果的不精确,所产生误差称为读 出噪声;该量化噪声为采集图像时将电压转换为数字,由连续信号转化为数字信号,所造成的信息损失称之为量化误差或取整误差,即为量化噪声。
在一实施例中,获取光子噪声的方式还可以为:获取采集图像是传感器接收光子数I,使用泊松分布拟合该光子数I,得到光子噪声。通过泊松分布对接收到光子数进行拟合,能够准确地得到光子噪声。
在一实施例中,获取读出噪声的方式还可以为:获取采集图像过程中电子转换为电压过程中的误差,对该误差进行高斯分布处理并通过预设Tukey lambda分布进行处理,生成读出噪声。通过对采集图像过程中电子转换为电压过程中的误差进行处理,能够准确地得到读出噪声。
在一实施例中,获取量化噪声的方式还可以为:获取量化噪声分布,该量化噪声分布为[-0.5q,0.5q],其中,q为量化步骤的次数。获取量化步骤的次数,并对量化步骤的次数和量化噪声分布进行运算,得到量化噪声。例如,该量化步骤的次数为1(即q为1),量化噪声为[-0.5,0.5];该量化步骤的次数为2(即q为2),量化噪声为[-0.1,0.1]。
在一实施例中,获取每个第二样本人脸图像的分辨率,根据每个第二样本人脸图像的分辨率,给每个第二样本人脸图像添加光子噪声、读出噪声和量化噪声,得到多个第三样本人脸图像。通过对第二样本人脸图像添加光子噪声、读出噪声和量化噪声,能够得到更加符合低分辨图像特征的样本图像。
示例性的,获取噪声叠加公式,该噪声叠加公式为N=kN 1+N 2+N 3,其中,N为总噪声,k为光子噪声增益值,N 1为每个像素数点的光子噪声,N 2为每个像素数点的读出噪声,N 3为每个像素数点的量化噪声,获取光子噪声、读出噪声和量化噪声,基于该噪声叠加公式,对光子噪声、读出噪声和量化噪声进行叠加得到总噪声。其中,该光子噪声增益值是根据成像***进行设置,该光子噪声增益值可以根据实际情况进行设置,本发明实施例对此不做具体限定。根据第二样本人脸图像的分辨率,对第二样本人脸图像添加总噪声,生成第三样本人脸图像。
步骤S202、根据多个所述第三样本人脸图像,对预设的图像增广模型进行训练,直至所述图像增广模型收敛。
其中,该图像增广模型包括图像下采样模型和高斯模糊模型。
在一实施例中,如图3所示,步骤S202包括子步骤S2021至子步骤S2023。
子步骤S2021、通过预设的图像增广模型对各所述第三样本人脸图像进行处理,得到多个第三人脸图像。
通过图像下采样模型对各第三样本人脸图像进行下采样处理,并对下采样后的图像进行高斯模糊模型处理,得到各第三样本人脸图像对应的第三人脸图像。通过图像下采样模型对第三样本人脸图像进行下采样处理,能够使得样本图像符合显示区域的大小和生成对应图像缩略图,对缩略图进行高斯模糊处理能够准确地得到第三人脸图像。
子步骤S2022、根据多个所述第二样本人脸图像和多个所述第三人脸图像,确定所述图像增广模型是否收敛。
计算各身份标识码相匹配的两个第二样本人脸图像之间的人脸特征相似度,得到各身份标识码对应的至少一个人脸相似度,并根据各人脸相似度建立第一相似度直方图;计算各身份标识码相匹配的两个第三人脸图像之间的人脸特征相似度,得到各身份标识码对应的至少一个人脸相似度,并根据各人脸相似度建立第二相似度直方图;对第一相似度直方图进行曲线拟合,得到第一曲线,并对第二相似度直方图进行曲线拟合,得到第二曲线;确定第一曲线与坐标轴所围成的第一区域以及第二曲线与坐标轴所围成的第二区域;在第一区域与第二区域的交集区域的面积大于或等于预设面积阈值时,确定图像增广模型已收敛;在第一区域与第二区域的交集区域的面积小于预设面积阈值时,确定图像增广模型已收敛。其中,该预设面积阈值可以根据实际情况进行设置,本发明实施例对此不做具体限定。通过确定各第二样本人脸图像与坐标轴围成第一区域,以及各第三人脸图像与坐标轴围成第二区域的交集区域的面积能够准确知晓图像增广模型是否收敛。
在一实施例中,计算各身份标识码相匹配的两个第二样本人脸图像之间的人脸特征相似度的方式可以为:获取身份标识码相匹配的两个第二样本人脸图像,对两个第二样本人脸图像进行特征的余弦距离计算,得到两个第二样本人脸图像的相似度。通过对两个第二样本人脸图像进行特征的余弦距离计算,能够准确地得到两个第二样本人脸图像的相似度。
在一实施例中,根据各人脸相似度建立第一相似度直方图的方式可以为:以人脸相似度为横坐标,以相同人脸相似度的数量为纵坐标建立直角坐标系,根据各人脸相似度和相同人脸相似度的数量建立第一相似度直方图。通过对 各人脸相似度建立第一相似度直方图,能够提高模型训练的准确性。
需要说明的是,计算各身份标识码相匹配的两个第三人脸图像之间的人脸特征相似度,得到各身份标识码对应的至少一个人脸相似度的方式,可以参照计算各身份标识码相匹配的两个第二样本人脸图像之间的人脸特征相似度,得到各身份标识码对应的至少一个人脸相似度的方式;根据各人脸相似度建立第二相似度直方图的方式可以参照根据各人脸相似度建立第一相似度直方图的方式,因此,对于计算各身份标识码相匹配的两个第三人脸图像之间的人脸特征相似度,得到各身份标识码对应的至少一个人脸相似度,并根据各人脸相似度建立第二相似度直方图不做过多的赘述。
在一实施例中,获取预设的曲线拟合方式,基于该预设的曲线拟合方式对第一相似度直方图进行曲线拟合,得到第一曲线,并对第二相似度直方图进行曲线拟合,得到第二曲线。其中,该预设的曲线拟合方式的可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,该预设的曲线拟合方式可以为采用matplotlib中的mlab模块或采用seaborn库中的distplot绘制。通过该曲线拟合方式能够准确得到第一相似度直方图对应的第一曲线,以及第二相似度直方图对应的第二曲线。
子步骤S2023、若所述图像增广模型未收敛,调整所述图像增广模型的模型参数,以更新所述图像增广模型,并继续训练更新后的所述图像增广模型,直至所述图像增广模型收敛。
确定第一区域与第二区域的交集区域的面积是否大于或等于预设面积阈值时,在第一区域与第二区域的交集区域的面积大于或等于预设面积阈值时,确定图像增广模型已收敛。在第一区域与第二区域的交集区域的面积小于预设面积阈值时,确定图像增广模型未收敛,其中,图像增广模型包括图像下采样模型和高斯模糊模型,调整图像增广模型中图像下采样模型的下采样参数和调整高斯模糊模型的模型参数,以更新图像下采样模型和高斯模糊模型,继续训练更新后的图像下采样模型和高斯模糊模型,直至图像下采样模型和高斯模糊模型收敛,得到收敛的图像增广模型。当确定图像增广模型未收敛,通过调整图像增广模型的模型参数,并对更新模型参数的图像增广模型继续训练,能够准确地得到收敛的图像增广模型。
在一实施例,调整图像增广模型中图像下采样模型的下采样参数和调整 高斯模糊模型的模型参数的方式可以为:从预设的下采样参数库和预设的模型参数库各选取一个参数作为此次调整图像下采样模型的下采样参数和调整高斯模糊模型的模型参数。其中,该预设的下采样参数库和预设的模型参数库可以根据实际情况进行设置,本发明实施例对此不做具体限定。例如,该下采样参数库中包括的下采样参数可以为10倍、20倍和50倍等参数;该模型参数包括高斯核参数,高斯核参数可以为0.5、5和8等参数。通过从预设的下采样参数库和预设的模型参数库中选取下采样参数和模型参数,能够准确地调整图像增广模型中图像下采样模型的下采样参数和调整高斯模糊模型的模型参数。
在一实施例中,对多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像。其中,该预设增广可以根据实际情况进行选择,本发明实施例的对此不做具体,该预设增广可以是随机翻转、亮度对比度调整、图像灰度化和随机擦除等方式。通过对多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,以丰富样本图像。
在一实施例中,根据预设的图像增广模型,对多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,其中,第一样本人脸图像与对应第二人脸图像的身份标识码相同。通过图像增广模型对第一样本人脸图像进行增广处理,得到多个第二人脸图像。
步骤S103、根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。
其中,该人脸识别模型为神经网络模型,该神经网络模型的具体类型可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,该神经网络模型可以为知识蒸馏神经网络模型。例如,该人脸识别模型可以是基于知识蒸馏神经网络的脸识别模型和基于卷积神经网络的人脸识别模型等模型。
示例性的,该人脸识别模型包括的结构可以是backbone+L2 norm,该backbone可以根据实际情况进行选择,本发明实施例对此不做具体限定,例如,该backbone可以是MobileFaceNet、iresnet和vit;例如,该人脸识别模型的结构可以为MobileFaceNet+L2 norm、iresnet+L2 norm和vit+L2 norm等结构模型。
在一实施例中,如图4所示,步骤S103包括子步骤S1031至子步骤S1034。
子步骤S1031、将所述第一人脸图像输入至所述人脸识别模型进行处理,得到第一特征向量。
将第一人脸图像输入至人脸识别模型进行处理,得到第一特征向量。通过该人脸识别模型能够准确地得到第一人脸图像对应的第一特征向量。
子步骤S1032、将所述第二人脸图像输入至所述人脸识别模型进行处理,得到第二特征向量性向量。
将第二人脸图像输入至人脸识别模型进行处理,得到第二特征向量性向量。通过该人脸识别模型能够准确地得到第二人脸图像对应的第二特征向量。
子步骤S1033、根据所述第一特征向量和所述第二特征向量,确定所述人脸识别模型的目标损失值,并根据所述目标损失值,确定所述人脸识别模型是否收敛。
根据第一特征向量以及第一特征向量对应的所述身份标识码,生成第一损失值;根据第二特征向量和第一特征向量,生成第二损失值;对第一损失值和所述第二损失值进行加权求和,得到目标损失值。通过确定第一损失值和第二损失值,并对第一损失值和第二损失值进行加权求和,能够准确地得到人脸识别模型的目标损失值。
在一实施例中,根据第一特征向量以及第一特征向量对应的所述身份标识码,生成第一损失值的方式可以为:获取预设的第一损失值公式,该第一损失值公式为
Figure PCTCN2022142236-appb-000001
其中,L 1为第一损失值,N为小批次图片数量,n为参与训练的第一样本人脸图像的身份标识码个数,m为角度距离,s为第一特征向量余弦距离扩大倍数,θ yi为第一特征向量与对应身份标识码的特征原型的夹角。基于该第一损失值公式,并根据第一特征向量以及所述第一特征向量对应的所述身份标识码,生成第一损失值。通过该第一损失值公式能够准确地运算出第一损失值。
在一实施例中,根据第二特征向量和第一特征向量,生成第二损失值的方式可以为:对第二特征向量和所述第一特征向量进行蒸馏学习,具体为:将第二特征向量和各所述第一特征向量设定三元组,将第二特征向量作为anchor,将与第二特征向量的身份标识码相同且特征相似度最小的第一特征向量作为positive,将与第二特征向量的身份标识码不相同且特征相似度最 大的第一特征向量作为negative,基于各anchor对应的第二特征向量、positive对应的第一特征向量和negative对应的第一特征向量进行三元组损失值计算,得到第二损失值。通过对第二特性向量和各第一特征向量构建三元组,基于构建的三元组和三元组损失原理能准确地计算出第二损失值。需要说明的是,三元组损失原理是基于欧式距离形式化原理。
示例性的,获取预设的第二损失值公式,该第二损失值公式为L 2=max{d(a,p)-d(a,n)+m,0},设样本为x和映射函数为f (x),对该第二损失值公式进行化简,得到化简后的第二损失值公式为
Figure PCTCN2022142236-appb-000002
其中,L 2第二损失值,N为样本总数量,n为第几个样本,a为anchor,p为positive,n为negative,m为常数,其中,该常数m和映射函数f (x)可以根据实际情况进行设置,本发明实施例对此不做具体限定。基于该第二损失值公式,并根据第二特性向量和各第一特征向量构建三元组,生成第二损失值。
在一实施例中,对第一损失值和第二损失值进行加权求和,得到目标损失值的方式可以为:获取第一权重参数和第二权重参数,将第一权重参数与第一损失值进行乘法运算,得到第三损失值,将第二权重参数与第二损失值进行乘法运算,得到第四损失值,对第三损失值和第四损失值进行求和,得到目标损失值。其中,该第一权重参数和第二权重参数可以根据实际情况进行设置,本发明实施例对此不做具体限定,通过对第一损失值和第二损失值进行加权求和,能够准确地得到目标损失值。
在一实施例中,在得到目标损失值后,确定目标损失值是否小于或等于预设阈值,若目标损失值小于或等于预设阈值,则确定人脸识别模型已收敛;若目标损失值大于预设阈值,则确定人脸识别模型未收敛。其中,该预设阈值可以根据实际情况进行设置,本发明实施例对此不做具体限定,
子步骤S1034、若所述人脸识别模型未收敛,则调整所述人脸识别模型的模型参数,以更新所述人脸识别模型,并继续训练更新后的所述人脸识别模型,若所述人脸识别模型收敛,则得到收敛后的人脸识别模型。
确定目标损失值是否小于或等于预设阈值,若目标损失值小于或等于预设阈值,则确定人脸识别模型已收敛;若目标损失值大于预设阈值,则确定人脸识别模型未收敛,调整该人脸识别模型的模型参数,以更新人脸识别模型,并继续训练更新后的人脸识别模型,若更新后的人脸识别模型的目标损失值小于或等于预设阈值,则确定人脸识别模型已收敛。当人脸识别模型未 收敛时更新模型参数,并继续训练能够得到收敛人脸识别模型。
上述实施例提供的人脸识别模型训练方法,通过获取多个第一样本人脸图像以及每个第一样本人脸图像对应的身份标识码;然后对多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同;根据多个第一人脸图像和多个第二人脸图像,对预设的人脸识别模型进行训练,直至人脸识别模型收敛。本方案通过预设增广和图像增广模型对多个第一样本人脸图像进行增广处理,能够得到大量的第一人脸图像和第二人脸图像,极大地增加了训练样本的数量,通过对第一人脸图像和第二人脸图像对预设的人脸识别模型进行联合训练,使训练出的人脸识别模型更加准确。
请参照图5,图5为本申请实施例提供的人脸识别方法的步骤流程示意图。
如图5所示,该人脸识别方法包括步骤S301至步骤S303。
步骤S301、获取待识别的人脸图像。
获取待识别的人脸图像,该人脸图像可以是人脸照片,也是可以是视频中的一帧人脸图像,本发明实施例对此不做具体限定。
步骤S302、将所述待识别的人脸图像输入至人脸识别模型,得到所述待识别的人脸图像对应的人物的身份特征。
其中,该所述人脸识别模是通过前述的人脸识别模型训练方法进行训练得到的。
将该人脸图像输入至预设人脸识别模型中,得到人脸图像对应的人物的身份特征。通过将该人脸图像输入至预设人脸识别模型中,可以准确的得到人脸图像对应的人物的身份特征。
步骤S303、根据所述身份特征和预设的身份信息库,确定所述待识别的人脸图像对应的人物的身份信息。
其中,该预设的身份信息库为预先根据每个人物的身份信息建立的身份信息库,该身份信息库中每个身份信息均映射每个人物的预设身份特征。该预设的身份信息库可以根据实际情况进行建立,本发明实施例对此不做具体限定。
在一实施例中,计算该身份特征与身份信息库中每个预设身份特征的相似度,得到身份特征每个预设身份特征的相似度,从该相似度队列中选取最大的相似度对应的预设身份特征作为目标身份特征,将目标身份特征对应的身份信息作为待识别的人脸图像对应的人物的身份信息。通过计算身份特征与身份信息库中每个预设身份特征相似度,能准确地确定待识别的人脸图像对应的人物的身份信息。
在一实施例中,计算该身份特征与身份信息库中每个预设身份特征的相似度,得到身份特征每个预设身份特征的相似度的方式可以为:获取预设的余弦相似度公式,该余弦相似度公式为
Figure PCTCN2022142236-appb-000003
其中,L 3为身份特征相似度,A为身份特征,B为预设身份特征,将身份特征和预设身份特征代入该该余弦相似度公式,得到身份特征与预设身份特征的相似度。
上述实施例提供的人脸识别方法,通过获取待识别的人脸图像;然后将人脸图像输入至人脸识别模型中,得到人脸图像对应的人物的身份特征,之后根据身份特征和预设的身份信息库,确定待识别的人脸图像对应的人物的身份信息。通过将该人脸图像输入至人脸识别模型中,可以准确地识别分辨率较低的图像,极大地提高了人脸识别的准确性。
请参数图6,图6为本申请实施例提供的一种人脸识别模型训练装置的示意性框图。
如图6所示,人脸识别模型训练装置400包括第一获取模块410、生成模块420和第一训练模块430,其中:
所述第一获取模块410,用于获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码;
所述生成模块420,用于对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,所述图像增广模型用于对所述第一人脸图像进行图像模糊增广;
所述第一训练模块430,用于根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。
在一实施例中,如图7所示,所述第一训练模块430还包括第一处理模块431、第二处理模块432、第一确定模块433和更新模块434,其中:
第一处理模块431、用于将所述第一人脸图像输入至所述人脸识别模型进行处理,得到第一特征向量;
第二处理模块432、用于将所述第二人脸图像输入至所述人脸识别模型进行处理,得到第二特征向量性向量;
第一确定模块433、用于根据所述第一特征向量和所述第二特征向量,确定所述人脸识别模型的目标损失值,并根据所述目标损失值,确定所述人脸识别模型是否收敛;
更新模块434、用于若所述人脸识别模型未收敛,则调整所述人脸识别模型的模型参数,以更新所述人脸识别模型,并继续训练更新后的所述人脸识别模型,若所述人脸识别模型收敛,则得到收敛后的人脸识别模型。
在一实施例中,所述第一确定模块433,还用于:
根据所述第一特征向量以及所述第一特征向量对应的所述身份标识码,生成第一损失值;
根据所述第二特征向量和所述第一特征向量,生成第二损失值;
对所述第一损失值和所述第二损失值进行加权求和,得到目标损失值。
在一实施例中,请参阅图8,图8为本申请实施例提供的一种图像增广模型训练装置的示意性框图。该图像增广模型训练装置500包括第二获取模块510、添加模块520和第二训练模型530,其中:
第二获取模块510,用于获取多个第二样本人脸图像;
添加模块520,用于给每个所述第二样本人脸图像添加噪声,得到多个第三样本人脸图像;
第二训练模型530,用于根据多个所述第三样本人脸图像,对预设的图像增广模型进行训练,直至所述图像增广模型收敛。
在一实施例中,所述添加模块520,还用于:
获取预设的光子噪声、读出噪声和量化噪声;
根据每个所述第二样本人脸图像的分辨率,给每个所述第二样本人脸图像添加所述光子噪声、读出噪声和量化噪声,得到多个第三样本人脸图像。
在一实施例中,所述第二训练模块530,还用于:
通过预设的图像增广模型对各所述第三样本人脸图像进行处理,得到多个第三人脸图像;
根据多个所述第二样本人脸图像和多个所述第三人脸图像,确定所述图像增广模型是否收敛;
若所述图像增广模型未收敛,调整所述图像增广模型的模型参数,以更新所述图像增广模型,并继续训练更新后的所述图像增广模型,直至所述图像增广模型收敛。
在一实施例中,所述第二训练模块530,还用于:
计算各身份标识码相匹配的两个所述第二样本人脸图像之间的人脸特征相似度,得到各所述身份标识码对应的至少一个人脸相似度,并根据各所述人脸相似度建立第一相似度直方图;
计算各身份标识码相匹配的两个所述第三人脸图像之间的人脸特征相似度,得到各所述身份标识码对应的至少一个人脸相似度,并根据各所述人脸相似度建立第二相似度直方图;
对所述第一相似度直方图进行曲线拟合,得到第一曲线,并对所述第二相似度直方图进行曲线拟合,得到第二曲线;
确定所述第一曲线与坐标轴所围成的第一区域以及所述第二曲线与坐标轴所围成的第二区域;
在所述第一区域与所述第二区域的交集区域的面积大于或等于预设面积阈值时,确定所述图像增广模型已收敛。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述人脸识别模型训练装置的具体工作过程,可以参考前述人脸识别模型训练方法实施例中的对应过程,在此不再赘述。
请参阅图9,图9为本申请实施例提供的一种人脸识别装置的示意性框图。该人脸识别装置600包括第三获取模块610、识别模块620和第二确定模块630,其中:
第三获取模块610,用于获取待识别的人脸图像;
识别模块620,用于将所述待识别的人脸图像输入至人脸识别模型,得到所述待识别的人脸图像对应的人物的身份特征;
所述第二确定模块630,用于根据所述身份特征和预设的身份信息库,确定所述待识别的人脸图像对应的人物的身份信息。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方 便和简洁,上述人脸识别模型训练装置的具体工作过程,可以参考前述人脸识别方法实施例中的对应过程,在此不再赘述。
请参阅图10,图10为本申请实施例提供的一种终端设备的结构示意性框图。
如图10所示,终端设备700包括处理器701和存储器702,处理器701和存储器702通过总线703连接,该总线比如为I2C(Inter-integrated Circuit)总线。
具体地,处理器701用于提供计算和控制能力,支撑整个终端设备700的运行。处理器701可以是中央处理单元(Central Processing Unit,CPU),该处理器701还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
具体地,存储器702可以是Flash芯片、只读存储器(ROM,Read-Only Memory)磁盘、光盘、U盘或移动硬盘等。
本领域技术人员可以理解,图10中示出的结构,仅仅是与本发明方案相关的部分结构的框图,并不构成对本发明方案所应用于其上的终端设备的限定,具体的终端设备700可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。其中,在一个实施例中,所述处理器701用于运行存储在存储器中的计算机程序,以实现如下步骤:
获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码;
对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,所述图像增广模型用于对所述第一人脸图像进行图像模糊增广;
根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识 别模型进行训练,直至所述人脸识别模型收敛。
在一个实施例中,所述处理器701在实现所述根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛时,用于实现:
将所述第一人脸图像输入至所述人脸识别模型进行处理,得到第一特征向量;
将所述第二人脸图像输入至所述人脸识别模型进行处理,得到第二特征向量性向量;
根据所述第一特征向量和所述第二特征向量,确定所述人脸识别模型的目标损失值,并根据所述目标损失值,确定所述人脸识别模型是否收敛;
若所述人脸识别模型未收敛,则调整所述人脸识别模型的模型参数,以更新所述人脸识别模型,并继续训练更新后的所述人脸识别模型,若所述人脸识别模型收敛,则得到收敛后的人脸识别模型。
在一个实施例中,所述处理器701在实现所述根据所述第一特征向量和第二特征向量,确定所述人脸识别模型的目标损失值时,用于实现:
根据所述第一特征向量以及所述第一特征向量对应的所述身份标识码,生成第一损失值;
根据所述第二特征向量和所述第一特征向量,生成第二损失值;
对所述第一损失值和所述第二损失值进行加权求和,得到目标损失值。
在一个实施例中,所述处理器701在实现所述获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码之前,还用于实现:
获取多个第二样本人脸图像,并给每个所述第二样本人脸图像添加噪声,得到多个第三样本人脸图像;
根据多个所述第三样本人脸图像,对预设的图像增广模型进行训练,直至所述图像增广模型收敛。
在一个实施例中,所述处理器701在实现所述给每个所述第二样本人脸图像添加噪声,得到多个第三样本人脸图像时,用于实现:
获取预设的光子噪声、读出噪声和量化噪声;
根据每个所述第二样本人脸图像的分辨率,给每个所述第二样本人脸图像添加所述光子噪声、读出噪声和量化噪声,得到多个第三样本人脸图像。
在一个实施例中,所述处理器701在实现所述根据多个所述第三样本人脸图像,对预设的图像增广模型进行训练,直至所述图像增广模型收敛时,用于实现:
通过预设的图像增广模型对各所述第三样本人脸图像进行处理,得到多个第三人脸图像;
根据多个所述第二样本人脸图像和多个所述第三人脸图像,确定所述图像增广模型是否收敛;
若所述图像增广模型未收敛,调整所述图像增广模型的模型参数,以更新所述图像增广模型,并继续训练更新后的所述图像增广模型,直至所述图像增广模型收敛。
在一个实施例中,所述处理器701在实现所述根据多个所述第二样本人脸图像和多个所述第三人脸图像,确定所述图像增广模型是否收敛时,用于实现:
计算各身份标识码相匹配的两个所述第二样本人脸图像之间的人脸特征相似度,得到各所述身份标识码对应的至少一个人脸相似度,并根据各所述人脸相似度建立第一相似度直方图;
计算各身份标识码相匹配的两个所述第三人脸图像之间的人脸特征相似度,得到各所述身份标识码对应的至少一个人脸相似度,并根据各所述人脸相似度建立第二相似度直方图;
对所述第一相似度直方图进行曲线拟合,得到第一曲线,并对所述第二相似度直方图进行曲线拟合,得到第二曲线;
确定所述第一曲线与坐标轴所围成的第一区域以及所述第二曲线与坐标轴所围成的第二区域;
在所述第一区域与所述第二区域的交集区域的面积大于或等于预设面积阈值时,确定所述图像增广模型已收敛。
在一个实施例中,所述处理器701用于实现:
获取待识别的人脸图像;
将所述待识别的人脸图像输入至人脸识别模型,得到所述待识别的人脸图像对应的人物的身份特征;
根据所述身份特征和预设的身份信息库,确定所述待识别的人脸图像对 应的人物的身份信息。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述终端设备的具体工作过程,可以参考前述人脸识别模型训练方法和/或人脸识别方法实施例中的对应过程,在此不再赘述。
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序中包括程序指令,所述程序指令被执行时所实现的方法可参照本申请人脸识别方法的各个实施例。
其中,所述计算机可读存储介质可以是前述实施例所述的计算机设备的内部存储单元,例如所述计算机设备的硬盘或内存。所述计算机可读存储介质可以是非易失性的,也可以是易失性的。所述计算机可读存储介质也可以是所述计算机设备的外部存储设备,例如所述计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者***不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者***所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者***中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (11)

  1. 一种人脸识别模型训练方法,其特征在于,包括:
    获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码;
    对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,所述图像增广模型用于对所述第一人脸图像进行图像模糊增广;
    根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。
  2. 如权利要求1所述的人脸识别模型训练方法,其特征在于,所述根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛,包括:
    将所述第一人脸图像输入至所述人脸识别模型进行处理,得到第一特征向量;
    将所述第二人脸图像输入至所述人脸识别模型进行处理,得到第二特征向量;
    根据所述第一特征向量和所述第二特征向量,确定所述人脸识别模型的目标损失值,并根据所述目标损失值,确定所述人脸识别模型是否收敛;
    若所述人脸识别模型未收敛,则调整所述人脸识别模型的模型参数,以更新所述人脸识别模型,并继续训练更新后的所述人脸识别模型,若所述人脸识别模型收敛,则得到收敛后的人脸识别模型。
  3. 如权利要求2所述的人脸识别模型训练方法,其特征在于,所述根据所述第一特征向量和第二特征向量,确定所述人脸识别模型的目标损失值,包括:
    根据所述第一特征向量以及所述第一特征向量对应的所述身份标识码,生成第一损失值;
    根据所述第二特征向量和所述第一特征向量,生成第二损失值;
    对所述第一损失值和所述第二损失值进行加权求和,得到目标损失值。
  4. 如权利要求1所述的人脸识别模型训练方法,其特征在于,所述获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码之前,还包括:
    获取多个第二样本人脸图像,并给每个所述第二样本人脸图像添加噪声,得到多个第三样本人脸图像;
    根据多个所述第三样本人脸图像,对预设的图像增广模型进行训练,直至所述图像增广模型收敛。
  5. 如权利要求4所述的人脸识别模型训练方法,其特征在于,所述给每个所述第二样本人脸图像添加噪声,得到多个第三样本人脸图像,包括:
    获取预设的光子噪声、读出噪声和量化噪声;
    根据每个所述第二样本人脸图像的分辨率,给每个所述第二样本人脸图像添加所述光子噪声、读出噪声和量化噪声,得到多个第三样本人脸图像。
  6. 如权利要求4所述的人脸识别模型训练方法,其特征在于,所述根据多个所述第三样本人脸图像,对预设的图像增广模型进行训练,直至所述图像增广模型收敛,包括:
    通过预设的图像增广模型对各所述第三样本人脸图像进行处理,得到多个第三人脸图像;
    根据多个所述第二样本人脸图像和多个所述第三人脸图像,确定所述图像增广模型是否收敛;
    若所述图像增广模型未收敛,调整所述图像增广模型的模型参数,以更新所述图像增广模型,并继续训练更新后的所述图像增广模型,直至所述图像增广模型收敛。
  7. 如权利要求6所述的人脸识别模型训练方法,其特征在于,所述根据多个所述第二样本人脸图像和多个所述第三人脸图像,确定所述图像增广模型是否收敛,包括:
    计算各身份标识码相匹配的两个所述第二样本人脸图像之间的人脸特征相似度,得到各所述身份标识码对应的至少一个人脸相似度,并根据各所述人脸相似度建立第一相似度直方图;
    计算各身份标识码相匹配的两个所述第三人脸图像之间的人脸特征相似度,得到各所述身份标识码对应的至少一个人脸相似度,并根据各所述人脸 相似度建立第二相似度直方图;
    对所述第一相似度直方图进行曲线拟合,得到第一曲线,并对所述第二相似度直方图进行曲线拟合,得到第二曲线;
    确定所述第一曲线与坐标轴所围成的第一区域以及所述第二曲线与坐标轴所围成的第二区域;
    在所述第一区域与所述第二区域的交集区域的面积大于或等于预设面积阈值时,确定所述图像增广模型已收敛。
  8. 一种人脸识别方法,其特征在于,包括:
    获取待识别的人脸图像;
    将所述待识别的人脸图像输入至人脸识别模型,得到所述待识别的人脸图像对应的人物的身份特征,其中,所述人脸识别模是通过权利要求1-7中任一项所述的人脸识别模型训练方法进行训练得到的;
    根据所述身份特征和预设的身份信息库,确定所述待识别的人脸图像对应的人物的身份信息。
  9. 一种人脸识别模型训练装置,其特征在于,所述人脸识别模型训练装置包括第一获取模块、生成模块和第一训练模块,其中:
    所述第一获取模块,用于获取多个第一样本人脸图像以及每个所述第一样本人脸图像对应的身份标识码;
    所述生成模块,用于对所述多个第一样本人脸图像进行预设增广处理,得到多个第一人脸图像,并根据预设的图像增广模型,对所述多个第一样本人脸图像进行增广处理,得到多个第二人脸图像,所述第一样本人脸图像与对应所述第二人脸图像的身份标识码相同,所述图像增广模型用于对所述第一人脸图像进行图像模糊增广;
    所述第一训练模块,用于根据所述多个第一人脸图像和所述多个第二人脸图像,对预设的人脸识别模型进行训练,直至所述人脸识别模型收敛。
  10. 一种终端设备,其特征在于,所述终端设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机程序,其中所述计算机程序被所述处理器执行时,实现如权利要求1至7中任一项所述的人脸识别模型训练方法和/或权利要求8所述的人脸识别方法的步骤。
  11. 一种存储介质,其特征在于,所述存储介质上存储有计算机程序, 其中所述计算机程序被处理器执行时,实现如权利要求1至7中任一项所述的人脸识别模型训练方法和/或权利要求8所述的人脸识别方法的步骤。
PCT/CN2022/142236 2022-07-29 2022-12-27 人脸识别模型训练方法、识别方法、装置、设备及介质 WO2024021504A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210914189.X 2022-07-29
CN202210914189.XA CN115410249A (zh) 2022-07-29 2022-07-29 人脸识别模型训练方法、识别方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2024021504A1 true WO2024021504A1 (zh) 2024-02-01

Family

ID=84159815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/142236 WO2024021504A1 (zh) 2022-07-29 2022-12-27 人脸识别模型训练方法、识别方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN115410249A (zh)
WO (1) WO2024021504A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115410249A (zh) * 2022-07-29 2022-11-29 成都云天励飞技术有限公司 人脸识别模型训练方法、识别方法、装置、设备及介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104050472A (zh) * 2014-06-12 2014-09-17 浙江工业大学 一种灰度图像二值化的自适应全局阈值方法
CN110363047A (zh) * 2018-03-26 2019-10-22 普天信息技术有限公司 人脸识别的方法、装置、电子设备和存储介质
CN111767906A (zh) * 2020-09-01 2020-10-13 腾讯科技(深圳)有限公司 人脸检测模型训练方法、人脸检测方法、装置及电子设备
CN112613385A (zh) * 2020-12-18 2021-04-06 成都三零凯天通信实业有限公司 一种基于监控视频的人脸识别方法
CN112669244A (zh) * 2020-12-29 2021-04-16 中国平安人寿保险股份有限公司 人脸图像增强方法、装置、计算机设备以及可读存储介质
CN114359397A (zh) * 2021-09-29 2022-04-15 大连中科创达软件有限公司 图像优化方法、装置、设备及存储介质
CN114783017A (zh) * 2022-03-17 2022-07-22 北京明略昭辉科技有限公司 基于逆映射的生成对抗网络优化方法及装置
CN115410249A (zh) * 2022-07-29 2022-11-29 成都云天励飞技术有限公司 人脸识别模型训练方法、识别方法、装置、设备及介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104050472A (zh) * 2014-06-12 2014-09-17 浙江工业大学 一种灰度图像二值化的自适应全局阈值方法
CN110363047A (zh) * 2018-03-26 2019-10-22 普天信息技术有限公司 人脸识别的方法、装置、电子设备和存储介质
CN111767906A (zh) * 2020-09-01 2020-10-13 腾讯科技(深圳)有限公司 人脸检测模型训练方法、人脸检测方法、装置及电子设备
CN112613385A (zh) * 2020-12-18 2021-04-06 成都三零凯天通信实业有限公司 一种基于监控视频的人脸识别方法
CN112669244A (zh) * 2020-12-29 2021-04-16 中国平安人寿保险股份有限公司 人脸图像增强方法、装置、计算机设备以及可读存储介质
CN114359397A (zh) * 2021-09-29 2022-04-15 大连中科创达软件有限公司 图像优化方法、装置、设备及存储介质
CN114783017A (zh) * 2022-03-17 2022-07-22 北京明略昭辉科技有限公司 基于逆映射的生成对抗网络优化方法及装置
CN115410249A (zh) * 2022-07-29 2022-11-29 成都云天励飞技术有限公司 人脸识别模型训练方法、识别方法、装置、设备及介质

Also Published As

Publication number Publication date
CN115410249A (zh) 2022-11-29

Similar Documents

Publication Publication Date Title
CN108898086B (zh) 视频图像处理方法及装置、计算机可读介质和电子设备
WO2021036059A1 (zh) 图像转换模型训练方法、异质人脸识别方法、装置及设备
WO2022027912A1 (zh) 一种人脸姿态检测方法、装置、终端设备及存储介质
CN108399383B (zh) 表情迁移方法、装置存储介质及程序
EP3968179A1 (en) Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device
WO2020253127A1 (zh) 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
WO2017096753A1 (zh) 人脸关键点跟踪方法、终端和非易失性计算机可读存储介质
CN111507333B (zh) 一种图像矫正方法、装置、电子设备和存储介质
WO2022156622A1 (zh) 脸部图像的视线矫正方法、装置、设备、计算机可读存储介质及计算机程序产品
WO2021164269A1 (zh) 基于注意力机制的视差图获取方法和装置
CN111582044A (zh) 基于卷积神经网络和注意力模型的人脸识别方法
WO2023206944A1 (zh) 一种语义分割方法、装置、计算机设备和存储介质
WO2023035531A1 (zh) 文本图像超分辨率重建方法及其相关设备
WO2024045442A1 (zh) 图像矫正模型的训练方法、图像矫正方法、设备及存储介质
WO2022262474A1 (zh) 变焦控制方法、装置、电子设备和计算机可读存储介质
CN112614110B (zh) 评估图像质量的方法、装置及终端设备
WO2024021504A1 (zh) 人脸识别模型训练方法、识别方法、装置、设备及介质
WO2023221790A1 (zh) 图像编码器的训练方法、装置、设备及介质
WO2023124040A1 (zh) 一种人脸识别方法及装置
WO2022213761A1 (zh) 一种图像处理方法、装置、电子设备和存储介质
WO2021238586A1 (zh) 一种训练方法、装置、设备以及计算机可读存储介质
CN112232506A (zh) 网络模型训练方法、图像目标识别方法、装置和电子设备
CN106803077A (zh) 一种拍摄方法及终端
Liu et al. Learning explicit shape and motion evolution maps for skeleton-based human action recognition
WO2022252640A1 (zh) 图像分类预处理、图像分类方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952933

Country of ref document: EP

Kind code of ref document: A1