CN116884049A - Face recognition method, device and storage medium - Google Patents

Face recognition method, device and storage medium Download PDF

Info

Publication number
CN116884049A
CN116884049A CN202210302446.4A CN202210302446A CN116884049A CN 116884049 A CN116884049 A CN 116884049A CN 202210302446 A CN202210302446 A CN 202210302446A CN 116884049 A CN116884049 A CN 116884049A
Authority
CN
China
Prior art keywords
image
adjustment
face
module
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210302446.4A
Other languages
Chinese (zh)
Inventor
许剑清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shanghai Co Ltd
Original Assignee
Tencent Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shanghai Co Ltd filed Critical Tencent Technology Shanghai Co Ltd
Priority to CN202210302446.4A priority Critical patent/CN116884049A/en
Publication of CN116884049A publication Critical patent/CN116884049A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a face recognition method, a face recognition device and a storage medium, which can be applied to the field of maps. Acquiring a face image; then carrying out image adjustment on the face image for a plurality of times to obtain an image set; extracting a feature map corresponding to the image set, and performing feature mapping to obtain image features; determining confidence information corresponding to the face image; carrying out feature fusion on the confidence information corresponding to the image features and the face images to obtain fusion features; and then determining the recognition result based on the fusion characteristics. Therefore, the face recognition process based on the single image is realized, and the confidence coefficient of different adjustment forms of the single input image is estimated, and then the similarity comparison is carried out in a mode of fusing the features of the adjustment images according to the confidence coefficient, so that the face feature distribution under different scenes is simulated, and the face recognition accuracy is improved.

Description

Face recognition method, device and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and apparatus for face recognition, and a storage medium.
Background
With rapid development of computer technology, face recognition technology is becoming more common. The face recognition system is applied to actual scenes such as security, payment, access control and the like, and the face recognition system (method) used by the face recognition system is used for recognizing face images, so that the input face images are required to be high-definition and reliable. For face images in which blurring or various noises (such as occlusion, large pose, etc.) exist, the face recognition system (method) maps its features to spatial areas with poor discrimination, resulting in inaccuracy in recognition. However, in a practical application scenario, there is a part of face images which are unsuitable for recognition by the face recognition system due to low image quality (such as occlusion, blurring, large gesture, etc.), so that the accuracy of the face recognition system is reduced.
In general, a plurality of face images of the same person can be collected and input into a face recognition system, and features output by the recognition system are subjected to modeling estimation in a two-dimensional European space to obtain features of each dimension, so that fusion weighting of the features is performed to obtain a recognition result.
However, a large variable is easy to exist in the process of collecting a plurality of photos, so that a weighting mode in the process of feature fusion may not be suitable, and the accuracy of face recognition is affected.
Disclosure of Invention
In view of the above, the present application provides a face recognition method, which can effectively improve the accuracy of face recognition.
The first aspect of the present application provides a face recognition method, which may be applied to a system or a program including a face recognition function in a terminal device, and specifically includes:
acquiring a face image of a target object;
carrying out multiple image adjustment on the face image, and respectively collecting the adjusted face images to obtain an image set;
extracting a feature map corresponding to the image set based on a depth network unit module, and inputting the feature map into a full-connection mapping module for feature mapping to obtain target image features;
inputting the feature map and the target image feature into a confidence estimation module for estimation to determine confidence information corresponding to the face image, wherein the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on the depth network unit module and the fully-connected mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space;
Feature fusion is carried out on the target image features and the confidence information corresponding to the face image so as to obtain fusion features;
and comparing the similarity between the fusion characteristics and the registered images in the face database to determine a face recognition result corresponding to the target object.
Optionally, in some possible implementations of the present application, the performing image adjustment on the face image for multiple times, and collecting the adjusted face image to obtain an image set respectively includes:
executing a first adjustment operation on the face image to obtain a first adjustment image;
determining a second adjustment operation based on the adjustment dimension corresponding to the first adjustment operation, wherein the adjustment dimension corresponding to the second adjustment operation is different from the adjustment dimension corresponding to the first adjustment operation;
executing a second adjustment operation on the face image to obtain a second adjustment image;
the first adjustment image and the second adjustment image are collected to obtain the image set.
Optionally, in some possible implementations of the present application, the performing image adjustment on the face image for multiple times, and collecting the adjusted face image to obtain an image set respectively includes:
Executing a third adjustment operation on the face image to obtain a third adjustment image;
performing a fourth adjustment operation on the third adjustment image to obtain a fourth adjustment image;
and collecting the third adjustment image and the fourth adjustment image to obtain the image set.
Optionally, in some possible implementations of the present application, the comparing, based on the similarity between the fusion feature and a registered image in a face database, to determine a face recognition result corresponding to the target object includes:
acquiring the false recognition rate indicated in the preset demand;
determining a comparison threshold corresponding to similarity comparison based on the false recognition rate;
performing similarity comparison on the fusion characteristics and registered images in a face database to determine target similarity;
and comparing the target similarity with the comparison threshold value to determine a face recognition result corresponding to the target object.
Optionally, in some possible implementations of the present application, the depth-based network element module extracts a feature map corresponding to the image set, and inputs the feature map to a full-connection mapping module for feature mapping to obtain a target image feature, including:
Acquiring a first face image in a first sample set;
extracting the spatial features of the first face image based on the depth network unit module to obtain a first sample feature map containing spatial structure information;
mapping the first sample feature map to a first sample vector based on the fully connected mapping module;
performing cyclic calculation on the first sample vector based on a first objective function to determine corresponding first loss information when a first condition is reached;
parameter adjustment is carried out on the depth network unit module and the full-connection mapping module according to the first loss information;
and extracting a feature map corresponding to the image set based on the depth network unit module after parameter adjustment, and carrying out feature mapping on the feature map input parameter-adjusted full-connection mapping module to obtain the target image feature.
Optionally, in some possible implementations of the present application, inputting the feature map and the target image feature into a confidence estimation module for estimation to determine confidence information corresponding to the face image includes:
acquiring a second face image in a second sample set;
Performing image adjustment on the second face image to determine an adjustment image set;
determining a category center corresponding to the adjustment image set;
extracting the spatial features of the second face image based on the depth network unit module to obtain a second sample feature map containing spatial structure information;
mapping the second sample feature map to a second sample vector based on the fully connected mapping module;
calculating uncertainty of the second sample vector in the hypersphere space based on a confidence estimation module to determine sample confidence;
performing cyclic calculation on the second sample vector and the sample confidence based on a second objective function to determine second loss information corresponding to the second condition;
performing parameter adjustment on the confidence coefficient estimation module according to the second loss information;
and estimating the feature map and the confidence coefficient estimation module after the feature input parameters of the target image are adjusted so as to determine the confidence coefficient information corresponding to the face image.
Optionally, in some possible implementations of the present application, the performing image adjustment on the second face image to determine an adjusted image set includes:
Counting the adjustment modes adopted in the process of carrying out multiple image adjustment on the face image so as to determine the distribution information of the adjustment modes;
determining a preset mode combination based on the adjustment mode distribution information;
and carrying out image adjustment on the second face image according to the preset mode combination so as to determine the adjustment image set.
A second aspect of the present application provides a face recognition apparatus, including:
the acquisition unit is used for acquiring the face image of the target object;
the adjusting unit is used for carrying out image adjustment on the face images for a plurality of times and respectively collecting the face images after adjustment to obtain an image set;
the recognition unit is used for extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features;
the recognition unit is further configured to input the feature map and the target image feature into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on confidence information corresponding to a training sample and sample features, the sample features are determined based on the depth network unit module and the full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in a hypersphere space;
The recognition unit is further used for carrying out feature fusion on the target image features and confidence information corresponding to the face image to obtain fusion features;
the recognition unit is further used for comparing the similarity between the fusion characteristics and the registered images in the face database so as to determine a face recognition result corresponding to the target object.
Optionally, in some possible implementations of the present application, the adjusting unit is specifically configured to perform a first adjustment operation on the face image to obtain a first adjustment image;
the adjusting unit is specifically configured to determine a second adjusting operation based on an adjusting dimension corresponding to the first adjusting operation, where the adjusting dimension corresponding to the second adjusting operation is different from the adjusting dimension corresponding to the first adjusting operation;
the adjusting unit is specifically configured to perform a second adjusting operation on the face image to obtain a second adjusted image;
the adjusting unit is specifically configured to collect the first adjustment image and the second adjustment image, so as to obtain the image set.
Optionally, in some possible implementations of the present application, the adjusting unit is specifically configured to perform a third adjustment operation on the face image to obtain a third adjustment image;
The adjusting unit is specifically configured to perform a fourth adjustment operation on the third adjustment image to obtain a fourth adjustment image;
the adjusting unit is specifically configured to collect the third adjustment image and the fourth adjustment image, so as to obtain the image set.
Optionally, in some possible implementations of the present application, the identifying unit is specifically configured to obtain an error recognition rate indicated in a preset requirement;
the identification unit is specifically configured to determine a comparison threshold corresponding to similarity comparison based on the false recognition rate;
the identification unit is specifically used for comparing the similarity between the fusion characteristics and registered images in the face database so as to determine target similarity;
the recognition unit is specifically configured to compare the target similarity with the comparison threshold to determine a face recognition result corresponding to the target object.
Optionally, in some possible implementations of the present application, the identifying unit is specifically configured to obtain a first face image in the first sample set;
the recognition unit is specifically configured to extract spatial features of the first face image based on the depth network unit module, so as to obtain a first sample feature map that includes spatial structure information;
The identifying unit is specifically configured to map the first sample feature map to a first sample vector based on the fully-connected mapping module;
the identification unit is specifically configured to perform cyclic calculation on the first sample vector based on a first objective function, so as to determine first loss information corresponding to the first condition;
the identification unit is specifically configured to perform parameter adjustment on the deep network element module and the full-connection mapping module according to the first loss information;
the identification unit is specifically configured to extract a feature map corresponding to the image set based on the depth network unit module after parameter adjustment, and perform feature mapping on the feature map input parameter-adjusted fully-connected mapping module to obtain the target image feature.
Optionally, in some possible implementations of the present application, the identifying unit is specifically configured to obtain a second face image in the second sample set;
the identification unit is specifically configured to perform image adjustment on the second face image to determine an adjusted image set;
the identification unit is specifically configured to determine a category center corresponding to the adjustment image set;
The recognition unit is specifically configured to extract spatial features of the second face image based on the depth network unit module, so as to obtain a second sample feature map that includes spatial structure information;
the identifying unit is specifically configured to map the second sample feature map to a second sample vector based on the fully-connected mapping module;
the identification unit is specifically configured to calculate, based on a confidence estimation module, uncertainty of the second sample vector in the hypersphere space, so as to determine a sample confidence;
the identification unit is specifically configured to perform cyclic calculation on the second sample vector and the sample confidence coefficient based on a second objective function, so as to determine second loss information corresponding to the second condition;
the identification unit is specifically configured to perform parameter adjustment on the confidence coefficient estimation module according to the second loss information;
the recognition unit is specifically configured to estimate the feature map and the confidence coefficient estimation module after the feature input parameters of the target image are adjusted, so as to determine confidence coefficient information corresponding to the face image.
Optionally, in some possible implementation manners of the present application, the identifying unit is specifically configured to count an adjustment manner adopted in a process of performing multiple image adjustment on a face image, so as to determine adjustment manner distribution information;
The identification unit is specifically used for determining a preset mode combination based on the adjustment mode distribution information;
the identification unit is specifically configured to perform image adjustment on the second face image according to the preset mode combination, so as to determine the adjustment image set.
A third aspect of the present application provides a computer apparatus comprising: a memory, a processor, and a bus system; the memory is used for storing program codes; the processor is configured to execute the face recognition method according to the first aspect or any one of the first aspects according to instructions in the program code.
A fourth aspect of the present application provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of face recognition of the first aspect or any one of the first aspects described above.
According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method for recognizing a human face provided in the above-mentioned first aspect or various optional implementations of the first aspect.
From the above technical solutions, the embodiment of the present application has the following advantages:
acquiring a face image of a target object; then carrying out multiple image adjustment on the face images, and respectively collecting the face images after adjustment to obtain an image set; further extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features; the feature map and the target image feature are input into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on a depth network unit module and a full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space; carrying out feature fusion on the confidence information corresponding to the target image features and the face images to obtain fusion features; and further, similarity comparison is carried out on the basis of the fusion characteristics and registered images in the face database so as to determine a face recognition result corresponding to the target object. Therefore, the face recognition process based on the single image is realized, and the confidence coefficient of different adjustment forms of the single input image is estimated, and then the similarity comparison is carried out in a mode of fusing the features of the adjustment images according to the confidence coefficient, so that the face feature distribution under different scenes is simulated, and the face recognition accuracy is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a network architecture diagram of the operation of a face recognition system;
fig. 2 is a flow chart of face recognition according to an embodiment of the present application;
fig. 3 is a flowchart of a face recognition method according to an embodiment of the present application;
fig. 4 is a flow chart of a face recognition method according to an embodiment of the present application;
fig. 5 is a flow chart of another face recognition method according to an embodiment of the present application;
fig. 6 is a flowchart of another face recognition method according to an embodiment of the present application;
fig. 7 is a flowchart of another face recognition method according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a face recognition device according to an embodiment of the present application;
Fig. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a face recognition method and a related device, which can be applied to a system or a program containing a face recognition function in terminal equipment, and can be used for acquiring a face image of a target object; then carrying out multiple image adjustment on the face images, and respectively collecting the face images after adjustment to obtain an image set; further extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features; the feature map and the target image feature are input into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on a depth network unit module and a full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space; carrying out feature fusion on the confidence information corresponding to the target image features and the face images to obtain fusion features; and further, similarity comparison is carried out on the basis of the fusion characteristics and registered images in the face database so as to determine a face recognition result corresponding to the target object. Therefore, the face recognition process based on the single image is realized, and the confidence coefficient of different adjustment forms of the single input image is estimated, and then the similarity comparison is carried out in a mode of fusing the features of the adjustment images according to the confidence coefficient, so that the face feature distribution under different scenes is simulated, and the face recognition accuracy is improved.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "includes" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that the face recognition method provided by the application can be applied to a system or a program containing a face recognition function in a terminal device, for example, face recognition application, specifically, the face recognition system can be operated in a network architecture shown in fig. 1, as shown in fig. 1, a network architecture diagram operated by the face recognition system, as can be known from the figure, the face recognition system can provide a face recognition process with a plurality of information sources, namely, the face recognition system issues corresponding face images to a server through interaction operation at a terminal side and recognizes the face images; it will be appreciated that various terminal devices are shown in fig. 1, the terminal devices may be computer devices, in an actual scene, there may be more or less terminal devices participating in the face recognition process, and the specific number and types are not limited herein, and in addition, one server is shown in fig. 1, but in an actual scene, there may also be multiple servers participating, and the specific number of servers is determined by the actual scene.
In this embodiment, the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a smart voice interaction device, a smart home appliance, a vehicle-mounted terminal, and the like. The terminals and servers may be directly or indirectly connected by wired or wireless communication, and the terminals and servers may be connected to form a blockchain network, which is not limited herein.
It will be appreciated that the face recognition system described above may be operated on a personal mobile terminal, for example: the face recognition application can be used as an application which can also be run on a server or a third party device to provide face recognition so as to obtain the face recognition processing result of the information source; the specific face recognition system may be in a program form, may also be operated as a system component in the device, and may also be used as a cloud service program, where the specific operation mode is determined by an actual scenario and is not limited herein.
With rapid development of computer technology, face recognition technology is becoming more common. The face recognition system is applied to actual scenes such as security, payment, access control and the like, and the face recognition system (method) used by the face recognition system is used for recognizing face images, so that the input face images are required to be high-definition and reliable. For face images in which blurring or various noises (such as occlusion, large pose, etc.) exist, the face recognition system (method) maps its features to spatial areas with poor discrimination, resulting in inaccuracy in recognition. However, in a practical application scenario, there is a part of face images which are unsuitable for recognition by the face recognition system due to low image quality (such as occlusion, blurring, large gesture, etc.), so that the accuracy of the face recognition system is reduced.
The face recognition can be performed by adopting Computer Vision (CV) technology, and Computer Vision is a science for researching how to make a machine "see", and further means that a camera and a Computer are used for replacing human eyes to perform machine Vision such as recognition, detection and measurement on a target, and further performing graphic processing, so that the Computer is processed into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
Generally, a plurality of face images of the same person can be acquired through a computer vision technology and input into a face recognition system, features output by the face recognition system are subjected to modeling estimation in a two-dimensional European space to obtain features of each dimension, and fusion weighting of the features is performed to obtain a recognition result.
However, a large variable is easy to exist in the process of collecting a plurality of photos, so that a weighting mode in the process of feature fusion may not be suitable, and the accuracy of face recognition is affected.
In order to solve the above problems, the present application provides a face recognition method, which is applied to a face recognition flow frame shown in fig. 2, as shown in fig. 2, and is a face recognition flow frame diagram provided in an embodiment of the present application, a user sends a corresponding face image to a server through interactive operation of a terminal, the server performs image enhancement on the face image, performs confidence calculation, and performs similarity recognition based on image features fused with confidence; the method comprises the steps of obtaining the distribution information of the face image features of the same person in different scenes (gestures, illumination, shielding and the like) in the hypersphere space, evaluating multiple enhancement forms of a single image in an input system according to the obtained distribution information, carrying out feature fusion according to the obtained confidence, and finally carrying out recognition in a fused feature input comparison module. The recognition accuracy of the face recognition system can be further improved by estimating the confidence coefficient of different enhancement modes of a single input image and fusing the features of the enhancement images according to the confidence coefficient.
It can be understood that the method provided by the application can be a program writing method, which is used as a processing logic in a hardware system, and can also be used as a face recognition device, and the processing logic is realized in an integrated or external mode. As one implementation, the face recognition device obtains a face image of a target object; then carrying out multiple image adjustment on the face images, and respectively collecting the face images after adjustment to obtain an image set; further extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features; the feature map and the target image feature are input into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on a depth network unit module and a full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space; carrying out feature fusion on the confidence information corresponding to the target image features and the face images to obtain fusion features; and further, similarity comparison is carried out on the basis of the fusion characteristics and registered images in the face database so as to determine a face recognition result corresponding to the target object. Therefore, the face recognition process based on the single image is realized, and the confidence coefficient of different adjustment forms of the single input image is estimated, and then the similarity comparison is carried out in a mode of fusing the features of the adjustment images according to the confidence coefficient, so that the face feature distribution under different scenes is simulated, and the face recognition accuracy is improved.
The scheme provided by the embodiment of the application relates to an artificial intelligence computer vision technology, and is specifically described by the following embodiments:
referring to fig. 3, fig. 3 is a flowchart of a face recognition method provided by an embodiment of the present application, where the management method may be executed by a terminal or a server, and the embodiment of the present application at least includes the following steps:
301. and acquiring a face image of the target object.
In this embodiment, the face image of the target object may be an instant collected face image, for example, a real-time face image collected by an entrance guard; the face image of the target object can also be uploaded to the server after being acquired by the terminal, for example, in the face-brushing payment process, the face image is uploaded to the server for recognition; the face image of the target object may also be a data set collected in a period of time, for example, a list of people entering and exiting in the security system statistics, and the specific face image acquisition form is determined according to an actual scene, which is not limited herein.
302. And carrying out multiple image adjustment on the face images, and respectively collecting the face images after adjustment to obtain an image set.
In the embodiment, the image adjustment, that is, the image enhancement process is performed on the face image, so as to simulate the face image with blurring or various noises (such as shielding, large gesture, etc.) in the actual scene, thereby improving the recognition accuracy; the specific image enhancement mode can comprise a data enhancement mode of turning left and right, adding black noise randomly, adjusting brightness and tone, and the like.
Optionally, because there is a certain correlation between different noises in the actual scene, for example, the entrance guard adopts a fixed angle to collect faces, and the different degrees are mainly reflected in different light rays and the like; the relevance setting can be performed on the image enhancement process, for example, a parallel image enhancement process can be adopted, that is, first, a first adjustment operation is performed on the face image to obtain a first adjustment image; then determining a second adjustment operation based on the adjustment dimension corresponding to the first adjustment operation, wherein the adjustment dimension corresponding to the second adjustment operation is different from the adjustment dimension corresponding to the first adjustment operation, so as to simulate the influence of different variable factors on the face image, for example, if the first adjustment image is light adjustment, the first adjustment image is black noise (simulated occlusion); further, a second adjustment operation is carried out on the face image so as to obtain a second adjustment image; the first adjusted image and the second adjusted image are then collected to obtain an image set.
It should be understood that, here, two times of parallel adjustment are illustrated as an example, and more times may be specifically used, and the adjustment manner of the number of times of the parallel logic is adopted, which is not limited to the specific times.
In addition, in order to simulate a gradual recognition scene in an actual scene, for example, face recognition in a scene with regular light change, a serial enhancement process may be performed, that is, a third adjustment operation is performed on a face image to obtain a third adjustment image; then, fourth adjustment operation is carried out on the third adjustment image so as to obtain a fourth adjustment image; and collecting the third adjustment image and the fourth adjustment image to obtain an image set, namely a gradual adjustment process, so that the face recognition accuracy in a variable gradual change scene is improved.
It should be understood that the above description is given here with respect to two times of serial adjustment, and more times may be specifically adopted, and the adjustment manner of the number of times of the serial adjustment is not limited to the above-mentioned parallel logic, and the third adjustment image and the fourth adjustment operation may be the same adjustment operation or different adjustment operations, and the specific manner depends on the actual scenario.
303. And extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain the target image features.
In this embodiment, the depth network unit module and the full connection mapping module are pre-trained feature extraction modules, and are described below with reference to specific recognition scenarios, as shown in fig. 4, fig. 4 is a schematic flow chart of a face recognition method provided in the embodiment of the present application; the figure shows two phases, a module training phase, a deployment run phase. In the module training stage, firstly, the deep network unit module and the full-connection mapping module of face recognition are trained, then parameters in the deep network unit module and the full-connection mapping module are fixed, the output of the parameters is used as input of the confidence estimation module, the confidence estimation module of the face image is trained, and only the parameters of the confidence estimation module are trained, and the parameters in the deep network unit module and the full-connection mapping module are not updated. After the deep network unit module, the full-connection mapping module and the image confidence estimation module are trained, the modules are integrated in a deployment operation stage, and the distance measurement module and the threshold comparison module are matched with the feature fusion module to form a complete face recognition module.
The training process of the deep network element module and the full connection mapping module is described below. Specifically, a first face image in a first sample set may be first acquired; then extracting the spatial features of the first face image based on the depth network unit module to obtain a first sample feature map containing spatial structure information; mapping the first sample feature map into a first sample vector based on the full connection mapping module; then, performing cyclic calculation on the first sample vector based on a first objective function (such as softmax, various types of softmax plus margin) to determine corresponding first loss information when a first condition is reached; and further, parameter adjustment is performed on the depth network unit module and the full-connection mapping module according to the first loss information, so that a feature map corresponding to the image set is extracted based on the depth network unit module after parameter adjustment, and feature mapping is performed on the full-connection mapping module after parameter adjustment of the feature map input to obtain target image features, wherein the first condition can be that the number of circulation times or convergence reaches a threshold value.
The training process of the depth network unit module and the full connection mapping module is described below with reference to the module scene shown in fig. 4, as shown in fig. 5, fig. 5 is a schematic flow chart of another face recognition method according to an embodiment of the present application; the figure shows that when training the deep network element module and the full connection mapping module, the flow and the functions of each module are as follows:
(a) Training data preparation module: the module reads face training data in the training process, combines the read data into a batch, and sends the batch into a deep network unit for processing.
(b) Deep network element module: the function of the module is to extract the spatial characteristics of the face image, and the output characteristic diagram keeps the spatial structure information of the face image. The present module generally has a structure of Convolutional Neural Network (CNN), and includes operations such as convolutional (convolution) calculation, nonlinear activation function (Relu) calculation, pooling (Pooling) calculation, and the like.
(c) The function of the full connection mapping unit module is to map the feature map extracted by the depth network unit module and containing space structure information into a 1 Xn d Vector μ of dimensions.
(d) And the face recognition objective function calculation module takes the feature f output by the full-connection mapping unit and the label information of the face image for generating the vector as inputs to calculate an objective function value. The objective function may be a classification function (such as softmax, various types of softmax plus margin), or may be another type of objective function.
(e) The face recognition objective function optimization module performs training optimization on the whole network based on a gradient descent mode (such as random gradient descent, random gradient descent of a driving quantity item, adam, adagard). Repeating the steps (a) - (d) in training until the training result meets the training termination condition. The condition for ending model training generally sets the iteration times to meet the set value, or the loss calculated by the face recognition objective function is smaller than the set value, so that model training can be completed.
It will be appreciated that the above-described division of modules into descriptions of their functions, and that modules having similar functions may also be integrated, as merely examples herein.
304. And inputting the feature map and the target image features into a confidence estimation module for estimation so as to determine confidence information corresponding to the face image.
In this embodiment, the confidence estimation module is trained based on the confidence information corresponding to the training sample and the sample feature, and the sample feature is determined based on the depth network element module and the full-connection mapping module, that is, after training the depth network element module and the full-connection mapping module in step 303, training the confidence estimation module based on the trained depth network element module and full-connection mapping module; in addition, confidence information corresponding to the training sample is determined based on uncertainty of sample features in the hypersphere space, and features represented by modeling the face in the hypersphere space are more accurate because the hypersphere space is more in line with the feature space of the face.
In the embodiment, the confidence estimation module is used for estimating, the confidence analysis is carried out on each enhancement form of the same face image, the image confidence is utilized for fusing the features, and finally the fused features are adopted for face recognition comparison, so that the accuracy of face recognition is improved.
The training process of the confidence estimation module is described below, specifically, a second face image in a second sample set may be first acquired, where the second sample set may be the same set as the first sample set or may be a different set, and the second sample set is not limited herein; then, performing image adjustment on the second face image to determine an adjustment image set, wherein the specific adjustment mode refers to an example in step 302 and is not described herein; then determining a class center corresponding to the image set, such as a characteristic average value obtaining operation; extracting the spatial features of the second face image based on the depth network unit module to obtain a second sample feature map containing spatial structure information; mapping the second sample feature map to a second sample vector based on the full connection mapping module; then calculating uncertainty of the second sample vector in the hypersphere space based on the confidence estimation module to determine sample confidence; performing cyclic calculation on the second sample vector and the sample confidence coefficient based on a second objective function to determine second loss information corresponding to the second condition; and then, carrying out parameter adjustment on the confidence coefficient estimation module according to the second loss information, so as to estimate the confidence coefficient estimation module after the feature image and the target image feature input parameter adjustment, and further determine the confidence coefficient information corresponding to the face image.
The training process is described below with reference to the module scenario shown in fig. 4, and as shown in fig. 6, 6 is a schematic flow chart of another face recognition method according to an embodiment of the present application; the training of the confidence estimation module (also referred to as uncertainty estimation module) is shown, and the flow and functions of each module are as follows:
(a) And the function of the training data preparation module is consistent with that of the training data preparation module in the training of the deep network unit module.
It will be appreciated that the depth network element module of step 303 is used in conjunction with training the uncertainty estimation module, and that the parameters of the module are not updated during training.
(b) And the image data enhancement module has the function of enhancing the data of different forms of the input image. In the embodiment, the module can enable the confidence estimation network to learn the confidence estimation of various enhancement forms of the same image, and increase the estimation robustness. The main enhancement functions in the module are as follows: turning left and right, adding black noise randomly, and adjusting brightness and tone.
(c) Training sample class center acquisition module, the function of the module is to calculate and obtain class center w of face image in each ID in face training set x∈c Representing the class center of the class c sample. The calculation method of the module can be an average value operation of all image features in the category, or can adopt the classification weight of each category obtained in the training depth network unit module as the center of the category.
(d) The uncertainty estimation module is used for estimating the uncertainty k of each input face image characteristic in the hypersphere space. The structure of the module is a CNN structure, for example, a plurality of layers of neural networks which are connected in a full-connection way or in a RESNET form can be adopted, and the specific form is determined according to the actual scene.
In addition, the spatial distribution with closed solution can be used for estimating the characteristics of the hypersphere space, and the middle training process can be reduced.
(e) Uncertainty objective function calculation module, the function of the module is to obtain the characteristic mu (x) and class center w of the class for the depth network element module x∈c The calculation of the second objective function is performed, the expression of the second objective function (1) being as follows:
wherein k represents the uncertainty of the image, mu represents the output characteristic of the depth network unit module, d represents the characteristic dimension output by the fully-connected mapping module, r represents the radius of the characteristic mapped into the hypersphere space, and I is a Bessel function; in addition, the expression of the bessel function (2) is as follows:
Wherein α is in the second objective functionx is the confidence coefficient, and k and m are reference parameters.
(f) The uncertainty objective function optimizing module optimizes the whole uncertainty estimating module based on a gradient descent mode (such as random gradient descent, random gradient descent driving quantity items, adam, adagard). Wherein the optimized gradient of k and mu is shown as a formula (3), and the formulas (4) are repeated from (a) to (d) in training until the training result meets the training termination condition. The condition for ending model training generally sets the iteration times to meet the set value, or the loss calculated by the face recognition objective function is smaller than the set value, so that model training can be completed.
And estimating through a confidence estimation module, performing confidence analysis on each enhanced form of the same face image, fusing the features by utilizing the image confidence, and finally performing face recognition comparison by adopting the fused features, thereby improving the accuracy of face recognition.
Optionally, for the image adjustment (enhancement) process in the training process, the image adjustment may be matched with the recognition process, that is, the image adjustment may be performed on the second face image, so that the process of determining the adjustment image set may first perform statistics on the adjustment manners adopted in the image adjustment process for multiple times on the face image, so as to determine adjustment manner distribution information (for example, the number of times of use of different adjustment manners, the order of adjustment, etc.); then determining a preset mode combination (such as an adjustment sequence combination with the largest frequency) based on the adjustment mode distribution information; and then, carrying out image adjustment on the second face image according to the preset mode combination to determine an adjustment image set, thereby improving the accuracy of the trained model on the feature recognition of the feature image enhancement combination. In addition, a face image enhancement mode with network estimation can be introduced to obtain a more accurate confidence estimation scheme
305. And carrying out feature fusion on the confidence information corresponding to the target image features and the face image to obtain fusion features.
In this embodiment, the fusion method is as shown in formula (5):
where z is the fused feature, k is the image confidence, and μ is the feature of the single image.
306. And comparing the similarity between the fusion characteristics and the registered images in the face database to determine the face recognition result corresponding to the target object.
In this embodiment, for the similarity calculation with respect to the features of the registered images in the face database, a cosine similarity or a euclidean distance equidistant measurement function may be used, and if the similarity reaches a threshold (for example, 0.95), it indicates that the face recognition result is the same person, that is, the target object is the object corresponding to the registered image in the face database.
In addition, different thresholds can be set for different false recognition rates, so that recognition scenes of different demands are realized, namely, the false recognition rate indicated in the preset demands is firstly obtained; then determining a comparison threshold value corresponding to the similarity comparison based on the false recognition rate (for example, the smaller the false recognition rate is, the larger the comparison threshold value is); then, similarity comparison is carried out on the basis of the fusion characteristics and registered images in a face database so as to determine target similarity; and comparing the target similarity with a comparison threshold value to determine the face recognition result corresponding to the target object, wherein if the similarity is larger than the comparison threshold value, the face recognition result is the same object.
The following describes the recognition process with reference to the module scenario shown in fig. 3, that is, the module deployment stage mainly combines and deploys the related modules obtained in the module training stage to form a complete solution, the flow of this stage is shown in fig. 7, and fig. 7 is a schematic flow diagram of another face recognition method provided in the embodiment of the present application; the figure shows that a single face image 1 can be firstly acquired, the face image 1 is subjected to left-right overturning operation by adopting an image enhancement stitching module to obtain an overturned image 2, and an image resize intercepted from the middle (200 ) is amplified to an input size to obtain an image 3. And then inputting the three image sets into a depth network unit, outputting a feature map with the high expression information by the depth network unit, obtaining an image sample feature mu with the dimension d by the feature map through a full-connection mapping module, and obtaining an uncertain estimation factor k of the image by the feature map and the feature mu of the image through a confidence estimation module. The image features and the corresponding confidence coefficient enter a feature fusion module at the same time, and the feature fusion module fuses the input image features to obtain fusion features under the id. The fusion mode is as follows:
Where z is the fused feature, k is the image confidence, and μ is the feature of the single image.
In addition, for the similarity calculation module, the function of the module is to calculate the similarity between the fused features and the features in the registry. The similarity between the id of the query and the id in the base is output by adopting a cosine similarity or Euclidean distance equidistant measurement function.
And for the threshold judging unit module, the function of the module is to judge whether the two images are images from the same person according to the image similarity outputted by the similarity calculating module. Wherein the determination of the threshold th is determined according to the false positive rate which is required to be satisfied in the actual application scenario. The output of the threshold value judgment module (L out ) As shown below:
according to the embodiment, a group of models for estimating the confidence coefficient of the face features in the hypersphere space are trained by training images, confidence coefficient analysis is carried out on each enhanced form of a single image by the models, and finally the features are fused by the confidence coefficient, so that the accuracy of identification can be improved by identifying the obtained fused features. The method does not need to retrain the deployed face recognition (system) model, and only needs to train the model for estimating the confidence level by using the original training data. The cost of collecting data can be saved without adopting new training data, and the feature confidence is estimated in the hypersphere space, which has theoretical basis. Meanwhile, only a single image is required to be acquired, and the image is enhanced and expanded by adopting the enhancement mode in the embodiment, so that the noninductive upgrading of user experience is realized. The embodiment can be applied to all face recognition systems (methods) without being limited by application scenes or methods.
As can be seen from the above embodiments, a face image of a target object is acquired; then carrying out multiple image adjustment on the face images, and respectively collecting the face images after adjustment to obtain an image set; further extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features; the feature map and the target image feature are input into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on a depth network unit module and a full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space; carrying out feature fusion on the confidence information corresponding to the target image features and the face images to obtain fusion features; and further, similarity comparison is carried out on the basis of the fusion characteristics and registered images in the face database so as to determine a face recognition result corresponding to the target object. Therefore, the face recognition process based on the single image is realized, and the confidence coefficient of estimating different adjustment forms of the single input image is adopted, and then the similarity comparison is carried out in a mode of fusing the features of the adjustment images according to the confidence coefficient, so that the face feature distribution under different scenes is simulated, the recognition accuracy of a face recognition system is improved, and the false recognition alarm rate is reduced.
In order to better implement the above-described aspects of the embodiments of the present application, the following provides related apparatuses for implementing the above-described aspects. Referring to fig. 8, fig. 8 is a schematic structural diagram of a face recognition device according to an embodiment of the present application, and the recognition device 800 includes:
an acquiring unit 801, configured to acquire a face image of a target object;
the adjusting unit 802 is configured to perform multiple image adjustments on the face image, and collect the adjusted face images respectively to obtain an image set;
the identifying unit 803 is configured to extract a feature map corresponding to the image set based on the depth network unit module, and input the feature map into the full-connection mapping module for feature mapping to obtain a target image feature;
the identifying unit 803 is further configured to input the feature map and the target image feature to a confidence estimation module for estimation to determine confidence information corresponding to the face image, where the confidence estimation module is obtained by training based on confidence information corresponding to a training sample and sample features, the sample features are determined based on the depth network unit module and the fully connected mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in a hypersphere space;
The identifying unit 803 is further configured to perform feature fusion on the target image feature and confidence information corresponding to the face image, so as to obtain a fused feature;
the identifying unit 803 is further configured to perform similarity comparison with a registered image in a face database based on the fusion feature, so as to determine a face recognition result corresponding to the target object.
Optionally, in some possible implementations of the present application, the adjusting unit 802 is specifically configured to perform a first adjustment operation on the face image to obtain a first adjusted image;
the adjusting unit 802 is specifically configured to determine a second adjustment operation based on an adjustment dimension corresponding to the first adjustment operation, where the adjustment dimension corresponding to the second adjustment operation is different from the adjustment dimension corresponding to the first adjustment operation;
the adjusting unit 802 is specifically configured to perform a second adjusting operation on the face image to obtain a second adjusted image;
the adjusting unit 802 is specifically configured to collect the first adjustment image and the second adjustment image to obtain the image set.
Optionally, in some possible implementations of the present application, the adjusting unit 802 is specifically configured to perform a third adjustment operation on the face image to obtain a third adjusted image;
The adjusting unit 802 is specifically configured to perform a fourth adjustment operation on the third adjustment image to obtain a fourth adjustment image;
the adjusting unit 802 is specifically configured to collect the third adjustment image and the fourth adjustment image to obtain the image set.
Optionally, in some possible implementations of the present application, the identifying unit 803 is specifically configured to obtain a false recognition rate indicated in the preset requirement;
the identifying unit 803 is specifically configured to determine a comparison threshold corresponding to the similarity comparison based on the false recognition rate;
the identifying unit 803 is specifically configured to perform similarity comparison with a registered image in a face database based on the fusion feature, so as to determine a target similarity;
the identifying unit 803 is specifically configured to compare the target similarity with the comparison threshold to determine a face recognition result corresponding to the target object.
Optionally, in some possible implementations of the present application, the identifying unit 803 is specifically configured to obtain a first face image in the first sample set;
the identifying unit 803 is specifically configured to extract, based on the depth network unit module, a spatial feature of the first face image, so as to obtain a first sample feature map that includes spatial structure information;
The identifying unit 803 is specifically configured to map the first sample feature map to a first sample vector based on the fully-connected mapping module;
the identifying unit 803 is specifically configured to perform loop calculation on the first sample vector based on a first objective function, so as to determine first loss information corresponding to the first condition;
the identifying unit 803 is specifically configured to perform parameter adjustment on the deep network element module and the full connection mapping module according to the first loss information;
the identifying unit 803 is specifically configured to extract a feature map corresponding to the image set based on the depth network unit module after parameter adjustment, and perform feature mapping on the feature map input parameter-adjusted fully connected mapping module to obtain the target image feature.
Optionally, in some possible implementations of the present application, the identifying unit 803 is specifically configured to obtain a second face image in the second sample set;
the identifying unit 803 is specifically configured to perform image adjustment on the second face image to determine an adjusted image set;
the identifying unit 803 is specifically configured to determine a category center corresponding to the adjusted image set;
The identifying unit 803 is specifically configured to extract, based on the depth network unit module, a spatial feature of the second face image, so as to obtain a second sample feature map that includes spatial structure information;
the identifying unit 803 is specifically configured to map the second sample feature map to a second sample vector based on the fully-connected mapping module;
the identifying unit 803 is specifically configured to calculate, based on a confidence estimation module, an uncertainty of the second sample vector in the hypersphere space, so as to determine a sample confidence;
the identifying unit 803 is specifically configured to perform a loop calculation on the second sample vector and the sample confidence coefficient based on a second objective function, so as to determine second loss information corresponding to the second condition;
the identifying unit 803 is specifically configured to perform parameter adjustment on the confidence coefficient estimation module according to the second loss information;
the identifying unit 803 is specifically configured to estimate the feature map and the confidence coefficient estimation module after the feature input parameter of the target image is adjusted, so as to determine confidence coefficient information corresponding to the face image.
Optionally, in some possible implementations of the present application, the identifying unit 803 is specifically configured to count adjustment manners adopted in the process of performing multiple image adjustment on the face image, so as to determine adjustment manner distribution information;
The identifying unit 803 is specifically configured to determine a preset mode combination based on the adjustment mode distribution information;
the identifying unit 803 is specifically configured to perform image adjustment on the second face image according to the preset mode combination, so as to determine the adjusted image set.
Acquiring a face image of a target object; then carrying out multiple image adjustment on the face images, and respectively collecting the face images after adjustment to obtain an image set; further extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features; the feature map and the target image feature are input into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on a depth network unit module and a full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space; carrying out feature fusion on the confidence information corresponding to the target image features and the face images to obtain fusion features; and further, similarity comparison is carried out on the basis of the fusion characteristics and registered images in the face database so as to determine a face recognition result corresponding to the target object. Therefore, the face recognition process based on the single image is realized, and the confidence coefficient of different adjustment forms of the single input image is estimated, and then the similarity comparison is carried out in a mode of fusing the features of the adjustment images according to the confidence coefficient, so that the face feature distribution under different scenes is simulated, and the face recognition accuracy is improved.
The embodiment of the present application further provides a terminal device, as shown in fig. 9, which is a schematic structural diagram of another terminal device provided in the embodiment of the present application, for convenience of explanation, only the portion related to the embodiment of the present application is shown, and specific technical details are not disclosed, please refer to the method portion of the embodiment of the present application. The terminal may be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (personal digital assistant, PDA), a point of sale (POS), a vehicle-mounted computer, and the like, taking the terminal as an example of the mobile phone:
fig. 9 is a block diagram showing a part of the structure of a mobile phone related to a terminal provided by an embodiment of the present application. Referring to fig. 9, the mobile phone includes: radio Frequency (RF) circuitry 910, memory 920, input unit 930, display unit 940, sensor 950, audio circuitry 960, wireless fidelity (wireless fidelity, wiFi) module 970, processor 980, and power source 990. It will be appreciated by those skilled in the art that the handset construction shown in fig. 9 is not limiting of the handset and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The following describes the components of the mobile phone in detail with reference to fig. 9:
the RF circuit 910 may be used for receiving and transmitting signals during a message or a call, and particularly, after receiving downlink information of a base station, the signal is processed by the processor 980; in addition, the data of the design uplink is sent to the base station. Typically, the RF circuitry 910 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (low noise amplifier, LNA), a duplexer, and the like. In addition, the RF circuitry 910 may also communicate with networks and other devices via wireless communications. The wireless communications may use any communication standard or protocol including, but not limited to, global system for mobile communications (global system of mobile communication, GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), wideband code division multiple access (wideband code division multiple access, WCDMA), long term evolution (long term evolution, LTE), email, short message service (short messaging service, SMS), and the like.
The memory 920 may be used to store software programs and modules, and the processor 980 performs various functional applications and data processing by operating the software programs and modules stored in the memory 920. The memory 920 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 920 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The input unit 930 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the handset. In particular, the input unit 930 may include a touch panel 931 and other input devices 932. The touch panel 931, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on or thereabout the touch panel 931 using a finger, a stylus, or any other suitable object or accessory, and spaced touch operations within a certain range on the touch panel 931), and drive the corresponding connection device according to a predetermined program. Alternatively, the touch panel 931 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 980, and can receive commands from the processor 980 and execute them. In addition, the touch panel 931 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 930 may include other input devices 932 in addition to the touch panel 931. In particular, other input devices 932 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.
The display unit 940 may be used to display information input by a user or information provided to the user and various menus of the mobile phone. The display unit 940 may include a display panel 941, and alternatively, the display panel 941 may be configured in the form of a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 931 may overlay the display panel 941, and when the touch panel 931 detects a touch operation thereon or thereabout, the touch operation is transferred to the processor 980 to determine a type of touch event, and then the processor 980 provides a corresponding visual output on the display panel 941 according to the type of touch event. Although in fig. 9, the touch panel 931 and the display panel 941 are implemented as two separate components for the input and output functions of the mobile phone, in some embodiments, the touch panel 931 may be integrated with the display panel 941 to implement the input and output functions of the mobile phone.
The handset may also include at least one sensor 950, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 941 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 941 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for applications of recognizing the gesture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the handset are not described in detail herein.
Audio circuitry 960, speaker 961, microphone 962 may provide an audio interface between a user and a cell phone. Audio circuit 960 may transmit the received electrical signal converted from audio data to speaker 961, where it is converted to a sound signal by speaker 961 for output; on the other hand, microphone 962 converts the collected sound signals into electrical signals, which are received by audio circuit 960 and converted into audio data, which are processed by audio data output processor 980 for transmission to, for example, another cell phone via RF circuit 910 or for output to memory 920 for further processing.
WiFi belongs to a short-distance wireless transmission technology, and a mobile phone can help a user to send and receive emails, browse webpages, access streaming media and the like through a WiFi module 970, so that wireless broadband Internet access is provided for the user. Although fig. 9 shows a WiFi module 970, it is understood that it does not belong to the necessary constitution of the handset, and can be omitted entirely as needed within the scope of not changing the essence of the invention.
The processor 980 is a control center of the handset, connecting various parts of the entire handset using various interfaces and lines, performing various functions and processing data of the handset by running or executing software programs and/or modules stored in the memory 920, and invoking data stored in the memory 920, thereby performing overall monitoring of the handset. Optionally, processor 980 may include one or more processing units; alternatively, processor 980 may integrate an application processor with a modem processor, where the application processor primarily handles operating systems, user interfaces, applications programs, etc., and the modem processor primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 980.
The handset further includes a power supply 990 (e.g., a battery) for powering the various components, optionally in logical communication with the processor 980 through a power management system, such as by performing charge, discharge, and power management functions via the power management system.
Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which will not be described herein.
In the embodiment of the present application, the processor 980 included in the terminal further has a function of executing each step of the page processing method as described above.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 1000 may have a relatively large difference due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 1022 (e.g., one or more processors) and a memory 1032, one or more storage media 1030 (e.g., one or more mass storage devices) storing application programs 1042 or data 1044. Wherein memory 1032 and storage medium 1030 may be transitory or persistent. The program stored on the storage medium 1030 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Further, central processor 1022 may be configured to communicate with storage medium 1030 to perform a series of instruction operations in storage medium 1030 on server 1000.
The server 1000 may also include one or more power supplies 1026, one or more wired or wireless network interfaces 1050, one or more input/output interfaces 1058, and/or one or more operating systems 1041, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.
The steps performed by the management apparatus in the above-described embodiments may be based on the server structure shown in fig. 10.
In an embodiment of the present application, a computer readable storage medium is further provided, where a face recognition instruction is stored, where the face recognition instruction when executed on a computer causes the computer to perform the steps performed by the face recognition device in the method described in the foregoing embodiments shown in fig. 3 to fig. 7.
There is also provided in an embodiment of the present application a computer program product comprising instructions for face recognition, which when run on a computer causes the computer to perform the steps performed by the face recognition device in the method described in the embodiment of fig. 3 to 7.
The embodiment of the application also provides a face recognition system, which can comprise a face recognition device in the embodiment shown in fig. 8, or a terminal device in the embodiment shown in fig. 9, or a server shown in fig. 10.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product, or all or part of the technical solution, which is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a face recognition device, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method for face recognition, comprising:
acquiring a face image of a target object;
carrying out multiple image adjustment on the face image, and respectively collecting the adjusted face images to obtain an image set;
extracting a feature map corresponding to the image set based on a depth network unit module, and inputting the feature map into a full-connection mapping module for feature mapping to obtain target image features;
inputting the feature map and the target image feature into a confidence estimation module for estimation to determine confidence information corresponding to the face image, wherein the confidence estimation module is trained based on the confidence information corresponding to a training sample and sample features, the sample features are determined based on the depth network unit module and the fully-connected mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in an hypersphere space;
Feature fusion is carried out on the target image features and the confidence information corresponding to the face image so as to obtain fusion features;
and comparing the similarity between the fusion characteristics and the registered images in the face database to determine a face recognition result corresponding to the target object.
2. The method according to claim 1, wherein performing image adjustment on the face image for a plurality of times, and collecting the adjusted face images respectively to obtain image sets, includes:
executing a first adjustment operation on the face image to obtain a first adjustment image;
determining a second adjustment operation based on the adjustment dimension corresponding to the first adjustment operation, wherein the adjustment dimension corresponding to the second adjustment operation is different from the adjustment dimension corresponding to the first adjustment operation;
executing a second adjustment operation on the face image to obtain a second adjustment image;
the first adjustment image and the second adjustment image are collected to obtain the image set.
3. The method according to claim 1, wherein performing image adjustment on the face image for a plurality of times, and collecting the adjusted face images respectively to obtain image sets, includes:
Executing a third adjustment operation on the face image to obtain a third adjustment image;
performing a fourth adjustment operation on the third adjustment image to obtain a fourth adjustment image;
and collecting the third adjustment image and the fourth adjustment image to obtain the image set.
4. The method according to claim 1, wherein the comparing similarity between the fusion feature and a registered image in a face database to determine a face recognition result corresponding to the target object includes:
acquiring the false recognition rate indicated in the preset demand;
determining a comparison threshold corresponding to similarity comparison based on the false recognition rate;
performing similarity comparison on the fusion characteristics and registered images in a face database to determine target similarity;
and comparing the target similarity with the comparison threshold value to determine a face recognition result corresponding to the target object.
5. The method according to claim 1, wherein the depth-based network element module extracts a feature map corresponding to the image set, and inputs the feature map to a fully-connected mapping module for feature mapping to obtain a target image feature, and the method comprises:
Acquiring a first face image in a first sample set;
extracting the spatial features of the first face image based on the depth network unit module to obtain a first sample feature map containing spatial structure information;
mapping the first sample feature map to a first sample vector based on the fully connected mapping module;
performing cyclic calculation on the first sample vector based on a first objective function to determine corresponding first loss information when a first condition is reached;
parameter adjustment is carried out on the depth network unit module and the full-connection mapping module according to the first loss information;
and extracting a feature map corresponding to the image set based on the depth network unit module after parameter adjustment, and carrying out feature mapping on the feature map input parameter-adjusted full-connection mapping module to obtain the target image feature.
6. The method of claim 1, wherein inputting the feature map and the target image feature into a confidence estimation module for estimation to determine confidence information corresponding to the face image comprises:
acquiring a second face image in a second sample set;
Performing image adjustment on the second face image to determine an adjustment image set;
determining a category center corresponding to the adjustment image set;
extracting the spatial features of the second face image based on the depth network unit module to obtain a second sample feature map containing spatial structure information;
mapping the second sample feature map to a second sample vector based on the fully connected mapping module;
calculating uncertainty of the second sample vector in the hypersphere space based on a confidence estimation module to determine sample confidence;
performing cyclic calculation on the second sample vector and the sample confidence based on a second objective function to determine second loss information corresponding to the second condition;
performing parameter adjustment on the confidence coefficient estimation module according to the second loss information;
and estimating the feature map and the confidence coefficient estimation module after the feature input parameters of the target image are adjusted so as to determine the confidence coefficient information corresponding to the face image.
7. The method of claim 6, wherein said performing image adjustment on said second face image to determine an adjusted image set comprises:
Counting the adjustment modes adopted in the process of carrying out image adjustment on the face image for a plurality of times so as to determine the distribution information of the adjustment modes;
determining a preset mode combination based on the adjustment mode distribution information;
and carrying out image adjustment on the second face image according to the preset mode combination so as to determine the adjustment image set.
8. A face recognition apparatus, comprising:
the acquisition unit is used for acquiring the face image of the target object;
the adjusting unit is used for carrying out image adjustment on the face images for a plurality of times and respectively collecting the face images after adjustment to obtain an image set;
the recognition unit is used for extracting a feature map corresponding to the image set based on the depth network unit module, and inputting the feature map into the full-connection mapping module for feature mapping to obtain target image features;
the recognition unit is further configured to input the feature map and the target image feature into a confidence estimation module for estimation to determine confidence information corresponding to the face image, the confidence estimation module is trained based on confidence information corresponding to a training sample and sample features, the sample features are determined based on the depth network unit module and the full-connection mapping module, and the confidence information corresponding to the training sample is determined based on uncertainty of the sample features in a hypersphere space;
The recognition unit is further used for carrying out feature fusion on the target image features and confidence information corresponding to the face image to obtain fusion features;
the recognition unit is further used for comparing the similarity between the fusion characteristics and the registered images in the face database so as to determine a face recognition result corresponding to the target object.
9. A computer device, the computer device comprising a processor and a memory:
the memory is used for storing program codes; the processor is configured to perform the face recognition method of any one of claims 1 to 7 according to instructions in the program code.
10. A computer program product comprising computer programs/instructions stored on a computer readable storage medium, characterized in that the computer programs/instructions in the computer readable storage medium, when executed by a processor, implement the steps of the face recognition method of any one of the preceding claims 1 to 7.
CN202210302446.4A 2022-03-25 2022-03-25 Face recognition method, device and storage medium Pending CN116884049A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210302446.4A CN116884049A (en) 2022-03-25 2022-03-25 Face recognition method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210302446.4A CN116884049A (en) 2022-03-25 2022-03-25 Face recognition method, device and storage medium

Publications (1)

Publication Number Publication Date
CN116884049A true CN116884049A (en) 2023-10-13

Family

ID=88264894

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210302446.4A Pending CN116884049A (en) 2022-03-25 2022-03-25 Face recognition method, device and storage medium

Country Status (1)

Country Link
CN (1) CN116884049A (en)

Similar Documents

Publication Publication Date Title
JP7265003B2 (en) Target detection method, model training method, device, apparatus and computer program
CN111985265B (en) Image processing method and device
CN110009052B (en) Image recognition method, image recognition model training method and device
CN111476306B (en) Object detection method, device, equipment and storage medium based on artificial intelligence
WO2020199926A1 (en) Image recognition network model training method, image recognition method and device
CN111914113B (en) Image retrieval method and related device
CN111444826B (en) Video detection method, device, storage medium and computer equipment
CN111209423B (en) Image management method and device based on electronic album and storage medium
CN111709398A (en) Image recognition method, and training method and device of image recognition model
CN110766081B (en) Interface image detection method, model training method and related device
CN114722937B (en) Abnormal data detection method and device, electronic equipment and storage medium
CN110516113B (en) Video classification method, video classification model training method and device
CN114418069A (en) Method and device for training encoder and storage medium
CN113822427A (en) Model training method, image matching device and storage medium
CN117237742B (en) Knowledge distillation method and device for initial model
CN112818733B (en) Information processing method, device, storage medium and terminal
CN111914106B (en) Texture and normal library construction method, texture and normal map generation method and device
CN115171196B (en) Face image processing method, related device and storage medium
CN116959059A (en) Living body detection method, living body detection device and storage medium
CN117011929A (en) Head posture estimation method, device, equipment and storage medium
CN116884049A (en) Face recognition method, device and storage medium
CN116259083A (en) Image quality recognition model determining method and related device
CN113705309A (en) Scene type judgment method and device, electronic equipment and storage medium
CN107679460B (en) Face self-learning method, intelligent terminal and storage medium
CN117115596B (en) Training method, device, equipment and medium of object action classification model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination