WO2019128367A1 - Face verification method and apparatus based on triplet loss, and computer device and storage medium - Google Patents

Face verification method and apparatus based on triplet loss, and computer device and storage medium Download PDF

Info

Publication number
WO2019128367A1
WO2019128367A1 PCT/CN2018/109169 CN2018109169W WO2019128367A1 WO 2019128367 A1 WO2019128367 A1 WO 2019128367A1 CN 2018109169 W CN2018109169 W CN 2018109169W WO 2019128367 A1 WO2019128367 A1 WO 2019128367A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
face
image
training
neural network
Prior art date
Application number
PCT/CN2018/109169
Other languages
French (fr)
Chinese (zh)
Inventor
许丹丹
梁添才
章烈剽
龚文川
Original Assignee
广州广电运通金融电子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州广电运通金融电子股份有限公司 filed Critical 广州广电运通金融电子股份有限公司
Publication of WO2019128367A1 publication Critical patent/WO2019128367A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Definitions

  • the present invention relates to the field of image processing technologies, and in particular, to a face authentication method, apparatus, computer device, and storage medium based on Triplet Loss.
  • Face authentication refers to comparing the photos of the scenes collected by the scene and the photos of the IDs in the identity information to determine whether they are the same person.
  • the key technology of face authentication is face recognition.
  • the method based on classification learning mainly calculates the classification loss function (such as softmax loss, center loss and related variants) of the sample after the feature is extracted by the deep convolution network to optimize the network.
  • the last layer of the network is used for classification.
  • the number of output nodes is always consistent with the total number of categories in the training data set.
  • This type of method is suitable for training samples, especially when the training samples of the same category are rich, and the network can get better. Training effect and generalization ability.
  • the number of categories reaches hundreds of thousands or more, the amount of parameters in the final classification layer (fully connected layer) of the network will increase linearly and become quite large, which makes the network difficult to train.
  • Another type of method is a method based on metric learning, which organizes training samples in the form of tuples (such as a binary set or a triplet triplet). After deep deconvolution, there is no need to pass the classification layer, but directly based on the volume.
  • the product feature vector calculates the metric loss between samples (such as contrast loss, triplet loss, etc.) to optimize the network.
  • This method does not need to train the classification layer, so the network parameter quantity is not affected by the number of categories, and the category of the training data set. There is no limit to the number. It is only necessary to select the same or different types of samples according to the corresponding strategy to construct a suitable tuple.
  • the metric learning method is more suitable for the case where the training data breadth is large but the depth is insufficient (the number of sample categories is large, but the similar samples are few). By the different combinations between the samples, a considerable amount of tuple data can be constructed. It is used for training, and the measurement learning method pays more attention to the internal relationship of the tuple. It has its inherent advantages for the judgment of 1:1 face verification.
  • metric-based learning methods use Euclidean distance to measure the similarity between samples, while Euclidean distance measures the absolute distance of each point in space, which is directly related to the position coordinates of each point. This is not in line with people.
  • the distribution property of the face feature space leads to low reliability of face recognition.
  • a face recognition method based on Triplet Loss comprising:
  • the method further includes:
  • the training sample including a document face image and at least one scene face image that are marked for each mark object;
  • the triad element includes a reference sample, a positive sample, and a negative sample;
  • the convolutional neural network model is trained based on the supervision of the triplet loss function;
  • the triad loss function is measured by the cosine distance and the model is optimized by the stochastic gradient descent algorithm. parameter;
  • the verification set data is input into the convolutional neural network model, and when the training end condition is reached, the trained convolutional neural network model for face authentication is obtained.
  • the step of training the convolutional neural network model according to the training sample, and generating the triple element corresponding to each training sample by using OHEM includes:
  • the currently trained convolutional neural network model is used to extract the cosine distance between the features. For each reference sample, from other images that do not belong to the label object, the selection distance is the smallest and is different from the reference sample. An image of the category as a negative sample of the reference sample.
  • the triad loss function includes a definition of the cosine distance of a homogeneous sample, and a definition of the cosine distance of the heterogeneous sample.
  • the triplet loss function is:
  • cos( ⁇ ) represents the cosine distance and is calculated as N is the number of triples
  • N represents the number of triples
  • the method further includes: initializing the basic model parameters trained based on the massive open source face data, adding a normalization layer and a ternary loss function layer after the feature output layer, to obtain training Convolutional neural network model.
  • a face recognition device based on Triplet Loss comprising: an image acquisition module, an image preprocessing module, a feature acquisition module, a calculation module and an authentication module;
  • the image obtaining module is configured to obtain a photo of the document and a photo of the scene of the character based on the face authentication request;
  • the image pre-processing module is configured to perform face detection, key point positioning, and image pre-processing on the scene photo and the photo of the document, respectively, to obtain a scene face image corresponding to the scene photo, and the photo of the ID Corresponding document face image;
  • the feature acquiring module is configured to input the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and acquire the output of the convolutional neural network model a first feature vector corresponding to the scene face image, and a second feature vector corresponding to the document face image; wherein the convolutional neural network model is obtained based on the supervision training of the triplet loss function;
  • the calculating module is configured to calculate a cosine distance of the first feature vector and the second feature vector
  • the authentication module is configured to compare the cosine distance and a preset threshold, and determine a face authentication result according to the comparison result.
  • the apparatus further includes: a sample acquisition module, a triplet acquisition module, a training module, and a verification module;
  • the sample obtaining module is configured to acquire a labeled training sample, where the training sample includes a document face image and at least one scene face image that are marked for each mark object;
  • the triplet obtaining module is configured to train a convolutional neural network model according to the training sample, and generate a triple element corresponding to each training sample by using OHEM; the triplet element includes a reference sample, a positive sample, and a negative sample. ;
  • the training module is configured to train the convolutional neural network model according to the triad element of each training sample and the supervision of the base triad loss function; the triad loss function is measured by a cosine distance A stochastic gradient descent algorithm to optimize model parameters;
  • the verification module is configured to input the verification set data into the convolutional neural network model, and when the training end condition is reached, obtain a trained convolutional neural network model for face authentication.
  • a computer apparatus comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, the processor executing the computer program to implement the steps of the above-described Triplet Loss-based face authentication method.
  • a storage medium having stored thereon a computer program characterized in that, when the computer program is executed by a processor, the steps of the above-described facelet authentication method based on Triplet Loss are implemented.
  • the facelet authentication method, device, computer device and storage medium based on Triplet Loss use face-conflicted convolutional neural network for face authentication, and the convolutional neural network model is based on the supervision training of the triplet loss function.
  • the similarity between the scene face image and the document face image is calculated according to the first feature vector corresponding to the scene face image and the cosine distance of the second feature vector corresponding to the document face image, and the cosine distance is a space vector
  • the angle is more reflected in the direction, which is more in line with the distribution properties of the face feature space, which improves the reliability of face authentication.
  • FIG. 1 is a schematic structural diagram of a facelet authentication system based on a Triplet Loss according to an embodiment
  • FIG. 2 is a flow chart of a face authentication method based on Triplet Loss in an embodiment
  • FIG. 3 is a flow chart showing the steps of training a convolutional neural network model for face authentication in one embodiment
  • Figure 4 is a schematic diagram showing the probability of sample misclassification in the case where the interval between classes is uniform and the variance within the class is large;
  • Figure 5 is a schematic diagram showing the probability of sample misclassification in the case where the interval between classes is uniform and the variance within the class is small;
  • FIG. 6 is a schematic diagram of a migration learning process of face authentication based on Triplet Loss in an embodiment
  • FIG. 7 is a schematic structural diagram of a convolutional neural network model for face authentication in an embodiment
  • FIG. 8 is a schematic flowchart of a face authentication method based on Triplet Loss in an embodiment
  • FIG. 9 is a structural block diagram of a face authentication device based on a Triplet Loss in an embodiment
  • FIG. 10 is a structural block diagram of a face authentication device based on Triplet Loss in another embodiment.
  • FIG. 1 is a schematic structural diagram of a facelet authentication system based on a Triplet Loss according to an embodiment.
  • the face authentication system includes a server 101 and an image capture device 102.
  • the server 101 is connected to the network of the image collection device 102.
  • the image collection device 102 collects a real-time scene photo of the user to be authenticated, and a photo of the ID, and transmits the collected real-time scene photo and ID photo to the server 101.
  • the server 101 determines whether the person in the scene photo and the person in the photo ID are the same person, and authenticates the identity of the authenticated user.
  • the image capture device 102 can be a camera or a user terminal having a camera function.
  • the image collection device 102 can be a camera; for example, the image collection device 102 can be a mobile terminal having an imaging function.
  • the face authentication system may further include a card reader for reading a photo of the ID in the chip (such as an ID card).
  • FIG. 2 is a flow chart of a face authentication method based on Triplet Loss in one embodiment. As shown in Figure 2, the method includes:
  • the photo of the certificate refers to the photo corresponding to the document that can prove the identity of the person, such as the photo of the ID printed on the ID card or the photo of the ID in the chip.
  • the way to obtain the photo of the ID card can be obtained by taking a photo of the document, or reading the photo of the ID stored in the ID chip through the card reader.
  • the documents in this embodiment may be an identity card, a driver's license or a social security card.
  • the scene photo of the character is a photo taken by the user to be authenticated at the time of authentication, and the user to be authenticated is in the live environment.
  • the on-site environment refers to the environment in which the user is taking pictures, and the on-site environment is not limited.
  • the scene photo may be obtained by using a mobile terminal having a camera function to collect a scene photo and send it to the server.
  • Face authentication refers to comparing the photos of the scenes collected by the scene and the photos of the IDs in the identity information to determine whether they are the same person.
  • the face authentication request is triggered based on an actual application operation, for example, a face authentication request is triggered based on the user's account opening request.
  • the application prompts the user to perform a photo collection operation on the display interface of the user terminal, and after the photo collection is completed, sends the collected photo to the server for face authentication.
  • Face detection refers to recognizing a photo and obtaining a face area in the photo.
  • Key point positioning refers to the location of the face key detected in the photo and the position of the face key in each photo.
  • Key points of the face include the eyes, the tip of the nose, the tip of the mouth, the eyebrows, and the outline points of the various parts of the face.
  • the cascading convolutional neural network (MTCNN) method based on multi-task joint learning can be used to simultaneously perform face detection and face key point detection, and face detection methods based on LBP features and people based on shape regression can also be used. Face key point detection method.
  • Image pre-processing refers to performing portrait alignment and cropping processing according to the position of the detected face key point in each picture, thereby obtaining a size-normalized scene face image and a document face image.
  • the scene face image refers to the face image obtained by performing face detection, key point positioning and image preprocessing on the scene photo.
  • the face image of the document refers to face detection, key point positioning and image pre-preparation of the document photo. The face image obtained after processing.
  • the convolutional neural network model based on the supervision of the triplet loss function is pre-trained according to the training samples in advance.
  • the convolutional neural network includes a convolutional layer, a pooling layer, an activation function layer, and a fully connected layer, and each neuron parameter of each layer is determined by training.
  • the trained convolutional neural network through the network forward propagation, the first feature vector of the scene face image output by the fully connected layer of the convolutional neural network model and the second feature vector corresponding to the document face image are obtained.
  • a triplet refers to randomly selecting a sample from a training data set.
  • the sample is called a reference sample, and then randomly selects a sample that belongs to the same person as the reference sample as a positive sample, and selects a sample that does not belong to the same person as a negative sample.
  • the sample thus constitutes a (reference sample, positive sample, negative sample) triplet.
  • the three-tuple model has two main combinations: When the image is a reference sample, both the positive sample and the negative sample are scene photos; when the scene image is used as the reference sample, both the positive sample and the negative sample are passport photos.
  • a network shared by parameters is trained to obtain the feature representation of the three elements.
  • the purpose of improving the triplet loss is to learn to make the distance between the reference expression of the reference sample and the positive sample as small as possible, and the distance between the feature expression of the reference sample and the negative sample is as large as possible, and There is a minimum spacing between the distance between the feature representation of the reference sample and the positive sample and the distance between the reference sample and the feature representation of the negative sample.
  • the cosine distance also known as the cosine similarity, is a measure of the magnitude of the difference between two individuals using the cosine of the angle between the two vectors in the vector space.
  • the similarity between the image and the document face image is smaller.
  • the Euclidean distance is used to measure the similarity between samples.
  • the Euclidean distance measures the absolute distance of each point in the space, which is directly related to the position coordinates of each point, which does not conform to the distribution property of the face feature space.
  • the cosine distance is used to measure the similarity between the samples.
  • the cosine distance measures the angle between the space vectors, which is more reflected in the direction, not the position, which is more in line with the distribution properties of the face feature space.
  • x represents the first feature vector and y represents the second feature vector.
  • S210 Compare the cosine distance and the preset threshold, and determine a face authentication result according to the comparison result.
  • the result of the certification includes the passing of the certification, that is, the photo of the certificate and the photo of the scene belong to the same person.
  • the result of the certification also includes the failure of the certification, ie the photo of the ID and the photo of the scene do not belong to the same person.
  • the authentication is successful, and when the cosine distance is less than the preset threshold, the representation is That is, the similarity between the photo of the document and the photo of the scene is less than a preset threshold, and the authentication fails.
  • the above-mentioned facelet authentication method based on Triplet Loss uses a pre-trained convolutional neural network for face authentication, and the convolutional neural network model is obtained based on the supervision training of the triplet loss function, and the scene face image and the document face are obtained.
  • the similarity of the image is calculated according to the first feature vector corresponding to the scene face image and the cosine distance of the second feature vector corresponding to the document face image.
  • the cosine distance measures the angle between the space vectors and is more reflected in the direction difference. Therefore, it is more in line with the distribution attribute of the face feature space, which improves the reliability of face authentication.
  • the face authentication method further includes the step of training to obtain a convolutional neural network model for face authentication.
  • 3 is a flow diagram of the steps of training a convolutional neural network model for face authentication in one embodiment. As shown in Figure 3, this step includes:
  • the marker object is a person
  • the training sample is marked in person
  • the scene face image and the document face image belonging to the same person are marked.
  • the scene face image and the document face image can be obtained by performing face detection, key point positioning, and image preprocessing on the marked scene photo and the ID photo.
  • Face detection refers to recognizing a photo and obtaining a face area in the photo.
  • Key point positioning refers to the location of the face key detected in the photo and the position of the face key in each photo.
  • Key points of the face include the eyes, the tip of the nose, the tip of the mouth, the eyebrows, and the outline points of the various parts of the face.
  • the cascading convolutional neural network (MTCNN) method based on multi-task joint learning can be used to simultaneously perform face detection and face key, and the face detection method based on LBP feature and the face key based on shape regression can also be used. Point detection method.
  • Image preprocessing refers to performing portrait alignment and cropping processing according to the position of the detected face key point in each picture, thereby obtaining a size normalized scene face image and a document face image.
  • the scene face image refers to the face image obtained by performing face detection, key point positioning and image preprocessing on the scene photo.
  • the face image of the document refers to face detection, key point positioning and image pre-preparation of the document photo.
  • the sample is called a reference sample, and then randomly selecting a scene photo sample that belongs to the same person as the reference sample as a positive sample. Select a scene sample that does not belong to the same person as a negative sample, thereby forming a (reference sample, positive sample, negative sample) triplet.
  • the positive sample and the reference sample are the same kind of samples, that is, belong to the same person image.
  • a negative sample is a heterogeneous sample of a reference sample, that is, an image that does not belong to the same person.
  • the reference sample and the positive sample in the triple element are labeled in the training sample, and the negative sample is constructed in the convolutional neural network.
  • the OHEM (Online Hard Example Mining) strategy is used to construct the triplet online, that is, in the network.
  • the current network is used to perform forward calculation on the candidate triplet, and the image in the training sample that does not belong to the same user as the reference sample is selected, and the image with the closest cosine distance is used as the negative sample, thereby obtaining corresponding training samples.
  • Triple element the image in the training sample that does not belong to the same user as the reference sample is selected, and the image with the closest cosine distance is used as the negative sample, thereby obtaining corresponding training samples.
  • the step of training the convolutional neural network based on the training samples and generating the corresponding triple elements of each training sample comprises the following steps S1 and S2:
  • S1 randomly select an image as a reference sample, and select an image belonging to the same label object and different from the reference sample category as a positive sample.
  • the category refers to the type of image to which it belongs.
  • the category of the training sample includes a scene face image and a document face image. Because the face authentication is mainly the comparison between the document photo and the scene photo, the reference sample and the positive sample should belong to different categories. If the reference sample is the scene face image, the positive sample is the document face image; if the reference sample is the document For the face image, the positive sample is the scene face image.
  • the currently trained convolutional neural network model is used to extract the cosine distance between the features. For each reference sample, from other images that do not belong to the same tag object, the selection distance is the smallest, and the reference sample belongs to different categories. The image as a negative sample of the reference sample.
  • the negative sample is selected from the face image of the label that does not belong to the same person as the reference sample.
  • the negative sample uses the OHEM strategy to construct the triplet online, that is, optimization in each iteration of the network.
  • the candidate triples are forwardly calculated, and the images in the training samples that do not belong to the same user as the reference samples and whose cosine distance is closest and do not belong to the same category as the reference samples are selected as negative samples. That is, the negative sample is different from the reference sample. It can be considered that if the documentary photo is taken as a reference sample in the triplet, both the positive sample and the negative sample are scene photos; otherwise, if the scene is taken as the reference sample, then the other positive and negative samples are the identity photos.
  • the human verification terminal verifies the user identity by comparing the user ID chip photo with the scene photo.
  • the data collected in the background is usually a single person's sample with only two pictures, that is, the photo taken and the scene captured at the comparison time. Photo, but the number of different individuals can be thousands. If the data with such a large number of categories and few similar samples is trained by the classification-based method, the classification layer parameters will be too large and the network is very difficult to learn, so consider using the metric learning method.
  • the typical measurement learning is generally to use the triplet loss method to construct an effective feature map by constructing an image triplet. Under this mapping, the feature distance of the same sample is smaller than the feature distance of the heterogeneous sample. Thereby achieving the purpose of correct comparison.
  • the purpose of the triplet loss is to make the distance between the feature expressions of the reference sample and the positive sample as small as possible, and the distance between the feature expressions of the reference sample and the negative sample is as large as possible, and There is a minimum separation between the distance between the reference expression of the reference sample and the positive sample and the distance between the reference expression of the reference sample and the negative sample.
  • the triplet loss function includes a definition of the cosine distance of a homogeneous sample and a definition of the cosine distance of the heterogeneous sample.
  • the same type of sample refers to the reference sample and the positive sample
  • the heterogeneous sample refers to the reference sample and the negative sample.
  • the cosine distance of a similar sample refers to the cosine distance of the reference sample and the positive sample
  • the cosine distance of the heterogeneous sample refers to the cosine distance of the reference sample and the negative sample.
  • the original triplet loss method only considers the gap between classes and does not consider the intra-class gap. If the distribution within the class is not enough, the generalization ability of the network will be weakened, and the adaptability to the scene will also decrease.
  • the original tripletloss method uses Euclidean distance to measure the similarity between samples. In fact, after the face model is deployed, the cosine distance is used to measure more in the feature comparison. The Euclidean distance measures the absolute distance of each point in the space, which is directly related to the position coordinates of each point. The cosine distance measures the angle between the space vectors, which is more reflected in the direction, not the position, which is more in line with The distribution property of the face feature space.
  • the triplet loss method is used to perform iterative optimization by constructing a triplet data input network online and then backpropagating the metric loss of the triple.
  • Each triple contains three images, one reference sample, one positive sample of the same kind as the reference sample, and one negative sample that is heterogeneous to the reference sample, labeled (anchor, positive, negative).
  • anchor positive, negative
  • the basic idea of the original triplet loss is that the distance between the reference sample and the positive sample is made smaller than the distance between the reference sample and the negative sample by metric learning, and the difference between the distances is greater than a minimum interval parameter ⁇ . So the original triplet loss loss function is as follows:
  • N is the number of triples
  • a feature vector representing a reference sample a eigenvector representing a positive sample of its kind
  • a feature vector representing a heterogeneous negative sample Represents the L2 paradigm, the Euclidean distance.
  • [ ⁇ ] + is as follows:
  • the original triplet loss function only defines the distance between the same sample (anchor, positive) and the heterogeneous sample (anchor, negative), that is, the interval between the classes is increased as much as possible by the interval parameter ⁇ , and
  • the intra-class distance is not limited, that is, there is no constraint on the distance between similar samples. If the distance within the class is scattered and the variance is too large, the generalization ability of the network will be weakened, and the probability that the sample will be misclassified will be greater.
  • Figure 4 is a schematic diagram showing the probability of sample misclassification when the interval between classes is uniform and the variance within the class is large.
  • Figure 5 is a probability diagram of the probability of sample misclassification when the interval between classes is uniform and the variance within the class is small.
  • the shaded part indicates the probability of sample misclassification.
  • the probability of sample misclassification is significantly larger than the interval between classes, and the intra-class variance is small. The probability of a sample being misclassified.
  • the present invention proposes an improved triplet loss method, which on the one hand retains the limitation of the distance between classes in the original method, and increases the constraint on the distance within the class, so that the intra-class distance is as concentrated as possible.
  • Its loss function expression is:
  • cos( ⁇ ) represents the cosine distance and is calculated as N is the number of triples
  • N represents the number of triples
  • the improved triplet loss function is changed from Euclidean distance to cosine distance, which can keep the consistency between the training phase and the deployment phase and improve the continuity of feature learning.
  • the first item of the new triplet loss function is consistent with the original triplet loss, which is used to increase the gap between classes.
  • the second item adds the distance constraint on the same sample pair (orthogonal group), which is used to narrow the intra-class gap.
  • ⁇ 1 is an inter-class interval parameter, which ranges from 0 to 0.2
  • ⁇ 2 is an intra-class interval parameter, ranging from 0.8 to 1.0.
  • the obtained metric corresponds to the similarity between the two samples, so In the expression, only the samples with the cosine similarity of the negative tuple in the range of ⁇ 1 greater than the cosine similarity of the positive tuple will actually participate in the training.
  • the model is trained based on the improved ternary loss function, and the back-propagation optimization training of the model is carried out through the joint constraint between the loss between classes and the loss within the class, so that the similar samples are as close as possible in the feature space and the heterogeneous samples are in the feature space. Keep it as far as possible to improve the recognition of the model, thus improving the reliability of face authentication.
  • 90% of the data is taken from the pool of human image data as a training set, and the remaining 10% is used as a verification set.
  • the improved triplet loss value is calculated based on the above formula and fed back to the convolutional neural network for iterative optimization.
  • the performance of the observation model in the verification set when the verification performance is no longer elevated, the model reaches a convergence state, and the training phase is terminated.
  • the above-mentioned face authentication method increases the constraint on the sample distance within the class in the loss function of the original triplet loss, thereby reducing the intra-class gap and increasing the generalization ability of the model while increasing the inter-class gap;
  • the original triplet loss measurement method is changed from Euclidean distance to cosine distance, keeping the consistency of training and deployment metrics, and improving the continuity of feature learning.
  • the step of training the convolutional neural network further comprises: initializing the basic model parameters trained based on the massive open source face data, adding a normalized layer and the improved triplet after the feature output layer Loss function layer, get the convolutional neural network to be trained.
  • the deep face recognition model based on the conventional Internet-based massive face data training will greatly reduce the performance of the human-environment comparison application in a specific scenario, but the specific application
  • the source of human witness data in the scenario is limited.
  • Direct learning often results in unsatisfactory training results due to insufficient samples. Therefore, it is extremely necessary to develop a method for effectively expanding the training of small data sets to enhance face recognition.
  • the accuracy of the model in a specific application scenario meets the needs of market applications.
  • Deep learning algorithms often rely on the training of massive data.
  • the comparison between document photos and scenes is a heterogeneous sample comparison problem.
  • the conventional deep face recognition model based on massive Internet face data training. Performance will drop significantly in the comparison of applications.
  • the source of human data is limited (requires the same person's ID card image and corresponding scene image), and the amount of data that can be used for training is small. Direct training may result in poor training results due to insufficient samples, so deep learning is used.
  • migration learning When training the model of humanity and syndrome, it is often the idea of using migration learning.
  • a basic model with reliable performance on the open source test set is trained, and then it is repeated twice on the limited person data. Extended training enables the model to automatically learn the feature representation of a particular modality and improve model performance. This process is shown in Figure 6.
  • the entire network is initialized with pre-trained basic model parameters, and then an L2 normalization layer and an improved triplet loss layer are added after the feature output layer of the network, and the convolutional nerve to be trained
  • the network structure diagram is shown in Figure 7.
  • FIG. 8 a schematic diagram of a face authentication method is shown in FIG. 8 and includes three phases, namely, a data acquisition and preprocessing phase, a training phase, and a deployment phase.
  • the card reader module of the human verification terminal device reads the ID card photo, and the front camera captures the live photo, and passes through the face detector, key point detector, face alignment and cutting. The module is then obtained with a normalized ID face image and a scene face image.
  • the training phase 90% of the data from the human image data pool is used as the training set, and the remaining 10% is used as the verification set. Since the comparison of the person's card is mainly the comparison between the photo of the document and the scene photo, if the photo is taken as an anchor in the triad, the other two pictures are scene photos; otherwise, if the scene is taken as a reference Figure, the other two pictures are photo ID.
  • the strategy of constructing triples on-line using OHEM is to use the current network to perform forward calculation on the candidate triples in the process of optimization of each iteration of the network, and to filter the effective triples satisfying the conditions, and calculate the improved according to the above formula. The value of the triplet loss is fed back into the network for iterative optimization. At the same time, the performance of the observation model in the verification set, when the verification performance is no longer elevated, the model reaches a convergence state, and the training phase is terminated.
  • the image acquired by the device passes through the same pre-processing procedure as the training phase, and then the feature vector of each face image is obtained through the network forward calculation.
  • the cosine distance is calculated to obtain the similarity between the two images, and then the judgment is performed according to the preset threshold, and the same person is greater than the preset threshold, and vice versa.
  • the original triplet loss function only defines the learning relationship of the distance between classes.
  • the above face authentication method increases the constraint of the intra-class distance by improving the original triplet loss loss function, which can make the network in the training process. In the process of increasing the gap between classes, the intra-class gap is minimized, thereby improving the generalization ability of the network, and thus improving the adaptability of the model.
  • the Euclidean distance is used to replace the Euclidean distance metric in the original triplet loss, which is more consistent with the distribution property of the face feature space, and maintains the consistency between the training phase and the deployment phase, making the comparison result more reliable.
  • a face authentication device comprising: an image acquisition module 902, an image preprocessing module 904, a feature acquisition module 906, a calculation module 908, and an authentication module 910.
  • the image obtaining module 902 is configured to obtain a photo of the document and a photo of the scene of the person based on the face authentication request.
  • the image pre-processing module 904 is configured to perform face detection, key point positioning, and image pre-processing on the scene photo and the ID photo, respectively, to obtain a scene face image corresponding to the scene photo, and a document face image corresponding to the ID photo.
  • the feature acquisition module 906 is configured to input the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and obtain a corresponding scene face image output by the convolutional neural network model.
  • the calculation module 908 is configured to calculate a cosine distance of the first feature vector and the second feature vector.
  • the authentication module 910 is configured to compare the cosine distance and the preset threshold, and determine a face authentication result according to the comparison result.
  • the face authentication device described above performs face authentication using a pre-trained convolutional neural network, and the convolutional neural network model is obtained based on the supervised training of the improved triad loss function, and the scene face image and the document face image are obtained.
  • the similarity is calculated according to the first feature vector corresponding to the scene face image and the cosine distance of the second feature vector corresponding to the document face image.
  • the cosine distance measures the angle between the space vectors and is more reflected in the direction difference. Instead of position, it is more in line with the distribution properties of the face feature space, which improves the reliability of face authentication.
  • the face authentication device further includes: a sample acquisition module 912, a triplet acquisition module 914, a training module 916, and a verification module 918.
  • the sample obtaining module 912 is configured to obtain a labeled training sample, where the training sample includes a document face image and at least one scene face image that are marked for each mark object.
  • the triplet obtaining module 914 is configured to train the convolutional neural network model according to the training samples, and generate the triple element corresponding to each training sample by using OHEM; the triplet elements include a reference sample, a positive sample, and a negative sample.
  • the triplet obtaining module 914 is configured to randomly select an image as a reference sample, select an image belonging to the same label object and different from the reference sample category as a positive sample, and also use the current training convolution according to the OHEM strategy.
  • the neural network model extracts the cosine distance between the features. For each reference sample, from other face images that do not belong to the same tag object, select the image with the smallest distance and different categories from the reference sample as the negative of the reference sample. sample.
  • both the positive sample and the negative sample are scene photos; when the scene photo is taken as the reference sample, both the positive sample and the negative sample are the passport photos.
  • the training module 916 is configured to train a convolutional neural network model based on the triple element of each training sample based on the supervision of the triad loss function, and the triad loss function is measured by a cosine distance as a metric by a random gradient Algorithm to optimize model parameters.
  • the improved triplet loss function includes a definition of the cosine distance of a homogeneous sample and a definition of the cosine distance of the heterogeneous sample.
  • the improved triplet loss function is:
  • cos( ⁇ ) represents the cosine distance and is calculated as N is the number of triples
  • N represents the number of triples
  • the verification module 918 is configured to input the verification set data into the convolutional neural network model, and when the training end condition is reached, obtain a trained convolutional neural network model for face authentication.
  • the face authentication device further includes a model initialization module 920, configured to initialize the basic model parameters trained based on the massive open source face data, and add a normalization layer and a triplet after the feature output layer. Loss function layer, get the convolutional neural network to be trained.
  • the above-mentioned face authentication device increases the constraint on the sample distance within the class in the loss function of the original triplet loss, thereby reducing the intra-class gap and increasing the generalization ability of the model while increasing the inter-class gap;
  • the original triplet loss measurement method is changed from Euclidean distance to cosine distance, keeping the consistency of training and deployment metrics, and improving the continuity of feature learning.
  • a computer device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, the step of implementing the face authentication method of each of the above embodiments when the processor executes the computer program.
  • a storage medium having stored thereon a computer program, wherein the computer program is executed by a processor to implement the steps of the face authentication method of each of the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The present invention relates to a face verification method and apparatus based on Triplet Loss, and a computer device and a storage medium. The method comprises: based on a face verification request, acquiring a certificate photograph and a scenario photograph of a person; respectively performing face detection, key point positioning and image pre-processing on the scenario photograph and the certificate photograph, so as to obtain a scenario face image corresponding to the scenario photograph and a certificate face image corresponding to the certificate photograph; inputting the scenario face image and the certificate face image into a pre-trained convolutional neural network model used for face verification, and acquiring a first feature vector corresponding to the scenario face image and a second feature vector corresponding to the certificate face image, which feature vectors are output by the convolutional neural network model; calculating a cosine distance between the first feature vector and the second feature vector; and comparing the cosine distance with a pre-set threshold value, and determining a face verification result according to a comparison result. The method improves the reliability of face verification.

Description

基于Triplet Loss的人脸认证方法、装置、计算机设备和存储介质Face authentication method, device, computer device and storage medium based on Triplet Loss 技术领域Technical field
本发明涉及图像处理技术领域,特别是涉及一种基于Triplet Loss的人脸认证方法、装置、计算机设备和存储介质。The present invention relates to the field of image processing technologies, and in particular, to a face authentication method, apparatus, computer device, and storage medium based on Triplet Loss.
背景技术Background technique
人脸认证,是指对比现场采集的人物场景照片以及身份信息中的证件照片,判断是否为同一个人。人脸认证的关键技术为人脸识别。Face authentication refers to comparing the photos of the scenes collected by the scene and the photos of the IDs in the identity information to determine whether they are the same person. The key technology of face authentication is face recognition.
随着深度学***得到较大的提升。在运用深度学习解决人脸识别问题的研究工作中,主要有两派主流的方法:基于分类学习的方法和基于度量学习的方法。其中,基于分类学习的方法主要是在深度卷积网络提取特征之后计算样本的分类损失函数(比如softmax loss、center loss及相关变体)来对网络进行优化,网络最后一层是用于分类的全连接层,其输出节点的数量往往要与训练数据集的总类别数保持一致,该类方法适用于训练样本较多,尤其是同一类别的训练样本比较丰富的情况,网络可以得到较好的训练效果和泛化能力。但当类别数达到数十万或更高数量级时,网络最后的分类层(全连接层)参数量会呈线性增长而相当庞大,导致网络难以训练。With the rise of deep learning technology, the related problems of face recognition have continuously broken through the traditional technical bottlenecks, and the performance level has been greatly improved. In the research work of applying deep learning to solve face recognition problems, there are mainly two mainstream methods: classification learning based methods and metric learning based methods. Among them, the method based on classification learning mainly calculates the classification loss function (such as softmax loss, center loss and related variants) of the sample after the feature is extracted by the deep convolution network to optimize the network. The last layer of the network is used for classification. In the fully connected layer, the number of output nodes is always consistent with the total number of categories in the training data set. This type of method is suitable for training samples, especially when the training samples of the same category are rich, and the network can get better. Training effect and generalization ability. However, when the number of categories reaches hundreds of thousands or more, the amount of parameters in the final classification layer (fully connected layer) of the network will increase linearly and become quite large, which makes the network difficult to train.
另一类方法是基于度量学习的方法,该方法以元组的方式组织训练样本(比如二元组pair或者三元组triplet),在深度卷积网络之后无需通过分类层,而是直接基于卷积特征向量计算样本间的度量损失(比如contrastive loss、triplet loss等)来对网络进行优化,该方法不需要训练分类层,因此网络参数量不受类别数增长的影响,对训练数据集的类别数无限定,只需要根据相应策略选取同类或异类样本构造合适的元组即可。相比分类学习方法,度量学习方法更适用于训练数据广度较大但深度不足(样本类别数多,但同类样本少)的情况,通过样本之间的不同组合,可以构造相当丰富的元组数据用于训练,同时度量学习方式更加关注元组内部关系,对于1:1人脸验证这类判断是与不是的问题有其先天的优势。Another type of method is a method based on metric learning, which organizes training samples in the form of tuples (such as a binary set or a triplet triplet). After deep deconvolution, there is no need to pass the classification layer, but directly based on the volume. The product feature vector calculates the metric loss between samples (such as contrast loss, triplet loss, etc.) to optimize the network. This method does not need to train the classification layer, so the network parameter quantity is not affected by the number of categories, and the category of the training data set. There is no limit to the number. It is only necessary to select the same or different types of samples according to the corresponding strategy to construct a suitable tuple. Compared with the classification learning method, the metric learning method is more suitable for the case where the training data breadth is large but the depth is insufficient (the number of sample categories is large, but the similar samples are few). By the different combinations between the samples, a considerable amount of tuple data can be constructed. It is used for training, and the measurement learning method pays more attention to the internal relationship of the tuple. It has its inherent advantages for the judgment of 1:1 face verification.
在实际应用中,许多的机构都要求实名制登记,例如,银行开户,手机号码登记、金融账号开户等等。实名制登记要求用户携带身份证到指定的地点,由工作人员验证本人与身份证的照片对应后,方可开户成功。而随着互联网技术地发展,越来越多的机构推出了便民服务,不再强制要求客户到指定网点。用户的地理位置不受限制,上传身份证,并利用移动终端的图像采集装置采集现场的人物场景照片,由***进行人脸认证,并在人脸认证通过后, 即可开户成功。而传统地基于度量的学习方法,使用欧式距离来度量样本之间的相似度,而欧氏距离衡量的是空间各点的绝对距离,跟各个点所在的位置坐标直接相关,这并不符合人脸特征空间的分布属性,导致人脸识别的可靠性较低。In practical applications, many organizations require real-name registration, such as bank account opening, mobile phone number registration, financial account opening, and so on. The real-name registration requires the user to bring the ID card to the designated place, and the staff member can verify that the photo is corresponding to the ID card before the account can be opened successfully. With the development of Internet technology, more and more organizations have introduced convenience services, and no longer require customers to go to designated outlets. The user's geographical location is not limited, the ID card is uploaded, and the image collection device of the mobile terminal is used to collect the photo of the scene of the scene, and the system performs face authentication, and after the face authentication is passed, the account opening can be successful. Traditionally, metric-based learning methods use Euclidean distance to measure the similarity between samples, while Euclidean distance measures the absolute distance of each point in space, which is directly related to the position coordinates of each point. This is not in line with people. The distribution property of the face feature space leads to low reliability of face recognition.
发明内容Summary of the invention
基于此,有必要针对传统的人脸认证方法可靠性低的问题,提供一种基于Triplet Loss的人脸认证方法、装置、计算机设备和存储介质。Based on this, it is necessary to provide a face authentication method, device, computer device and storage medium based on Triplet Loss for the problem that the traditional face authentication method has low reliability.
一种基于Triplet Loss的人脸认证方法,包括:A face recognition method based on Triplet Loss, comprising:
基于人脸认证请求,获取证件照片和人物的场景照片;Obtaining photos of the ID and scenes of the person based on the face authentication request;
对所述场景照片和所述证件照片分别进行人脸检测、关键点定位和图像预处理,得到所述场景照片对应的场景人脸图像,以及所述证件照片对应的证件人脸图像;Performing face detection, key point positioning, and image pre-processing on the scene photo and the document photo respectively, obtaining a scene face image corresponding to the scene photo, and a document face image corresponding to the document photo;
将所述场景人脸图像和证件人脸图像输入到预先训练好的用于人脸认证的卷积神经网络模型,并获取所述卷积神经网络模型输出的所述场景人脸图像对应的第一特征向量,以及所述证件人脸图像对应的第二特征向量;其中,所述卷积神经网络模型基于三元组损失函数的监督训练得到;Inputting the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and acquiring a corresponding image of the scene face image output by the convolutional neural network model a feature vector, and a second feature vector corresponding to the document face image; wherein the convolutional neural network model is obtained based on supervised training of a triple loss function;
计算所述第一特征向量和所述第二特征向量的余弦距离;Calculating a cosine distance of the first feature vector and the second feature vector;
比较所述余弦距离和预设阈值,并根据比较结果确定人脸认证结果。Comparing the cosine distance and a preset threshold, and determining a face authentication result according to the comparison result.
在一个实施例中,所述方法还包括:In an embodiment, the method further includes:
获取带标记的训练样本,所述训练样本包括标记了属于每个标记对象的一张证件人脸图像和至少一张场景人脸图像;Obtaining a labeled training sample, the training sample including a document face image and at least one scene face image that are marked for each mark object;
根据所述训练样本训练卷积神经网络模块,通过OHEM产生各训练样本对应的三元组元素;所述三元组元素包括参考样本、正样本和负样本;And training a convolutional neural network module according to the training sample, and generating a triple element corresponding to each training sample by using OHEM; the triad element includes a reference sample, a positive sample, and a negative sample;
根据各训练样本的三元组元素,基于三元组损失函数的监督,训练所述卷积神经网络模型;该三元组损失函数,以余弦距离作为度量方式,通过随机梯度下降算法来优化模型参数;According to the triple element of each training sample, the convolutional neural network model is trained based on the supervision of the triplet loss function; the triad loss function is measured by the cosine distance and the model is optimized by the stochastic gradient descent algorithm. parameter;
将验证集数据输入所述卷积神经网络模型,达到训练结束条件时,得到训练好的用于人脸认证的卷积神经网络模型。The verification set data is input into the convolutional neural network model, and when the training end condition is reached, the trained convolutional neural network model for face authentication is obtained.
在另一个实施例中,根据所述训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素的步骤,包括:In another embodiment, the step of training the convolutional neural network model according to the training sample, and generating the triple element corresponding to each training sample by using OHEM, includes:
随机选择一个图像作为参考样本,选择属于同一标签对象、与参考样本类别不同的图像作为正样本;Randomly selecting an image as a reference sample, and selecting an image belonging to the same label object and different from the reference sample category as a positive sample;
根据OHEM策略,利用当前训练的卷积神经网络模型提取特征之间的余弦距离,对于每一个参考样本,从其它不属于所述标签对象的图像中,选择距离最小、与所述参考样本属于不同类别的图像,作为该参考样本的负样本。According to the OHEM strategy, the currently trained convolutional neural network model is used to extract the cosine distance between the features. For each reference sample, from other images that do not belong to the label object, the selection distance is the smallest and is different from the reference sample. An image of the category as a negative sample of the reference sample.
在另一个实施例中,所述三元组损失函数包括对同类样本的余弦距离的限定,以及对异类样本的余弦距离的限定。In another embodiment, the triad loss function includes a definition of the cosine distance of a homogeneous sample, and a definition of the cosine distance of the heterogeneous sample.
在另一个实施例中,所述三元组损失函数为:In another embodiment, the triplet loss function is:
Figure PCTCN2018109169-appb-000001
Figure PCTCN2018109169-appb-000001
其中,cos(·)表示余弦距离,其计算方式为
Figure PCTCN2018109169-appb-000002
N是三元组数量,
Figure PCTCN2018109169-appb-000003
表示参考样本的特征向量,
Figure PCTCN2018109169-appb-000004
表示同类正样本的特征向量,
Figure PCTCN2018109169-appb-000005
表示异类负样本的特征向量,[·] +的含义如下:
Figure PCTCN2018109169-appb-000006
α 1为类间间隔参数,α 2为类内间隔参数。
Where cos(·) represents the cosine distance and is calculated as
Figure PCTCN2018109169-appb-000002
N is the number of triples,
Figure PCTCN2018109169-appb-000003
Represents the feature vector of the reference sample,
Figure PCTCN2018109169-appb-000004
a eigenvector representing a positive sample of its kind,
Figure PCTCN2018109169-appb-000005
A eigenvector representing a heterogeneous negative sample, the meaning of [·] + is as follows:
Figure PCTCN2018109169-appb-000006
α 1 is the inter-class spacing parameter and α 2 is the intra-class spacing parameter.
在另一个实施例中,所述方法还包括:利用基于海量开源人脸数据训练好的基础模型参数进行初始化,在特征输出层后添加归一化层及三元组损失函数层,得到待训练的卷积神经网络模型。In another embodiment, the method further includes: initializing the basic model parameters trained based on the massive open source face data, adding a normalization layer and a ternary loss function layer after the feature output layer, to obtain training Convolutional neural network model.
一种基于Triplet Loss的人脸认证装置,包括:图像获取模块、图像预处理模块、特征获取模块、计算模块和认证模块;A face recognition device based on Triplet Loss, comprising: an image acquisition module, an image preprocessing module, a feature acquisition module, a calculation module and an authentication module;
所述图像获取模块,用于基于人脸认证请求,获取证件照片和人物的场景照片;The image obtaining module is configured to obtain a photo of the document and a photo of the scene of the character based on the face authentication request;
所述图像预处理模块,用于对所述场景照片和所述证件照片分别进行人脸检测、关键点定位和图像预处理,得到所述场景照片对应的场景人脸图像,以及所述证件照片对应的证件人脸图像;The image pre-processing module is configured to perform face detection, key point positioning, and image pre-processing on the scene photo and the photo of the document, respectively, to obtain a scene face image corresponding to the scene photo, and the photo of the ID Corresponding document face image;
所述特征获取模块,用于将所述场景人脸图像和证件人脸图像输入到预先训练好的用于人脸认证的卷积神经网络模型,并获取所述卷积神经网络模型输出的所述场景人脸图像对应的第一特征向量,以及所述证件人脸图像对应的第二特征向量;其中,所述卷积神经网络模型基于三元组损失函数的监督训练得到;The feature acquiring module is configured to input the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and acquire the output of the convolutional neural network model a first feature vector corresponding to the scene face image, and a second feature vector corresponding to the document face image; wherein the convolutional neural network model is obtained based on the supervision training of the triplet loss function;
所述计算模块,用于计算所述第一特征向量和所述第二特征向量的余弦距离;The calculating module is configured to calculate a cosine distance of the first feature vector and the second feature vector;
所述认证模块,用于比较所述余弦距离和预设阈值,并根据比较结果确定人脸认证结果。The authentication module is configured to compare the cosine distance and a preset threshold, and determine a face authentication result according to the comparison result.
在另一个实施例中,所述装置还包括:样本获取模块、三元组获取模块、训练模块和验证模块;In another embodiment, the apparatus further includes: a sample acquisition module, a triplet acquisition module, a training module, and a verification module;
所述样本获取模块,用于获取带标记的训练样本,所述训练样本包括标记了属于每个标记对象的一张证件人脸图像和至少一张场景人脸图像;The sample obtaining module is configured to acquire a labeled training sample, where the training sample includes a document face image and at least one scene face image that are marked for each mark object;
所述三元组获取模块,用于根据所述训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素;所述三元组元素包括参考样本、正样本和负样本;The triplet obtaining module is configured to train a convolutional neural network model according to the training sample, and generate a triple element corresponding to each training sample by using OHEM; the triplet element includes a reference sample, a positive sample, and a negative sample. ;
所述训练模块,用于根据各训练样本的三元组元素,基三元组损失函数的监督,训练所述卷积神经网络模型;该三元组损失函数,以余弦距离作为度量方式,通过随机梯度下降算法来优化模型参数;The training module is configured to train the convolutional neural network model according to the triad element of each training sample and the supervision of the base triad loss function; the triad loss function is measured by a cosine distance A stochastic gradient descent algorithm to optimize model parameters;
所述验证模块,用于将验证集数据输入所述卷积神经网络模型,达到训练结束条件时,得到训练好的用于人脸认证的卷积神经网络模型。The verification module is configured to input the verification set data into the convolutional neural network model, and when the training end condition is reached, obtain a trained convolutional neural network model for face authentication.
一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述的基于Triplet Loss的人脸认证方法的步骤。A computer apparatus comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, the processor executing the computer program to implement the steps of the above-described Triplet Loss-based face authentication method.
一种存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时,实现上述的基于Triplet Loss的人脸认证方法的步骤。A storage medium having stored thereon a computer program, characterized in that, when the computer program is executed by a processor, the steps of the above-described facelet authentication method based on Triplet Loss are implemented.
本发明所述的基于Triplet Loss的人脸认证方法、装置、计算机设备和存储介质,利用预先训练的卷积神经网络进行人脸认证,由于卷积神经网络模型基于三元组损失函数的监督训练得到,而场景人脸图像和证件人脸图像的相似度根据场景人脸图像对应的第一特征向量和证件人脸图像对应的第二特征向量的余弦距离计算得到,余弦距离衡量的是空间向量的夹角,更加体现在方向上的差异,从而更符合人脸特征空间的分布属性,提高了人脸认证的可靠性。The facelet authentication method, device, computer device and storage medium based on Triplet Loss according to the present invention use face-conflicted convolutional neural network for face authentication, and the convolutional neural network model is based on the supervision training of the triplet loss function. Obtaining, the similarity between the scene face image and the document face image is calculated according to the first feature vector corresponding to the scene face image and the cosine distance of the second feature vector corresponding to the document face image, and the cosine distance is a space vector The angle is more reflected in the direction, which is more in line with the distribution properties of the face feature space, which improves the reliability of face authentication.
附图说明DRAWINGS
图1为一个实施例的基于Triplet Loss的人脸认证***的结构示意图;1 is a schematic structural diagram of a facelet authentication system based on a Triplet Loss according to an embodiment;
图2为一个实施例中基于Triplet Loss的人脸认证方法的流程图;2 is a flow chart of a face authentication method based on Triplet Loss in an embodiment;
图3为一个实施例中训练得到用于人脸认证的卷积神经网络模型的步骤的流程图;3 is a flow chart showing the steps of training a convolutional neural network model for face authentication in one embodiment;
图4为在类间间隔一致、类内方差较大情况下,样本错分的概率示意图;Figure 4 is a schematic diagram showing the probability of sample misclassification in the case where the interval between classes is uniform and the variance within the class is large;
图5为在类间间隔一致、类内方差较小情况下,样本错分的概率示意图;Figure 5 is a schematic diagram showing the probability of sample misclassification in the case where the interval between classes is uniform and the variance within the class is small;
图6为一个实施例中基于Triplet Loss的人脸认证的迁移学习过程的示意图;6 is a schematic diagram of a migration learning process of face authentication based on Triplet Loss in an embodiment;
图7为一个实施例中用于人脸认证的卷积神经网络模型的结构示意图;7 is a schematic structural diagram of a convolutional neural network model for face authentication in an embodiment;
图8为一个实施例中基于Triplet Loss的人脸认证方法的流程示意图;FIG. 8 is a schematic flowchart of a face authentication method based on Triplet Loss in an embodiment; FIG.
图9为一个实施例中基于Triplet Loss的人脸认证装置的结构框图;9 is a structural block diagram of a face authentication device based on a Triplet Loss in an embodiment;
图10为另一个实施例中基于Triplet Loss的人脸认证装置的结构框图。FIG. 10 is a structural block diagram of a face authentication device based on Triplet Loss in another embodiment.
具体实施方式Detailed ways
图1为一个实施例的基于Triplet Loss的人脸认证***的结构示意图。如图1所示,人脸认证***包括服务器101和图像采集装置102。其中,服务器101与图像采集装置102网络连接。图像采集装置102采集待认证用户的实时场景照片,以及证件照片,并将采集的实时场景照片和证件照片发送至服务器101。服务器101判断场景照片的人物与证件照中的人物是否为同一人,对待认证用户的身份进行认证。基于具体的应用场景,图像采集装置102可以为摄像头,或是具有摄像功能的用户终端。以在开户现场为例,图像采集装置102可以为摄像头;以通过互联网进行金融账号开户为例,图像采集装置102可以为具有摄像功能的移动终端。FIG. 1 is a schematic structural diagram of a facelet authentication system based on a Triplet Loss according to an embodiment. As shown in FIG. 1, the face authentication system includes a server 101 and an image capture device 102. The server 101 is connected to the network of the image collection device 102. The image collection device 102 collects a real-time scene photo of the user to be authenticated, and a photo of the ID, and transmits the collected real-time scene photo and ID photo to the server 101. The server 101 determines whether the person in the scene photo and the person in the photo ID are the same person, and authenticates the identity of the authenticated user. Based on the specific application scenario, the image capture device 102 can be a camera or a user terminal having a camera function. For example, the image collection device 102 can be a camera; for example, the image collection device 102 can be a mobile terminal having an imaging function.
在其它的实施例中,人脸认证***还可以包括读卡器,用于读取证件(如身份证等)芯片内的证件照。In other embodiments, the face authentication system may further include a card reader for reading a photo of the ID in the chip (such as an ID card).
图2为一个实施例中基于Triplet Loss的人脸认证方法的流程图。如图2所示,该方法包括:2 is a flow chart of a face authentication method based on Triplet Loss in one embodiment. As shown in Figure 2, the method includes:
S202,基于人脸认证请求,获取证件照片和人物的场景照片。S202. Acquire a photo of the ID and a photo of the scene of the character based on the face authentication request.
其中,证件照片是指能够证明人物身份的证件所对应的照片,例如身份证上所印制的证件照或芯片内的证件照。证件照片的获取方式可以采用对证件进行拍照获取,也可以通过读卡器读取证件芯片所存储的证件照片。本实施例中的证件可以为身份证,驾驶证或社会保障卡等。Among them, the photo of the certificate refers to the photo corresponding to the document that can prove the identity of the person, such as the photo of the ID printed on the ID card or the photo of the ID in the chip. The way to obtain the photo of the ID card can be obtained by taking a photo of the document, or reading the photo of the ID stored in the ID chip through the card reader. The documents in this embodiment may be an identity card, a driver's license or a social security card.
人物的场景照片是指待认证用户在认证时所采集,该待认证用户在现场环境的照片。现场环境是指用户在拍照时的所处环境,现场环境不受限制。场景照片的获取方式可以为,利用具有摄像功能的移动终端采集场景照片并发送至服务器。The scene photo of the character is a photo taken by the user to be authenticated at the time of authentication, and the user to be authenticated is in the live environment. The on-site environment refers to the environment in which the user is taking pictures, and the on-site environment is not limited. The scene photo may be obtained by using a mobile terminal having a camera function to collect a scene photo and send it to the server.
人脸认证,是指对比现场采集的人物场景照片以及身份信息中的证件照片,判断是否为同一个人。人脸认证请求基于实际的应用操作触发,例如,基于用户的开户请求,触发人脸认证请求。应用程序在用户终端的显示界面提示用户进行照片的采集操作,并在照片采集完成后,将采集的照片发送至服务器,进行人脸认证。Face authentication refers to comparing the photos of the scenes collected by the scene and the photos of the IDs in the identity information to determine whether they are the same person. The face authentication request is triggered based on an actual application operation, for example, a face authentication request is triggered based on the user's account opening request. The application prompts the user to perform a photo collection operation on the display interface of the user terminal, and after the photo collection is completed, sends the collected photo to the server for face authentication.
S204,对场景照片和证件照片分别进行人脸检测、关键点定位和图像预处理,得到场景照片对应的场景人脸图像,以及证件照片对应的证件人脸图像。S204, performing face detection, key point positioning, and image preprocessing on the scene photo and the ID photo, respectively, obtaining a scene face image corresponding to the scene photo, and a document face image corresponding to the ID photo.
人脸检测是指识别照片并获取照片中的人脸区域。Face detection refers to recognizing a photo and obtaining a face area in the photo.
关键点定位,是指对照片中检测的人脸区域,获取人脸关键点在每幅照片中的位置。人脸关键点包括眼睛,鼻尖、嘴角尖、眉毛以及人脸各部件轮廓点。Key point positioning refers to the location of the face key detected in the photo and the position of the face key in each photo. Key points of the face include the eyes, the tip of the nose, the tip of the mouth, the eyebrows, and the outline points of the various parts of the face.
本实施例中,可采用基于多任务联合学习的级联卷积神经网络MTCNN方法同时完成人脸检测和人脸关键点检测,亦可采用基于LBP特征的人脸检测方法和基于形状回归的人脸关键点检测方法。In this embodiment, the cascading convolutional neural network (MTCNN) method based on multi-task joint learning can be used to simultaneously perform face detection and face key point detection, and face detection methods based on LBP features and people based on shape regression can also be used. Face key point detection method.
图像预处理是指将根据检测的人脸关键点在每张图片中的位置,进行人像对齐和剪切处理,从而得到尺寸归一化的场景人脸图像和证件人脸图像。其中,场景人脸图像是指对场景照片进行人脸检测、关键点定位和图像预处理后得到的人脸图像,证件人脸图像是指对证件照片进行人脸检测、关键点定位和图像预处理后得到的人脸图像。Image pre-processing refers to performing portrait alignment and cropping processing according to the position of the detected face key point in each picture, thereby obtaining a size-normalized scene face image and a document face image. The scene face image refers to the face image obtained by performing face detection, key point positioning and image preprocessing on the scene photo. The face image of the document refers to face detection, key point positioning and image pre-preparation of the document photo. The face image obtained after processing.
S206,将场景人脸图像和证件人脸图像输入到预先训练好的用于人脸认证的卷积神经网络模型,并获取卷积神经网络模型输出的场景人脸图像对应的第一特征向量,以及证件人脸图像对应的第二特征向量。S206. Input the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and obtain a first feature vector corresponding to the scene face image output by the convolutional neural network model, And a second feature vector corresponding to the document face image.
其中,卷积神经网络模型基于三元组损失函数的监督预先根据训练样本提前训练好的。卷积神经网络包括卷积层、池化层、激活函数层和全连接层,每层的各个神经元参数通过训练确定。利用训练好的卷积神经网络,通过网络前向传播,获取卷积神经网络模型的全连接层输出的场景人脸图像的第一特征向量,以及证件人脸图像对应的第二特征向量。Among them, the convolutional neural network model based on the supervision of the triplet loss function is pre-trained according to the training samples in advance. The convolutional neural network includes a convolutional layer, a pooling layer, an activation function layer, and a fully connected layer, and each neuron parameter of each layer is determined by training. Using the trained convolutional neural network, through the network forward propagation, the first feature vector of the scene face image output by the fully connected layer of the convolutional neural network model and the second feature vector corresponding to the document face image are obtained.
三元组(triplet)是指从训练数据集中随机选一个样本,该样本称为参考样本,然后再随机选取一个和参考样本属于同一人的样本作为正样本,选取不属于同一人的样本作为负样本,由此构成一个(参考样本、正样本、负样本)三元组。由于人证比对主要是基于证件照与场景照的比对,而不是证件照与证件照、或者场景照与场景照的比对,因此三元组的模式主要有两种组合:以证件照图像为参考样本时,正样本和负样本均为场景照;以场景照图像为参考样本时,正样本和负样本均为证件照。A triplet refers to randomly selecting a sample from a training data set. The sample is called a reference sample, and then randomly selects a sample that belongs to the same person as the reference sample as a positive sample, and selects a sample that does not belong to the same person as a negative sample. The sample thus constitutes a (reference sample, positive sample, negative sample) triplet. Since the comparison of the witnesses is mainly based on the comparison between the photo of the document and the scene, rather than the comparison of the photo of the document and the photo of the document, or the comparison of the scene and the scene, the three-tuple model has two main combinations: When the image is a reference sample, both the positive sample and the negative sample are scene photos; when the scene image is used as the reference sample, both the positive sample and the negative sample are passport photos.
针对三元组中的每个样本,训练一个参数共享的网络,得到三个元素的特征表达。改进三元组损失(triplet loss)的目的就是通过学习,让参考样本和正样本的特征表达之间的距离尽可能小,而参考样本和负样本的特征表达之间的距离尽可能大,并且要让参考样本和正样本的特征表达之间的距离和参考样本和负样本的特征表达之间的距离之间有一个最小的间隔。For each sample in the triple, a network shared by parameters is trained to obtain the feature representation of the three elements. The purpose of improving the triplet loss is to learn to make the distance between the reference expression of the reference sample and the positive sample as small as possible, and the distance between the feature expression of the reference sample and the negative sample is as large as possible, and There is a minimum spacing between the distance between the feature representation of the reference sample and the positive sample and the distance between the reference sample and the feature representation of the negative sample.
S208,计算第一特征向量和第二特征向量的余弦距离。S208. Calculate a cosine distance of the first feature vector and the second feature vector.
余弦距离,也称为余弦相似度,是用向量空间中两个向量夹角的余弦值作为衡量两个个体间差异的大小的度量。第一特征向量和第二特征向量的余弦距离越大,表示场景人脸图像 和证件人脸图像的相似度越大,第一特征向量和第二特征向量的余弦距离越小,表示场景人脸图像和证件人脸图像的相似度越小。当场景人脸图像和证件人脸图像的余弦距离越接收于1时,两张图像属于同一人的机率越大,当场景人脸图像和证件人脸图像的余弦距离越小,两张图像属于同一人的机率越小。The cosine distance, also known as the cosine similarity, is a measure of the magnitude of the difference between two individuals using the cosine of the angle between the two vectors in the vector space. The larger the cosine distance of the first feature vector and the second feature vector is, the larger the similarity between the scene face image and the document face image is, and the smaller the cosine distance of the first feature vector and the second feature vector is, indicating the scene face. The similarity between the image and the document face image is smaller. When the cosine distance of the scene face image and the document face image is received at 1, the probability that the two images belong to the same person is larger, and the smaller the cosine distance of the scene face image and the document face image, the two images belong to The probability of the same person is smaller.
传统的三元组损失(triplet loss)方法中,使用欧式距离来度量样本之间的相似度。而欧氏距离衡量的是空间各点的绝对距离,跟各个点所在的位置坐标直接相关,这并不符合人脸特征空间的分布属性。本实施例中,考虑人脸特征空间的分布属性和实际应用场景,采用余弦距离来度量样本之间的相似度。余弦距离衡量的是空间向量的夹角,更加体现在方向上的差异,而不是位置,从而更符合人脸特征空间的分布属性。In the traditional triplet loss method, the Euclidean distance is used to measure the similarity between samples. The Euclidean distance measures the absolute distance of each point in the space, which is directly related to the position coordinates of each point, which does not conform to the distribution property of the face feature space. In this embodiment, considering the distribution attribute of the face feature space and the actual application scenario, the cosine distance is used to measure the similarity between the samples. The cosine distance measures the angle between the space vectors, which is more reflected in the direction, not the position, which is more in line with the distribution properties of the face feature space.
具体地,余弦距离的计算公式为:Specifically, the formula for calculating the cosine distance is:
Figure PCTCN2018109169-appb-000007
Figure PCTCN2018109169-appb-000007
其中,x表示第一特征向量,y表示第二特征向量。Where x represents the first feature vector and y represents the second feature vector.
S210,比较余弦距离和预设阈值,并根据比较结果确定人脸认证结果。S210: Compare the cosine distance and the preset threshold, and determine a face authentication result according to the comparison result.
认证结果包括认证通过,即证件照片和场景照片属于同一人。认证结果还包括认证失败,即证件照片和场景照片不属于同一人。The result of the certification includes the passing of the certification, that is, the photo of the certificate and the photo of the scene belong to the same person. The result of the certification also includes the failure of the certification, ie the photo of the ID and the photo of the scene do not belong to the same person.
具体地,将余弦距离与预设阈值进行比较,当余弦距离大于预设阈值时,表示即证件照片与场景照片的相似度大于预设阈值,认证成功,当余弦距离小于预设阈值时,表示即证件照片与场景照片的相似度小于预设阈值,认证失败。Specifically, comparing the cosine distance with the preset threshold, when the cosine distance is greater than the preset threshold, indicating that the similarity between the photo of the document and the scene photo is greater than a preset threshold, the authentication is successful, and when the cosine distance is less than the preset threshold, the representation is That is, the similarity between the photo of the document and the photo of the scene is less than a preset threshold, and the authentication fails.
上述的基于Triplet Loss的人脸认证方法,利用预先训练的卷积神经网络进行人脸认证,由于卷积神经网络模型基于三元组损失函数的监督训练得到,而场景人脸图像和证件人脸图像的相似度根据场景人脸图像对应的第一特征向量和证件人脸图像对应的第二特征向量的余弦距离计算得到,余弦距离衡量的是空间向量的夹角,更加体现在方向上的差异,从而更符合人脸特征空间的分布属性,提高了人脸认证的可靠性。The above-mentioned facelet authentication method based on Triplet Loss uses a pre-trained convolutional neural network for face authentication, and the convolutional neural network model is obtained based on the supervision training of the triplet loss function, and the scene face image and the document face are obtained. The similarity of the image is calculated according to the first feature vector corresponding to the scene face image and the cosine distance of the second feature vector corresponding to the document face image. The cosine distance measures the angle between the space vectors and is more reflected in the direction difference. Therefore, it is more in line with the distribution attribute of the face feature space, which improves the reliability of face authentication.
在另一个实施例中,人脸认证方法还包括训练得到用于人脸认证的卷积神经网络模型的步骤。图3为一个实施例中训练得到用于人脸认证的卷积神经网络模型的步骤的流程图。如图3所示,该步骤包括:In another embodiment, the face authentication method further includes the step of training to obtain a convolutional neural network model for face authentication. 3 is a flow diagram of the steps of training a convolutional neural network model for face authentication in one embodiment. As shown in Figure 3, this step includes:
S302,获取带标记的训练样本,训练样本包括标记了属于每个标记对象的一张证件人脸图像和至少一张场景人脸图像。S302. Acquire a labeled training sample, where the training sample includes a document face image and at least one scene face image that are marked for each mark object.
本实施例中,标记对象即人,训练样本以人为单位,标记了同属于一个人的场景人脸图 像和证件人脸图像。具体地,场景人脸图像和证件人脸图像可通过对带标记的场景照片和证件照片进行人脸检测、关键点定位和图像预处理得到。In this embodiment, the marker object is a person, and the training sample is marked in person, and the scene face image and the document face image belonging to the same person are marked. Specifically, the scene face image and the document face image can be obtained by performing face detection, key point positioning, and image preprocessing on the marked scene photo and the ID photo.
人脸检测是指识别照片并获取照片中的人脸区域。Face detection refers to recognizing a photo and obtaining a face area in the photo.
关键点定位,是指对照片中检测的人脸区域,获取人脸关键点在每幅照片中的位置。人脸关键点包括眼睛,鼻尖、嘴角尖、眉毛以及人脸各部件轮廓点。Key point positioning refers to the location of the face key detected in the photo and the position of the face key in each photo. Key points of the face include the eyes, the tip of the nose, the tip of the mouth, the eyebrows, and the outline points of the various parts of the face.
本实施例中,可采用基于多任务联合学习的级联卷积神经网络MTCNN方法同时完成人脸检测和人脸关键,亦可采用基于LBP特征的人脸检测方法和基于形状回归的人脸关键点检测方法。In this embodiment, the cascading convolutional neural network (MTCNN) method based on multi-task joint learning can be used to simultaneously perform face detection and face key, and the face detection method based on LBP feature and the face key based on shape regression can also be used. Point detection method.
图像预处理是指将根据检测的人脸关键点在每张图片中的位置,进行人像对齐和剪切处理,从而得到尺寸归一化场景人脸图像和证件人脸图像。其中,场景人脸图像是指对场景照片进行人脸检测、关键点定位和图像预处理后得到的人脸图像,证件人脸图像是指对证件照片进行人脸检测、关键点定位和图像预处理后得到的人脸图像。Image preprocessing refers to performing portrait alignment and cropping processing according to the position of the detected face key point in each picture, thereby obtaining a size normalized scene face image and a document face image. The scene face image refers to the face image obtained by performing face detection, key point positioning and image preprocessing on the scene photo. The face image of the document refers to face detection, key point positioning and image pre-preparation of the document photo. The face image obtained after processing.
S304,根据训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素;三元组元素包括参考样本、正样本和负样本。S304. Train a convolutional neural network model according to the training sample, and generate a triple element corresponding to each training sample by using OHEM; the triplet element includes a reference sample, a positive sample, and a negative sample.
三元组有两种组合方式:以证件照图像为参考样本时,正样本和负样本均为场景照图像;以场景照图像为参考样本时,正样本和负样本均为证件照图像。There are two combinations of triads: when the ID photo is used as the reference sample, the positive and negative samples are scene images; when the scene image is used as the reference sample, both the positive and negative samples are ID images.
具体地,以证件照为参考图像为例,从训练数据集中随机选一个人的证件照样本,该样本称为参考样本,然后再随机选取一个和参考样本属于同一人的场景照样本作为正样本,选取不属于同一人的场景照样本作为负样本,由此构成一个(参考样本、正样本、负样本)三元组。Specifically, taking the document as a reference image as an example, randomly selecting a person's certificate photo sample from the training data set, the sample is called a reference sample, and then randomly selecting a scene photo sample that belongs to the same person as the reference sample as a positive sample. Select a scene sample that does not belong to the same person as a negative sample, thereby forming a (reference sample, positive sample, negative sample) triplet.
即正样本与参考样本为同类样本,即属于同一人图像。负样本是参考样本的异类样本,即不属于同一人的图像。其中,三元组元素中的参考样本和正样本是训练样本中已标记的,负样本在卷积神经网络的训练过程中,采用OHEM(Online Hard Example Mining)策略在线构造三元组,即在网络每次迭代优化的过程中,利用当前网络对候选三元组进行前向计算,选择训练样本中与参考样本不属于同一用户,且余弦距离最近的图像作为负样本,从而得到各训练样本对应的三元组元素。That is, the positive sample and the reference sample are the same kind of samples, that is, belong to the same person image. A negative sample is a heterogeneous sample of a reference sample, that is, an image that does not belong to the same person. Among them, the reference sample and the positive sample in the triple element are labeled in the training sample, and the negative sample is constructed in the convolutional neural network. The OHEM (Online Hard Example Mining) strategy is used to construct the triplet online, that is, in the network. In each iterative optimization process, the current network is used to perform forward calculation on the candidate triplet, and the image in the training sample that does not belong to the same user as the reference sample is selected, and the image with the closest cosine distance is used as the negative sample, thereby obtaining corresponding training samples. Triple element.
一个实施例中,根据训练样本训练卷积神经网络,并产生各训练样本对应的三元组元素的步骤,包括以下步骤S1和S2:In one embodiment, the step of training the convolutional neural network based on the training samples and generating the corresponding triple elements of each training sample comprises the following steps S1 and S2:
S1:随机选择一个图像作为参考样本,选择属于同一标签对象、与参考样本类别不同的图像作为正样本。S1: randomly select an image as a reference sample, and select an image belonging to the same label object and different from the reference sample category as a positive sample.
类别是指所属的图像类型,本实施例中,训练样本的类别包括场景人脸图像和证件人脸图像。因为人脸认证主要是证件照和场景照之间的对比,因此,参考样本和正样本应当属于不同的类别,若参考样本为场景人脸图像,则正样本为证件人脸图像;若参考样本为证件人脸图像,则正样本为场景人脸图像。The category refers to the type of image to which it belongs. In this embodiment, the category of the training sample includes a scene face image and a document face image. Because the face authentication is mainly the comparison between the document photo and the scene photo, the reference sample and the positive sample should belong to different categories. If the reference sample is the scene face image, the positive sample is the document face image; if the reference sample is the document For the face image, the positive sample is the scene face image.
S2:根据OHEM策略,利用当前训练的卷积神经网络模型提取特征之间的余弦距离,对于每一个参考样本,从其它不属于同一标签对象的图像中,选择距离最小、与参考样本属于不同类别的图像,作为该参考样本的负样本。S2: According to the OHEM strategy, the currently trained convolutional neural network model is used to extract the cosine distance between the features. For each reference sample, from other images that do not belong to the same tag object, the selection distance is the smallest, and the reference sample belongs to different categories. The image as a negative sample of the reference sample.
负样本从与参考样本不属于同一人的标签的人脸图像中选择,具体地,负样本在卷积神经网络的训练过程中,采用OHEM策略在线构造三元组,即在网络每次迭代优化的过程中,利用当前网络对候选三元组进行前向计算,选择训练样本中与参考样本不属于同一用户,且余弦距离最近、与参考样本不属于同一类别的图像作为负样本。即,负样本与参考样本的类别不同。可以认为,三元组中若以证件照为参考样本,则正样本和负样本均是场景照;反之若以场景照为参考样本,则另外正样本和负样本均是证件照。The negative sample is selected from the face image of the label that does not belong to the same person as the reference sample. Specifically, in the training process of the convolutional neural network, the negative sample uses the OHEM strategy to construct the triplet online, that is, optimization in each iteration of the network. In the process of using the current network, the candidate triples are forwardly calculated, and the images in the training samples that do not belong to the same user as the reference samples and whose cosine distance is closest and do not belong to the same category as the reference samples are selected as negative samples. That is, the negative sample is different from the reference sample. It can be considered that if the documentary photo is taken as a reference sample in the triplet, both the positive sample and the negative sample are scene photos; otherwise, if the scene is taken as the reference sample, then the other positive and negative samples are the identity photos.
S306,根据各训练样本的三元组元素,基于三元组损失函数的监督,训练卷积神经网络模型,该三元组损失函数,以余弦距离作为度量方式,通过随机梯度下降算法来优化模型参数。S306, according to the triple element of each training sample, based on the supervision of the triplet loss function, training the convolutional neural network model, the triad loss function, using a cosine distance as a measurement method, and optimizing the model by a stochastic gradient descent algorithm parameter.
人证核验终端通过比对用户证件芯片照与场景照是否一致来对用户身份进行验证,后台采集到的数据往往是单个人的样本只有两张图,即证件照与比对时刻抓拍到的场景照,而不同个体的数量却可以成千上万。这种类别数量较大而同类样例少的数据如果用基于分类的方法来进行训练,分类层参数会过于庞大而导致网络非常难以学习,因此考虑用度量学习的方法来解决。其中度量学习的典型的一般是用三元组损失(triplet loss)方法,通过构造图像三元组来学习一种有效的特征映射,在该映射下同类样本的特征距离小于异类样本的特征距离,从而达到正确比对的目的。The human verification terminal verifies the user identity by comparing the user ID chip photo with the scene photo. The data collected in the background is usually a single person's sample with only two pictures, that is, the photo taken and the scene captured at the comparison time. Photo, but the number of different individuals can be thousands. If the data with such a large number of categories and few similar samples is trained by the classification-based method, the classification layer parameters will be too large and the network is very difficult to learn, so consider using the metric learning method. The typical measurement learning is generally to use the triplet loss method to construct an effective feature map by constructing an image triplet. Under this mapping, the feature distance of the same sample is smaller than the feature distance of the heterogeneous sample. Thereby achieving the purpose of correct comparison.
三元组损失(triplet loss)的目的就是通过学习,让参考样本和正样本的特征表达之间的距离尽可能小,而参考样本和负样本的特征表达之间的距离尽可能大,并且要让参考样本和正样本的特征表达之间的距离和参考样本和负样本的特征表达之间的距离之间有一个最小的间隔。The purpose of the triplet loss is to make the distance between the feature expressions of the reference sample and the positive sample as small as possible, and the distance between the feature expressions of the reference sample and the negative sample is as large as possible, and There is a minimum separation between the distance between the reference expression of the reference sample and the positive sample and the distance between the reference expression of the reference sample and the negative sample.
在另一个实施例中,三元组损失函数包括对同类样本的余弦距离的限定,以及对异类样本的余弦距离的限定。In another embodiment, the triplet loss function includes a definition of the cosine distance of a homogeneous sample and a definition of the cosine distance of the heterogeneous sample.
其中,同类样本是指参考样本和正样本,异类样本是指参考样本和负样本。同类样本的 余弦距离是指参考样本和正样本的余弦距离,异类样本的余弦距离是指参考样本和负样本的余弦距离。Among them, the same type of sample refers to the reference sample and the positive sample, and the heterogeneous sample refers to the reference sample and the negative sample. The cosine distance of a similar sample refers to the cosine distance of the reference sample and the positive sample, and the cosine distance of the heterogeneous sample refers to the cosine distance of the reference sample and the negative sample.
一方面,原始的triplet loss方法只是考虑了类间差距而没有考虑类内差距,如果类内分布不够聚敛,网络的泛化能力就会减弱,对场景适应性也会随之降低。另一方面,原始的tripletloss方法采用的是欧式距离来度量样本之间的相似度,实际上人脸模型部署后在特征比对环节,更多地会采用余弦距离来进行度量。欧氏距离衡量的是空间各点的绝对距离,跟各个点所在的位置坐标直接相关;而余弦距离衡量的是空间向量的夹角,更加体现在方向上的差异,而不是位置,从而更符合人脸特征空间的分布属性。On the one hand, the original triplet loss method only considers the gap between classes and does not consider the intra-class gap. If the distribution within the class is not enough, the generalization ability of the network will be weakened, and the adaptability to the scene will also decrease. On the other hand, the original tripletloss method uses Euclidean distance to measure the similarity between samples. In fact, after the face model is deployed, the cosine distance is used to measure more in the feature comparison. The Euclidean distance measures the absolute distance of each point in the space, which is directly related to the position coordinates of each point. The cosine distance measures the angle between the space vectors, which is more reflected in the direction, not the position, which is more in line with The distribution property of the face feature space.
采用triplet loss方法,通过在线构造三元组数据输入网络,然后反向传播三元组的度量损失来进行迭代优化。每一个三元组包含三张图像,分别是一个参考样本,一个与参考样本同类的正样本,以及一个与参考样本异类的负样本,标记为(anchor,positive,negative)。原始triplet loss的基本思想是,通过度量学习使得参考样本与正样本之间的距离小于参考样本与负样本之间的距离,并且距离之差大于一个最小间隔参数α。因此原始的triplet loss损失函数如下:The triplet loss method is used to perform iterative optimization by constructing a triplet data input network online and then backpropagating the metric loss of the triple. Each triple contains three images, one reference sample, one positive sample of the same kind as the reference sample, and one negative sample that is heterogeneous to the reference sample, labeled (anchor, positive, negative). The basic idea of the original triplet loss is that the distance between the reference sample and the positive sample is made smaller than the distance between the reference sample and the negative sample by metric learning, and the difference between the distances is greater than a minimum interval parameter α. So the original triplet loss loss function is as follows:
Figure PCTCN2018109169-appb-000008
Figure PCTCN2018109169-appb-000008
其中,N是三元组数量,
Figure PCTCN2018109169-appb-000009
表示参考样本(anchor)的特征向量,
Figure PCTCN2018109169-appb-000010
表示同类正样本(positive)的特征向量,
Figure PCTCN2018109169-appb-000011
表示异类负样本(negative)的特征向量。
Figure PCTCN2018109169-appb-000012
表示L2范式,即欧氏距离。[·] +的含义如下:
Figure PCTCN2018109169-appb-000013
Where N is the number of triples,
Figure PCTCN2018109169-appb-000009
a feature vector representing a reference sample,
Figure PCTCN2018109169-appb-000010
a eigenvector representing a positive sample of its kind,
Figure PCTCN2018109169-appb-000011
A feature vector representing a heterogeneous negative sample.
Figure PCTCN2018109169-appb-000012
Represents the L2 paradigm, the Euclidean distance. The meaning of [·] + is as follows:
Figure PCTCN2018109169-appb-000013
从上式可看出,原始的triplet loss函数只限定了同类样本(anchor,positive)与异类样本(anchor,negative)之间的距离,即通过间隔参数α尽可能增大类间距离,而对类内距离未作任何限定,即对同类样本之间的距离未作任何约束。如果类内距离比较分散,方差过大,网络的泛化能力就会减弱,样本被错分的概率就会更大。图4为在类间间隔一致、类内方差较大情况下,样本错分的概率示意图,图5为在类间间隔一致、类内方差较小情况下,样本错分的概率示意图,如图4和图5所示,阴影部分表示样本错分的概率,在类间间隔一致、类内方差较大情况下,样本错分的概率明显大于类间间隔一致、类内方差较小情况下,样本错分的概率。As can be seen from the above formula, the original triplet loss function only defines the distance between the same sample (anchor, positive) and the heterogeneous sample (anchor, negative), that is, the interval between the classes is increased as much as possible by the interval parameter α, and The intra-class distance is not limited, that is, there is no constraint on the distance between similar samples. If the distance within the class is scattered and the variance is too large, the generalization ability of the network will be weakened, and the probability that the sample will be misclassified will be greater. Figure 4 is a schematic diagram showing the probability of sample misclassification when the interval between classes is uniform and the variance within the class is large. Figure 5 is a probability diagram of the probability of sample misclassification when the interval between classes is uniform and the variance within the class is small. 4 and Figure 5, the shaded part indicates the probability of sample misclassification. When the interval between classes is uniform and the variance within the class is large, the probability of sample misclassification is significantly larger than the interval between classes, and the intra-class variance is small. The probability of a sample being misclassified.
针对上述问题,本发明提出改进的triplet loss方法,一方面保留了原始方法中对类间距离的限定,同时增加了对类内距离的约束项,使得类内距离尽可能聚敛。其loss函数表达式为:In view of the above problems, the present invention proposes an improved triplet loss method, which on the one hand retains the limitation of the distance between classes in the original method, and increases the constraint on the distance within the class, so that the intra-class distance is as concentrated as possible. Its loss function expression is:
Figure PCTCN2018109169-appb-000014
Figure PCTCN2018109169-appb-000014
其中,cos(·)表示余弦距离,其计算方式为
Figure PCTCN2018109169-appb-000015
N是三元组数量,
Figure PCTCN2018109169-appb-000016
表示参考样本的特征向量,
Figure PCTCN2018109169-appb-000017
表示同类正样本的特征向量,
Figure PCTCN2018109169-appb-000018
表示异类负样本的特征向量,[·] +的含义如下:
Figure PCTCN2018109169-appb-000019
α 1为类间间隔参数,α 2为类内间隔参数。
Where cos(·) represents the cosine distance and is calculated as
Figure PCTCN2018109169-appb-000015
N is the number of triples,
Figure PCTCN2018109169-appb-000016
Represents the feature vector of the reference sample,
Figure PCTCN2018109169-appb-000017
a eigenvector representing a positive sample of its kind,
Figure PCTCN2018109169-appb-000018
A eigenvector representing a heterogeneous negative sample, the meaning of [·] + is as follows:
Figure PCTCN2018109169-appb-000019
α 1 is the inter-class spacing parameter and α 2 is the intra-class spacing parameter.
相比原始的triplet loss函数,改进后的triplet loss函数的度量方式由欧氏距离改为余弦距离,这样可以保持训练阶段与部署阶段度量方式的一致性,提高特征学习的连续性。同时新的triplet loss函数第一项与原始的triplet loss作用一致,用于增大类间差距,第二项添加了对同类样本对(正元组)的距离约束,用于缩小类内差距。α 1为类间间隔参数,取值范围为0~0.2,α 2为类内间隔参数,取值范围为0.8~1.0。值得注意的是,由于是用余弦方式度量,得到的度量值对应两个样本之间的相似度,因此在
Figure PCTCN2018109169-appb-000020
表达式中,只有负元组余弦相似度在α 1范围内大于正元组余弦相似度的样本,才会真正参与训练。
Compared with the original triplet loss function, the improved triplet loss function is changed from Euclidean distance to cosine distance, which can keep the consistency between the training phase and the deployment phase and improve the continuity of feature learning. At the same time, the first item of the new triplet loss function is consistent with the original triplet loss, which is used to increase the gap between classes. The second item adds the distance constraint on the same sample pair (orthogonal group), which is used to narrow the intra-class gap. α 1 is an inter-class interval parameter, which ranges from 0 to 0.2, and α 2 is an intra-class interval parameter, ranging from 0.8 to 1.0. It is worth noting that since the metric is measured by the cosine method, the obtained metric corresponds to the similarity between the two samples, so
Figure PCTCN2018109169-appb-000020
In the expression, only the samples with the cosine similarity of the negative tuple in the range of α 1 greater than the cosine similarity of the positive tuple will actually participate in the training.
基于改进后的三元组损失函数来训练模型,通过类间损失与类内损失的联合约束来对模型进行反向传播的优化训练,使得同类样本在特征空间尽可能接近而异类样本在特征空间尽可能远离,提高模型的辨识力,从而提高人脸认证的可靠性。The model is trained based on the improved ternary loss function, and the back-propagation optimization training of the model is carried out through the joint constraint between the loss between classes and the loss within the class, so that the similar samples are as close as possible in the feature space and the heterogeneous samples are in the feature space. Keep it as far as possible to improve the recognition of the model, thus improving the reliability of face authentication.
S308,将验证集数据输入卷积神经网络,达到训练结束条件时,得到训练好的用于人脸认证的卷积神经网络。S308. Enter the verification set data into the convolutional neural network, and when the training end condition is reached, obtain a trained convolutional neural network for face authentication.
具体地,从人证图像数据池中取90%数据作为训练集,剩余10%作为验证集。基于上式计算出改进后的triplet loss值,反馈到卷积神经网络中进行迭代优化。同时观测模型在验证集中的性能表现,当验证性能不再升高时,模型达到收敛状态,训练阶段终止。Specifically, 90% of the data is taken from the pool of human image data as a training set, and the remaining 10% is used as a verification set. The improved triplet loss value is calculated based on the above formula and fed back to the convolutional neural network for iterative optimization. At the same time, the performance of the observation model in the verification set, when the verification performance is no longer elevated, the model reaches a convergence state, and the training phase is terminated.
上述的人脸认证方法,一方面在原始triplet loss的损失函数中增加了对类内样本距离的约束,从而在增大类间差距的同时减小类内差距,提升模型的泛化能力;另一方面,将原始triplet loss的度量方式由欧氏距离改为余弦距离,保持训练与部署的度量一致性,提高特征学 习的连续性。The above-mentioned face authentication method, on the one hand, increases the constraint on the sample distance within the class in the loss function of the original triplet loss, thereby reducing the intra-class gap and increasing the generalization ability of the model while increasing the inter-class gap; On the one hand, the original triplet loss measurement method is changed from Euclidean distance to cosine distance, keeping the consistency of training and deployment metrics, and improving the continuity of feature learning.
在另一个实施例中,训练卷积神经网络的步骤还包括:利用基于海量开源人脸数据训练好的基础模型参数进行初始化,在特征输出层后添加归一化层及改进后的三元组损失函数层,得到待训练的卷积神经网络。In another embodiment, the step of training the convolutional neural network further comprises: initializing the basic model parameters trained based on the massive open source face data, adding a normalized layer and the improved triplet after the feature output layer Loss function layer, get the convolutional neural network to be trained.
具体地,在用深度学习解决人证合一问题时,常规的基于互联网海量人脸数据训练得到的深度人脸识别模型在特定场景下的人证比对应用上性能会大幅下降,而特定应用场景下的人证数据来源又比较有限,直接地学习往往由于样本不足导致训练结果不理想,因此极需要研发一种有效地针对小数据集的场景数据进行扩展训练的方法,以提升人脸识别模型在特定应用场景下的准确率,满足市场应用需求。Specifically, when deep learning is used to solve the problem of human-integration, the deep face recognition model based on the conventional Internet-based massive face data training will greatly reduce the performance of the human-environment comparison application in a specific scenario, but the specific application The source of human witness data in the scenario is limited. Direct learning often results in unsatisfactory training results due to insufficient samples. Therefore, it is extremely necessary to develop a method for effectively expanding the training of small data sets to enhance face recognition. The accuracy of the model in a specific application scenario meets the needs of market applications.
深度学习算法往往依赖于海量数据的训练,在人证合一应用中,证件照与场景照比对属于异质样本比对问题,常规的基于海量互联网人脸数据训练得到的深度人脸识别模型在人证比对应用上性能会大幅下降。然而人证数据来源有限(需要同时具备同一个人的身份证图像及相应的场景图像),可用于训练的数据量较少,直接训练会由于样本不足导致训练效果不理想,因此在运用深度学习进行人证合一的模型训练时,往往是利用迁移学习的思想,先基于海量的互联网人脸数据训练一个在开源测试集上性能可靠的基础模型,然后再在有限的人证数据上进行二次扩展训练,使模型能自动学习特定模态的特征表示,提升模型性能。此过程如图6所示。Deep learning algorithms often rely on the training of massive data. In the application of human-integration, the comparison between document photos and scenes is a heterogeneous sample comparison problem. The conventional deep face recognition model based on massive Internet face data training. Performance will drop significantly in the comparison of applications. However, the source of human data is limited (requires the same person's ID card image and corresponding scene image), and the amount of data that can be used for training is small. Direct training may result in poor training results due to insufficient samples, so deep learning is used. When training the model of humanity and syndrome, it is often the idea of using migration learning. Firstly, based on massive Internet face data, a basic model with reliable performance on the open source test set is trained, and then it is repeated twice on the limited person data. Extended training enables the model to automatically learn the feature representation of a particular modality and improve model performance. This process is shown in Figure 6.
在二次训练的过程中,整个网络用预训练好的基础模型参数进行初始化,然后在网络的特征输出层之后添加一个L2归一化层以及改进后的triplet loss层,待训练的卷积神经网络结构图如图7所示。In the process of secondary training, the entire network is initialized with pre-trained basic model parameters, and then an L2 normalization layer and an improved triplet loss layer are added after the feature output layer of the network, and the convolutional nerve to be trained The network structure diagram is shown in Figure 7.
一个实施例中,一种人脸认证方法的流程示意图如图8所示,包括三个阶段,分别为数据采集与预处理阶段、训练阶段和部署阶段。In one embodiment, a schematic diagram of a face authentication method is shown in FIG. 8 and includes three phases, namely, a data acquisition and preprocessing phase, a training phase, and a deployment phase.
数据采集与预处理阶段,由人证核验终端设备的读卡器模块读取证件芯片照,以及前置摄像头抓取现场照片,经过人脸检测器、关键点检测器、人脸对齐与剪切模块之后得到尺寸归一化的证件人脸图像和场景人脸图像。In the data acquisition and preprocessing stage, the card reader module of the human verification terminal device reads the ID card photo, and the front camera captures the live photo, and passes through the face detector, key point detector, face alignment and cutting. The module is then obtained with a normalized ID face image and a scene face image.
训练阶段,从人证图像数据池中取90%数据作为训练集,剩余10%作为验证集。由于人证比对主要是证件照与场景照之间的比对,因为三元组中若以证件照为参考图(anchor),则另外两张图均是场景照;反之若以场景照为参考图,则另外两张图均是证件照。采用OHEM 在线构造三元组的策略,即在网络每次迭代优化的过程中,利用当前网络对候选三元组进行前向计算,筛选满足条件的有效三元组,按照上式计算出改进后的triplet loss值,反馈到网络中进行迭代优化。同时观测模型在验证集中的性能表现,当验证性能不再升高时,模型达到收敛状态,训练阶段终止。In the training phase, 90% of the data from the human image data pool is used as the training set, and the remaining 10% is used as the verification set. Since the comparison of the person's card is mainly the comparison between the photo of the document and the scene photo, if the photo is taken as an anchor in the triad, the other two pictures are scene photos; otherwise, if the scene is taken as a reference Figure, the other two pictures are photo ID. The strategy of constructing triples on-line using OHEM is to use the current network to perform forward calculation on the candidate triples in the process of optimization of each iteration of the network, and to filter the effective triples satisfying the conditions, and calculate the improved according to the above formula. The value of the triplet loss is fed back into the network for iterative optimization. At the same time, the performance of the observation model in the verification set, when the verification performance is no longer elevated, the model reaches a convergence state, and the training phase is terminated.
部署阶段,将训练好的模型部署到人证核验终端进行使用时,设备采集到的图像经过与训练阶段相同的预处理程序,然后通过网络前向计算得到每张人脸图像的特征向量,通过计算余弦距离得到两张图像的相似度,然后根据预设阈值进行判决,大于预设阈值的为同一人,反之为不同人。During the deployment phase, when the trained model is deployed to the human verification terminal for use, the image acquired by the device passes through the same pre-processing procedure as the training phase, and then the feature vector of each face image is obtained through the network forward calculation. The cosine distance is calculated to obtain the similarity between the two images, and then the judgment is performed according to the preset threshold, and the same person is greater than the preset threshold, and vice versa.
上述的人脸认证方法,原始triplet loss函数只限定了类间距离的学习关系,上述的人脸认证方法,通过改进原始triplet loss损失函数增加了类内距离的约束项,可以使得网络在训练过程中增大类间差距的同时尽可能减小类内差距,从而提高网络的泛化能力,进而提升模型的场景适应性。另外,用余弦距离替代了原始triplet loss中的欧氏距离度量方式,更符合人脸特征空间的分布属性,保持了训练阶段与部署阶段度量方式的一致性,使得比对结果更加可靠。In the above face authentication method, the original triplet loss function only defines the learning relationship of the distance between classes. The above face authentication method increases the constraint of the intra-class distance by improving the original triplet loss loss function, which can make the network in the training process. In the process of increasing the gap between classes, the intra-class gap is minimized, thereby improving the generalization ability of the network, and thus improving the adaptability of the model. In addition, the Euclidean distance is used to replace the Euclidean distance metric in the original triplet loss, which is more consistent with the distribution property of the face feature space, and maintains the consistency between the training phase and the deployment phase, making the comparison result more reliable.
在一个实施例中,提供一种人脸认证装置,如图9所示,包括:图像获取模块902、图像预处理模块904、特征获取模块906、计算模块908和认证模块910。In an embodiment, a face authentication device is provided, as shown in FIG. 9, comprising: an image acquisition module 902, an image preprocessing module 904, a feature acquisition module 906, a calculation module 908, and an authentication module 910.
图像获取模块902,用于基于人脸认证请求,获取证件照片和人物的场景照片。The image obtaining module 902 is configured to obtain a photo of the document and a photo of the scene of the person based on the face authentication request.
图像预处理模块904,用于对场景照片和证件照片分别进行人脸检测、关键点定位和图像预处理,得到场景照片对应的场景人脸图像,以及证件照片对应的证件人脸图像。The image pre-processing module 904 is configured to perform face detection, key point positioning, and image pre-processing on the scene photo and the ID photo, respectively, to obtain a scene face image corresponding to the scene photo, and a document face image corresponding to the ID photo.
特征获取模块906,用于将场景人脸图像和证件人脸图像输入到预先训练好的用于人脸认证的卷积神经网络模型,并获取卷积神经网络模型输出的场景人脸图像对应的第一特征向量,以及证件人脸图像对应的第二特征向量;其中,卷积神经网络模型基于三元组损失函数的监督训练得到。The feature acquisition module 906 is configured to input the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and obtain a corresponding scene face image output by the convolutional neural network model. The first feature vector and the second feature vector corresponding to the document face image; wherein the convolutional neural network model is obtained based on the supervised training of the triple loss function.
计算模块908,用于计算第一特征向量和第二特征向量的余弦距离。The calculation module 908 is configured to calculate a cosine distance of the first feature vector and the second feature vector.
认证模块910,用于比较余弦距离和预设阈值,并根据比较结果确定人脸认证结果。The authentication module 910 is configured to compare the cosine distance and the preset threshold, and determine a face authentication result according to the comparison result.
上述的人脸认证装置,利用预先训练的卷积神经网络进行人脸认证,由于卷积神经网络模型基于改进后的三元组损失函数的监督训练得到,而场景人脸图像和证件人脸图像的相似度根据场景人脸图像对应的第一特征向量和证件人脸图像对应的第二特征向量的余弦距离计 算得到,余弦距离衡量的是空间向量的夹角,更加体现在方向上的差异,而不是位置,从而更符合人脸特征空间的分布属性,提高了人脸认证的可靠性。The face authentication device described above performs face authentication using a pre-trained convolutional neural network, and the convolutional neural network model is obtained based on the supervised training of the improved triad loss function, and the scene face image and the document face image are obtained. The similarity is calculated according to the first feature vector corresponding to the scene face image and the cosine distance of the second feature vector corresponding to the document face image. The cosine distance measures the angle between the space vectors and is more reflected in the direction difference. Instead of position, it is more in line with the distribution properties of the face feature space, which improves the reliability of face authentication.
如图9所示,在另一个实施例中,人脸认证装置还包括:样本获取模块912、三元组获取模块914、训练模块916和验证模块918。As shown in FIG. 9, in another embodiment, the face authentication device further includes: a sample acquisition module 912, a triplet acquisition module 914, a training module 916, and a verification module 918.
样本获取模块912,用于获取带标记的训练样本,所述训练样本包括标记了属于每个标记对象的一张证件人脸图像和至少一张场景人脸图像。The sample obtaining module 912 is configured to obtain a labeled training sample, where the training sample includes a document face image and at least one scene face image that are marked for each mark object.
三元组获取模块914,用于根据训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素;三元组元素包括参考样本、正样本和负样本。The triplet obtaining module 914 is configured to train the convolutional neural network model according to the training samples, and generate the triple element corresponding to each training sample by using OHEM; the triplet elements include a reference sample, a positive sample, and a negative sample.
具体地,三元组获取模块914,用于随机选择一个图像作为参考样本,选择属于同一标签对象、与参考样本类别不同的图像作为正样本,还用于根据OHEM策略,利用当前训练的卷积神经网络模型提取特征之间的余弦距离,对于每一个参考样本,从其它具有不属于同一标签对象的人脸图像中,选择距离最小、与参考样本属于不同类别的图像,作为该参考样本的负样本。Specifically, the triplet obtaining module 914 is configured to randomly select an image as a reference sample, select an image belonging to the same label object and different from the reference sample category as a positive sample, and also use the current training convolution according to the OHEM strategy. The neural network model extracts the cosine distance between the features. For each reference sample, from other face images that do not belong to the same tag object, select the image with the smallest distance and different categories from the reference sample as the negative of the reference sample. sample.
具体地,以证件照作为参考样本时,正样本和负样本均为场景照;以场景照作为参考样本时,正样本和负样本均为证件照。Specifically, when the passport photo is used as the reference sample, both the positive sample and the negative sample are scene photos; when the scene photo is taken as the reference sample, both the positive sample and the negative sample are the passport photos.
训练模块916,用于根据各训练样本的三元组元素,基于三元组损失函数的监督,训练卷积神经网络模型,该三元组损失函数,以余弦距离作为度量方式,通过随机梯度下降算法来优化模型参数。The training module 916 is configured to train a convolutional neural network model based on the triple element of each training sample based on the supervision of the triad loss function, and the triad loss function is measured by a cosine distance as a metric by a random gradient Algorithm to optimize model parameters.
具体地,改进型三元组损失函数包括对同类样本的余弦距离的限定,以及对异类样本的余弦距离的限定。In particular, the improved triplet loss function includes a definition of the cosine distance of a homogeneous sample and a definition of the cosine distance of the heterogeneous sample.
改进型三元组损失函数为:The improved triplet loss function is:
Figure PCTCN2018109169-appb-000021
Figure PCTCN2018109169-appb-000021
其中,cos(·)表示余弦距离,其计算方式为
Figure PCTCN2018109169-appb-000022
N是三元组数量,
Figure PCTCN2018109169-appb-000023
表示参考样本的特征向量,
Figure PCTCN2018109169-appb-000024
表示同类正样本的特征向量,
Figure PCTCN2018109169-appb-000025
表示异类负样本的特征向量,[·] +的含义如下:
Figure PCTCN2018109169-appb-000026
α 1为类间间隔参数,α 2为类内间隔参数。
Where cos(·) represents the cosine distance and is calculated as
Figure PCTCN2018109169-appb-000022
N is the number of triples,
Figure PCTCN2018109169-appb-000023
Represents the feature vector of the reference sample,
Figure PCTCN2018109169-appb-000024
a eigenvector representing a positive sample of its kind,
Figure PCTCN2018109169-appb-000025
A eigenvector representing a heterogeneous negative sample, the meaning of [·] + is as follows:
Figure PCTCN2018109169-appb-000026
α 1 is the inter-class spacing parameter and α 2 is the intra-class spacing parameter.
验证模块918,用于将验证集数据输入卷积神经网络模型,达到训练结束条件时,得到训练好的用于人脸认证的卷积神经网络模型。The verification module 918 is configured to input the verification set data into the convolutional neural network model, and when the training end condition is reached, obtain a trained convolutional neural network model for face authentication.
在另一个实施例中,人脸认证装置还包括模型初始化模块920,用于利用基于海量开源人脸数据训练好的基础模型参数进行初始化,在特征输出层后添加归一化层及三元组损失函数层,得到待训练的卷积神经网络。上述的人脸认证装置,一方面在原始triplet loss的损失函数中增加了对类内样本距离的约束,从而在增大类间差距的同时减小类内差距,提升模型的泛化能力;另一方面,将原始triplet loss的度量方式由欧氏距离改为余弦距离,保持训练与部署的度量一致性,提高特征学习的连续性。In another embodiment, the face authentication device further includes a model initialization module 920, configured to initialize the basic model parameters trained based on the massive open source face data, and add a normalization layer and a triplet after the feature output layer. Loss function layer, get the convolutional neural network to be trained. The above-mentioned face authentication device increases the constraint on the sample distance within the class in the loss function of the original triplet loss, thereby reducing the intra-class gap and increasing the generalization ability of the model while increasing the inter-class gap; On the one hand, the original triplet loss measurement method is changed from Euclidean distance to cosine distance, keeping the consistency of training and deployment metrics, and improving the continuity of feature learning.
一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现上述各实施例的人脸认证方法的步骤。A computer device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, the step of implementing the face authentication method of each of the above embodiments when the processor executes the computer program.
一种存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时,实现上述各实施例的人脸认证方法的步骤。A storage medium having stored thereon a computer program, wherein the computer program is executed by a processor to implement the steps of the face authentication method of each of the above embodiments.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-described embodiments may be arbitrarily combined. For the sake of brevity of description, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, All should be considered as the scope of this manual.
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。The above-described embodiments are merely illustrative of several embodiments of the present invention, and the description thereof is more specific and detailed, but is not to be construed as limiting the scope of the invention. It should be noted that a number of variations and modifications may be made by those skilled in the art without departing from the spirit and scope of the invention. Therefore, the scope of the invention should be determined by the appended claims.

Claims (10)

  1. 一种基于Triplet Loss的人脸认证方法,包括:A face recognition method based on Triplet Loss, comprising:
    基于人脸认证请求,获取证件照片和人物的场景照片;Obtaining photos of the ID and scenes of the person based on the face authentication request;
    对所述场景照片和所述证件照片分别进行人脸检测、关键点定位和图像预处理,得到所述场景照片对应的场景人脸图像,以及所述证件照片对应的证件人脸图像;Performing face detection, key point positioning, and image pre-processing on the scene photo and the document photo respectively, obtaining a scene face image corresponding to the scene photo, and a document face image corresponding to the document photo;
    将所述场景人脸图像和证件人脸图像输入到预先训练好的用于人脸认证的卷积神经网络模型,并获取所述卷积神经网络模型输出的所述场景人脸图像对应的第一特征向量,以及所述证件人脸图像对应的第二特征向量;其中,所述卷积神经网络模型基于三元组损失函数的监督训练得到;Inputting the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and acquiring a corresponding image of the scene face image output by the convolutional neural network model a feature vector, and a second feature vector corresponding to the document face image; wherein the convolutional neural network model is obtained based on supervised training of a triple loss function;
    计算所述第一特征向量和所述第二特征向量的余弦距离;Calculating a cosine distance of the first feature vector and the second feature vector;
    比较所述余弦距离和预设阈值,并根据比较结果确定人脸认证结果。Comparing the cosine distance and a preset threshold, and determining a face authentication result according to the comparison result.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 further comprising:
    获取带标记的训练样本,所述训练样本包括标记了属于每个标记对象的一张证件人脸图像和至少一张场景人脸图像;Obtaining a labeled training sample, the training sample including a document face image and at least one scene face image that are marked for each mark object;
    根据所述训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素;所述三元组元素包括参考样本、正样本和负样本;And training a convolutional neural network model according to the training sample, and generating a triple element corresponding to each training sample by using OHEM; the triad element includes a reference sample, a positive sample, and a negative sample;
    根据各训练样本的三元组元素,基于三元组损失函数的监督,训练所述卷积神经网络模型;该三元组损失函数,以余弦距离作为度量方式,通过随机梯度下降算法来优化模型参数;According to the triple element of each training sample, the convolutional neural network model is trained based on the supervision of the triplet loss function; the triad loss function is measured by the cosine distance and the model is optimized by the stochastic gradient descent algorithm. parameter;
    将验证集数据输入所述卷积神经网络模型,达到训练结束条件时,得到训练好的用于人脸认证的卷积神经网络模型。The verification set data is input into the convolutional neural network model, and when the training end condition is reached, the trained convolutional neural network model for face authentication is obtained.
  3. 根据权利要求2所述的方法,其特征在于,根据所述训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素的步骤,包括:The method according to claim 2, wherein the step of training the convolutional neural network model according to the training sample, and generating the triple element corresponding to each training sample by OHEM comprises:
    随机选择一个图像作为参考样本,选择属于同一标签对象、与参考样本类别不同的图像作为正样本;Randomly selecting an image as a reference sample, and selecting an image belonging to the same label object and different from the reference sample category as a positive sample;
    根据OHEM策略,利用当前训练的卷积神经网络模型提取特征之间的余弦距离,对于每一个参考样本,从其它不属于同一标签对象的图像中,选择距离最小、与所述参考样本属于不同类别的图像,作为该参考样本的负样本。According to the OHEM strategy, the currently trained convolutional neural network model is used to extract the cosine distance between the features. For each reference sample, from other images that do not belong to the same tag object, the selection distance is the smallest, and the reference sample belongs to a different category. The image as a negative sample of the reference sample.
  4. 根据权利要求2所述的方法,其特征在于,所述三元组损失函数包括对同类样本的余弦距离的限定,以及对异类样本的余弦距离的限定。The method of claim 2 wherein said triplet loss function comprises a definition of a cosine distance for a homogeneous sample and a definition of a cosine distance of the heterogeneous sample.
  5. 根据权利要求4所述的方法,其特征在于,所述三元组损失函数为:The method of claim 4 wherein said triplet loss function is:
    Figure PCTCN2018109169-appb-100001
    Figure PCTCN2018109169-appb-100001
    其中,cos(·)表示余弦距离,其计算方式为
    Figure PCTCN2018109169-appb-100002
    N是三元组数量,
    Figure PCTCN2018109169-appb-100003
    表示参考样本的特征向量,
    Figure PCTCN2018109169-appb-100004
    表示同类正样本的特征向量,
    Figure PCTCN2018109169-appb-100005
    表示异类负样本的特征向量,[·] +的含义如下:
    Figure PCTCN2018109169-appb-100006
    α 1为类间间隔参数,α 2为类内间隔参数。
    Where cos(·) represents the cosine distance and is calculated as
    Figure PCTCN2018109169-appb-100002
    N is the number of triples,
    Figure PCTCN2018109169-appb-100003
    Represents the feature vector of the reference sample,
    Figure PCTCN2018109169-appb-100004
    a eigenvector representing a positive sample of its kind,
    Figure PCTCN2018109169-appb-100005
    A eigenvector representing a heterogeneous negative sample, the meaning of [·] + is as follows:
    Figure PCTCN2018109169-appb-100006
    α 1 is the inter-class spacing parameter and α 2 is the intra-class spacing parameter.
  6. 根据权利要求2所述的方法,其特征在于,所述方法还包括:利用基于海量开源人脸数据训练好的基础模型参数进行初始化,在特征输出层后添加归一化层及三元组损失函数层,得到待训练的卷积神经网络模型。The method according to claim 2, wherein the method further comprises: initializing the basic model parameters trained based on the massive open source face data, and adding the normalized layer and the triplet loss after the feature output layer The function layer obtains the convolutional neural network model to be trained.
  7. 一种基于Triplet Loss的人脸认证装置,包括:图像获取模块、图像预处理模块、特征获取模块、计算模块和认证模块;A face recognition device based on Triplet Loss, comprising: an image acquisition module, an image preprocessing module, a feature acquisition module, a calculation module and an authentication module;
    所述图像获取模块,用于基于人脸认证请求,获取证件照片和人物的场景照片;The image obtaining module is configured to obtain a photo of the document and a photo of the scene of the character based on the face authentication request;
    所述图像预处理模块,用于对所述场景照片和所述证件照片分别进行人脸检测、关键点定位和图像预处理,得到所述场景照片对应的场景人脸图像,以及所述证件照片对应的证件人脸图像;The image pre-processing module is configured to perform face detection, key point positioning, and image pre-processing on the scene photo and the photo of the document, respectively, to obtain a scene face image corresponding to the scene photo, and the photo of the ID Corresponding document face image;
    所述特征获取模块,用于将所述场景人脸图像和证件人脸图像输入到预先训练好的用于人脸认证的卷积神经网络模型,并获取所述卷积神经网络模型输出的所述场景人脸图像对应的第一特征向量,以及所述证件人脸图像对应的第二特征向量;其中,所述卷积神经网络模型基于三元组损失函数的监督训练得到;The feature acquiring module is configured to input the scene face image and the document face image into a pre-trained convolutional neural network model for face authentication, and acquire the output of the convolutional neural network model a first feature vector corresponding to the scene face image, and a second feature vector corresponding to the document face image; wherein the convolutional neural network model is obtained based on the supervision training of the triplet loss function;
    所述计算模块,用于计算所述第一特征向量和所述第二特征向量的余弦距离;The calculating module is configured to calculate a cosine distance of the first feature vector and the second feature vector;
    所述认证模块,用于比较所述余弦距离和预设阈值,并根据比较结果确定人脸认证结果。The authentication module is configured to compare the cosine distance and a preset threshold, and determine a face authentication result according to the comparison result.
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括:样本获取模块、三元组获取模块、训练模块和验证模块;The device according to claim 7, wherein the device further comprises: a sample acquisition module, a triplet acquisition module, a training module, and a verification module;
    所述样本获取模块,用于获取带标记的训练样本,所述训练样本包括标记了属于每个标 记对象的一张证件人脸图像和至少一张场景人脸图像;The sample obtaining module is configured to acquire a labeled training sample, where the training sample includes a document face image and at least one scene face image marked with each tag object;
    所述三元组获取模块,用于根据所述训练样本训练卷积神经网络模型,通过OHEM产生各训练样本对应的三元组元素;所述三元组元素包括参考样本、正样本和负样本;The triplet obtaining module is configured to train a convolutional neural network model according to the training sample, and generate a triple element corresponding to each training sample by using OHEM; the triplet element includes a reference sample, a positive sample, and a negative sample. ;
    所述训练模块,用于根据各训练样本的三元组元素,基于三元组损失函数的监督,训练所述卷积神经网络模型;该三元组损失函数,以余弦距离作为度量方式,通过随机梯度下降算法来优化模型参数;The training module is configured to train the convolutional neural network model based on the triple element of each training sample based on the supervision of the triad loss function; the triad loss function is measured by a cosine distance A stochastic gradient descent algorithm to optimize model parameters;
    所述验证模块,用于将验证集数据输入所述卷积神经网络模型,达到训练结束条件时,得到训练好的用于人脸认证的卷积神经网络模型。The verification module is configured to input the verification set data into the convolutional neural network model, and when the training end condition is reached, obtain a trained convolutional neural network model for face authentication.
  9. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至6任一项所述的基于Triplet Loss的人脸认证方法的步骤。A computer device comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein the processor executes the computer program to implement any one of claims 1 to The steps of the face recognition method based on Triplet Loss.
  10. 一种存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时,实现权利要求1至6任一项所述的基于Triplet Loss的人脸认证方法的步骤。A storage medium having stored thereon a computer program, wherein the computer program is executed by a processor, the step of implementing the facelet authentication method based on the Triplet Loss according to any one of claims 1 to 6.
PCT/CN2018/109169 2017-12-26 2018-09-30 Face verification method and apparatus based on triplet loss, and computer device and storage medium WO2019128367A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711436879.4 2017-12-26
CN201711436879.4A CN108009528B (en) 2017-12-26 2017-12-26 Triple Loss-based face authentication method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2019128367A1 true WO2019128367A1 (en) 2019-07-04

Family

ID=62061566

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/109169 WO2019128367A1 (en) 2017-12-26 2018-09-30 Face verification method and apparatus based on triplet loss, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN108009528B (en)
WO (1) WO2019128367A1 (en)

Cited By (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414431A (en) * 2019-07-29 2019-11-05 广州像素数据技术股份有限公司 Face identification method and system based on elastic context relation loss function
CN110458233A (en) * 2019-08-13 2019-11-15 腾讯云计算(北京)有限责任公司 Combination grain object identification model training and recognition methods, device and storage medium
CN110516533A (en) * 2019-07-11 2019-11-29 同济大学 A kind of pedestrian based on depth measure discrimination method again
CN110555478A (en) * 2019-09-05 2019-12-10 东北大学 Fan multi-fault diagnosis method based on depth measurement network of difficult sample mining
CN110647880A (en) * 2019-08-12 2020-01-03 深圳市华付信息技术有限公司 Mobile terminal identity card image shielding judgment method
CN110647938A (en) * 2019-09-24 2020-01-03 北京市商汤科技开发有限公司 Image processing method and related device
CN110674637A (en) * 2019-09-06 2020-01-10 腾讯科技(深圳)有限公司 Character relation recognition model training method, device, equipment and medium
CN110705357A (en) * 2019-09-02 2020-01-17 深圳中兴网信科技有限公司 Face recognition method and face recognition device
CN110705393A (en) * 2019-09-17 2020-01-17 中国计量大学 Method for improving face recognition performance of community population
CN110796057A (en) * 2019-10-22 2020-02-14 上海交通大学 Pedestrian re-identification method and device and computer equipment
CN110852367A (en) * 2019-11-05 2020-02-28 上海联影智能医疗科技有限公司 Image classification method, computer device, and storage medium
CN110956098A (en) * 2019-11-13 2020-04-03 深圳和而泰家居在线网络科技有限公司 Image processing method and related equipment
CN111008550A (en) * 2019-09-06 2020-04-14 上海芯灵科技有限公司 Identification method for finger vein authentication identity based on Multiple loss function
CN111062430A (en) * 2019-12-12 2020-04-24 易诚高科(大连)科技有限公司 Pedestrian re-identification evaluation method based on probability density function
CN111079566A (en) * 2019-11-28 2020-04-28 深圳市信义科技有限公司 Large-scale face recognition model optimization system
CN111091089A (en) * 2019-12-12 2020-05-01 新华三大数据技术有限公司 Face image processing method and device, electronic equipment and storage medium
CN111126240A (en) * 2019-12-19 2020-05-08 西安工程大学 Three-channel feature fusion face recognition method
CN111126360A (en) * 2019-11-15 2020-05-08 西安电子科技大学 Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN111144240A (en) * 2019-12-12 2020-05-12 深圳数联天下智能科技有限公司 Image processing method and related equipment
CN111191563A (en) * 2019-12-26 2020-05-22 三盟科技股份有限公司 Face recognition method and system based on data sample and test data set training
CN111198964A (en) * 2020-01-10 2020-05-26 中国科学院自动化研究所 Image retrieval method and system
CN111209839A (en) * 2019-12-31 2020-05-29 上海涛润医疗科技有限公司 Face recognition method
CN111222411A (en) * 2019-11-28 2020-06-02 中国船舶重工集团公司第七一三研究所 Laser emission safe and rapid alarm method and device
CN111241925A (en) * 2019-12-30 2020-06-05 新大陆数字技术股份有限公司 Face quality evaluation method, system, electronic equipment and readable storage medium
CN111274946A (en) * 2020-01-19 2020-06-12 杭州涂鸦信息技术有限公司 Face recognition method, system and equipment
CN111368766A (en) * 2020-03-09 2020-07-03 云南安华防灾减灾科技有限责任公司 Cattle face detection and identification method based on deep learning
CN111414862A (en) * 2020-03-22 2020-07-14 西安电子科技大学 Expression recognition method based on neural network fusion key point angle change
CN111429414A (en) * 2020-03-18 2020-07-17 腾讯科技(深圳)有限公司 Artificial intelligence-based focus image sample determination method and related device
CN111507289A (en) * 2020-04-22 2020-08-07 上海眼控科技股份有限公司 Video matching method, computer device and storage medium
CN111539247A (en) * 2020-03-10 2020-08-14 西安电子科技大学 Hyper-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN111582107A (en) * 2020-04-28 2020-08-25 浙江大华技术股份有限公司 Training method and recognition method of target re-recognition model, electronic equipment and device
CN111626212A (en) * 2020-05-27 2020-09-04 腾讯科技(深圳)有限公司 Method and device for identifying object in picture, storage medium and electronic device
CN111639535A (en) * 2020-04-29 2020-09-08 深圳英飞拓智能技术有限公司 Face recognition method and device based on deep learning
CN111738157A (en) * 2020-06-23 2020-10-02 平安科技(深圳)有限公司 Method and device for constructing data set of facial action units and computer equipment
CN111988614A (en) * 2020-08-14 2020-11-24 深圳前海微众银行股份有限公司 Hash coding optimization method and device and readable storage medium
CN112052821A (en) * 2020-09-15 2020-12-08 浙江智慧视频安防创新中心有限公司 Fire fighting channel safety detection method, device, equipment and storage medium
CN112069993A (en) * 2020-09-04 2020-12-11 西安西图之光智能科技有限公司 Dense face detection method and system based on facial features mask constraint and storage medium
CN112084956A (en) * 2020-09-11 2020-12-15 上海交通大学烟台信息技术研究院 Special face crowd screening system based on small sample learning prototype network
CN112257738A (en) * 2020-07-31 2021-01-22 北京京东尚科信息技术有限公司 Training method and device of machine learning model and classification method and device of image
CN112287765A (en) * 2020-09-30 2021-01-29 新大陆数字技术股份有限公司 Face living body detection method, device and equipment and readable storage medium
CN112307968A (en) * 2020-10-30 2021-02-02 天地伟业技术有限公司 Face recognition feature compression method
CN112328786A (en) * 2020-11-03 2021-02-05 平安科技(深圳)有限公司 Text classification method and device based on BERT, computer equipment and storage medium
CN112580011A (en) * 2020-12-25 2021-03-30 华南理工大学 Portrait encryption and decryption system facing biological feature privacy protection
CN112733574A (en) * 2019-10-14 2021-04-30 中移(苏州)软件技术有限公司 Face recognition method and device and computer readable storage medium
CN112766237A (en) * 2021-03-12 2021-05-07 东北林业大学 Unsupervised pedestrian re-identification method based on cluster feature point clustering
CN112836629A (en) * 2021-02-01 2021-05-25 清华大学深圳国际研究生院 Image classification method
CN112836719A (en) * 2020-12-11 2021-05-25 南京富岛信息工程有限公司 Indicator diagram similarity detection method fusing two classifications and three groups
CN112836566A (en) * 2020-12-01 2021-05-25 北京智云视图科技有限公司 Multitask neural network face key point detection method for edge equipment
CN112861626A (en) * 2021-01-04 2021-05-28 西北工业大学 Fine-grained expression classification method based on small sample learning
CN112949780A (en) * 2020-04-21 2021-06-11 佳都科技集团股份有限公司 Feature model training method, device, equipment and storage medium
CN112966724A (en) * 2021-02-07 2021-06-15 惠州市博实结科技有限公司 Method and device for classifying image single categories
CN113157956A (en) * 2021-04-23 2021-07-23 雅马哈发动机(厦门)信息***有限公司 Picture searching method, system, mobile terminal and storage medium
CN113344031A (en) * 2021-05-13 2021-09-03 清华大学 Text classification method
CN113362096A (en) * 2020-03-04 2021-09-07 驰众信息技术(上海)有限公司 Frame advertisement image matching method based on deep learning
CN113435545A (en) * 2021-08-14 2021-09-24 北京达佳互联信息技术有限公司 Training method and device of image processing model
CN113469253A (en) * 2021-07-02 2021-10-01 河海大学 Electricity stealing detection method based on triple twin network
CN113486804A (en) * 2021-07-07 2021-10-08 科大讯飞股份有限公司 Object identification method, device, equipment and storage medium
CN113569991A (en) * 2021-08-26 2021-10-29 深圳市捷顺科技实业股份有限公司 Testimony comparison model training method, computer equipment and computer storage medium
CN113642481A (en) * 2021-08-17 2021-11-12 百度在线网络技术(北京)有限公司 Recognition method, training method, device, electronic equipment and storage medium
CN113762019A (en) * 2021-01-22 2021-12-07 北京沃东天骏信息技术有限公司 Training method of feature extraction network, face recognition method and device
CN113780461A (en) * 2021-09-23 2021-12-10 中国人民解放军国防科技大学 Robust neural network training method based on feature matching
CN113807122A (en) * 2020-06-11 2021-12-17 阿里巴巴集团控股有限公司 Model training method, object recognition method and device, and storage medium
CN113887653A (en) * 2021-10-20 2022-01-04 西安交通大学 Positioning method and system for tightly-coupled weak supervised learning based on ternary network
CN114049479A (en) * 2021-11-10 2022-02-15 苏州魔视智能科技有限公司 Self-supervision fisheye camera image feature point extraction method and device and storage medium
GB2600922A (en) * 2020-11-05 2022-05-18 Thales Holdings Uk Plc One shot learning for identifying data items similar to a query data item
CN114663965A (en) * 2022-05-24 2022-06-24 之江实验室 Testimony comparison method and device based on two-stage alternating learning
CN114817888A (en) * 2022-06-27 2022-07-29 中国信息通信研究院 Certificate registering and issuing method, device and storage medium
CN114882558A (en) * 2022-04-29 2022-08-09 陕西师范大学 Learning scene real-time identity authentication method based on face recognition technology
CN114926445A (en) * 2022-05-31 2022-08-19 哈尔滨工业大学 Twin network-based small sample crop disease image identification method and system
CN116206355A (en) * 2023-04-25 2023-06-02 鹏城实验室 Face recognition model training, image registration and face recognition method and device
CN116959064A (en) * 2023-06-25 2023-10-27 上海腾桥信息技术有限公司 Certificate verification method and device, computer equipment and storage medium
CN116977461A (en) * 2023-06-30 2023-10-31 北京开普云信息科技有限公司 Portrait generation method, device, storage medium and equipment for specific scene

Families Citing this family (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009528B (en) * 2017-12-26 2020-04-07 广州广电运通金融电子股份有限公司 Triple Loss-based face authentication method and device, computer equipment and storage medium
CN108922542B (en) * 2018-06-01 2023-04-28 平安科技(深圳)有限公司 Sample triplet acquisition method and device, computer equipment and storage medium
CN108921033A (en) * 2018-06-04 2018-11-30 北京京东金融科技控股有限公司 Face picture comparison method, device, medium and electronic equipment
CN110598840B (en) * 2018-06-13 2023-04-18 富士通株式会社 Knowledge migration method, information processing apparatus, and storage medium
CN109145704B (en) * 2018-06-14 2022-02-22 西安电子科技大学 Face portrait recognition method based on face attributes
CN108921952B (en) * 2018-06-15 2022-09-06 深圳大学 Object functionality prediction method, device, computer equipment and storage medium
CN108985198A (en) * 2018-07-02 2018-12-11 四川斐讯信息技术有限公司 A kind of COS distance calculation method based on big data feature vector
CN110738071A (en) * 2018-07-18 2020-01-31 浙江中正智能科技有限公司 face algorithm model training method based on deep learning and transfer learning
CN109145956B (en) * 2018-07-26 2021-12-14 上海慧子视听科技有限公司 Scoring method, scoring device, computer equipment and storage medium
CN108960342B (en) * 2018-08-01 2021-09-14 中国计量大学 Image similarity calculation method based on improved Soft-Max loss function
CN108960209B (en) * 2018-08-09 2023-07-21 腾讯科技(深圳)有限公司 Identity recognition method, identity recognition device and computer readable storage medium
CN109165589B (en) * 2018-08-14 2021-02-23 北京颂泽科技有限公司 Vehicle weight recognition method and device based on deep learning
CN109145991B (en) * 2018-08-24 2020-07-31 北京地平线机器人技术研发有限公司 Image group generation method, image group generation device and electronic equipment
CN109271877A (en) * 2018-08-24 2019-01-25 北京智芯原动科技有限公司 A kind of human figure identification method and device
CN110874602A (en) * 2018-08-30 2020-03-10 北京嘀嘀无限科技发展有限公司 Image identification method and device
CN109344740A (en) * 2018-09-12 2019-02-15 上海了物网络科技有限公司 Face identification system, method and computer readable storage medium
CN109359541A (en) * 2018-09-17 2019-02-19 南京邮电大学 A kind of sketch face identification method based on depth migration study
CN109543524A (en) * 2018-10-18 2019-03-29 同盾控股有限公司 A kind of image-recognizing method, device
CN109214361A (en) * 2018-10-18 2019-01-15 康明飞(北京)科技有限公司 A kind of face identification method and device and ticket verification method and device
CN109492583A (en) * 2018-11-09 2019-03-19 安徽大学 A kind of recognition methods again of the vehicle based on deep learning
CN109685106A (en) * 2018-11-19 2019-04-26 深圳博为教育科技有限公司 A kind of image-recognizing method, face Work attendance method, device and system
CN109522850B (en) * 2018-11-22 2023-03-10 中山大学 Action similarity evaluation method based on small sample learning
CN109685121B (en) * 2018-12-11 2023-07-18 中国科学院苏州纳米技术与纳米仿生研究所 Training method of image retrieval model, image retrieval method and computer equipment
CN111325223B (en) * 2018-12-13 2023-10-24 中国电信股份有限公司 Training method and device for deep learning model and computer readable storage medium
CN109711443A (en) * 2018-12-14 2019-05-03 平安城市建设科技(深圳)有限公司 Floor plan recognition methods, device, equipment and storage medium neural network based
CN109815801A (en) * 2018-12-18 2019-05-28 北京英索科技发展有限公司 Face identification method and device based on deep learning
CN109657792A (en) * 2018-12-19 2019-04-19 北京世纪好未来教育科技有限公司 Construct the method, apparatus and computer-readable medium of neural network
CN109711358B (en) * 2018-12-28 2020-09-04 北京远鉴信息技术有限公司 Neural network training method, face recognition system and storage medium
CN109871762B (en) * 2019-01-16 2023-08-08 平安科技(深圳)有限公司 Face recognition model evaluation method and device
CN111461152B (en) * 2019-01-21 2024-04-05 同方威视技术股份有限公司 Cargo detection method and device, electronic equipment and computer readable medium
CN109886186A (en) * 2019-02-18 2019-06-14 上海骏聿数码科技有限公司 A kind of face identification method and device
US10885385B2 (en) * 2019-03-19 2021-01-05 Sap Se Image search and training system
CN109948568A (en) * 2019-03-26 2019-06-28 东华大学 Embedded human face identifying system based on ARM microprocessor and deep learning
CN110147732A (en) * 2019-04-16 2019-08-20 平安科技(深圳)有限公司 Refer to vein identification method, device, computer equipment and storage medium
CN111832364B (en) * 2019-04-22 2024-04-23 普天信息技术有限公司 Face recognition method and device
CN110147833B (en) * 2019-05-09 2021-10-12 北京迈格威科技有限公司 Portrait processing method, device, system and readable storage medium
CN110213660B (en) * 2019-05-27 2021-08-20 广州荔支网络技术有限公司 Program distribution method, system, computer device and storage medium
CN110674688B (en) * 2019-08-19 2023-10-31 深圳力维智联技术有限公司 Face recognition model acquisition method, system and medium for video monitoring scene
CN112580406A (en) * 2019-09-30 2021-03-30 北京中关村科金技术有限公司 Face comparison method and device and storage medium
CN111104846B (en) * 2019-10-16 2022-08-30 平安科技(深圳)有限公司 Data detection method and device, computer equipment and storage medium
CN110765933A (en) * 2019-10-22 2020-02-07 山西省信息产业技术研究院有限公司 Dynamic portrait sensing comparison method applied to driver identity authentication system
CN110929099B (en) * 2019-11-28 2023-07-21 杭州小影创新科技股份有限公司 Short video frame semantic extraction method and system based on multi-task learning
CN111062338B (en) * 2019-12-19 2023-11-17 厦门商集网络科技有限责任公司 License and portrait consistency comparison method and system
CN111178249A (en) * 2019-12-27 2020-05-19 杭州艾芯智能科技有限公司 Face comparison method and device, computer equipment and storage medium
CN111368644B (en) * 2020-02-14 2024-01-05 深圳市商汤科技有限公司 Image processing method, device, electronic equipment and storage medium
CN111401257B (en) * 2020-03-17 2022-10-04 天津理工大学 Face recognition method based on cosine loss under non-constraint condition
CN111401277A (en) * 2020-03-20 2020-07-10 深圳前海微众银行股份有限公司 Face recognition model updating method, device, equipment and medium
CN113538075A (en) * 2020-04-14 2021-10-22 阿里巴巴集团控股有限公司 Data processing method, model training method, device and equipment
CN111709313B (en) * 2020-05-27 2022-07-29 杭州电子科技大学 Pedestrian re-identification method based on local and channel combination characteristics
CN112492383A (en) * 2020-12-03 2021-03-12 珠海格力电器股份有限公司 Video frame generation method and device, storage medium and electronic equipment
CN113065495B (en) * 2021-04-13 2023-07-14 深圳技术大学 Image similarity calculation method, target object re-recognition method and system
CN113283359A (en) * 2021-06-02 2021-08-20 万达信息股份有限公司 Authentication method and system for handheld certificate photo and electronic equipment
CN113344875A (en) * 2021-06-07 2021-09-03 武汉象点科技有限公司 Abnormal image detection method based on self-supervision learning
CN113269155A (en) * 2021-06-28 2021-08-17 苏州市科远软件技术开发有限公司 End-to-end face recognition method, device, equipment and storage medium
CN113642468A (en) * 2021-08-16 2021-11-12 中国银行股份有限公司 Identity authentication method and device
CN113688793A (en) * 2021-09-22 2021-11-23 万章敏 Training method of face model and face recognition system
CN116188256A (en) * 2021-11-25 2023-05-30 北京字跳网络技术有限公司 Super-resolution image processing method, device, equipment and medium
CN114387457A (en) * 2021-12-27 2022-04-22 腾晖科技建筑智能(深圳)有限公司 Face intra-class interval optimization method based on parameter adjustment
DE102022132343A1 (en) * 2022-12-06 2024-06-06 Bundesdruckerei Gmbh Authentication device and method for authenticating a person using an identification document assigned to the person, as well as identity document and method for producing
CN116127298B (en) * 2023-02-22 2024-03-19 北京邮电大学 Small sample radio frequency fingerprint identification method based on triplet loss

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9129216B1 (en) * 2013-07-15 2015-09-08 Xdroid Kft. System, method and apparatus for computer aided association of relevant images with text
CN106599827A (en) * 2016-12-09 2017-04-26 浙江工商大学 Small target rapid detection method based on deep convolution neural network
CN107194341A (en) * 2017-05-16 2017-09-22 西安电子科技大学 The many convolution neural network fusion face identification methods of Maxout and system
CN107423690A (en) * 2017-06-26 2017-12-01 广东工业大学 A kind of face identification method and device
CN108009528A (en) * 2017-12-26 2018-05-08 广州广电运通金融电子股份有限公司 Face authentication method, device, computer equipment and storage medium based on Triplet Loss

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9129216B1 (en) * 2013-07-15 2015-09-08 Xdroid Kft. System, method and apparatus for computer aided association of relevant images with text
CN106599827A (en) * 2016-12-09 2017-04-26 浙江工商大学 Small target rapid detection method based on deep convolution neural network
CN107194341A (en) * 2017-05-16 2017-09-22 西安电子科技大学 The many convolution neural network fusion face identification methods of Maxout and system
CN107423690A (en) * 2017-06-26 2017-12-01 广东工业大学 A kind of face identification method and device
CN108009528A (en) * 2017-12-26 2018-05-08 广州广电运通金融电子股份有限公司 Face authentication method, device, computer equipment and storage medium based on Triplet Loss

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BRANDON, AMOS: "OpenFace:A general-purpose face recognition library with mobile applications", CMU SCHOOL OF COMPUTER SCIENCE , TECH. REP., 30 June 2016 (2016-06-30), XP055378815 *

Cited By (119)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516533A (en) * 2019-07-11 2019-11-29 同济大学 A kind of pedestrian based on depth measure discrimination method again
CN110414431B (en) * 2019-07-29 2022-12-27 广州像素数据技术股份有限公司 Face recognition method and system based on elastic context relation loss function
CN110414431A (en) * 2019-07-29 2019-11-05 广州像素数据技术股份有限公司 Face identification method and system based on elastic context relation loss function
CN110647880A (en) * 2019-08-12 2020-01-03 深圳市华付信息技术有限公司 Mobile terminal identity card image shielding judgment method
CN110458233A (en) * 2019-08-13 2019-11-15 腾讯云计算(北京)有限责任公司 Combination grain object identification model training and recognition methods, device and storage medium
CN110458233B (en) * 2019-08-13 2024-02-13 腾讯云计算(北京)有限责任公司 Mixed granularity object recognition model training and recognition method, device and storage medium
CN110705357A (en) * 2019-09-02 2020-01-17 深圳中兴网信科技有限公司 Face recognition method and face recognition device
CN110555478A (en) * 2019-09-05 2019-12-10 东北大学 Fan multi-fault diagnosis method based on depth measurement network of difficult sample mining
CN110555478B (en) * 2019-09-05 2023-02-03 东北大学 Fan multi-fault diagnosis method based on depth measurement network of difficult sample mining
CN110674637A (en) * 2019-09-06 2020-01-10 腾讯科技(深圳)有限公司 Character relation recognition model training method, device, equipment and medium
CN111008550A (en) * 2019-09-06 2020-04-14 上海芯灵科技有限公司 Identification method for finger vein authentication identity based on Multiple loss function
CN110705393A (en) * 2019-09-17 2020-01-17 中国计量大学 Method for improving face recognition performance of community population
CN110705393B (en) * 2019-09-17 2023-02-03 中国计量大学 Method for improving face recognition performance of community population
CN110647938A (en) * 2019-09-24 2020-01-03 北京市商汤科技开发有限公司 Image processing method and related device
CN110647938B (en) * 2019-09-24 2022-07-15 北京市商汤科技开发有限公司 Image processing method and related device
CN112733574A (en) * 2019-10-14 2021-04-30 中移(苏州)软件技术有限公司 Face recognition method and device and computer readable storage medium
CN112733574B (en) * 2019-10-14 2023-04-07 中移(苏州)软件技术有限公司 Face recognition method and device and computer readable storage medium
CN110796057A (en) * 2019-10-22 2020-02-14 上海交通大学 Pedestrian re-identification method and device and computer equipment
CN110852367B (en) * 2019-11-05 2023-10-31 上海联影智能医疗科技有限公司 Image classification method, computer device, and storage medium
CN110852367A (en) * 2019-11-05 2020-02-28 上海联影智能医疗科技有限公司 Image classification method, computer device, and storage medium
CN110956098A (en) * 2019-11-13 2020-04-03 深圳和而泰家居在线网络科技有限公司 Image processing method and related equipment
CN111126360B (en) * 2019-11-15 2023-03-24 西安电子科技大学 Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN111126360A (en) * 2019-11-15 2020-05-08 西安电子科技大学 Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN111079566A (en) * 2019-11-28 2020-04-28 深圳市信义科技有限公司 Large-scale face recognition model optimization system
CN111222411B (en) * 2019-11-28 2023-09-01 中国船舶重工集团公司第七一三研究所 Laser emission safety rapid alarm method and device
CN111222411A (en) * 2019-11-28 2020-06-02 中国船舶重工集团公司第七一三研究所 Laser emission safe and rapid alarm method and device
CN111079566B (en) * 2019-11-28 2023-05-02 深圳市信义科技有限公司 Large-scale face recognition model optimization system
CN111062430A (en) * 2019-12-12 2020-04-24 易诚高科(大连)科技有限公司 Pedestrian re-identification evaluation method based on probability density function
CN111091089B (en) * 2019-12-12 2022-07-29 新华三大数据技术有限公司 Face image processing method and device, electronic equipment and storage medium
CN111062430B (en) * 2019-12-12 2023-05-09 易诚高科(大连)科技有限公司 Pedestrian re-identification evaluation method based on probability density function
CN111091089A (en) * 2019-12-12 2020-05-01 新华三大数据技术有限公司 Face image processing method and device, electronic equipment and storage medium
CN111144240A (en) * 2019-12-12 2020-05-12 深圳数联天下智能科技有限公司 Image processing method and related equipment
CN111126240B (en) * 2019-12-19 2023-04-07 西安工程大学 Three-channel feature fusion face recognition method
CN111126240A (en) * 2019-12-19 2020-05-08 西安工程大学 Three-channel feature fusion face recognition method
CN111191563A (en) * 2019-12-26 2020-05-22 三盟科技股份有限公司 Face recognition method and system based on data sample and test data set training
CN111241925B (en) * 2019-12-30 2023-08-18 新大陆数字技术股份有限公司 Face quality assessment method, system, electronic equipment and readable storage medium
CN111241925A (en) * 2019-12-30 2020-06-05 新大陆数字技术股份有限公司 Face quality evaluation method, system, electronic equipment and readable storage medium
CN111209839B (en) * 2019-12-31 2023-05-23 上海涛润医疗科技有限公司 Face recognition method
CN111209839A (en) * 2019-12-31 2020-05-29 上海涛润医疗科技有限公司 Face recognition method
CN111198964A (en) * 2020-01-10 2020-05-26 中国科学院自动化研究所 Image retrieval method and system
CN111198964B (en) * 2020-01-10 2023-04-25 中国科学院自动化研究所 Image retrieval method and system
CN111274946A (en) * 2020-01-19 2020-06-12 杭州涂鸦信息技术有限公司 Face recognition method, system and equipment
CN111274946B (en) * 2020-01-19 2023-05-05 杭州涂鸦信息技术有限公司 Face recognition method, system and equipment
CN113362096A (en) * 2020-03-04 2021-09-07 驰众信息技术(上海)有限公司 Frame advertisement image matching method based on deep learning
CN111368766A (en) * 2020-03-09 2020-07-03 云南安华防灾减灾科技有限责任公司 Cattle face detection and identification method based on deep learning
CN111368766B (en) * 2020-03-09 2023-08-18 云南安华防灾减灾科技有限责任公司 Deep learning-based cow face detection and recognition method
CN111539247B (en) * 2020-03-10 2023-02-10 西安电子科技大学 Hyper-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN111539247A (en) * 2020-03-10 2020-08-14 西安电子科技大学 Hyper-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN111429414B (en) * 2020-03-18 2023-04-07 腾讯科技(深圳)有限公司 Artificial intelligence-based focus image sample determination method and related device
CN111429414A (en) * 2020-03-18 2020-07-17 腾讯科技(深圳)有限公司 Artificial intelligence-based focus image sample determination method and related device
CN111414862A (en) * 2020-03-22 2020-07-14 西安电子科技大学 Expression recognition method based on neural network fusion key point angle change
CN111414862B (en) * 2020-03-22 2023-03-24 西安电子科技大学 Expression recognition method based on neural network fusion key point angle change
CN112949780A (en) * 2020-04-21 2021-06-11 佳都科技集团股份有限公司 Feature model training method, device, equipment and storage medium
CN112949780B (en) * 2020-04-21 2022-09-20 佳都科技集团股份有限公司 Feature model training method, device, equipment and storage medium
CN111507289A (en) * 2020-04-22 2020-08-07 上海眼控科技股份有限公司 Video matching method, computer device and storage medium
CN111582107B (en) * 2020-04-28 2023-09-29 浙江大华技术股份有限公司 Training method and recognition method of target re-recognition model, electronic equipment and device
CN111582107A (en) * 2020-04-28 2020-08-25 浙江大华技术股份有限公司 Training method and recognition method of target re-recognition model, electronic equipment and device
CN111639535B (en) * 2020-04-29 2023-08-22 深圳英飞拓智能技术有限公司 Face recognition method and device based on deep learning
CN111639535A (en) * 2020-04-29 2020-09-08 深圳英飞拓智能技术有限公司 Face recognition method and device based on deep learning
CN111626212B (en) * 2020-05-27 2023-09-26 腾讯科技(深圳)有限公司 Method and device for identifying object in picture, storage medium and electronic device
CN111626212A (en) * 2020-05-27 2020-09-04 腾讯科技(深圳)有限公司 Method and device for identifying object in picture, storage medium and electronic device
CN113807122A (en) * 2020-06-11 2021-12-17 阿里巴巴集团控股有限公司 Model training method, object recognition method and device, and storage medium
CN111738157A (en) * 2020-06-23 2020-10-02 平安科技(深圳)有限公司 Method and device for constructing data set of facial action units and computer equipment
CN111738157B (en) * 2020-06-23 2023-07-21 平安科技(深圳)有限公司 Face action unit data set construction method and device and computer equipment
CN112257738A (en) * 2020-07-31 2021-01-22 北京京东尚科信息技术有限公司 Training method and device of machine learning model and classification method and device of image
CN111988614B (en) * 2020-08-14 2022-09-13 深圳前海微众银行股份有限公司 Hash coding optimization method and device and readable storage medium
CN111988614A (en) * 2020-08-14 2020-11-24 深圳前海微众银行股份有限公司 Hash coding optimization method and device and readable storage medium
CN112069993B (en) * 2020-09-04 2024-02-13 西安西图之光智能科技有限公司 Dense face detection method and system based on five-sense organ mask constraint and storage medium
CN112069993A (en) * 2020-09-04 2020-12-11 西安西图之光智能科技有限公司 Dense face detection method and system based on facial features mask constraint and storage medium
CN112084956A (en) * 2020-09-11 2020-12-15 上海交通大学烟台信息技术研究院 Special face crowd screening system based on small sample learning prototype network
CN112052821A (en) * 2020-09-15 2020-12-08 浙江智慧视频安防创新中心有限公司 Fire fighting channel safety detection method, device, equipment and storage medium
CN112052821B (en) * 2020-09-15 2023-07-07 浙江智慧视频安防创新中心有限公司 Fire-fighting channel safety detection method, device, equipment and storage medium
CN112287765B (en) * 2020-09-30 2024-06-04 新大陆数字技术股份有限公司 Face living body detection method, device, equipment and readable storage medium
CN112287765A (en) * 2020-09-30 2021-01-29 新大陆数字技术股份有限公司 Face living body detection method, device and equipment and readable storage medium
CN112307968A (en) * 2020-10-30 2021-02-02 天地伟业技术有限公司 Face recognition feature compression method
CN112328786A (en) * 2020-11-03 2021-02-05 平安科技(深圳)有限公司 Text classification method and device based on BERT, computer equipment and storage medium
GB2600922B (en) * 2020-11-05 2024-04-10 Thales Holdings Uk Plc One shot learning for identifying data items similar to a query data item
GB2600922A (en) * 2020-11-05 2022-05-18 Thales Holdings Uk Plc One shot learning for identifying data items similar to a query data item
CN112836566A (en) * 2020-12-01 2021-05-25 北京智云视图科技有限公司 Multitask neural network face key point detection method for edge equipment
CN112836719B (en) * 2020-12-11 2024-01-05 南京富岛信息工程有限公司 Indicator diagram similarity detection method integrating two classifications and triplets
CN112836719A (en) * 2020-12-11 2021-05-25 南京富岛信息工程有限公司 Indicator diagram similarity detection method fusing two classifications and three groups
CN112580011B (en) * 2020-12-25 2022-05-24 华南理工大学 Portrait encryption and decryption system facing biological feature privacy protection
CN112580011A (en) * 2020-12-25 2021-03-30 华南理工大学 Portrait encryption and decryption system facing biological feature privacy protection
CN112861626B (en) * 2021-01-04 2024-03-08 西北工业大学 Fine granularity expression classification method based on small sample learning
CN112861626A (en) * 2021-01-04 2021-05-28 西北工业大学 Fine-grained expression classification method based on small sample learning
CN113762019A (en) * 2021-01-22 2021-12-07 北京沃东天骏信息技术有限公司 Training method of feature extraction network, face recognition method and device
CN113762019B (en) * 2021-01-22 2024-04-09 北京沃东天骏信息技术有限公司 Training method of feature extraction network, face recognition method and device
CN112836629B (en) * 2021-02-01 2024-03-08 清华大学深圳国际研究生院 Image classification method
CN112836629A (en) * 2021-02-01 2021-05-25 清华大学深圳国际研究生院 Image classification method
CN112966724A (en) * 2021-02-07 2021-06-15 惠州市博实结科技有限公司 Method and device for classifying image single categories
CN112966724B (en) * 2021-02-07 2024-04-09 惠州市博实结科技有限公司 Method and device for classifying image single categories
CN112766237A (en) * 2021-03-12 2021-05-07 东北林业大学 Unsupervised pedestrian re-identification method based on cluster feature point clustering
CN113157956A (en) * 2021-04-23 2021-07-23 雅马哈发动机(厦门)信息***有限公司 Picture searching method, system, mobile terminal and storage medium
CN113344031A (en) * 2021-05-13 2021-09-03 清华大学 Text classification method
CN113344031B (en) * 2021-05-13 2022-12-27 清华大学 Text classification method
CN113469253B (en) * 2021-07-02 2024-05-14 河海大学 Electric larceny detection method based on triple twinning network
CN113469253A (en) * 2021-07-02 2021-10-01 河海大学 Electricity stealing detection method based on triple twin network
CN113486804B (en) * 2021-07-07 2024-02-20 科大讯飞股份有限公司 Object identification method, device, equipment and storage medium
CN113486804A (en) * 2021-07-07 2021-10-08 科大讯飞股份有限公司 Object identification method, device, equipment and storage medium
CN113435545A (en) * 2021-08-14 2021-09-24 北京达佳互联信息技术有限公司 Training method and device of image processing model
CN113642481A (en) * 2021-08-17 2021-11-12 百度在线网络技术(北京)有限公司 Recognition method, training method, device, electronic equipment and storage medium
CN113569991A (en) * 2021-08-26 2021-10-29 深圳市捷顺科技实业股份有限公司 Testimony comparison model training method, computer equipment and computer storage medium
CN113569991B (en) * 2021-08-26 2024-05-28 深圳市捷顺科技实业股份有限公司 Person evidence comparison model training method, computer equipment and computer storage medium
CN113780461A (en) * 2021-09-23 2021-12-10 中国人民解放军国防科技大学 Robust neural network training method based on feature matching
CN113887653B (en) * 2021-10-20 2024-02-06 西安交通大学 Positioning method and system for tight coupling weak supervision learning based on ternary network
CN113887653A (en) * 2021-10-20 2022-01-04 西安交通大学 Positioning method and system for tightly-coupled weak supervised learning based on ternary network
CN114049479A (en) * 2021-11-10 2022-02-15 苏州魔视智能科技有限公司 Self-supervision fisheye camera image feature point extraction method and device and storage medium
CN114882558B (en) * 2022-04-29 2024-02-23 陕西师范大学 Learning scene real-time identity authentication method based on face recognition technology
CN114882558A (en) * 2022-04-29 2022-08-09 陕西师范大学 Learning scene real-time identity authentication method based on face recognition technology
CN114663965A (en) * 2022-05-24 2022-06-24 之江实验室 Testimony comparison method and device based on two-stage alternating learning
CN114663965B (en) * 2022-05-24 2022-10-21 之江实验室 Testimony comparison method and device based on two-stage alternative learning
CN114926445B (en) * 2022-05-31 2024-03-26 哈尔滨工业大学 Small sample crop disease image identification method and system based on twin network
CN114926445A (en) * 2022-05-31 2022-08-19 哈尔滨工业大学 Twin network-based small sample crop disease image identification method and system
CN114817888A (en) * 2022-06-27 2022-07-29 中国信息通信研究院 Certificate registering and issuing method, device and storage medium
CN116206355A (en) * 2023-04-25 2023-06-02 鹏城实验室 Face recognition model training, image registration and face recognition method and device
CN116959064A (en) * 2023-06-25 2023-10-27 上海腾桥信息技术有限公司 Certificate verification method and device, computer equipment and storage medium
CN116959064B (en) * 2023-06-25 2024-04-26 上海腾桥信息技术有限公司 Certificate verification method and device, computer equipment and storage medium
CN116977461A (en) * 2023-06-30 2023-10-31 北京开普云信息科技有限公司 Portrait generation method, device, storage medium and equipment for specific scene
CN116977461B (en) * 2023-06-30 2024-03-08 北京开普云信息科技有限公司 Portrait generation method, device, storage medium and equipment for specific scene

Also Published As

Publication number Publication date
CN108009528A (en) 2018-05-08
CN108009528B (en) 2020-04-07

Similar Documents

Publication Publication Date Title
WO2019128367A1 (en) Face verification method and apparatus based on triplet loss, and computer device and storage medium
US10755084B2 (en) Face authentication to mitigate spoofing
CN106780906B (en) A kind of testimony of a witness unification recognition methods and system based on depth convolutional neural networks
WO2019024636A1 (en) Identity authentication method, system and apparatus
WO2019120115A1 (en) Facial recognition method, apparatus, and computer apparatus
US10002286B1 (en) System and method for face recognition robust to multiple degradations
CN105740780B (en) Method and device for detecting living human face
US9189686B2 (en) Apparatus and method for iris image analysis
WO2018086543A1 (en) Living body identification method, identity authentication method, terminal, server and storage medium
CN105956572A (en) In vivo face detection method based on convolutional neural network
WO2022206319A1 (en) Image processing method and apparatus, and device, storage medium and computer program product
CN105740779B (en) Method and device for detecting living human face
WO2016150240A1 (en) Identity authentication method and apparatus
Islam et al. Multibiometric human recognition using 3D ear and face features
CN113033519B (en) Living body detection method, estimation network processing method, device and computer equipment
WO2020088029A1 (en) Liveness detection method, storage medium, and electronic device
Peter et al. Improving ATM security via face recognition
CN112686191B (en) Living body anti-counterfeiting method, system, terminal and medium based on three-dimensional information of human face
Qin et al. Finger-vein image quality evaluation based on the representation of grayscale and binary image
Goud et al. Smart attendance notification system using SMTP with face recognition
Valehi et al. A graph matching algorithm for user authentication in data networks using image-based physical unclonable functions
Yuan et al. SALM: Smartphone-based identity authentication using lip motion characteristics
CN114373213A (en) Juvenile identity recognition method and device based on face recognition
TWI632509B (en) Face recognition apparatus and method thereof, method for increasing image recognition accuracy, and computer-readable storage medium
Hemasree et al. Facial Skin Texture and Distributed Dynamic Kernel Support Vector Machine (DDKSVM) Classifier for Age Estimation in Facial Wrinkles.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18897492

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18897492

Country of ref document: EP

Kind code of ref document: A1