CN112560559A

CN112560559A - Method and device for updating face library

Info

Publication number: CN112560559A
Application number: CN201910917079.7A
Authority: CN
Inventors: 赵光耀
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-09-26
Filing date: 2019-09-26
Publication date: 2021-03-26

Abstract

The application provides a method and a device for updating a face library, wherein the method comprises the following steps: the method comprises the steps that a face library updating device obtains a first face sample collected by a camera; determining a second face sample corresponding to the first face sample; comparing the second face sample with a face image corresponding to a second target in a face library, and judging whether the target presented by the second face sample is identified as a second target, wherein the second target is an endorser of the first target; and when the target presented by the second face sample is identified as the second target, updating the face library of the first target according to the first face sample. The second target is the first target endorsement, so that the updating accuracy of the face library is guaranteed, and meanwhile, the updating opportunity of the face library is improved.

Description

Method and device for updating face library

Technical Field

The present application relates to the field of artificial intelligence, and more particularly, to a method and an apparatus for updating a face library.

Background

In the practical application of face recognition, the appearance of a person may change, for example, as the person ages, the features of the face may change, thereby reducing the accuracy of face recognition. The automatic updating of the face library can solve the above problems, for example, the collected face image is identified, and the face library is updated if the identification is successful. However, if the registered face image is collected earlier and the preset face recognition threshold is higher, it is difficult to successfully recognize the face image of the user and it is difficult to obtain the opportunity of automatically updating the face library.

Disclosure of Invention

The application provides a method and a device for updating a face library, which can reduce the error rate of updating the face library and further improve the identification accuracy.

In a first aspect, a method for updating a face library is provided, where the method includes: the method comprises the steps that a face library updating device obtains a first face sample collected by a camera, wherein whether a target presented by the first face sample is to be identified as a first target or not is judged; determining a second face sample corresponding to the first face sample; comparing the second face sample with a face image corresponding to a second target in a face library, and judging whether the target presented by the second face sample is identified as the second target, wherein the second target is an endorser of the first target; and when the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample.

Optionally, the face library updating apparatus obtains a first face sample collected by the camera, and may obtain the first face sample by shooting through the camera.

Optionally, the face library updating apparatus obtains a first face sample collected by the camera, and may obtain the first face sample through a memory, where the memory is used to store the first face sample collected by the camera.

According to the scheme of the embodiment of the application, when the target presented by the second face sample corresponding to the first face sample can be identified as the endorsement person, the first face sample is used for updating the face image of the first target in the face library, and the endorsement person is set for the first target, so that the first face sample can be used for updating the face library, and the updating accuracy of the face library is ensured. In addition, compared with the method for updating the face library only after the first face sample is successfully identified, the method can reduce the error rate of updating the face library, and increases the opportunity of updating the face library while ensuring the updating accuracy.

With reference to the first aspect, in certain implementations of the first aspect, the determining whether the target presented by the second face sample is identified as the second target includes: when the target presented by the first face sample is not recognized as the first target, judging whether the target presented by the second face sample is recognized as the second target.

According to the scheme of the embodiment of the application, whether the target presented by the first face sample is identified as the first target or not is judged, when the target presented by the first face sample is identified as the first target, the first face sample can be directly utilized to update the face library, and the calculation amount can be reduced.

With reference to the first aspect, in certain implementations of the first aspect, the determining whether the target presented by the second face sample is identified as the second target includes: when the first similarity between the first face sample and the first face image is the maximum similarity among a plurality of second similarities, judging whether a target presented by the second face sample is identified as a second target, wherein the first face image is a face image corresponding to the first target in the face library, and the second similarity is the similarity between the first face sample and the face image in the face library.

According to the scheme of the embodiment of the application, the accuracy of updating the face image corresponding to the first target is improved by judging the similarity relation between the first face sample and the face image corresponding to the first target.

With reference to the first aspect, in some implementations of the first aspect, the determining whether the target presented by the second face sample is identified as the second target when the first similarity between the first face sample and the first face image is the largest similarity among a plurality of second similarities includes: and when the first similarity between the first face sample and the first face image is the maximum similarity in a plurality of second similarities and is greater than or equal to a first threshold, judging whether a target presented by the second face sample is identified as a second target.

According to the scheme of the embodiment of the application, the relationship between the similarity between the first face sample and the face image corresponding to the first target and the first threshold is judged, so that the first face sample with low similarity is prevented from being used for updating the face image of the first target, and the accuracy of updating the face image corresponding to the first target is further improved.

With reference to the first aspect, in some implementations of the first aspect, the determining whether the target represented by the second face sample is identified as the second target includes: and if the third similarity between the second face sample and the second face image is greater than or equal to a second threshold value, identifying the target presented by the second face sample as the second target, wherein the second face image is a face image corresponding to the second target in the face library.

With reference to the first aspect, in certain implementations of the first aspect, the identifying, when the third similarity between the second face sample and the second face image is greater than or equal to the second threshold, the target presented by the second face sample as the second target includes: if the third similarity is greater than or equal to the second threshold and the third similarity is the maximum similarity among a plurality of fourth similarities, the second face sample is identified as the second target, and the fourth similarity is the similarity between the second face sample and the face images in the face library.

According to the scheme of the embodiment of the application, the relationship between the similarity between the second face sample and the face image corresponding to the second target and the second threshold is judged, so that the target presented by the second face sample with lower similarity is prevented from being identified as the second target, and the accuracy of updating the face image corresponding to the first target is ensured.

With reference to the first aspect, in certain implementations of the first aspect, the second face sample and the first face sample are face samples acquired at the same location, or the second face sample and the first face sample are face samples acquired at the same time interval, or the second face sample and the first face sample are face samples acquired in the same video source.

With reference to the first aspect, in certain implementations of the first aspect, the first face sample is a face sample collected at a restricted location corresponding to the first target; or the second face sample is a face sample collected in a limited place corresponding to the second target.

According to the scheme of the embodiment of the application, the accuracy of identification can be improved by limiting the acquisition places of the first face sample or the second face sample, the acquisition range of the first face sample or the second face sample is limited, and the calculated amount and the network flow can be reduced.

With reference to the first aspect, in certain implementations of the first aspect, the updating, according to the first face sample, a face image corresponding to the first object in the face library when an object represented by the second face sample is identified as the second object includes: when the target presented by the second face sample is identified as the second target, determining the first face sample as a candidate face sample of the first target, wherein the number of the candidate face samples is M, M is more than 1 and less than or equal to Q, and M is an integer; determining target candidate face samples from the M candidate face samples according to at least one of a first parameter and a second parameter, wherein the first parameter includes a similarity between each candidate face sample and a third face image, the third face image is a face image corresponding to the first target in the face library, and the second parameter is the number of effective second face samples corresponding to each candidate face sample, wherein the effective second face samples corresponding to the mth candidate face sample include second face samples identified as second targets by targets present in T second face samples corresponding to the mth candidate face sample, M belongs to [1, M ], M is an integer, and T is a positive integer; and updating the face image corresponding to the first target in the face library according to the target candidate face sample.

According to the scheme of the embodiment of the application, the number of the candidate face samples of the first target is increased, and then the target candidate face samples are selected according to at least one parameter of the similarity and the actual endorsement number, so that the accuracy rate of face library updating can be improved. For example, the higher the similarity between the target candidate face sample and the face image corresponding to the first target in the face library, the more the number of actual endorsements corresponding to the target candidate face sample is, the higher the probability that the target presented by the target candidate face sample is the first target is, and the higher the accuracy of updating the face image corresponding to the first target in the face library by using the target candidate face sample is.

With reference to the first aspect, in certain implementations of the first aspect, the determining a target candidate face sample from the M candidate face samples according to at least one of a first parameter and a second parameter includes: and clustering the M candidate face samples, and determining the target candidate face sample in a first class of candidate face samples according to at least one parameter of a first parameter and a second parameter, wherein the first class is one or more classes of which the number of the included candidate face samples is greater than or equal to a third threshold value.

Alternatively, the similarity between the candidate face samples corresponding to the first target may be used as the distance between the samples of the cluster.

According to the scheme of the embodiment of the application, the M candidate face samples are clustered, the classes with fewer candidate face samples are discarded, and the target candidate face samples are determined in the remaining classes according to at least one parameter of the first parameter and the second parameter, so that the updating accuracy can be further improved.

With reference to the first aspect, in certain implementations of the first aspect, the method further includes: performing N times of identification aiming at the first target based on N third face samples, wherein N is an integer greater than 1; and determining the first target as a target to be updated according to the N recognition results.

According to the scheme of the embodiment of the application, whether the target is the target to be updated or not is judged, and the face library is updated only for the target needing to be updated, so that the frequency of updating the face library can be reduced, and the calculated amount and the network flow are reduced.

With reference to the first aspect, in some implementations of the first aspect, a fifth similarity between the third face sample and the fourth face image is a maximum similarity among a plurality of sixth similarities, where the fourth face image is a face image corresponding to the first target in the face library, and the sixth similarity is a similarity between the third face sample and a face image in the face library.

With reference to the first aspect, in certain implementations of the first aspect, determining the first target as a target to be updated according to the N recognition results includes: when the number K of successful recognition times in the N recognition results is smaller than or equal to a fourth threshold value, determining the first target as a target to be updated, wherein K is an integer; or when the ratio of the number K of successful recognition times to the N in the N recognition results is smaller than or equal to a fifth threshold, determining the first target as a target to be updated, wherein K is an integer; or when the number L of times of identification failure in the N times of identification results is greater than or equal to a sixth threshold value, determining the first target as a target to be updated, wherein L is an integer; or when the ratio of the number L of identification failures to the number N of identification failures in the result of the N times of identification is greater than or equal to a seventh threshold value, determining the first target as a target to be updated, wherein L is an integer; or when the ratio of the number K of successful recognition times to the number L of failed recognition times in the result of the N times of recognition is less than or equal to an eighth threshold, determining the first target as a target to be updated, wherein L is an integer and K is an integer.

In a second aspect, there is provided a face library updating apparatus, including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first face sample acquired by a camera, and whether a target presented by the first face sample is to be identified as a first target or not is judged; a processing unit to: determining a second face sample corresponding to the first face sample; comparing the second face sample with a face image corresponding to a second target in a face library, and judging whether the target presented by the second face sample is identified as the second target, wherein the second target is an endorser of the first target; and when the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: when the target presented by the first face sample is not recognized as the first target, judging whether the target presented by the second face sample is recognized as the second target.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: when the first similarity between the first face sample and the first face image is the maximum similarity among a plurality of second similarities, judging whether a target presented by the second face sample is identified as a second target, wherein the first face image is a face image corresponding to the first target in the face library, and the second similarity is the similarity between the first face sample and the face image in the face library.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: and when the first similarity between the first face sample and the first face image is the maximum similarity in a plurality of second similarities and is greater than or equal to a first threshold, judging whether a target presented by the second face sample is identified as a second target.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: and if the third similarity between the second face sample and the second face image is greater than or equal to a second threshold value, identifying the target presented by the second face sample as the second target, wherein the second face image is a face image corresponding to the second target in the face library.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: if the third similarity is greater than or equal to the second threshold and the third similarity is the maximum similarity among a plurality of fourth similarities, the second face sample is identified as the second target, and the fourth similarity is the similarity between the second face sample and the face images in the face library.

With reference to the second aspect, in some implementations of the second aspect, the second face sample and the first face sample are face samples acquired at the same location, or the second face sample and the first face sample are face samples acquired at the same time interval, or the second face sample and the first face sample are face samples acquired in the same video source.

With reference to the second aspect, in some implementations of the second aspect, the first face sample is a face sample collected at a restricted location corresponding to the first target; or the second face sample is a face sample collected in a limited place corresponding to the second target.

With reference to the second aspect, in certain implementations of the second aspect, the first face samples are Q, Q being an integer greater than 1, and the processing unit is configured to: when the target presented by the second face sample is identified as the second target, determining the first face sample as a candidate face sample of the first target, wherein the number of the candidate face samples is M, M is more than 1 and less than or equal to Q, and M is an integer; determining target candidate face samples from the M candidate face samples according to at least one of a first parameter and a second parameter, wherein the first parameter includes a similarity between each candidate face sample and a third face image, the third face image is a face image corresponding to the first target in the face library, and the second parameter is the number of effective second face samples corresponding to each candidate face sample, wherein the effective second face samples corresponding to the mth candidate face sample include second face samples identified as second targets by targets present in T second face samples corresponding to the mth candidate face sample, M belongs to [1, M ], M is an integer, and T is a positive integer; and updating the face image corresponding to the first target in the face library according to the target candidate face sample.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: and clustering the M candidate face samples, and determining the target candidate face sample in a first class of candidate face samples according to at least one parameter of a first parameter and a second parameter, wherein the first class is one or more classes of which the number of the included candidate face samples is greater than or equal to a third threshold value.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is further configured to: performing N times of identification aiming at the first target based on N third face samples, wherein N is an integer greater than 1; and determining the first target as a target to be updated according to the N recognition results.

With reference to the second aspect, in some implementations of the second aspect, a fifth similarity between the third face sample and the fourth face image is a maximum similarity among a plurality of sixth similarities, where the fourth face image is a face image corresponding to the first target in the face library, and the sixth similarity is a similarity between the third face sample and a face image in the face library.

With reference to the second aspect, in certain implementations of the second aspect, the processing unit is configured to: when the number K of successful recognition times in the N recognition results is smaller than or equal to a fourth threshold value, determining the first target as a target to be updated, wherein K is an integer; or when the ratio of the number K of successful recognition times to the N in the N recognition results is smaller than or equal to a fifth threshold, determining the first target as a target to be updated, wherein K is an integer; or when the number L of times of identification failure in the N times of identification results is greater than or equal to a sixth threshold value, determining the first target as a target to be updated, wherein L is an integer; or when the ratio of the number L of identification failures to the number N of identification failures in the result of the N times of identification is greater than or equal to a seventh threshold value, determining the first target as a target to be updated, wherein L is an integer; or when the ratio of the number K of successful recognition times to the number L of failed recognition times in the result of the N times of recognition is less than or equal to an eighth threshold, determining the first target as a target to be updated, wherein L is an integer and K is an integer.

In a third aspect, an apparatus for updating a face library is provided, the apparatus comprising: a memory for storing a program; a processor for executing the program stored in the memory, the processor being configured to perform the method of the first aspect when the program stored in the memory is executed.

In a fourth aspect, an electronic device is provided, where the electronic device includes the face library updating apparatus in the third aspect.

The electronic device may be a mobile terminal (e.g., a smart phone), a tablet computer, a notebook computer, an augmented reality/virtual reality device, an in-vehicle terminal device, and the like.

In a fifth aspect, a computer readable storage medium is provided, the computer readable storage medium storing program code comprising instructions for performing the steps of the method in the first aspect.

In a sixth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.

In a seventh aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface to perform the method in the first aspect.

Optionally, as an implementation manner, the chip may further include a memory, where instructions are stored in the memory, and the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the processor is configured to execute the method in the first aspect.

It is to be understood that, in the present application, the method of the first aspect may specifically refer to the method of the first aspect as well as any one of the various implementations of the first aspect.

Drawings

Fig. 1 is a schematic diagram of a video conference communication system according to an embodiment of the present application.

Fig. 2 is a schematic flow chart of a face library updating method according to an embodiment of the present application.

Fig. 3 is a schematic flow chart of a face recognition method according to an embodiment of the present application.

Fig. 4 is a schematic flow chart of a face library updating method according to another embodiment of the present application.

Fig. 5 is a schematic flow chart of a method of determining candidate face samples in an embodiment of the present application.

Fig. 6 is a schematic block diagram of a face library updating apparatus according to an embodiment of the present application.

Fig. 7 is a schematic diagram of a hardware structure of a face library updating apparatus according to an embodiment of the present application.

Detailed Description

The technical solution in the present application will be described below with reference to the accompanying drawings.

The technical solution of the embodiment of the present application can be applied to a video conference communication system, and fig. 1 shows a schematic diagram of a video conference communication system of the embodiment of the present application, where the video conference communication system generally includes: the system comprises a video conference terminal, a face recognition server, an Artificial Intelligence (AI) gateway and an enterprise address book server. Wherein the peripheral equipment of the video conference terminal comprises an AI camera. The videoconference communication system may also include other network elements such as a Multipoint Control Unit (MCU), a voiceprint recognition server, etc. Fig. 1 is a schematic diagram of a video conference communication system according to an embodiment of the present application, and does not limit the present application. It should be noted that the technical solution of the embodiment of the present application is not limited to be applied to a video conference communication system, and may also be applied to other systems that need to update a face library.

Each video conference terminal corresponds to each conference place, and the main functions of the video conference terminal include: establishing a call, encoding and packaging local video, audio, data and control information, and sending the encoded and packaged local video, audio, data and control information; the received data packet is decoded and restored into video, audio, data and control information. The video conference terminal also has the functions of conference control and picture display, and can execute various conference applications, such as face brushing for meeting, face brushing for sign-in, welcoming words, electronic nameplates, sound control guide and conference statistics.

The AI camera is one of conference cameras, and can perform sound-face matching according to a sound source position and a face position.

The AI gateway can also be called as an AI server, is used for queuing and concurrency control of face recognition and address book query tasks, and is a bridge between the R video conference terminals and the face recognition server and the address book server.

Video conferencing is a multimedia communication method that utilizes video technology and equipment to hold a conference over a communication network. When a video conference is held, the participants at two or more different places can hear the sound of the other party and see the image of the other party, and can see the scene of the conference room of the other party and the real objects, pictures, forms, files and the like displayed in the conference, thereby shortening the distance of the participating representatives, enhancing the atmosphere of the conference, leading people to participate in the conference just like the same place, and obviously improving the working efficiency.

In a video conference room, there are typically multiple participants. The AI has wide application prospect in meeting rooms, such as face brushing for meeting, face brushing for signing in, welcoming words, electronic nameplates, sound control guide, meeting statistics and the like. The face brushing meeting is that when a legal meeting participant is in the spot, the starting and the preparation of meeting room equipment are automatically triggered without any operation, and a reserved meeting is connected. The face brushing check-in is that the legal conference participants are on the spot, and the automatic check-in is realized without any operation. The welcome word is the word appearing on the screen of "welcome some decoration clinical guide"! "is used herein. The electronic name plate automatically displays the name or the distinguished name of the speaker on the close-up picture. The voice-controlled director says that the meeting participants say' Wen, hello! Please close up a certain point! ", the system automatically finds that venue, that person, and gives a close-up picture. The conference statistics is to automatically count the time period and duration of each person participating in the conference. However, in the video conference scene, the identities of the participants cannot be identified in real time through modes such as participant login or active participant cooperation, and the conference system needs to automatically acquire and screen the biological characteristics of the participants to identify the identities, wherein the face identification is widely applied. Under the application scene, the identification result can be perceived by the participants in real time, and even if the probability of identification error is very low, once the identification result is wrong, great negative influence can be caused to the conference experience, so that the accuracy of identification is continuously improved, and the method is very important.

In the practical application of face recognition, the appearance of a person may change, for example, as the person ages, the features of the face may change, thereby reducing the accuracy of face recognition. For example, the threshold of the face recognition similarity is set to 85, that is, if the face recognition similarity is higher than 85, the recognition is determined to be successful. Under the condition of a higher threshold value, when the identity recognition is successful, the probability of the recognition error is lower and can be lower than 1%, but for people who register a photo and are collected in an earlier year, the recognition may fail frequently. For example, the threshold value of the face recognition similarity is set to 75, that is, if the face recognition similarity is higher than 75, the recognition is determined to be successful. Under the condition of a lower threshold value, when the identity identification is successful, the probability of identification error is higher, even higher than 10%.

The above problem can be solved by automatically updating the face library. For example, based on an originally registered face, a Generative Adaptive Network (GAN) is used to generate virtual faces of different age groups, a face library of the age groups is built, and then a face recognition neural network model adapted to the age groups is trained. The scheme expands the face feature tolerance range of a single person, but the face feature tolerance ranges of different persons are close to each other or even overlap, and when the number of targets in a face library is large, for example, more than one hundred thousand, the face recognition accuracy rate is sharply reduced.

For another example, the face image data of the user to be unlocked and authenticated is collected, if the face recognition is successful, the face library is updated, the scheme can solve the problem of slow change of face features (for example, face change along with aging or beard growth and the like), but if the registered face image is collected earlier, the face recognition is started and then the collected personal sample is subjected to face recognition, even if the similarity of the personal sample recognition is the highest, the recognition similarity is often low and may be below a preset face recognition threshold, namely, the recognition is unsuccessful, and the opportunity of automatically updating the face library is difficult to obtain. The higher the threshold of face recognition is, the higher the recognition accuracy rate is, so if the preset face recognition threshold is lowered, the recognition accuracy rate may be lowered, and further, the face library is updated with an incorrect (non-self) sample, resulting in a subsequent recognition error.

As another example, if the other means (e.g., fingerprint) identification/unlocking is successful, the face library is updated, and the solution can solve the problem of sudden changes in the face features (e.g., makeup, shaving, face-lifting, etc.). However, the biometric identification method with higher accuracy than the face identification, such as fingerprint identification, requires active cooperation of the user, is not automatic face bank update, is inconvenient to operate, and is not practical.

Fig. 2 illustrates a method 200 for face library update according to an embodiment of the present application. The method 200 includes steps S210-S240.

S210, a face library updating device acquires a first face sample acquired by a camera, wherein whether a target presented by the first face sample is to be identified is a first target or not.

S220 determines a second face sample corresponding to the first face sample.

By way of example and not limitation, the correspondence between the second face sample and the first face sample may be determined by at least one of the following parameters.

Parameter 1: the second face sample and the first face sample may be co-located face samples.

Parameter 2: the second face sample and the first face sample may be face samples acquired at the same time period. For example, the first face sample and the second face sample are acquired within a first time period, which may be a preset time period.

Or, a time difference between the acquisition time of the second face sample and the acquisition time of the first face sample may be smaller than or equal to a preset difference. For example, the second face sample is acquired within a period of a preset difference value before or after the time of acquisition of the first face sample.

Parameter 3: the second face sample and the first face sample may be face samples collected within the same video source. The same video source may be a video taken by the same device at the same location. The same video source can also be a video that is continuously shot by the same device.

The corresponding relationship between the second face sample and the first face sample can also be determined by the two or more parameters. For example, the second face sample and the first face sample may be face samples taken at the same location during the same time period. The accuracy of updating the face library can be further improved by limiting the acquisition time interval and the acquisition place of the first face sample and the second face sample. For another example, the second face sample and the first face sample may be face samples collected in the same video source, and a time difference between a collection time of the second face sample and a collection time of the first face sample is smaller than or equal to a preset difference.

By way of example and not limitation, the first face sample may be a face sample acquired at a restricted location corresponding to the first target. Further, the second face sample and the first face sample may both be face samples collected in a restricted location corresponding to the first target.

The limited places corresponding to the first target may be preset, or may be one or more places closest to the daily activity range of the first target obtained by analyzing the daily activity range of the first target through big data.

Through the restriction to the first face sample collection place, the accuracy of discernment can be improved, the collection scope of restriction first face sample simultaneously can reduce calculated amount and network flow.

By way of example and not limitation, the second face sample may be a face sample acquired at a restricted location corresponding to the second target. Further, the second face sample and the first face sample may both be face samples collected in a restricted location corresponding to a second target.

The limited places corresponding to the second target may be preset, or may be one or more places closest to the daily activity range of the second target obtained by analyzing the daily activity range of the second target through big data.

Further, the limited place corresponding to the first target or the limited place corresponding to the second target may be combined with at least one of the parameters 1, 2, and 3 to determine the range of the face sample. For example, the second face sample and the first face sample may be face samples acquired in a restricted place corresponding to a first target in the same time period.

By way of example and not limitation, the first face sample and the second face sample may be high-quality face images, such as face images with a front face and a clear image, and the number of pixels of the face images is greater than a preset value. The first face sample and the second face sample can also be preprocessed face images.

And S230, comparing the second face sample with a face image corresponding to a second target in a face library, and judging whether the target presented by the second face sample is identified as the second target, wherein the second target is an endorser of the first target. The second target may be plural, that is, the first target may have a plurality of endorsers. The endorser of the first target may be preset, or may be a target which is determined by means of big data and has the closest distance to the interpersonal relationship with the first target. It should be noted that, in some cases, the endorser whose second target is the first target may also be interpreted as the endorser whose first target is the second target.

By way of example and not limitation, prior to S230, it may be determined whether a target presented by the first face sample is identified as the first target. When the target presented by the first face sample is not recognized as the first target, S230 is performed. When the target presented by the first face sample is identified as the first target, the facial image corresponding to the first target in the facial library may be updated directly according to the first face sample without performing S230 and S240. By judging whether the target presented by the first face sample is identified as the first target or not, when the target presented by the first face sample is identified as the first target, the subsequent steps are not executed, the first face sample is directly used for updating the face library, and the calculation amount can be reduced.

As an example and not by way of limitation, determining whether the target presented by the first face sample is identified as the first target may be that if the first similarity between the first face sample and the first face image is greater than or equal to a second threshold, the target presented by the first face sample may be identified as the first target. For example, the second threshold may be 85. The first face image is a face image corresponding to a first target in the face library. When the face images corresponding to the first target in the face library are multiple face images, the first similarity may be a maximum similarity among multiple similarities between the first face sample and the multiple face images corresponding to the first target in the face library, and the first face image may be the face image corresponding to the maximum similarity.

The method for obtaining the first similarity between the first face sample and the first face image may be that the face feature vector of the first face sample is extracted and compared with the face feature vector of the first face image to obtain the first similarity between the face feature vector of the first face sample and the face feature vector of the first face image.

It should be noted that, in the embodiment of the present application, the similarity between the face sample and the face image in the face library may also be understood as a similarity (or a distance) between a face feature vector of the face sample and a face feature vector in the face library, that is, the face library may store face images of multiple targets, or may not store face images of multiple targets, but only store face feature vectors corresponding to face images of multiple targets.

Further, it may be determined whether the target presented by the first face sample is identified as the first target, and if a first similarity between the first face sample and the first face image is greater than or equal to a second threshold and the first similarity is a maximum similarity among the plurality of second similarities, the target presented by the first face sample may be identified as the first target. And the second similarity is the similarity between the first face sample and the face images in the face library. For example, when the similarity between the first face sample and the first face image is greater than the similarity between the first face sample and any other face image in the face library, and the similarity between the first face sample and the first face image is greater than or equal to a second threshold, the target presented by the first face sample may be identified as the first target. One way of determining whether the object presented by the first face sample is identified as a first object is described in detail in the face recognition method 400 below.

By way of example and not limitation, a determination may be made of a first similarity between the first face sample and a first face image prior to determining whether a target presented by the second face sample is identified as a second target. And when the first similarity between the first face sample and the first face image is the maximum similarity in the plurality of second similarities, judging whether the target presented by the second face sample is identified as a second target. The first face image is a face image corresponding to a first target in the face library, and the second similarity is a similarity between the first face sample and the face image in the face library.

Further, the determining of the first similarity between the first face sample and the first face image may be performed by determining whether a target presented by the second face sample is identified as a second target when the first similarity between the first face sample and the first face image is a maximum similarity among a plurality of second similarities, and the first similarity is greater than or equal to a first threshold. For example, the first threshold may be 65. The first face image is a face image corresponding to a first target in the face library, and the second similarity is a similarity between the first face sample and the face image in the face library.

As an example and not by way of limitation, determining whether the target presented by the second face sample is identified as the second target may be that, if the third similarity between the second face sample and the second face image is greater than or equal to a second threshold, the target presented by the second face sample may be identified as the second target. For example, the second threshold may be 85. The second face image is a face image in a face library of the second target. When the face images in the face library of the second target are multiple face images, the third similarity may be a maximum similarity among the multiple similarities between the second face sample and the multiple face images in the face library of the second target, and the second face image may be a face image corresponding to the maximum similarity.

The method for obtaining the third similarity between the second face sample and the second face image may be that the face feature vector of the second face sample is extracted and compared with the face feature vector of the second face image to obtain the third similarity between the face feature vector of the second face sample and the face feature vector of the second face image.

Whether the target presented by the face sample can be successfully identified or not is judged by setting a higher identification threshold value, so that the identification accuracy can be ensured.

Further, it may be determined whether the target presented by the second face sample is identified as the second target, where if a third similarity between the second face sample and the second face image is greater than or equal to a second threshold and the third similarity is a maximum similarity among the plurality of fourth similarities, the target presented by the second face sample may be identified as the second target. And the fourth similarity is the similarity between the second face sample and the face images in the face library. For example, when the third similarity between the second face sample and the second face image is greater than the similarity between the second face sample and any other face image in the face library, and the third similarity between the second face sample and the second face image is greater than or equal to a second threshold, the target presented by the second face sample may be identified as the second target. One way of determining whether the object represented by the second face sample is identified as a second object is described in detail in the face recognition method 400 below.

Whether the target presented by the face sample can be successfully identified or not can be judged by setting a higher identification threshold value and judging the target corresponding to the maximum similarity between the face sample and the face image in the face library, so that the identification accuracy can be further improved.

And S240, when the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample.

By way of example and not limitation, the number of the first face samples may be one or more, and if there is one first face sample, when the object represented by the second face sample is identified as the second object, the face image corresponding to the first object in the face library is updated by using the first face sample.

If the number of the first face samples is Q, where Q is an integer greater than 1, each first face sample may correspond to one or more second face samples. The Q first face samples can be face samples collected in different places, face samples collected in different time periods in the same place, face samples collected in different time periods in different places and face samples collected in different video sources. When the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample can be achieved through the following steps.

A-1: when the target presented by the second face sample is identified as the second target, determining the first face sample corresponding to the second face sample as a candidate face sample. The candidate face samples may be M, or may be Q first face samples, where M first face samples exist in the Q first face samples, and a target represented by a second face sample corresponding to the M first face samples can be identified as a second target, where M is greater than 1 and less than or equal to Q, and M is an integer.

A-2: a target candidate face sample may be determined from the M candidate face samples based on at least one of the first parameter and the second parameter. The first parameter includes a similarity between each candidate face sample and a third face image, and the third face image is a face image corresponding to the first target in the face library. The second parameter is the number of valid second face samples corresponding to each candidate face sample. And the effective second face sample corresponding to the mth candidate face sample comprises a face sample of which the target presented in the T second face samples corresponding to the mth candidate face sample is identified as the second target. T is a positive integer, M belongs to [1, M ], and M is an integer. The third facial image may be the same as the first facial image or may be different from the first facial image.

The second target may be plural, that is, the endorser of the first target may be plural.

Further, the second parameter may be an effective number of effective second face samples corresponding to each candidate face sample. The effective number is the number of effective second face samples for which the target represented by the effective second face samples is identified as a different second target. For example, the number of valid second face samples corresponding to the mth candidate face sample is P, where P is a positive integer, and the P valid face samples are identified as Y different second targets, then the valid number of valid second face samples corresponding to the mth candidate face sample is Y, where 0< Y ≦ P, and Y is a positive integer. That is, P different second face samples among the T second face samples corresponding to the mth candidate face sample can be recognized as Y different second objects. The effective number of people in the valid second face sample corresponding to the candidate face sample may also be referred to as the number of actual endorsers for the candidate face sample.

A-3: and updating the face image corresponding to the first target in the face library according to the target candidate face sample.

By increasing the number of the candidate face samples of the first target and further selecting the target candidate face samples from the candidate face samples according to at least one parameter of the similarity and the actual endorsement number, the accuracy rate of the face library updating can be improved. For example, the higher the similarity between the target candidate face sample and the face image corresponding to the first target in the face library, the more the number of actual endorsements corresponding to the target candidate face sample is, the higher the probability that the target presented by the target candidate face sample is the first target is, and the higher the accuracy of updating the face image corresponding to the first target in the face library by using the target candidate face sample is.

Further, determining a target candidate face sample from the M candidate face samples according to at least one of the first parameter and the second parameter, comprises:

according to the first parameter, determining the candidate face sample with the maximum similarity between the M candidate face samples and the third face image as the target candidate face sample, or

Determining the candidate face sample with the largest number of actual endorsers in the M candidate face samples as the target candidate face sample according to the second parameter; alternatively, the first and second electrodes may be,

and determining a target candidate face sample from the M candidate face samples according to the first parameter and the second parameter. Specifically, the weighting similarity corresponding to each candidate face sample is calculated according to the actual endorsement number of each candidate face sample of the first target and the similarity between each candidate face sample and the face image corresponding to the first target in the face library, and the candidate face sample with the highest weighting similarity is selected from the M candidate face samples to update the face library.

Specifically, the weighted similarity calculation method corresponding to the ith candidate face sample in the M candidate face samples is as follows. For convenience of description, the similarity between the ith candidate face sample corresponding to the M candidate face samples and the face image corresponding to the first target in the face library is referred to as the similarity corresponding to the ith candidate face sample.

And determining the weight of the similarity corresponding to the ith candidate face sample according to the actual endorsement number of the ith candidate face sample, and multiplying the similarity corresponding to the ith candidate face sample by the weight to obtain the weighted similarity of the ith candidate face sample. For example, the weight may be determined in such a way that the more the number of actual endorsers is, the higher the weight is. For example, when the actual endorsement population is 1, the weight may be 1.1; when the number of the actual endorsers is 2, the weight can be 1.2; when the actual number of endorsements is 3 or more, the weight may be 1.3.

Further, a target candidate face sample is determined from the M candidate face samples according to at least one of the first parameter and the second parameter. The M candidate face samples may be clustered first. And determining a target candidate face sample in the first type of candidate face samples according to at least one parameter of the first parameter and the second parameter. Wherein the first class is one or more classes including a number of candidate face samples greater than or equal to a third threshold. It is also understood that the M candidate face samples are clustered, the class with the smaller number of candidate face samples is discarded, and the target candidate face sample is determined in the remaining classes according to at least one of the first parameter and the second parameter.

For example, the similarity between the candidate face samples corresponding to the first target may be used as the distance between the samples of the cluster.

The M candidate face samples are clustered, and candidate face samples with small similarity with most of the candidate face samples are discarded, so that the accuracy of face library updating can be improved.

By way of example and not limitation, before performing S210, it may be determined whether the first target is a target to be updated, and if the first target is determined to be the target to be updated, S210 is performed.

By way of example and not limitation, it may be determined whether the first target is a target to be updated in the following manner.

Performing N times of identification aiming at the first target based on N third face samples, wherein N is an integer greater than 1; and determining the first target as a target to be updated according to the N recognition results.

And the fifth similarity between the third face sample and the fourth face image is the maximum similarity among a plurality of sixth similarities, wherein the fourth face image is one or more face images corresponding to the first target in the face library, and the sixth similarity is the similarity between the third face sample and the face images in the face library. The fourth face image may be the same as the first face image, may also be the same as the third face image, and may also be different from both the first face image and the third face image.

Further, determining the first target as a target to be updated according to the result of the N times of recognition may include:

determining the first target as the target to be updated when the number K of successful recognition times in the N recognition results is less than or equal to a fourth threshold, where K is an integer, for example, N is 1000, and the fourth threshold may be 200, that is, determining the first target as the target to be updated when the number K of successful recognition times in the 1000 recognition results is less than or equal to 200; or

When the ratio of the number K of successful recognition times to the N in the N recognition results is smaller than or equal to a fifth threshold value, determining the first target as a target to be updated, wherein K is an integer; or

When the number L of times of identification failure in the N times of identification results is larger than or equal to a sixth threshold value, determining the first target as a target to be updated, wherein L is an integer; or

When the ratio of the number L of identification failures to the number N of identification failures in the result of the N times of identification is greater than or equal to a seventh threshold value, determining the first target as a target to be updated, wherein L is an integer; or

And when the ratio of the number K of successful recognition times to the number L of failed recognition times in the N recognition results is smaller than or equal to an eighth threshold value, determining the first target as a target to be updated, wherein L is an integer, and K is an integer.

Alternatively, based on N third face samples, N times of recognition are performed on the first target, and according to the result of the N times of recognition, the first target is determined as a target to be updated, which may also be: and performing N times of continuous recognition on the first target based on N third face samples, wherein N is an integer greater than 1, and when the results of the N times of continuous recognition are all recognition failures, determining the first target as a target to be updated.

Fig. 3 is a schematic diagram illustrating a face recognition method 300 according to an embodiment of the present application. The method 300 includes steps S310-S340.

And S310, acquiring a video image.

S320, detecting a face sample in the video image, wherein the face sample can be a high-quality face image, and the high-quality face image can be a face image with a front face, a clear face or a face image with the number of pixels larger than a preset threshold value. Alternatively, the face samples may be randomly acquired face images.

And S330, extracting the face characteristic vector of the face sample, and respectively comparing the face characteristic vector with Z face characteristic vectors stored in a face library to obtain the similarity between the face characteristic vector of the face sample and each face characteristic vector stored in the face library. The Z face feature vectors correspond to Z degrees of similarity respectively. Where Z is an integer greater than 0, for example Z may be an integer on the order of one hundred thousand. The face library comprises face feature vectors of a plurality of targets, and each target can correspond to one or more face feature vectors in the face library. And selecting the highest similarity from the Z similarities to obtain a face feature vector corresponding to the highest similarity and a target corresponding to the face feature vector.

S340, determine whether the highest similarity is greater than or equal to a preset value, for example, the preset value may be 85. If the highest similarity is greater than or equal to 85, the target presented by the face sample can be identified as the target corresponding to the face feature vector corresponding to the highest similarity. If the highest similarity is smaller than 85, the face sample cannot be identified as a target corresponding to the face feature vector corresponding to the highest similarity, which may also be called identification failure.

In the embodiment of the present application, whether a target presented by a first face sample is identified as a first target and whether a target presented by a second face sample is identified as a second target may be respectively determined through the method 300.

In the embodiment of the present application, whether the target in the face library is the target to be updated may be determined by the method 300. The following description will be made by taking an example of determining whether the first target is a target to be updated. Specifically, for a face sample, if a target corresponding to a face feature vector corresponding to the highest similarity among the Z similarities is a first target, the recognition of the face sample may be referred to as recognition of the first target. And counting the recognition results of the first target for N times, and if the recognition failure times are greater than or equal to the preset times, determining that the first target is the target to be updated. For example, N may be 1000 and the preset number may be 200. The N times of recognition of the first target may be understood as that there are N face samples, and in Z similarities between the face feature vector of each face sample in the N face samples and the Z personal face feature vectors in the face library, the similarity between the face feature vector of each face sample and the face feature vector corresponding to the first target is the largest. Alternatively, if the number of times of continuous recognition failures is greater than or equal to a preset number of times in the result of recognition for the first target, the first target is determined to be the target to be updated. For example, the preset number of times may be 100.

Fig. 4 is a diagram illustrating a face library updating method 400 according to another embodiment of the present application. The method 400 is illustrated in connection with a video conferencing system.

In the video conference system, the target in the face library may also be understood as an identification number (ID) in the face library, where the ID is a personal unique identification number (PUID) of the user in the service range of the AI gateway. The user registers a face image in a face library in advance. Each user may register one or more face images, and each face image may correspond to a Face Unique Identifier (FUID) and a face feature vector. For example, the face feature vector may be a 1024-dimensional vector, and each dimension takes on an integer of 0 to 255. One PUID may correspond to one or more FUIDs.

By way of example and not limitation, a method of user registration may include the following steps.

B-1: the PUID is generated from credential information of the user, which may include a credential identification number. In the process of registering facial images in batch, an administrator can submit images and certificate information which are authorized in advance to the AI gateway through a registration tool, and the AI gateway can generate the PUID according to the certificate information.

B-2: and generating a face feature vector according to the image of the user, and allocating an FUID to the face feature vector. The AI gateway submits the image corresponding to the PUID to a face recognition server, and the face recognition server can generate a face feature vector according to the image, allocate an FUID for the face feature vector and return the FUID to the AI gateway. The face recognition server may store the FUID and the corresponding face feature vector, for example, generate a FUID-face feature vector lookup table.

B-3: and generating a corresponding relation between the PUID and the FUID. In addition, a relationship between the PUID and a certificate number (CID) may be further generated, and the CID may include a work card number, an identification number, or the like. The AI gateway may maintain a correspondence between PUID-FUID and a relationship between PUID and certificate number CID, such as generating a look-up table of PUID-CID-FUID.

After the user registration is completed, the AI gateway and the face recognition server can perform desensitization processing and delete unnecessary photos and information.

Taking a video conference system as an example, the face recognition method is explained in detail.

C-1: the AI camera outputs video images in which a person is active.

C-2: the conference terminal can detect a face sample from the video image, the face sample can be a high-quality face image, and the high-quality face image can be a face image with a front face, clearness and a pixel number larger than a preset value.

C-3: the AI gateway can queue and manage concurrency, submit a face recognition application to the face recognition server, and send the face sample to the face recognition server.

The face recognition server can extract the face feature vector of the face sample, and compare the face feature vector with the face feature vector registered in advance in the face library to obtain the similarity with the highest similarity. And feeding back a face recognition result to the AI gateway, wherein the face recognition result comprises the similarity with the highest similarity and the corresponding FUID.

C-4: the AI gateway can obtain the PUID corresponding to the similarity with the highest similarity according to a pre-stored PUID-FUID comparison table, and sends the similarity with the highest similarity and the corresponding PUID to the conference terminal.

C-5: if the highest similarity is greater than or equal to a preset value, for example, the preset value may be 85, the confirmation is successful, that is, the target presented by the face sample confirms the PUID corresponding to the highest similarity. Step C-5 may be performed by the conference terminal, by the AI gateway, or by both.

The method 400 includes steps S410-S440.

S410, obtain the ID to be updated, i.e. the target (first target) to be updated in the method 200. Taking a video conference system as an example, a method for judging whether a target in a face library is a target to be updated will be described.

The result of the identification for each PUID is counted. When the similarity between the face feature vector of one face sample and the face feature vector corresponding to FUID1(FUID1 is one FUID in the face library) in the face library is greater than the similarity between the feature vector of the face sample and the face feature vectors corresponding to the rest FUIDs in the face library, the recognition of the face sample can be regarded as the recognition of the PUID corresponding to FUID1, and the recognition result of the face sample can be counted in the statistics of the recognition result of the PUID. And if the identification result aiming at the PUID is that the number of times of identification failure is greater than or equal to the preset number of times in the latest N times of identification aiming at the PUID, the PUID is the identity ID to be updated. The recognition failure may be that the highest similarity is lower than the second threshold. For example, the second threshold may be 85, N may be 1000, and the preset number may be 200. That is, if the number of times that the degree of similarity is lower than 85 is greater than or equal to 200 times among the last 1000 identifications for the PUID, the PUID can be confirmed as the ID to be updated. One PUID may correspond to multiple FUIDs.

Alternatively, if the identification is performed N times for the PUID consecutively, the identification results are all identification failures. The recognition failure may be that the highest similarity is lower than a second threshold, where N is an integer greater than 1. For example, the second threshold may be 85 and N is 100. That is, if the similarity scores of the last 100 consecutive identifications for the PUID are all lower than 85, the PUID can be confirmed as the ID to be updated.

S411, adding the ID to be updated into the first list to be updated. The first list to be updated may include one or more IDs to be updated. S411 is an optional step.

S420, obtain an endorsement qualification list corresponding to the ID to be updated, where the endorsement qualification list may include one or more IDs corresponding to the ID to be updated. The ID in the endorsement qualification list is the endorser for the ID to be updated, i.e., the second objective in method 200. For example, the one or more IDs corresponding to the IDs to be updated may be one or more IDs closest to the human relationship between the IDs to be updated. For example, the endorsement eligibility list may include the 20 IDs that are closest to the interpersonal relationship between the IDs to be updated.

The interpersonal relationship distance can be obtained by inquiring an enterprise address book, chat records, conference notifications and the like, for example, by big data analysis.

By way of example and not limitation, the endorsement qualification list corresponding to the ID to be updated may not include the ID in the first list to be updated. For example, the endorsement qualification list may include 20 IDs closest to the human relationship between the IDs to be updated, where the 20 IDs do not include the ID in the first list to be updated.

And S421, acquiring the limited place list corresponding to the ID to be updated. S421 is an optional step.

The list of restricted locations may include a unique identifier (RUID) of the one or more restricted locations to which the ID to be updated corresponds. For example, the one or more restricted places corresponding to the ID to be updated may be the one or more places closest to the activity range of the ID to be updated, that is, the restricted places corresponding to the first target in the method 300. For example, the restricted place list may include 20 RUIDs having the closest range of motion to the ID to be updated.

Wherein the activity range distance may be obtained by querying a business address book or the like, for example, by big data analysis.

The endorsement qualification list and the restricted place list of each ID to be updated may be preset, or may be created after acquiring the ID to be updated. For example, the AI gateway can create an endorsement qualification list and a restricted locale list for each ID to be updated.

According to the first list to be updated and the limited place list corresponding to each ID to be updated in the first list to be updated, the second list to be updated of each meeting place can be obtained, where the second list to be updated includes all IDs to be updated in the meeting place, or in other words, the ID to be updated of the face library needs to be updated by collecting the first face sample in the meeting place.

And S430, determining candidate face samples. Specifically, whether the ID in the endorsement qualification list corresponding to the ID to be updated is present in the first meeting place is judged, and whether the first face sample is a candidate face sample of the ID to be updated is determined. S430 is a specific implementation method of S230, that is, a first face sample and a second face sample are obtained, and whether the first face sample is a candidate face sample is determined according to the second face sample. The first face sample and the second face sample may be face samples taken at the same meeting place in the same meeting, e.g., the first face sample and the second face sample may be face samples taken at the first meeting place in the first meeting.

In particular, it may be determined whether the ID is present in the first session in said first conference in the following way.

The method comprises the steps of collecting a face sample in a first meeting place of a first meeting, carrying out face recognition, and if the similarity score between a face feature vector of the face sample and a face feature vector corresponding to a first ID in a face library is the highest and the highest similarity score exceeds a first preset threshold, for example, the first preset threshold can be 85, identifying a target presented by the face sample as a first ID, and regarding the first ID as the first meeting place of the first meeting. That is, if the target of the second face sample presentation can be identified as an ID in the endorsement eligibility list, the ID in the endorsement eligibility list can be considered to be present in the first meeting. It should be noted that, in the embodiment of the present application, the face feature vector corresponding to the ID is a face feature vector corresponding to one or more FUIDs corresponding to the ID.

Specifically, it may be determined whether the first face sample is a candidate face sample of the ID to be updated in the following manner.

As an example and not by way of limitation, if at least T IDs in the endorsement eligibility list of IDs to be updated are present in the first meeting, it may be determined that the first face sample is the candidate face sample, T is a positive integer, e.g., T ═ 1.

By way of example and not limitation, if the similarity between the face feature vector of the first face sample and the face feature vector corresponding to the ID to be updated is greater than the similarity between the face feature vector of the first face sample and the face feature vectors corresponding to other IDs in the face library, and the similarity exceeds a second preset threshold, and at least T IDs in the endorsement qualification list of the IDs to be updated are present in the first meeting place, it may be determined that the first face sample is the candidate face sample, and T is a positive integer. For example, the second preset threshold may be 65, T ═ 1. The face library comprises a plurality of face feature vectors.

Still further, method 500 below details one method of determining candidate face samples.

And S440, updating a face library by using the candidate face sample. The candidate face samples are used for updating a face library, that is, for updating a face feature vector and a FUID corresponding to an ID to be updated in the face library.

By way of example and not limitation, the AI gateway may collect candidate face samples of IDs to be updated for different meeting places of different conferences, similarities between face feature vectors of each candidate face sample of IDs to be updated and face feature vectors corresponding to IDs to be updated, and actual endorsement population of each candidate face sample of IDs to be updated. When the number of the candidate face samples of the ID to be updated is greater than a preset value (for example, the preset value may be 100), one candidate face sample is selected from the candidate face samples of the ID to be updated to update the face library.

As an example and not by way of limitation, selecting one candidate face sample from the candidate face samples of the ID to be updated to update the face library may be to calculate a weighted similarity corresponding to each candidate face sample according to the actual endorsement population of each candidate face sample of the ID to be updated and a similarity between a face feature vector of each candidate face sample and a face feature vector corresponding to the ID to be updated, and select a candidate face sample with the highest weighted similarity from the candidate face samples of the ID to be updated to update the face library.

Specifically, the weighted similarity calculation method corresponding to the ith candidate face sample of the ID to be updated is as follows. For convenience of description, the similarity between the feature vector of the ith candidate face sample corresponding to the ID to be updated and the face feature vector corresponding to the ID to be updated is referred to as the similarity corresponding to the ith candidate face sample.

As an example and not by way of limitation, selecting one candidate face sample from the candidate face samples of the ID to be updated to update the face library may also be that the candidate face samples of the ID to be updated are clustered, classes in which the number of the candidate face samples is less than a certain value are discarded, and one candidate face sample is selected from the remaining candidate face samples. The clustering method may be to cluster the similarity between the candidate face samples as the distance between the candidate face samples.

Fig. 5 is a diagram illustrating a method 500 for determining candidate face samples according to an embodiment of the present application. The method 500 includes the following steps.

And S510, acquiring a face sample in the first meeting place of the first meeting.

And S520, carrying out face recognition on the collected face sample.

S530, determine whether a similarity between the face feature vector of the face sample and the face feature vector corresponding to the first ID in the face library exceeds a second threshold, for example, the second threshold may be 85. And the similarity between the face feature vector of the face sample and the face feature vector corresponding to the first ID in the face library is greater than the similarity between the face feature vector of the face sample and the face feature vectors corresponding to other IDs in the face library.

And if the similarity between the face feature vector of the face sample and the face feature vector corresponding to the first ID in the face library exceeds a second threshold, executing S540, and if the similarity between the face feature vector of the face sample and the face feature vector corresponding to the first ID in the face library does not exceed the second threshold, executing S550.

S540, in the first meeting place of the first meeting, the first ID is marked as present.

And S550, judging whether the similarity between the face feature vector of the face sample and the face feature vector corresponding to the second ID in the face library exceeds a first threshold value and judging whether the second ID is in a second list to be updated of the first meeting place. For example, the first threshold may be 65. And the similarity between the face feature vector of the face sample and the face feature vector corresponding to the second ID in the face library is greater than the similarity between the face feature vector of the face sample and the face feature vectors corresponding to other IDs in the face library. And if the similarity between the face feature vector of the face sample and the face feature vector corresponding to the second ID in the face library exceeds a first threshold and the second ID is in a second list to be updated of the first meeting place, the step S560 is entered, and if not, the step S is ended.

And S560, judging whether the ID in the endorsement qualification list corresponding to the second ID is in the field, if so, entering S570, and if not, ending.

S570, determining the face sample as a candidate face sample corresponding to the first ID, and counting the number of IDs in the endorsement qualification list corresponding to the first ID, namely the actual endorsement number of the candidate face sample.

It should be noted that the first ID and the second ID are only for convenience of description, and the first ID and the second ID may be the same ID. The sequence between the above steps is not a fixed sequence, for example, face recognition may be performed on all face samples collected in the first meeting place, IDs in face libraries of all existing places in the first meeting place are determined, and then S550-S570 are performed. S530-S550 may also be sequentially performed on each face sample, where the second ID and the first ID may be the same ID, and after the ID meeting the condition of S550 is marked, S560 and S570 are performed again for the marked ID (i.e., the second ID in S560), which can reduce the amount of calculation and speed up the determination process.

The method 500 may be performed for each meeting place of each meeting to obtain candidate face samples of different meeting places of different meetings, similarity between the face feature vector of each candidate face sample and the face feature vector corresponding to the first ID, and the actual number of endorsements made by each candidate face sample. At the end of the conference, all "presence" flags may be cleared.

Table 1 shows a first list to be updated.

TABLE 1

PUID	Pointer 1	Pointer 2
			z00382787	Endorsement qualification list (z00382787)	Restricted site list (z00382787)
w00875643	Endorsement qualification list (w00875643)	Restricted place list (w00875643)
			p00870865	Endorsement qualification list (p00870865)	Restricted site list (p00870865)
……	……	……

The PUID in table 1 indicates an ID to be updated, for example, a user ID with PUID z00382787 is included in table 1, that is, the ID needs to update the face library, the pointer 1 corresponding to the ID points to an endorsement qualification list (z00382787), and the pointer 2 corresponding to the ID points to a restricted place list (z 00382787). Table 2 shows an endorsement qualification list (z00382787) corresponding to one ID to be updated (PUID ═ z00382787) in table 1.

TABLE 2

PUID	Whether or not to be present
		q00874398	0
n00897470	1
		k00975620	0
……	……

PUID in table 2 represents ID of the endorser of the ID to be updated. A "0" in table 2 indicates that the corresponding ID is absent and a "1" indicates that the corresponding ID is present. Table 3 shows a restricted place list (z00382787) corresponding to one ID to be updated (PUID ═ z00382787) in table 1. The RUID in table 3 indicates the meeting place corresponding to the ID to be updated.

TABLE 3

RUID
	J1-1-D01R
K3-2-A14B
	D5-6-B12A
……

Table 4 shows a second list to be updated (J1-1-D01R) of the meeting place with RUID J1-1-D01R, where PUIDs in the second list to be updated (J1-1-D01R) include all IDs to be updated in the meeting place, and may also be said to be updated IDs that need to acquire a first face sample in the meeting place to update the face library.

TABLE 4

PUID	Pointer 1
		t00438779	Endorsement qualification List (t00438779)
z00382787	Endorsement qualification list (z00382787)
		c00987347	Endorsement qualification list (c00987347)
……	……

It should be noted that tables 1 to 4 are only illustrative and do not limit the examples of the present application.

According to the scheme of the embodiment of the application, the source of the endorser is limited, and the face library is updated by preferably selecting one candidate face sample from a plurality of candidate face samples, so that the error rate of updating the face library can be greatly reduced. If the number of the actual endorsements corresponding to the candidate face samples for updating is 3, the error rate can be reduced to one in ten thousand. By judging the ID to be updated, the updating frequency of the face library is reduced, the source of the candidate face sample is limited by setting the limited place list, the collection range of the candidate face sample is reduced, and the error rate of updating the face library is reduced while the calculated amount and the network flow are reduced. According to the embodiment of the application, historical information of conference activities, conference notification information, interpersonal relationship information and enterprise address book information are utilized, the error rate of automatically updating the face library is greatly reduced, and the updating frequency, network flow and calculation amount of the face library are reduced.

By way of example and not limitation, the face library updating method in the embodiment of the present application may be implemented by a single software module. For example, the software module may be located at the AI gateway. According to the network configuration, the software module can also be deployed on the same network element with an enterprise address book or a face recognition server. In a small scale system, the software module can be deployed on a certain terminal.

By way of example and not limitation, the face library updating method in the embodiment of the present application may be implemented by using a plurality of software modules. For example, the face library updating method in the embodiment of the present application may be implemented by a client software module and a server software module together. The client software module may be configured to perform the steps of determining candidate face samples, and the server software module may perform other steps. The client software module can be deployed in R video conference terminals, and each video conference terminal has a software instance running. The server software module may be deployed in the AI gateway. According to the network configuration, the server software module can also be deployed on the same network element as the enterprise address book or the face recognition server. In a small-scale system, a server software module can be deployed on a certain terminal, and only one software instance runs. The server software module can send the second list to be updated and the endorser qualification list corresponding to the ID to be updated in the second list to be updated to the corresponding conference terminal, and collect the candidate face samples from each conference terminal, the corresponding similarity and the corresponding actual endorsement number.

Fig. 6 is a schematic block diagram of a face library updating apparatus according to an embodiment of the present application. The face library updating apparatus 600 shown in fig. 6 includes an acquisition unit 601 and a processing unit 602.

The obtaining unit 601 and the processing unit 602 may be configured to execute the face library updating method according to the embodiment of the present application, specifically, the obtaining unit 601 may execute the above S210, and the processing unit 602 may execute the above S220 to S240.

The obtaining unit 601 may be configured to obtain a first face sample collected by a camera, where the first face sample represents whether a target to be identified is a first target.

The processing unit 602 may be configured to determine a second face sample corresponding to the first face sample; comparing the second face sample with a face image corresponding to a second target in a face library, and judging whether the target presented by the second face sample is identified as the second target, wherein the second target is an endorser of the first target; and when the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample.

The processing unit 602 may be configured to determine whether the target represented by the second face sample is identified as the second target when the target represented by the first face sample is not identified as the first target.

The processing unit 602 may be configured to, when a first similarity between the first face sample and a first face image is a largest similarity among a plurality of second similarities, determine whether a target presented by the second face sample is identified as a second target, where the first face image is a face image corresponding to the first target in the face library, and the second similarity is a similarity between the first face sample and a face image in the face library.

The processing unit 602 may be configured to, when a first similarity between the first face sample and the first face image is a largest similarity among a plurality of second similarities, and the first similarity is greater than or equal to a first threshold, determine whether a target presented by the second face sample is identified as a second target.

The processing unit 602 may be configured to, if a third similarity between the second face sample and a second face image is greater than or equal to a second threshold, identify a target presented by the second face sample as the second target, where the second face image is a face image corresponding to the second target in the face library.

The processing unit 602 may be configured to identify the second face sample as the second target if the third similarity is greater than or equal to the second threshold and the third similarity is a largest similarity among a plurality of fourth similarities, where the fourth similarities are similarities between the second face sample and face images in the face library.

The second face sample and the first face sample are face samples collected at the same place, or the second face sample and the first face sample are face samples collected at the same time interval, or the second face sample and the first face sample are face samples collected in the same video source.

The first face sample is a face sample collected in a limited place corresponding to the first target; or the second face sample is a face sample collected in a limited place corresponding to the second target.

The first face samples are Q, Q is an integer greater than 1, and the processing unit 602 may be configured to, when a target presented by the second face sample is identified as the second target, determine the first face sample as a candidate face sample of the first target, where M is greater than 1 and less than or equal to Q, and M is an integer; determining target candidate face samples from the M candidate face samples according to at least one of a first parameter and a second parameter, wherein the first parameter includes a similarity between each candidate face sample and a third face image, the third face image is a face image corresponding to the first target in the face library, and the second parameter is the number of effective second face samples corresponding to each candidate face sample, wherein the effective second face samples corresponding to the mth candidate face sample include second face samples identified as second targets by targets present in T second face samples corresponding to the mth candidate face sample, M belongs to [1, M ], M is an integer, and T is a positive integer; and updating the face image corresponding to the first target in the face library according to the target candidate face sample.

The processing unit 602 may be configured to cluster the M candidate face samples, and determine the target candidate face sample according to at least one of a first parameter and a second parameter in a first class of candidate face samples, where the first class is one or more classes in which the number of candidate face samples included is greater than or equal to a third threshold.

The processing unit 602 may be further configured to perform N times of recognition on the first target based on N third face samples, where N is an integer greater than 1; and determining the first target as a target to be updated according to the N recognition results.

A fifth similarity between the third face sample and the fourth face image is a maximum similarity among a plurality of sixth similarities, where the fourth face image is a face image corresponding to the first target in the face library, and the sixth similarity is a similarity between the third face sample and a face image in the face library.

The processing unit 602 may be configured to determine the first target as a target to be updated when a number K of times of successful recognition in the N recognition results is less than or equal to a fourth threshold, where K is an integer; or when the ratio of the number K of successful recognition times to the N in the N recognition results is smaller than or equal to a fifth threshold, determining the first target as a target to be updated, wherein K is an integer; or when the number L of times of identification failure in the N times of identification results is greater than or equal to a sixth threshold value, determining the first target as a target to be updated, wherein L is an integer; or when the ratio of the number L of identification failures to the number N of identification failures in the result of the N times of identification is greater than or equal to a seventh threshold value, determining the first target as a target to be updated, wherein L is an integer; or when the ratio of the number K of successful recognition times to the number L of failed recognition times in the result of the N times of recognition is less than or equal to an eighth threshold, determining the first target as a target to be updated, wherein L is an integer and K is an integer.

It is to be understood that the obtaining unit 601 and the processing unit 602 in the apparatus 600 described above may correspond to the processor 702 in the apparatus 700 described below.

Fig. 7 is a schematic diagram of a hardware structure of a face library updating apparatus according to an embodiment of the present application. The face library updating apparatus 700 shown in fig. 7 (the apparatus 700 may specifically be a computer device) includes a memory 701, a processor 702, a communication interface 703 and a bus 704. The memory 701, the processor 702, and the communication interface 703 are communicatively connected to each other via a bus 704.

The memory 701 may be a Read Only Memory (ROM), a static memory device, a dynamic memory device, or a Random Access Memory (RAM). The memory 701 may store a program, and when the program stored in the memory 701 is executed by the processor 702, the processor 702 and the communication interface 703 are used for executing the steps of the face library updating method according to the embodiment of the present application.

The processor 702 may be a general Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), a Graphics Processing Unit (GPU), or one or more integrated circuits, and is configured to execute related programs to implement functions that need to be executed by the units in the face library updating apparatus according to the embodiment of the present application, or to execute the face library updating method according to the embodiment of the present application.

The processor 702 may also be an integrated circuit chip having signal processing capabilities. In the implementation process, each step of the face library updating method of the present application may be completed by an integrated logic circuit of hardware in the processor 702 or an instruction in the form of software. The processor 702 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 701, and the processor 702 reads information in the memory 701, and completes functions to be executed by units included in the face library updating apparatus according to the embodiment of the present application in combination with hardware of the processor, or executes the face library updating method according to the embodiment of the method of the present application.

The communication interface 703 enables communication between the apparatus 700 and other devices or communication networks using transceiver means such as, but not limited to, transceivers. For example, a first face sample may be obtained via the communication interface 703.

Bus 704 may include a pathway to transfer information between various components of apparatus 700, such as memory 701, processor 702, and communication interface 703.

It should be noted that although the apparatus 700 shown in fig. 7 shows only memories, processors, and communication interfaces, in a specific implementation, those skilled in the art will appreciate that the apparatus 700 also includes other components necessary to achieve proper operation. Also, those skilled in the art will appreciate that the apparatus 700 may also include hardware components for performing other additional functions, according to particular needs. Furthermore, those skilled in the art will appreciate that apparatus 700 may also include only those components necessary to implement embodiments of the present application, and need not include all of the components shown in FIG. 7.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the above-described apparatus embodiments are merely illustrative, for example, the division of the units is only one logical function division, and there may be other division manners in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the communication connections shown or discussed may be indirect couplings or communication connections between devices or units through interfaces, and may be electrical, mechanical or other forms.

In addition, each unit in the embodiments of the apparatus of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

It is understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. The general purpose processor may be a microprocessor, but may be any conventional processor.

The methods in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer program or instructions may be stored in or transmitted over a computer-readable storage medium. The computer readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server that integrates one or more available media. The usable medium may be a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape; or optical media, such as CD-ROM, DVD; it may also be a semiconductor medium, such as a Solid State Disk (SSD), a Random Access Memory (RAM), a read-only memory (ROM), a register, and the like.

An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may reside in a network device or a terminal device. Of course, the processor and the storage medium may reside as discrete components in a transmitting device or a receiving device.

In the embodiments of the present application, unless otherwise specified or conflicting with respect to logic, the terms and/or descriptions in different embodiments have consistency and may be mutually cited, and technical features in different embodiments may be combined to form a new embodiment according to their inherent logic relationship.

In the present application, "and/or" describes an association relationship of associated objects, which means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In the description of the text of the present application, the character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula of the present application, the character "/" indicates that the preceding and following related objects are in a relationship of "division".

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of the present application. The sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic.

Claims

1. A method for updating a face library, comprising:

the method comprises the steps that a face library updating device obtains a first face sample collected by a camera, wherein whether a target presented by the first face sample is to be identified as a first target or not is judged;

determining a second face sample corresponding to the first face sample;

comparing the second face sample with a face image corresponding to a second target in a face library, and judging whether the target presented by the second face sample is identified as the second target, wherein the second target is an endorser of the first target;

and when the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample.

2. The method of claim 1, wherein the determining whether the target presented by the second face sample is identified as a second target comprises:

when the target presented by the first face sample is not recognized as the first target, judging whether the target presented by the second face sample is recognized as the second target.

3. The method of claim 1, wherein the determining whether the target presented by the second face sample is identified as a second target comprises:

when the first similarity between the first face sample and the first face image is the maximum similarity among a plurality of second similarities, judging whether a target presented by the second face sample is identified as a second target, wherein the first face image is a face image corresponding to the first target in the face library, and the second similarity is the similarity between the first face sample and the face image in the face library.

4. The method according to claim 3, wherein the determining whether the target presented by the second face sample is identified as the second target when the first similarity between the first face sample and the first face image is the largest similarity among a plurality of second similarities comprises:

and when the first similarity between the first face sample and the first face image is the maximum similarity in a plurality of second similarities and is greater than or equal to a first threshold, judging whether a target presented by the second face sample is identified as a second target.

5. The method according to any one of claims 1 to 4, wherein the determining whether the target presented by the second face sample is identified as the second target comprises:

and if the third similarity between the second face sample and the second face image is greater than or equal to a second threshold value, identifying the target presented by the second face sample as the second target, wherein the second face image is a face image corresponding to the second target in the face library.

6. The method of claim 5, wherein when the third similarity between the second face sample and the second face image is greater than or equal to the second threshold, the target presented by the second face sample is identified as the second target, comprising:

if the third similarity is greater than or equal to the second threshold and the third similarity is the maximum similarity among a plurality of fourth similarities, the second face sample is identified as the second target, and the fourth similarity is the similarity between the second face sample and the face images in the face library.

7. The method according to any of claims 1 to 6, wherein the second face sample is a co-located face sample taken with the first face sample, or

The second face sample and the first face sample are face samples collected at the same time period, or

The second face sample and the first face sample are face samples collected in the same video source.

8. The method according to any one of claims 1 to 7, wherein the first face samples are Q, Q being an integer greater than 1, and,

when the target presented by the second face sample is identified as the second target, updating the face image corresponding to the first target in the face library according to the first face sample, including:

when the target presented by the second face sample is identified as the second target, determining the first face sample as a candidate face sample of the first target, wherein the number of the candidate face samples is M, M is more than 1 and less than or equal to Q, and M is an integer;

determining target candidate face samples from the M candidate face samples according to at least one of a first parameter and a second parameter, wherein the first parameter includes a similarity between each candidate face sample and a third face image, the third face image is a face image corresponding to the first target in the face library, and the second parameter is the number of effective second face samples corresponding to each candidate face sample, wherein the effective second face samples corresponding to the mth candidate face sample include second face samples identified as second targets by targets present in T second face samples corresponding to the mth candidate face sample, M belongs to [1, M ], M is an integer, and T is a positive integer;

and updating the face image corresponding to the first target in the face library according to the target candidate face sample.

9. The method of claim 8, wherein determining a target candidate face sample from the M candidate face samples according to at least one of the first parameter and the second parameter comprises:

and clustering the M candidate face samples, and determining the target candidate face sample in a first class of candidate face samples according to at least one parameter of a first parameter and a second parameter, wherein the first class is one or more classes of which the number of the included candidate face samples is greater than or equal to a third threshold value.

10. The method according to any one of claims 1 to 9, further comprising:

performing N times of identification aiming at the first target based on N third face samples, wherein N is an integer greater than 1;

and determining the first target as a target to be updated according to the N recognition results.

11. The method according to claim 10, wherein a fifth similarity between the third face sample and the fourth face image is a maximum similarity among a plurality of sixth similarities, wherein the fourth face image is a face image corresponding to the first target in the face library, and the sixth similarity is a similarity between the third face sample and a face image in the face library.

12. The method according to claim 10 or 11, wherein determining the first target as the target to be updated according to the result of the N-times recognition comprises:

when the number K of successful recognition times in the N recognition results is smaller than or equal to a fourth threshold value, determining the first target as a target to be updated, wherein K is an integer; or

13. A face library updating apparatus, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first face sample acquired by a camera, and whether a target presented by the first face sample is to be identified as a first target or not is judged;

a processing unit to:

determining a second face sample corresponding to the first face sample;

14. The apparatus as recited in claim 13, said processing unit to:

15. The apparatus as recited in claim 13, said processing unit to:

16. The apparatus as defined in claim 15, wherein the processing unit is to:

17. The apparatus of any of claims 13 to 16, wherein the processing unit is to:

18. The apparatus as defined in claim 17, wherein the processing unit is to:

19. An apparatus as claimed in any one of claims 13 to 18, wherein the second face sample is a co-located face sample taken with the first face sample, or

The second face sample and the first face sample are face samples collected at the same time interval, or

20. An apparatus as claimed in any one of claims 13 to 19, wherein the first face samples are Q, Q being an integer greater than 1, and the processing unit is to:

21. The apparatus as recited in claim 20, said processing unit to:

22. The apparatus of any of claims 13 to 21, wherein the processing unit is further to:

23. The apparatus according to claim 22, wherein a fifth similarity between the third face sample and the fourth face image is a maximum similarity among a plurality of sixth similarities, wherein the fourth face image is a face image corresponding to the first target in the face library, and the sixth similarity is a similarity between the third face sample and a face image in the face library.

24. The apparatus of claim 22 or 23, wherein the processing unit is to:

25. A computer-readable storage medium, characterized in that the computer-readable medium stores program code for execution by a device, the program code comprising instructions for performing the method of any of claims 1-12.

26. A chip comprising a processor and a data interface, the processor reading instructions stored on a memory through the data interface to perform the method of any one of claims 1-12.