WO2020024737A1 - 生成人脸识别的负样本的方法、装置及计算机设备 - Google Patents

生成人脸识别的负样本的方法、装置及计算机设备 Download PDF

Info

Publication number
WO2020024737A1
WO2020024737A1 PCT/CN2019/093273 CN2019093273W WO2020024737A1 WO 2020024737 A1 WO2020024737 A1 WO 2020024737A1 CN 2019093273 W CN2019093273 W CN 2019093273W WO 2020024737 A1 WO2020024737 A1 WO 2020024737A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
samples
negative
scene
face recognition
Prior art date
Application number
PCT/CN2019/093273
Other languages
English (en)
French (fr)
Inventor
罗文寒
暴林超
高源�
刘威
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP19845530.5A priority Critical patent/EP3751450A4/en
Publication of WO2020024737A1 publication Critical patent/WO2020024737A1/zh
Priority to US17/016,162 priority patent/US11302118B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/50Maintenance of biometric data or enrolment thereof

Definitions

  • the present application relates to the field of image processing technology, and in particular, to a method, a device, and a computer device for generating a negative sample for face recognition.
  • face recognition is mainly performed through a face recognition model.
  • the face recognition model is obtained by training and learning a large number of training samples using a machine learning method.
  • the training samples used to participate in machine learning can be divided into two categories, positive samples and negative samples.
  • the division of positive and negative samples needs to be determined according to the actual content to be verified. Positive samples are Refers to the sample that can reach the correct conclusion, the negative sample is the opposite.
  • the number of positive and negative samples is often unbalanced.
  • the number of positive samples is large and the number of negative samples is small.
  • the embodiments of the present application provide a method, a device, and a computer device for generating a negative sample of face recognition.
  • a method for generating a negative sample for face recognition includes:
  • For the selected negative sample template nest the obtained positive sample into the negative sample template to obtain an intermediate sample that simulates displaying the positive sample in the display area of the negative sample template;
  • the intermediate samples are fused into the scene samples to obtain negative samples required for machine learning for face recognition.
  • a method for training a face recognition model which includes training the face recognition model using positive samples and negative samples, the negative samples including: negative samples obtained by using the method described in the first aspect.
  • a face living body authentication method which includes: using a face recognition model for face living body authentication, wherein the face recognition model is obtained by training based on the training method of the second aspect.
  • an apparatus for generating a negative sample of face recognition includes:
  • An obtaining unit configured to obtain a positive sample from a training sample library required for machine learning for face recognition
  • a nesting unit configured to nest the obtained positive sample into the negative sample template for the selected negative sample template to obtain an intermediate sample that simulates displaying the positive sample in the display area of the negative sample template;
  • a scene fusion unit is configured to fuse the intermediate samples into the scene samples for the selected scene samples to obtain the negative samples required for machine learning for face recognition.
  • a computer device including:
  • At least one processor At least one processor
  • a memory connected in communication with the at least one processor; wherein,
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the first aspect, the second aspect, and the third aspect Aspect of the method.
  • a computer storage medium stores computer-executable instructions, and the computer-executable instructions are used to cause a computer to execute the method according to the first aspect.
  • FIG. 1A is a schematic diagram of a scenario in an embodiment of the present application.
  • FIG. 1B is a schematic diagram of another scenario in the embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a method for generating a negative sample for face recognition according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a method for generating a negative sample for face recognition according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a first terminal taking a face picture in a second terminal according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a first terminal taking a picture of a hand-held face according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of a face picture placed in a face photo frame and a halo according to an embodiment of the present application
  • FIG. 7 is a schematic diagram of a nesting process in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a process of synthesizing with a reflection picture in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a process of merging with a scene picture according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an apparatus for generating a negative sample for face recognition according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • Positive samples and negative samples In face recognition, positive samples and negative samples are relative. Positive samples refer to samples that can be verified after the face recognition model recognizes the samples, then negative samples refer to faces. After the recognition model recognizes the sample, it cannot verify the passed sample. Of course, whether to pass the verification depends on the specific scenario. For example, if it is necessary to verify whether the picture to be verified is a human face, then the human face picture can be verified, and the picture that does not include the human face or the face part only occupies a small part cannot be verified.
  • the payment system usually requires the user to perform face authentication to verify whether he or she performs the operation.
  • the user, or first user usually turns on the camera on the terminal to take a picture of his face for authentication.
  • the picture of the face of the first user taken by the first user can be regarded as a positive sample.
  • the second user operates the account of the first user to make payment instead of the first user himself.
  • the second user usually performs authentication by using a photo of the first user or using a picture or video of the face of the first user stored in the terminal. This situation is obviously dangerous, and the picture taken in this case can be considered a negative sample, or an attack sample.
  • the function of face authentication is based on a face recognition model.
  • the face recognition model needs to take positive samples and negative samples as inputs, and learn the features in the positive and negative samples through the established face recognition model, so that the final face recognition model can correctly distinguish between positive samples and positive samples. Negative samples.
  • the majority of users who normally perform face authentication are, so the number of positive samples can be guaranteed.
  • the second user operates the first user's account by performing face authentication by using the photo of the first user or storing the picture or video of the first user in the terminal
  • the normal user will generally not do so. Therefore, the number of negative samples that the authentication system can obtain is extremely small, so that the number of positive samples and negative samples is extremely unbalanced. Then the accuracy of the face recognition model obtained by training these positive samples and negative samples is also There is no guarantee.
  • the present inventor has found that face recognition is often used for identity authentication, and in actual application scenarios, the most common attacker is to use a legitimate person's photo or video to try to deceive the face live authentication system.
  • Photos can be carried on photo paper or displayed on the attacker's handheld terminal. Videos are usually played on the attacker's handheld terminal.
  • the inventor designed a negative sample template by simulating such a frame structure in order to obtain a negative sample.
  • the negative sample template is used to simulate a photo or a handheld terminal device with a display screen.
  • an embodiment of the present application provides a method for generating a negative sample of face recognition. In this method, you can obtain machine learning for face recognition by nesting positive samples in negative sample templates and merging them into some common scene samples to simulate scenes that use negative samples to attack in daily life scenarios. Required negative samples.
  • the inventor also considers that, in actual life, for example, when a second user uses a photo of the first user stored in the terminal for face authentication, due to the reflection of light, it will inevitably be reflected on the display unit of the terminal. Shadows of some other objects, which will fall in the photo area of the first user displayed on the display unit. Then, in order to make the finally obtained negative samples more authentic, in the method of the embodiment of the present application, after the obtained positive samples are nested in the negative sample template, and the first intermediate sample is obtained, the An element reflecting the picture is added to an intermediate sample to simulate a scene in which an object is reflected in the first intermediate sample to form a second intermediate sample.
  • the inventor also considers that when an attack is performed during the face recognition process, the camera of the general authentication system does not directly face the attack sample, but has a certain spatial position relationship with it. Therefore, the second intermediate sample can also be fused into the scene sample after being subjected to a certain geometric deformation. In this way, the final negative samples are more in line with the real situation and have higher authenticity.
  • FIG. 1A is an application scenario to which the technical solution in the embodiment of the invention is applicable.
  • the terminal 101 and the server 102 may be included.
  • the terminal 101 includes a camera 1011.
  • the terminal 101 may be a personal terminal, and the personal terminal may be, for example, a user's personal mobile phone or tablet computer (PAD).
  • PAD personal mobile phone
  • the camera in the terminal 101 is turned on, an image including the face of the user is taken, and sent to the server 102.
  • the terminal 101 and the server 102 can communicate through a network.
  • the network may be a wired network or a wireless network.
  • the wireless network may be a mobile cellular network or a wireless local area network (WLAN).
  • WLAN wireless local area network
  • the network may also be any other network that can communicate. This application The embodiment does not limit this.
  • the server 102 After receiving the image including the user's face sent by the terminal 101, the server 102 can recognize the image including the user's face through the face recognition model in the server 102, and determine whether it is the user himself who is performing a face authentication operation. The determination result is fed back to the terminal 101.
  • the face recognition model in 102 in the server is obtained through training and learning according to multiple positive samples and negative samples, and the negative samples may be negative samples obtained according to the technical solution provided in the embodiment of the present application.
  • the terminal 101 may be a terminal device in an enterprise or a public institution, for example, it may be a government office lobby, a bank counter, or a computer device at a hotel front desk.
  • the computer device may include a camera 1011.
  • the camera 1011 may be a camera included in the terminal 101 itself, or may be a camera 1011 external to the terminal 101.
  • the terminal 101 can collect the credential information (generally an ID card) of the user who handles the service, and takes a picture of the face of the user who handles the service through the camera 1011, and then sends it to the server 102 together.
  • credential information generally an ID card
  • the server 102 recognizes the picture of the user's face through the face recognition model, and then determines whether the user handling the service matches the provided identity information, and feeds back the determination result to the terminal 101.
  • the face recognition model in 102 in the server is obtained through training and learning according to multiple positive samples and negative samples, and the negative samples may include negative samples obtained according to the technical solution provided in the embodiment of the present application.
  • FIG. 1B is another application scenario that can be used by the technical solution in the embodiment of the present application.
  • This application scenario is, for example, a security inspection system.
  • a gate 103 and a server 104 may be included.
  • the gate 103 may be, for example, a gate at an airport security entrance, a gate at a railway station ticket entrance, or a subway security entrance.
  • the gate 103 includes one or more cameras, through which an image 106 of the user's face can be captured, and the user's credential information is collected and sent to the server 104 for verification.
  • the server 104 may be an authentication server in a corresponding security inspection system. For example, when the gate 103 is a gate at an airport entrance, the server 104 is an authentication server in the airport security system.
  • the server 104 recognizes the image including the user's face through the face recognition model in the server 104, and then determines whether the user who is undergoing security check matches the information in the ID card, and feeds back the determination result to the terminal 103.
  • the face recognition model in 104 in the server is obtained through training and learning based on multiple positive samples and negative samples, and the negative samples may be the negative samples obtained according to the technical solution provided in the embodiment of the present application.
  • the method provided by the embodiment of the present application is not limited to the application scenario shown in FIG. 1A and FIG. 1B, and may also be used in other possible application scenarios, and the embodiment of the present application is not limited.
  • the embodiments of the present application provide method operation steps as shown in the following embodiments or the accompanying drawings, the method may include more or fewer operation steps. Among the steps in which the necessary causality does not exist logically, the execution order of these steps is not limited to the execution order provided by the embodiments of the present application.
  • the method may be executed sequentially or in parallel according to the method shown in the embodiment or the accompanying drawings during actual processing or when the device is executed.
  • FIG. 2 is a flowchart of a method for generating a negative sample for face recognition in an embodiment of the present application. As shown in Figure 2, the method includes the following steps:
  • Step 201 Obtain a positive sample from a training sample database required for machine learning for face recognition.
  • Step 202 For the selected negative sample template, the obtained positive sample is nested in the negative sample template to obtain an intermediate sample that simulates displaying the positive sample in a display area of the negative sample template.
  • Step 203 For the selected scene samples, the intermediate samples are fused into the scene samples to obtain negative samples required for machine learning for face recognition.
  • a positive sample is nested in a negative sample template, and a certain background is added to simulate a scenario in which a negative sample is used for an attack in a living scene, and the machine learning required for face recognition is obtained.
  • a large number of negative samples can be generated according to the positive samples, which can effectively solve the technical problem of a small number of negative samples in face recognition in life. , Thereby improving the performance of the face recognition model obtained through training.
  • step 202 for the selected negative sample template, embedding the obtained positive sample into the negative sample template may include:
  • Pre-processing the positive sample so that the pre-processed positive sample can adapt to the size of the display area of the negative sample template in the negative sample template;
  • the pre-processed positive samples are nested into the negative sample template.
  • the positive sample when a positive sample is nested in a negative sample template, the positive sample needs to be pre-processed, so that the size of the positive sample matches the size of the display area in the negative sample template, so that the obtained intermediate sample It is closer to real life scenes, and the authenticity of the negative samples obtained is mentioned.
  • step 203 of FIG. 2 before step 203 of FIG. 2: for the selected scene samples, the intermediate samples are fused into the scene samples to obtain the negative samples required for machine learning for face recognition, said generating
  • the method of negative samples for face recognition may further include:
  • Pre-processing the reflection picture based on the size of the intermediate sample for the selected reflection picture
  • the intermediate sample is processed as follows: the intermediate sample is used as a foreground, and the pre-processed reflection picture is synthesized to simulate the reflection picture with reflection display in the intermediate sample.
  • the negative sample in actual life, due to the reflection of light, the negative sample will inevitably include the reflection effect formed by the reflection of other objects. Therefore, an element that reflects the picture can be added to the intermediate sample, and it is mentioned that The authenticity of the negative sample.
  • the using the intermediate sample as a foreground and synthesizing the pre-processed reflection picture may include:
  • the weight of the reflection picture is smaller than the weight of the intermediate sample.
  • the content of the intermediate sample obtained after the synthesis is mainly based on the content of the intermediate sample before the synthesis, and The content of the reflected image does not affect the visual effect too much, and is more consistent with the real scene, thereby improving the authenticity of the negative samples.
  • the step 203 of fusing the intermediate sample into the scene sample may specifically include:
  • the geometrically deformed intermediate samples are fused into the scene samples according to the mask.
  • the general camera when face recognition attacks, does not directly face the attack samples, but there is a certain spatial angle, so it is also possible to perform certain geometric deformation on the intermediate samples and then fuse them to the scene samples. In this way, the final negative samples are more in line with the real situation and have higher authenticity.
  • the negative sample template may be a template of a terminal having a display function, and the display area may be a display screen area of the terminal; and / or the scene sample may include a scene photo or a scene video.
  • FIG. 3 is another schematic flowchart of a method for generating a negative sample for face recognition according to an embodiment of the present application.
  • an embodiment of the present application provides a method for generating a negative sample for face recognition. The following takes a synthesis of a negative sample as an example, and the flow of the method is as follows.
  • Step 301 Obtain a positive sample from a training sample database required for machine learning for face recognition.
  • a sample database for machine learning needs to be prepared in advance, and the sample database can be divided into a positive sample database and a negative sample database.
  • the samples in the sample library may be pictures or videos taken by the user during the face authentication process, or may be pictures or videos of human faces obtained from the network.
  • the method for obtaining samples is not limited in the embodiment of the present application .
  • many current mobile phones can support the face unlock function, so when the first user wants to unlock the phone through the face, he can take a picture of his own face through the phone camera to perform the unlock operation.
  • the obtained face picture of the first user can be used as a positive sample.
  • the second user may take a picture of the face of the first user by using the mobile phone of the first user (the first user's The face picture may be displayed on the mobile phone of the second user, for example, to unlock the mobile phone of the first user. This situation is obviously not allowed, so the picture taken in this case can be used as a negative sample.
  • the bank smart counter when a first user handles business at a bank smart counter, the bank smart counter not only verifies whether the face of the first user matches the credential information provided by the first user, that is, in addition to verifying the photographed face image of the first user In addition to whether the pictures in the credential information match, it is also verified whether the first user currently being shot is a living body. Then the bank's smart counter usually requires the first user to complete the specified action, such as blinking or nodding. That is, the bank's smart counter will record a video when the first user performs a specified action, and then perform verification based on the video. Then the video shot by the bank's smart counter of the first user completing the above-mentioned specified action can be used as a positive sample.
  • the specified action such as blinking or nodding
  • the camera of the bank's smart counter can then shoot The second user uses the terminal to play the video of the video where the first user completes the specified action. This situation is obviously not allowed, so in this case the video captured by the camera can also be used as a negative sample.
  • negative samples can be synthesized by using positive samples, thereby balancing the number of positive samples and negative samples, thereby improving the performance of the trained model.
  • a positive sample needs to be selected from the positive sample library as a basis for the synthesis.
  • the positive sample can be selected by random selection, or it can also be selected. By selecting in a certain order manner, this embodiment of the present application does not limit this.
  • Step 302 Select a negative sample template, and nest the positive sample in the negative sample template to obtain an intermediate sample.
  • the second user generally performs face authentication through pictures or videos of the first user in its terminal, or the second user may also be a person holding the first user Face photos are used for face authentication, so the pictures or videos taken by the authentication terminal that opens the face authentication page to authenticate users will also include things like frames.
  • Face photos are used for face authentication, so the pictures or videos taken by the authentication terminal that opens the face authentication page to authenticate users will also include things like frames.
  • the authentication terminal will inevitably capture the outline of the second user's terminal.
  • the authentication terminal will inevitably capture the hand of the second user. Therefore, a picture including such factors as the contour of the terminal, the hand, and the like can be used as a negative sample template, and subsequently, it can be nested with the positive sample to obtain a negative sample that simulates the actual scene.
  • the first terminal 401 is an authentication device that opens a face authentication page
  • the second terminal 402 is a device that displays or plays a picture or video of the first user.
  • the second user plays the picture or video 403 of the first user through the display screen of the second terminal 402 in front of the camera of the first terminal 401
  • the first terminal 401 performs face authentication by shooting the picture or video 403 played by the second terminal
  • the second terminal 402 falls into the shooting area 404 of the camera of the first terminal 401, so that the picture or video captured by the first terminal 401 includes the image of the second terminal 402.
  • the image of the terminal can be included in the negative sample accordingly.
  • the image of the second terminal can be used as a negative sample template, and then the positive samples are nested in the image of the second terminal to simulate the positive samples A scene displayed on the display screen of the second terminal.
  • the second user may also directly use the printed face image of the first user for face authentication.
  • the first terminal 501 is an authentication device for opening a face authentication page.
  • the picture captured by the first terminal 501 may also include an image of a hand of a second user who holds a picture of the face of the first user.
  • the negative sample can also include the image of the hand, then the image of the hand can be used as a template of the negative sample, and the positive sample is then nested in the image of the hand to simulate the hand holding the positive sample Scene.
  • the face picture 601 of the first user may also be placed in the photo frame 602 and held by the second user in front of the first terminal for verification.
  • the picture taken by the first terminal also includes a photo frame, and the image of the photo frame can also be used as a negative sample template for synthesizing negative samples.
  • the second user may directly place the face picture 603 of the first user on a plane, and then shoot through the first terminal.
  • there may be a certain halo 604 around the face image in the picture taken by the first terminal so the halo image can also be used as a negative sample template for synthesizing negative samples.
  • images of more commonly used terminals on the market, multiple images including hands in different poses, images of different look frames, and images of halo, etc. can be obtained in advance as negative sample templates and added to the negative samples.
  • Template gallery During the synthesis of negative samples, one of the negative sample templates can be selected from the negative sample template library.
  • the form of the negative sample template may be a picture, such as a terminal picture. Or the form of the negative sample template can also be video, then each frame in the negative sample template can have the same content, for example, each frame can be a terminal picture.
  • the negative sample template may also be the object itself, such as a terminal, a hand, or a photo frame.
  • the selection can be performed by random selection, or the selection can be performed according to a certain sequence, which is not limited in the embodiment of the present application.
  • the positive sample after obtaining the positive sample and the negative sample template, the positive sample may be nested in the negative sample template to obtain an intermediate sample that simulates displaying the positive sample in the display area of the negative sample template.
  • the display area specifically refers to the display screen area of the terminal; or when the negative sample template is an image of a photo frame, the display area specifically refers to the inner frame of the photo frame. region.
  • the positive samples in order to enable the positive samples to adapt to the size of the display area in the negative sample template, before performing the nesting operation, the positive samples may be pre-processed according to the size of the terminal display area, so that The middle sample can appear more realistic.
  • the following uses a positive sample as a face picture, and the terminal is specifically a mobile phone as an example for description.
  • the face image may be cropped and / or scaled according to the size of the display screen 703 in the mobile phone image 702 as a negative sample template.
  • a pre-processed face image 704 is obtained.
  • the scale transformation is a way of scaling a face picture to change the size of the face picture.
  • the mobile phone shown in FIG. 7 is a black Apple (iPhone) mobile phone. Of course, it can also be a mobile phone of another color or a mobile phone of another brand.
  • the pre-processed face picture 704 can be nested in the mobile phone image 702 to obtain a first intermediate sample 705, for example, as shown in FIG. 7, the pre-processed face picture 704 is placed in the display area 703 of the mobile phone picture to simulate the effect that the mobile phone is displaying a face picture.
  • the negative sample template may be a picture or a video.
  • the video includes only a terminal, a player ’s hand, a photo frame, or a halo.
  • the positive samples in the form of video can also be pre-processed and then nested into the negative sample template.
  • the proportion of the video can be adjusted to fit the size of the display screen of the mobile phone, and then the adjusted video is nested in the mobile phone to simulate the effect of playing the video in the mobile phone.
  • Step 303 Select a reflection picture, and preprocess the reflection picture according to the size of the intermediate sample.
  • the technical solution of the embodiment of the present application also simulates reflection.
  • the reflected object may be, for example, a building or a trademark on a building
  • the reflected object may be an indoor facility, such as wallpaper, air conditioner, or television. It is possible to collect pictures of objects in multiple indoor scenes and pictures of objects in multiple outdoor scenes and add them to the reflection picture library as reflection pictures, and then in the process of synthesizing negative samples, you can select from the reflection picture library. When specific selection is made, selection may be performed by random selection, or selection may be performed in a certain order manner, which is not limited in the embodiment of the present application.
  • the reflection picture needs to be pre-processed, so that the size of the reflection picture is consistent with the size of the intermediate sample.
  • the negative sample template is a terminal picture
  • the size of the reflected picture should be consistent with the shape of the terminal in the terminal picture
  • the negative sample template is a frame picture
  • the size of the reflected picture should be the same as that of the terminal. Frame borders in frame pictures are the same size.
  • the reflected picture can be cropped so that the cropped reflected picture is the same as the size of the phone in the phone picture; or the reflected picture can be scaled, such as horizontal scaling, or The vertical zoom makes the scaled reflection image the same as the external dimensions of the mobile phone. Since there may be a certain spatial position relationship between the reflected object and the display screen of the mobile phone in a real scene, based on this consideration, in addition to the above-mentioned processing operations, the reflected image can be geometrically deformed to make the reflected image It is more true in the intermediate sample.
  • Step 304 The intermediate sample is used as the foreground, and the pre-processed reflection picture is synthesized, so that the intermediate sample has a reflection picture simulated with reflection display.
  • the intermediate sample may be used as a foreground picture and synthesized with the pre-processed reflection picture, so that the intermediate sample simulates a reflection picture with a reflection display.
  • the intermediate sample when the intermediate sample is a picture, the intermediate sample can be directly synthesized with the pre-processed reflection picture, and the corresponding synthesized intermediate sample is also a picture.
  • each frame in the video may be synthesized with the reflection picture, and the corresponding synthesized intermediate sample is also a video.
  • the pre-processed reflection pictures synthesized with different frames may be the same or different.
  • the pre-processed reflection picture synthesized with the first frame may be the first part of the original reflection picture
  • the pre-processed reflection picture synthesized with the second frame may be the second part of the original reflection picture.
  • the first part is different from the second part, but there may or may not be an intersection between the first and the second part.
  • FIG. 8 is a schematic diagram of synthesizing an intermediate sample (for example, the first intermediate sample 705 obtained in the manner shown in FIG. 7) and a reflection image 801 before synthesis by taking a positive sample as a face picture as an example. It can be seen from FIG.
  • the second intermediate sample 802 the content of the synthesized intermediate sample (the first intermediate sample 705) is still dominant, and the buildings in the reflected picture are still dominant.
  • 803 can also feel it, the perception effect is weak.
  • the intermediate sample before the synthesis (the first intermediate sample 705) The weight of) is the complement of the weight of the reflection picture 801. Then, the intermediate samples before the synthesis and the reflection picture can be synthesized according to their respective weights.
  • the weight of the intermediate sample (the first intermediate sample) before synthesis is called a first weight value
  • the weight of the reflected picture is called a second weight value
  • the first weight value and the second weight value are complementary sets
  • the first weight value is greater than a preset weight threshold
  • the second weight value is less than or equal to the preset weight threshold.
  • the preset weight threshold may be a value set according to experience or a value obtained based on specific experiments.
  • the preset weight threshold may be 0.2, for example.
  • S represents the intermediate sample (second intermediate sample) after synthesis
  • I represents the intermediate sample (first intermediate sample) before synthesis
  • R represents the pre-reflected reflection picture
  • a is the second weight value
  • the value of a It can be randomly selected from the values that meet the demand, of course, the value of a can also be fixed.
  • the synthesis process based on the above formula is a process of multiplying the pixel value of each pixel in the intermediate sample (the first intermediate sample) before the synthesis and the pre-processed reflection picture by their respective weights and superimposing them.
  • Step 305 Perform geometric deformation on the intermediate sample at least once to obtain the geometrically deformed intermediate sample.
  • the intermediate sample before geometric deformation may be the synthesized intermediate sample (second intermediate sample) obtained in step 304, or the intermediate sample (first intermediate sample) obtained in step 302.
  • the geometric deformation may be a perspective transformation or an affine transformation.
  • the geometric deformation may also be another transformation type, which is not limited in the embodiment of the present application.
  • the transformation parameters of the perspective transformation may be fixed, or the transformation parameters may be randomly selected each time the perspective transformation is performed, and the transformation parameters may include, for example, a rotation angle or a stretching ratio.
  • the third intermediate sample 901 includes a mask 904 indicating the position of the intermediate sample before the geometric deformation in the intermediate sample after the geometric deformation, that is, the image area occupied by the mobile phone in the third intermediate sample 901 shown in FIG. 9 is Mask 904. Among them, only the pixels located in the mask 904 have values, and the pixels located outside the mask 904 have no values, as shown in FIG. 9 as a pure black area 905.
  • the mask 904 has It helps to fuse the geometrically deformed intermediate samples with the scene samples. The specific details will be described later. I won't go into details here.
  • Step 306 Fusion the intermediate samples into the scene samples to obtain negative samples.
  • the intermediate samples can also be fused. Into a scene, and finally get a negative sample. That is, for the selected scene samples, intermediate samples can be fused into the scene samples to obtain the negative samples required for machine learning for face recognition.
  • the intermediate sample referred to here may be the intermediate sample (first intermediate sample) obtained in step 302, or the synthesized intermediate sample (second intermediate sample) obtained in step 304, or it may also be a step 305 obtained intermediate sample (third intermediate sample). That is, in the embodiment of the present application, steps 303 to 305 are not mandatory steps. In a specific implementation process, steps 303 to 305 may be flexibly selected to be performed partially or completely according to actual requirements.
  • scenes can be divided into indoor scenes and outdoor scenes.
  • multiple pictures of indoor scenes and multiple outdoor scenes can be collected.
  • Pictures of the scene are added to the scene sample library, and in the process of synthesizing negative samples, they can be selected from the scene sample library.
  • selection may be performed by random selection, or selection may be performed in a certain order manner, which is not limited in the embodiment of the present application.
  • the scene sample may be a scene photo or a scene video.
  • the scene photo may also be a solid-colored background, such as a white wall or a blue sky.
  • the scene samples may also be pre-processed so that the size of the scene samples is consistent with the size of the intermediate samples.
  • F represents the final negative sample
  • M represents the mask
  • B represents the pre-processed scene sample.
  • the intermediate sample as the intermediate sample obtained in step 305, that is, the geometrically deformed intermediate sample (third intermediate sample 901) as an example, the third intermediate sample 901 and the scene sample (scene picture 902)
  • the area of the mask 904 in the pre-processed scene samples is essentially replaced by the mask 904 of the intermediate sample, and the area outside the mask 904 of the pre-processed scene samples remains unchanged.
  • a negative sample obtained is a picture
  • a negative sample is a video
  • the embodiment of the present application after obtaining the negative samples, the negative samples may be added to the negative sample library for training and learning of the face recognition model. Therefore, the embodiment of the present application also provides a method for training a face recognition model. In this method, the existing positive samples and the negative samples obtained by the method for generating negative samples for face recognition in the embodiment of the present application can be used to train and learn the model to obtain the final face recognition model.
  • the embodiment of the present application is not limited to the type of the model, and may be, for example, a neural network model or a genetic algorithm model, or other possible models.
  • an embodiment of the present application further provides a face authentication method.
  • a living face authentication can be performed through the face recognition model trained by the training method of the face recognition model described above, and the method can be applied to
  • the various application scenarios may include, but are not limited to, the application scenarios shown in FIG. 1A and FIG. 1B.
  • a negative scenario required for machine learning for face recognition can be obtained by simulating an attack scenario using a negative sample in a living scene, so that a large number of negative samples can be generated based on the positive samples, which can effectively It solves the technical problems of face recognition in life with a small number of attacks and too few negative samples, thereby improving the performance of the face recognition model obtained by training.
  • many random factors can be added, such as random selection when selecting negative sample templates, reflection photos or scene samples, the weight of reflection photos can be randomly selected, and the transformation parameters of perspective transformation can also be randomly selected. Then in theory, infinite negative samples can be generated, which in turn greatly improves the performance of the model.
  • an embodiment of the present application further provides a device for generating a negative sample of face recognition.
  • the device may include:
  • An obtaining unit 1001 configured to obtain a positive sample from a training sample database required for machine learning for face recognition
  • a nesting unit 1002 is configured to nest the obtained positive samples into the negative sample template for the selected negative sample template to obtain an intermediate sample that simulates displaying the positive sample in the display area of the negative sample template;
  • a scene fusion unit 1003 is configured to fuse the intermediate samples into the scene samples for the selected scene samples to obtain the negative samples required for machine learning of face recognition.
  • the nesting unit 1002 may be specifically used for:
  • the apparatus may further include a reflection synthesis unit 1004.
  • a reflection synthesis unit 1004 configured to preprocess the reflection picture based on the size of the intermediate sample for the selected reflection picture; and perform the following processing on the intermediate sample: use the intermediate sample as the foreground to synthesize with the preprocessed reflection picture, To simulate a reflection picture with reflections in the middle sample.
  • the reflection and synthesis unit 1004 may be specifically configured to:
  • the scene fusion unit 1003 may also be specifically configured to:
  • the intermediate samples after geometric modification are fused into the scene samples according to the mask.
  • the negative sample template is, for example, a template of a terminal having a display function, and the display area is a display screen area of the terminal; and / or the scene sample includes a scene photo or a scene video.
  • This device can be used to execute the method provided in the embodiment shown in FIG. 3 to FIG. 8. Therefore, for the functions that can be implemented by the functional modules of the device, refer to the description of the embodiment shown in FIG. 3 to FIG. 8. Not much more.
  • the reflection and synthesis unit 1004 is shown in FIG. 10 together, it is necessary to know that the reflection and synthesis unit 1004 is not a required functional unit, so it is shown with a dashed line in FIG. 10.
  • an embodiment of the present application further provides a computer device, which may include a memory 1101 and a processor 1102.
  • the memory 1101 is configured to store a computer program executed by the processor 1102.
  • the memory 1101 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of a computer device, and the like.
  • the processor 1102 may be a central processing unit (CPU), or a digital processing unit.
  • the specific connection medium between the memory 1101 and the processor 1102 is not limited in the embodiment of the present application. In the embodiment of the present application, the memory 1101 and the processor 1102 are connected by a bus 1103 in FIG. 11, and the bus 1103 is indicated by a thick line in FIG. 11.
  • the connection modes of other components are only for illustrative purposes and are not To be limited.
  • the bus 1103 may be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in FIG. 11, but it does not mean that there is only one bus or one type of bus.
  • the memory 1101 may be a volatile memory (for example, random-access memory (RAM); the memory 1101 may also be a non-volatile memory (for example, a read-only memory, a flash memory). Memory (flash memory), hard disk (HDD) or solid-state drive (SSD), or memory 1101 can be used to carry or store the desired program code in the form of instructions or data structures and can be used by Any other media that the computer accesses, but is not limited to.
  • the memory 1101 may be a combination of the above-mentioned memories.
  • the processor 1102 is configured to execute a method for generating a negative sample of face recognition, a method for training a face recognition model, and a method for training a face recognition model when the computer program stored in the memory 1101 is called, as shown in the embodiments shown in FIG. 3 to FIG. 8. Human face authentication method.
  • An embodiment of the present application further provides a computer storage medium that stores computer-executable instructions that need to be executed to execute the processor, and includes a program that is required to execute the processor.
  • aspects of the method for generating negative samples for face recognition, the training method for face recognition models, and the method for face biometric authentication provided in this application may also be implemented as a program product, which Including program code, when the program product runs on a computer device, the program code is used to cause the computer device to execute the negative samples for generating face recognition according to various exemplary embodiments of the present application described above in this specification.
  • Method, training method for face recognition model, and steps in face biometric authentication method For example, the computer device may execute the method for generating a negative sample of face recognition provided by the embodiment shown in FIG. 3 to FIG. 8.
  • Method, face recognition model training method and face living authentication method may be implemented as a program product, which Including program code, when the program product runs on a computer device, the program code is used to cause the computer device to execute the negative samples for generating face recognition according to various exemplary embodiments of the present application described above in this specification.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • the program product of the method for generating a negative sample of face recognition, the method for training a face recognition model, and the method for face biometric authentication according to the embodiments of the present application may use a portable compact disk read-only memory (CD-ROM) and include program code And can run on computing devices.
  • CD-ROM portable compact disk read-only memory
  • the program product of the present application is not limited thereto.
  • the readable storage medium may be any tangible medium containing or storing a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • the readable signal medium may include a data signal that is borne in baseband or propagated as part of a carrier wave, in which readable program code is carried. Such a propagated data signal may take many forms, including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with an instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • the program code for performing the operations of this application may be written in any combination of one or more programming languages, which includes object-oriented programming languages—such as Java, C ++, etc., and also includes conventional procedural Programming language—such as "C" or a similar programming language.
  • the program code may be executed entirely on the user computing device, partly on the user device, as an independent software package, partly on the user computing device, partly on the remote computing device, or entirely on the remote computing device or server On.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it may be connected to an external computing device (e.g., using Internet services Provider to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet services Provider to connect via the Internet
  • this application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, this application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种生成人脸识别的负样本的方法、装置及计算机设备。该方法包括:从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本;针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,获得模拟在所述负样本模板的显示区域中显示所述正样本的中间样本;针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。

Description

生成人脸识别的负样本的方法、装置及计算机设备
本申请要求于2018年08月02日提交的申请号为201810869295.4、发明名称为“生成人脸识别的负样本的方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种生成人脸识别的负样本的方法、装置及计算机设备。
背景技术
目前,人脸识别主要是通过人脸识别模型进行识别,人脸识别模型是通过采用机器学习的方法,对大量的训练样本进行训练学习得到的。其中,用于参与机器学习的训练样本可以划分为两类,即正样本(positive sample)和负样本(negative sample),正负样本的划分需要根据实际想要验证的内容而定,正样本是指能够得出正确结论的样本,负样本则与之相反。
但是在实际应用过程中,常常会出现正负样本的数量不平衡的情况,例如,正样本数量较多,而负样本数量较少的情况。
发明内容
本申请实施例提供一种生成人脸识别的负样本的方法、装置及计算机设备。
第一方面,提供一种生成人脸识别的负样本的方法,所述方法包括:
从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本;
针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,获得模拟在所述负样本模板的显示区域中显示所述正样本的中间样本;
针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
第二方面,提供一种人脸识别模型的训练方法,包括:使用正样本和负样本对人脸识别模型进行训练,所述负样本包括:采用第一方面所述的方法获得的负样本。
第三方面,提供一种人脸活体认证方法,包括:采用人脸识别模型进行人脸活体认证,所述人脸识别模型为基于第二方面的训练方法训练得到的。
第四方面,提供一种生成人脸识别的负样本的装置,所述装置包括:
获取单元,用于从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本;
嵌套单元,用于针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,获得模拟在所述负样本模板的显示区域中显示所述正样本的中间样本;
场景融合单元,用于针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
第五方面,提供一种计算机设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如第一方面、第二方面以及第三方面所述的方法。
第六方面,提供一种计算机存储介质,所述计算机存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如第一方面所述的方法。
附图简要说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。下面描述中的附图仅仅是本申请实施例。
图1A为本申请实施例中的一种场景示意图;
图1B为本申请实施例中的另一种场景示意图;
图2为本申请实施例中的生成人脸识别的负样本的方法的流程示意图;
图3为本申请实施例中的生成人脸识别的负样本的方法的流程示意图;
图4为本申请实施例中的第一终端拍摄第二终端中的人脸图片的示意图;
图5为本申请实施例中的第一终端拍摄手持人脸图片的示意图;
图6为本申请实施例中的人脸图片分别置于人脸相框和光晕中的示意图;
图7为本申请实施例中的嵌套过程的示意图;
图8为本申请实施例中的与反射图片合成的过程的示意图;
图9为本申请实施例中的与场景图片融合的过程的示意图;
图10为本申请实施例中的生成人脸识别的负样本的装置的一种结构示意图;
图11为本申请实施例中的计算机设备的一种结构示意图。
实施本发明的方式
为使本申请的目的、技术方案和优点更加清楚明白,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
为便于理解本申请实施例提供的技术方案,这里先对本申请实施例使用的一些关键名词进行解释:
正样本和负样本:在人脸识别中,正样本和负样本是相对而言的,正样本是指人脸识别模型对样本进行识别之后能够验证通过的样本,那么负样本则是指人脸识别模型对样本进行识别之后不能验证通过的样本。当然,是否验证通过是根据具体的场景而言的。例如,若是需要验证待验证图片是否为人脸,那 么人脸图片就可以验证通过,而不包括人脸或者人脸部分只占据很小一部分的图片则不可以验证通过。
或者,例如,用户在使用终端进行支付时,支付***通常会要求用户进行人脸认证,以验证是否本人进行操作。所述用户,或称为第一用户,通常会开启终端上的摄像头拍摄自己的脸部的图片进行认证。这种情况下,第一用户拍摄的第一用户的脸部的图片则可以认为是正样本。然而,还存在着不是第一用户本人,而是第二用户操作第一用户的账户进行支付的情况。这种情况下,第二用户通常会通过用第一用户的照片,或者用其终端中存储的第一用户的脸部的图片或者视频来进行认证。这种情况显然存在危险性,这种情况拍摄的图片则可以认为是负样本,或称为攻击样本。
另外,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,在不做特别说明的情况下,一般表示前后关联对象是一种“或”的关系。
在具体实践过程中,诸如上述进行支付时需要验证是否本人进行操作的情况,若是让负样本通过验证的话,则第一用户的账户会存在危险性,第一用户的财产安全无法得以保障,因此通过负样本进行验证的情况通常不允许验证通过。其中,人脸认证的功能是基于人脸识别模型进行的。人脸识别模型是需要将正样本和负样本作为输入,通过建立的人脸识别模型对正样本和负样本中的特征进行学习,从而使得最终得到的人脸识别模型能够正确地区分正样本和负样本。因此,为了在出现上述情况时不让负样本通过人脸认证,那么则需要在对人脸认证模型的训练过程中,加入负样本一并进行训练,从而让模型能够学习到负样本的特征,这样才能在出现上述情况时,能够准确分辨出其为负样本,从而让负样本的人脸认证失败,保证第一用户的账户安全。
在实际生活中,正常进行人脸认证的用户为大多数,因此正样本的数量是能够得以保证的。但是,诸如上述第二用户通过用第一用户的照片或者用其终端中存储第一用户的图片或者视频进行人脸认证,从而操作第一用户的账户的 情况,一般正常的用户都不会如此进行,因此认证***能够获取的负样本的数量极少,从而使得正样本和负样本的数量出现极度不平衡的情况,那么通过这些正样本和负样本训练得到的人脸识别模型的准确性也就得不到保证。
本发明人发现,人脸识别很多情况下是用在身份认证上,而在实际应用场景中,最常见的攻击者是采用合法者的照片或者视频来企图骗过人脸活体认证***。照片可以承载在相纸上或者显示在攻击者的手持终端上。视频一般都是在攻击者的手持终端上播放。这种情况下,人脸活体认证***获得的图像中,照片或者视频的周边都会有一个类似框架的东西,并且照片或者终端后面会出现背景。基于此类情况,发明人为了获得负样本,模拟这样的框架结构设计出负样本模板,负样本模板用来模拟照片或者具有显示屏的手持终端设备。进一步,由于正样本是容易获取的,也就是说正样本的数量有所保障,则可以将正样本嵌套进负样本模板中,再加上一定的背景,以利用正样本来生成负样本。这样就可以大量增加负样本的数量,从而解决正样本和负样本极度不平衡的情况。鉴于此,本申请实施例提供了一种生成人脸识别的负样本的方法。在该方法中,可以通过将正样本嵌套在负样本模板中,并融合到一些常见的场景样本中,从而模拟生活场景中使用负样本进行攻击的场景,获得用于人脸识别的机器学习所需的负样本。通过这种方式,可以根据正样本可以生成很多的负样本,从而可以有效地改善由于生活中人脸识别场景中出现攻击的情况为少数,负样本数量过少的现状,进而提高训练所得到的人脸识别模型的性能。常见的场景样本可以是一些常见活动场所的静态照片,例如商场收银台、机场过境闸机等,也可以是带有动态人流的视频等。
此外,本发明人还考虑到,在实际生活中,例如第二用户用终端中存储的第一用户的照片进行人脸认证时,由于光线的反射,不可避免地会在终端的显示单元上反射一些其他物体的影子,这些影子会落在显示单元所显示的第一用户的照片区域中。那么为了使最终得到的负样本的真实性更高,在本申请实施例的方法中,在将获取的正样本嵌套进所述负样本模板中,得到第一中间样本 之后,还可以在第一中间样本中加入反射图片的元素,以模拟真实场景中物体反射在第一中间样本中的情景,以形成第二中间样本。
进一步地,本发明人还考虑到,在人脸识别的过程中进行攻击时,一般认证***的摄像头不会直接正对攻击样本,而是与之存在一定的空间位置关系。因此还可以对第二中间样本进行一定的几何形变后,再融合到场景样本中。这样,最终得到的负样本更加符合真实情况,真实性更高。
在介绍完本申请实施例的设计思想之后,下面对本申请实施例的技术方案能够适用的应用场景做一些简单介绍,需要说明的是,以下介绍的应用场景仅用于说明本申请实施例而非限定。在具体实施过程中,可以根据实际需要灵活地应用本申请实施例提供的技术方案。
请参见图1A所示,为发明实施例中的技术方案能够适用的一种应用场景,在该场景中,可以包括终端101和服务器102。
其中,终端101包括摄像头1011。在一种可能的场景中,终端101可以是个人的终端,个人的终端例如可以是用户个人的手机或者平板电脑(PAD)。例如,用户的终端101上的应用程序的账户需要进行人脸认证时,则会打开终端101中的摄像头,拍摄包括该用户人脸的图像,并发送给服务器102。终端101与服务器102之间可以通过网络进行通信。其中,该网络可以是有线网络或者无线网络,无线网络例如可以是移动性蜂窝网络或者无线局域网(Wireless Local Area Network,WLAN),当然,该网络还可以是其他任何可以进行通信的网络,本申请实施例对此并不进行限制。
服务器102接收到终端101发送的包括用户人脸的图像后,可以通过服务器102中的人脸识别模型对包括用户人脸的图像进行识别,确定出正在进行人脸认证操作的是否为用户本人,并将确定结果反馈给终端101。其中,服务器中102中的人脸识别模型是根据多个正样本和负样本进行训练学习得到的,负样本即可以是根据本申请实施例所提供的技术方案获取的负样本。
在另一种可能的场景中,终端101可以是企业或者事业单位中的终端设备,例如可以是政府办事大厅、银行柜台或者酒店前台的计算机设备等。该计 算机设备可以包括摄像头1011。摄像头1011可以是终端101自身包括的摄像头,也可以是终端101外接的摄像头1011。终端101可以采集办理业务的用户的证件信息(一般为身份证),并通过摄像头1011拍摄办理业务的用户的人脸的图片,然后一并发送给服务器102。服务器102通过人脸识别模型对用户人脸的图片进行识别,进而确定出办理业务的用户与提供的身份信息是否吻合,并将确定结果反馈给终端101。其中,服务器中102中的人脸识别模型是根据多个正样本和负样本进行训练学习得到的,负样本可以包括根据本申请实施例所提供的技术方案获取的负样本。
请参见图1B所示,为本申请实施例中的技术方案能够使用的另一种应用场景。该应用场景例如为安检***。在该场景中,可以包括闸机103和服务器104。
其中,闸机103例如可以是机场安检入口的闸机、火车站检票入口的闸机或者地铁安检入口的闸机等。闸机103中包括一个或者多个摄像头,通过摄像头可以拍摄包括用户人脸的图像106,并采集该用户的证件信息并发送给服务器104进行验证。服务器104可以是相应的安检***中的验证服务器。例如当闸机103为机场入口的闸机时,则服务器104为机场安检***中的验证服务器。服务器104通过服务器104中的人脸识别模型对包括用户人脸的图像进行识别,进而确定出正在进行安检的用户与身份证中的信息是否吻合,并将确定结果反馈给终端103。其中,服务器中104中的人脸识别模型是根据多个正样本和负样本进行训练学习得到的,负样本即可以为根据本申请实施例所提供的技术方案获取的负样本。
当然,本申请实施例提供的方法并不限用于图1A和图1B所示的应用场景中,还可以用于其他可能的应用场景,本申请实施例并不进行限制。
为进一步说明本申请实施例提供的技术方案,下面结合附图以及具体实施方式对此进行详细的说明。虽然本申请实施例提供了如下述实施例或附图所示的方法操作步骤,但所述方法中可以包括更多或者更少的操作步骤。在逻辑上不存在必要因果关系的步骤中,这些步骤的执行顺序不限于本申请实施例提供 的执行顺序。所述方法在实际的处理过程中或者装置执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行。
图2为本申请实施例中的生成人脸识别的负样本的方法的流程图。如图2所示,该方法包括以下步骤:
步骤201:从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本。
步骤202:针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,获得模拟在所述负样本模板的显示区域中显示所述正样本的中间样本。
步骤203:针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
本申请实施例中,通过将正样本嵌套在负样本模板中,再给加上一定的背景,从而模拟生活场景中使用负样本进行攻击的场景,获得用于人脸识别的机器学习所需的负样本,这样,由于正样本的数量有所保证,根据正样本就可以生成很多的负样本,从而可以有效的解决生活中在人脸识别中攻击为少数,负样本数量过少的技术问题,进而提高训练所得到的人脸识别模型的性能。
根据本申请实施例,步骤202中,针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,可以包括:
对所述正样本进行预处理,以使得预处理后的正样本能够适应所述负样本模板的显示区域在所述负样本模板中的大小;
将预处理后的正样本嵌套进所述负样本模板中。
本申请实施例中,在将正样本嵌套在负样本模板中时,需要对正样本进行预处理,从而正样本的大小与负样本模板中显示区域的大小相适应,从而使得得到的中间样本愈加接近真实生活场景,从而提到得到的负样本的真实性。
根据本申请实施例,在图2的步骤203:针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本之前,所述生成人脸识别的负样本的方法还可以包括:
针对选定的反射图片,基于所述中间样本的大小对所述反射图片进行预处理;
对所述中间样本进行如下处理:将所述中间样本作为前景,与预处理后的反射图片进行合成,以模拟所述中间样本中有反射显示的所述反射图片。
本申请实施例中,在实际生活中,由于光线的反射,负样本中不可避免的会包括其他的物体反射形成的反射效果,因此还可以在中间样本中加入反射图片的元素,从而提到得到的负样本的真实性。
根据本申请实施例,所述将所述中间样本作为前景,与预处理后的反射图片进行合成,可以包括:
根据所述中间样本的第一权重值,以及所述反射图片的第二权重值,将所述中间样本与预处理后的反射图片进行合成,其中,所述第一权重值大于预设权重阈值,且所述第二权重值小于或者等于所述预设权重阈值。
本申请实施例中,在将反射图片与中间样本合成时,反射图片的权重小于中间样本的权重,这样,在合成后得到的中间样本中还是主要以合成前的中间样本的内容为主,而反射图片的内容并不会过多的影响视觉效果,更加符合真实场景,从而提高到的负样本的真实性。
根据本申请实施例,所述步骤203的将所述中间样本融合到所述场景样本中,具体可以包括:
对所述中间样本进行至少一次几何形变,几何形变后的中间样本中包括用于指示几何形变前的中间样本在几何形变后的中间样本中的位置的掩膜;
根据所述掩膜将几何形变后的中间样本融合到所述场景样本中。
本申请实施例中,在人脸识别进行攻击时,一般摄像头不会直接正对攻击样本,而是存在一定的空间角度,因此还可以对中间样本进行一定的几何形变后,再融合到场景样本中,这样,最终得到的负样本更加符合真实情况,真实性更高。
在前述实施例中,所述负样本模板可以为具有显示功能的终端的模板,所述显示区域可以为所述终端的显示屏区域;和/或所述场景样本可以包括场景照片或者场景视频。
图3为本申请实施例中的生成人脸识别的负样本的方法的另一个流程示意图。请参见图3所示,本申请实施例提供一种生成人脸识别的负样本的方法,下面以一个负样本的合成为例,该方法的流程如下。
步骤301:从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本。
本申请实施例中,在进行人脸识别模型的训练之前,都需要预先准备好用于机器学习的样本库,样本库可以划分为正样本库和负样本库。样本库中的样本可以是用户在人脸认证过程中拍摄的图片或者视频,或者也可以是从网络获取的人脸的图片或者视频,当然,对于样本的获取方式本申请实施例并不进行限制。
例如,目前许多的手机都可以支持人脸解锁的功能,那么第一用户想要通过人脸解锁手机时,则可以通过手机摄像头拍摄自身的人脸图片,以进行解锁操作,这种情况下拍摄得到的第一用户的人脸图片则可以作为正样本。而若是第二用户获取了第一用户的手机,且获取到第一用户的人脸图片,第二用户有可能通过用第一用户的手机拍摄第一用户的人脸图片(该第一用户的人脸图片例如可以是显示在第二用户的手机上),对第一用户的手机进行解锁操作。这种情况显然是不能够被允许的,因而这种情况下拍摄得到的图片则可以作为负样本。
再例如,第一用户在银行智能柜台办理业务时,银行智能柜台除了验证第一用户的人脸是否与第一用户提供的证件信息吻合之外,即除了验证拍摄的第一用户的人脸图片是否与证件信息中图片是否匹配之外,还会验证当前拍摄的第一用户是否为活体。那么银行智能柜台通常会要求第一用户完成指定的动作,例如眨眼或者点头等。即银行智能柜台会录制第一用户执行指定的动作时的视频,进而根据该视频进行验证。那么银行智能柜台拍摄的第一用户自身完 成上述指定动作的视频既可以作为正样本。而若是第二用户获取到第一用户完成上述指定动作的视频之后,通过在银行智能柜台的摄像头前播放该段视频,以期望通过该视频完成验证时,这时候银行智能柜台的摄像头则可以拍摄到第二用户使用终端播放第一用户完成指定动作的视频的这一段视频。这种情况显然也是不能够被允许的,因而这种情况下摄像头拍摄得到的视频也可以作为负样本。
在实际生活中,因为大多数用户都会按照正常的流程拍摄自身的图片或者视频进行人脸认证,所以正样本是较好获取的,获取到的正样本就可以加入到用于人脸识别的机器学***衡正样本和负样本的数量,进而提高训练得到的模型的性能。
具体地,在通过正样本合成负样本时,需要从正样本库中选取一个正样本,作为合成的基础,在具体进行正样本的选取时,可以通过随机选取的方式进行选取,或者,也可以通过按照一定顺序地方式进行选取,本申请实施例对此不作限制。
步骤302:选取负样本模板,并将正样本嵌套在负样本模板中,得到中间样本。
本申请实施例中,考虑到在实际场景中,第二用户一般都是通过其终端中第一用户的图片或者视频来进行人脸认证,或者,第二用户还可以是手持第一用户的人脸照片来进行人脸认证,因此打开人脸认证页面对用户进行认证的认证终端拍摄到的图片或者视频中还会包括类似框架的东西。例如,第二用户通过其终端中显示的图片进行人脸认证时,认证终端不可避免地会拍摄到第二用户的终端的外形轮廓。或者当第二用户手持人脸图片进行人脸认证时,认证终端会不可避免地拍摄到第二用户的手。因此可以将包括终端外形轮廓、手等这些因素的图片作为负样本模板,进而后续可以与正样本进行嵌套,从而得到模拟实际场景的负样本。
例如,请参见图4所示,第一终端401为打开人脸认证页面的认证设备,第二终端402为显示或播放第一用户的图片或者视频的设备。第二用户在第一终端401的摄像头前通过第二终端402的显示屏播放第一用户的图片或者视频403,第一终端401通过拍摄第二终端播放的图片或者视频403来进行人脸认证时,第二终端402会落入到第一终端401的摄像头的拍摄区域404中,从而第一终端401拍摄得到的图片或者视频中会包括第二终端402的图像。那么这种情况下,负样本中也就相应地可以包括终端的图像。为了模拟真实场景中的负样本,可以在通过正样本合成负样本的过程中,将第二终端的图像作为负样本模板,进而将正样本嵌套在第二终端的图像中,以模拟正样本在第二终端的显示屏上进行显示的情景。
例如,第二用户可能还会直接使用打印出来的第一用户的人脸图片进行人脸认证。请参见图5所示,第一终端501为打开人脸认证页面的认证设备,第一终端501在拍摄第一用户的人脸图片502时,第二用户的手503则会落入到第一终端的摄像头的拍摄区域504中,因而第一终端501拍摄到的图片中还可能包括拿着第一用户的人脸图片的第二用户的手的图像。那么这种情况下,负样本中也就相应的可以包括手的图像,那就可以将手的图像作为负样本模板,进而将正样本嵌套在手的图像中,以模拟手拿着正样本的情景。
例如,如图6中左图所示,第一用户的人脸图片601还可能被放置在相框602中由第二用户拿着在第一终端前面进行验证。那么这种情况下,第一终端拍摄到的图片中也就会包括相框,那么相框的图像也可以作为用于合成负样本的负样本模板。或者,如图6中右图所示,第二用户可能直接将第一用户的人脸图片603放置在一平面上,然后通过第一终端进行拍摄。这种情况下,由于环境中光线的影响,第一终端拍摄的图片中人脸图像周围可能会存在一定的光晕604,那么光晕图像也可以作为用于合成负样本的负样本模板。
在具体实施过程中,可以预先获取市面上较为常用的终端的图像、多个包括不同姿势的手的图像、不同模样的相框的图像以及光晕的图像等作为负样本模板,并添加至负样本模板库中。在进行负样本的合成的过程中,可以从负样 本模板库中选取其中一个负样本模板。其中,负样本模板的形式具体可以是图片,例如终端图片。或者负样本模板的形式还可以是视频,那么负样本模板中的每一帧可以均是相同的内容,例如每一帧都可以是终端图片。负样本模板具体还可以是物体本身,例如可以是终端、手或者相框等。
具体地,在进行负样本模板的选取时,可以通过随机选取的方式进行选取,或者,也可以通过按照一定顺序地方式进行选取,本申请实施例对此不作限制。
本申请实施例中,获取正样本和负样本模板之后,则可以将正样本嵌套在负样本模板中,以获得模拟在负样本模板的显示区域中显示正样本的中间样本。
具体地,当负样本模板为具有显示功能的终端的图像时,则显示区域具体指该终端的显示屏区域;或者,当负样本模板为相框的图像时,则显示区域具体指相框的内框区域。
本申请实施例中,为了使得正样本能够适应负样本模板中显示区域的大小,在进行嵌套操作之前,还可以根据终端显示区域的大小先对正样本进行预处理,从而使得嵌套之后得到的中间样本能够显得更加真实。
下面以正样本为人脸图片,终端具体为手机为例进行描述。
请参考图7所示,在得到作为正样本的人脸图片701之后,可以根据作为负样本模板的手机图像702中的显示屏703的大小对人脸图片进行裁剪和/或尺度变换等图像预处理,得到预处理后的人脸图片704。其中,尺度变换是对人脸图片进行缩放,以改变人脸图片的尺寸大小的方式。图7中所示的手机为黑色苹果(iPhone)手机,当然,也可以是其他颜色的手机,或者是其他品牌的手机。
在对人脸图片进行预处理之后,则可以将预处理后的人脸图片704嵌套在手机图像702中得到第一中间样本705,例如图7中所示,即将预处理后的人脸图片704置于手机图片的显示屏区域703中,以模拟出手机正显示人脸图片的效果。
本申请实施例中,若是正样本为视频时,负样本模板可以是图片,或者也可以是视频,例如视频中只包括终端、者手、相框或者光晕等内容。具体的,同样可以将对视频形式的正样本进行预处理后,再嵌套进负样本模板中。例如,可以调整视频的比例,以适应手机显示屏的大小,再将调整后的视频嵌套在手机中,以模拟在手机中播放视频的效果。
步骤303:选取反射图片,并根据中间样本的大小对反射图片进行预处理。
本申请实施例中,考虑到在第二用户通过终端中第一用户的人脸图片或者视频进行人脸认证时,或者第二用户手持打印出来的第一用户的人脸图片时,由于光线的反射,场景中的背景都不可避免地会在终端中或者打印出来的人脸图片中形成反射效果,因而本申请实施例的技术方案对反射也进行了模拟。
具体地,由于用户可以在多种场景中进行人脸认证的过程,因此人脸图片可以反射的物体是很多的。例如在室外场景中时,反射的物体例如可以是建筑物或者建筑物上的商标等,在室内场景中时,反射的物体可以是室内设施,例如壁纸、空调或者电视等。可以采集多个室内场景中的物体的图片以及多个室外场景的物体的图片,作为反射图片加入到反射图片库中,进而在合成负样本的过程中,则可以从反射图片库中进行选取。在具体进行选取时,可以通过随机选取的方式进行选取,或者,也可以通过按照一定顺序地方式进行选取,本申请实施例对此不作限制。
由于采集的反射图片的尺寸与中间样本通常不会直接吻合,因而还需要对反射图片进行预处理,从而使得反射图片的大小与中间样本的大小一致。例如,当负样本模板为终端图片时,具体即是要使得反射图片的大小与终端图片中终端的外形大小一致;或者当负样本模板为相框图片时,具体即是要使得反射图片的大小与相框图片中相框的边框大小一致。
沿用负样本模板为手机图片的例子,则可以对反射图片进行裁剪,使得剪裁后的反射图片与手机图片中手机的外形尺寸相同;或者,还可以对反射图片进行缩放,例如横向的缩放,或者纵向的缩放,使得缩放后的反射图片与手机 的外形尺寸相同。由于真实场景下,在反射的物体与手机的显示屏之间可能存在着一定空间位置关系,因此,基于该考虑,除了上述处理操作之外,还可以对反射图片进行几何形变,从而使反射图片在中间样本中更加趋于真实。
步骤304:将中间样本作为前景,与预处理后的反射图片进行合成,使得中间样本中模拟有反射显示的反射图片。
本申请实施例中,在对反射图片进行预处理之后,则可以将中间样本作为前景图片,与预处理后的反射图片进行合成,以使得中间样本中模拟有反射显示的反射图片。
具体地,在中间样本为图片时,则中间样本可以与预处理后的反射图片直接进行合成,相应的合成后的中间样本也为图片。或者,当中间样本为视频时,可以对视频中的每一帧都与反射图片进行合成,相应的合成后的中间样本也为视频。其中,与不同帧进行合成的预处理后的反射图片可以是相同的,也可以是不同的。例如,与第一帧进行合成的预处理后的反射图片可以是原反射图片中的第一部分,而与第二帧进行合成的预处理后的反射图片可以是原反射图片中的第二部分,第一部分和第二部分不同,但是第一部分和第二部分之间可以存在交集,也可以是完全不存在交集。
具体地,由于通常反射图片合成到终端图片中的终端的显示屏上形成的反射效果都是比较浅的,也就是说,在合成后的中间样本中,起到主要视觉效果的还是合成前的中间样本的内容,而反射图片中的内容起到的视觉效果较弱。请参见图8所示,图8为以正样本为人脸图片为例,将中间样本(例如是通过图7的方式得到的第一中间样本705)和合成前的反射图片801进行合成的示意图。从图8可以看到,在合成后的中间样本(称为第二中间样本802)中,合成前的中间样本(第一中间样本705)的内容仍然处于主导地位,而反射图片中的建筑物803虽然也能够感受到,但是感知效果是较弱的。
为了实现如图8所示的合成后的中间样本(第二中间样本802)的效果,在进行合成时,可以为反射图片801设置权重,相应地,合成前的中间样本 (第一中间样本705)的权重则为反射图片801的权重的补集,那么则可以根据各自的权重进行合成前的中间样本和反射图片的合成。
具体地,将合成前的中间样本(第一中间样本)的权重称为第一权重值,以及反射图片的权重称为第二权重值,第一权重值与第二权重值互为补集,且第一权重值大于预设权重阈值,且第二权重值小于或者等于所述预设权重阈值。例如,第二权重值为0.1时,则第一权重值为0.9(即1-0.1)。预设权重阈值可以是根据经验设置的值,或者根据具体实验得出的值,示例性的,预设权重阈值例如可以为0.2。
在将第一中间样本与预处理后的反射图片进行合成时,可以通过以下公式进行计算得到:
S=(1-a)*I+a*R
其中,S表示合成后的中间样本(第二中间样本),I表示合成前的中间样本(第一中间样本),R表示预处理后的反射图片,a为第二权重值,a的取值可以是从满足需求的值中随机选择的,当然,a的取值也可以是固定不变的。
基于上述公式的合成过程,即是将合成前的中间样本(第一中间样本)与预处理后的反射图片中的每一个像素点的像素值与各自相应的权重进行相乘后叠加的过程。
步骤305:对中间样本进行至少一次几何形变,获得几何形变后的中间样本。
在实际场景,例如图4所示的场景中,第一终端和第二终端之间可能很难保持完全平行的状态,也就是说第一终端和第二终端之间或多或少都会存在一定的空间位置关系;或者,例如图5所示的场景中,第二用户手持的人脸图片也很难与第一终端保持完全平行的状态,因此为了使得最终获得负样本更加趋于真实,还可以中间样本进行一次或者多次几何形变,得到几何形变后的中间样本,进而再进行后续的合成过程。
在这里需要声明的是,几何形变前的中间样本可以是步骤304得到的合成后的中间样本(第二中间样本),也可以是步骤302得到的中间样本(第一中间样本)。
其中,几何形变可以是透视变换,或者仿射变换,当然,几何形变也可以是其他的变换类型,本申请实施例对此并不进行限制。
下面以几何形变为透视变换为例进行描述。
具体地,透视变换的变换参数可以是固定的,也可以是每一次透视变换时都随机选取变换参数,变换参数例如可以包括旋转角度或者拉伸比例等。
请参见图9,在对第二中间样本802进行透视变换之后,得到的几何形变后的第三中间样本901。第三中间样本901中包括指示几何形变前的中间样本在几何形变后的中间样本中的位置的掩膜904,也就是图9所示的第三中间样本901中手机所占据的图像区域即为掩膜904。其中,只有位于掩膜904内的像素点是存在值的,而位于掩膜904之外的像素点是没有值的,具体表现为如图9中所示为纯黑色区域905,掩膜904有助于将几何形变后的中间样本与场景样本进行融合,具体的将在后续进行介绍,在此先不过多进行赘述。
步骤306:将中间样本融合到场景样本中,获得负样本。
本申请实施例中,在进行人脸认证过程中,拍摄图片或者视频时,难以避免的会将周围的环境拍摄进去,因此为了使得最终得到负样本更贴合真实情景,还可以将中间样本融合到一场景中,从而最终得到负样本。即,可以针对选定的场景样本,将中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
需要声明的是,这里所指的中间样本可以是步骤302得到的中间样本(第一中间样本),也可以是步骤304得到的合成后的中间样本(第二中间样本),或者还可以是步骤305得到的中间样本(第三中间样本)。也就是说,在本申请实施例中,步骤303~步骤305并不是必选的步骤,在具体实施过程中,步骤303~步骤305可以根据实际需求,灵活的选取部分或者全部来执行。
具体地,由于用户可以在多种场景中进行人脸认证的过程,总体来说,场景可以划分为室内场景和室外场景,在具体实施过程中,可以采集多个室内场景的图片以及多个室外场景的图片,加入到场景样本库中,进而在合成负样本的过程中,则可以从场景样本库中进行选取。在具体进行选取时,可以通过随机选取的方式进行选取,或者,也可以通过按照一定顺序地方式进行选取,本申请实施例对此不作限制。其中,场景样本可以是场景照片,也可以是场景视频。此外,除了上述的室内场景的图片以及室外场景的图片之外,场景照片还可以是纯色的背景,例如白墙或者蓝天等。
根据本申请实施例,也可以对场景样本进行预处理,使得场景样本的大小与中间样本的大小一致。
在将中间样本与场景样本进行融合时,可以通过以下公式进行计算得到:
F=S’*M+B*(1-M)
其中,F表示最终得到的负样本,S’表示中间样本,M表示掩膜,B表示预处理后的场景样本。
例如图9中所示,以中间样本为步骤305得到的中间样本,也就是几何形变后的中间样本(第三中间样本901)为例,将第三中间样本901与场景样本(场景图片902)进行融合的过程中,实质上是将预处理后的场景样本中掩膜904所在的区域替换成中间样本的掩膜904,而预处理后的场景样本中掩膜904之外的区域保持不变,得到负样本903。
本申请实施例中,对于正样本为图片时,得到的负样本也就相应的为图片,对于正样本为视频时,得到的负样本也就相应的为视频。
本申请实施例中,得到负样本之后,则可以将负样本添加到负样本库中,用于人脸识别模型的训练学习,因此,本申请实施例还提供一种人脸识别模型的训练方法,在该方法中,可以结合已有的正样本,以及本申请实施例的生成人脸识别的负样本的方法得到的负样本来对模型进行训练学习,以得到最终的人脸识别模型。本申请实施例并不限于模型的类型,例如可以是神经网络模型或者遗传算法模型,或者其他可能的模型。
相应地,本申请实施例还提供一种人脸认证方法,在该方法中,可以通过上述人脸识别模型的训练方法训练得到的人脸识别模型来进行人脸活体认证,该方法可以适用于多种应用场景中,例如可以包括但不限于如图1A和图1B所示的应用场景。
本申请实施例中,可以通过模拟生活场景中使用负样本进行攻击的场景,获得用于人脸识别的机器学习所需的负样本,这样,根据正样本可以生成很多的负样本,从而可以有效的解决生活中在人脸识别中攻击为少数,负样本数量过少的技术问题,进而提高训练所得到的人脸识别模型的性能。并且,在上述的步骤中,可以加入很多随机因素,例如在选取负样本模板、反射照片或者场景样本时可以随机选取,反射照片的权重可以随机选取,透视变换的变换参数也可以随机选取等,那么理论上可以产生无穷的负样本,继而极大地提升模型的性能。
请参见图10,基于同一发明构思,本申请实施例还提供了一种生成人脸识别的负样本的装置,该装置可以包括:
获取单元1001,用于从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本;
嵌套单元1002,用于针对选定的负样本模板,将获取的正样本嵌套进负样本模板中,获得模拟在负样本模板的显示区域中显示正样本的中间样本;
场景融合单元1003,用于针对选定的场景样本,将中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
根据本申请实施例,嵌套单元1002具体可以用于:
对正样本进行预处理,以使得预处理后的正样本能够适应负样本模板的显示区域的大小;
将预处理后的正样本嵌套进负样本模板的显示区域中。
根据本申请实施例,所述装置还可以包括反射合成单元1004。
反射合成单元1004,用于针对选定的反射图片,基于中间样本的大小对反射图片进行预处理;以及对中间样本进行如下处理:将中间样本作为前景,与预处理后的反射图片进行合成,以模拟中间样本中有反射显示的反射图片。
根据本申请实施例,反射合成单元1004具体可以用于:
根据中间样本的第一权重值,以及反射图片的第二权重值,将中间样本与预处理后的反射图片进行合成,其中,第一权重值大于预设权重阈值,且第二权重值小于或者等于预设权重阈值。
根据本申请实施例,场景融合单元1003具体还可以用于:
对中间样本进行至少一次几何形变,几何形变后的中间样本中包括用于指示几何变形前的中间样本在几何变形后的中间样本中的位置的掩膜;
根据掩膜将几何变性后的中间样本融合到场景样本中。
根据本申请实施例,负样本模板例如为具有显示功能的终端的模板,显示区域为终端的显示屏区域;和/或场景样本包括场景照片或者场景视频。
该装置可以用于执行图3~图8所示的实施例所提供的方法,因此,对于该装置的各功能模块所能够实现的功能等可参考图3~图8所示的实施例的描述,不多赘述。其中,反射合成单元1004虽然在图10中一并示出,但需要知道的是,反射合成单元1004并不是必选的功能单元,因此在图10中以虚线示出。
请参见图11,基于同一技术构思,本申请实施例还提供了一种计算机设备,可以包括存储器1101和处理器1102。
所述存储器1101,用于存储处理器1102执行的计算机程序。存储器1101可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序等;存储数据区可存储根据计算机设备的使用所创建的数据等。处理器1102,可以是一个中央处理单元(central processing unit,CPU),或者为数字处理单元等等。本申请实施例中不限定上述存储器1101和处理器1102之间的具体连接介质。本申请实施例在图11中以存储器1101和处理器1102之间通过总线1103连接,总线1103在图11中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线1103可 以分为地址总线、数据总线、控制总线等。为便于表示,图11中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
存储器1101可以是易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);存储器1101也可以是非易失性存储器(non-volatile memory),例如只读存储器,快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)、或者存储器1101是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器1101可以是上述存储器的组合。
处理器1102,用于调用所述存储器1101中存储的计算机程序时执行如图3~图8中所示的实施例提供的生成人脸识别的负样本的方法、人脸识别模型的训练方法以及人脸活体认证方法。
本申请实施例还提供了一种计算机存储介质,存储为执行上述处理器所需执行的计算机可执行指令,其包含用于执行上述处理器所需执行的程序。
在一些可能的实施方式中,本申请提供的生成人脸识别的负样本的方法、人脸识别模型的训练方法以及人脸活体认证方法的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在计算机设备上运行时,所述程序代码用于使所述计算机设备执行本说明书上述描述的根据本申请各种示例性实施方式的生成人脸识别的负样本的方法、人脸识别模型的训练方法以及人脸活体认证方法中的步骤,例如,所述计算机设备可以执行如图3~图8中所示的实施例提供的生成人脸识别的负样本的方法、人脸识别模型的训练方法以及人脸活体认证方法。
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器 (ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
本申请的实施方式的生成人脸识别的负样本的方法、人脸识别模型的训练方法以及人脸活体认证方法的程序产品可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在计算设备上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。
可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
应当注意,尽管在上文详细描述中提及了装置的若干单元或子单元,但是这种划分仅仅是示例性的并非强制性的。实际上,根据本申请的实施方式,上 文描述的两个或更多单元的特征和功能可以在一个单元中具体化。反之,上文描述的一个单元的特征和功能可以进一步划分为由多个单元来具体化。
此外,尽管在附图中以特定顺序描述了本申请方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。
本领域内的技术人员应明白,本申请的实施例可提供为方法、***、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。
本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (10)

  1. 一种生成人脸识别的负样本的方法,其特征在于,所述方法包括:
    从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本;
    针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,获得模拟在所述负样本模板的显示区域中显示所述正样本的中间样本;
    针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
  2. 如权利要求1所述的方法,其特征在于,所述针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,包括:
    对所述正样本进行预处理,以使得预处理后的正样本能够适应所述负样本模板的显示区域的大小;
    将预处理后的正样本嵌套进所述负样本模板的显示区域中。
  3. 如权利要求1所述的方法,其特征在于,
    在所述针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本之前,所述方法还包括:
    针对选定的反射图片,基于所述中间样本的大小对所述反射图片进行预处理;
    对所述中间样本进行如下处理:将所述中间样本作为前景,与预处理后的反射图片进行合成,以模拟所述中间样本中有反射显示的所述反射图片。
  4. 如权利要求3所述的方法,其特征在于,所述将所述中间样本作为前景,与预处理后的反射图片进行合成,包括:
    根据所述中间样本的第一权重值,以及所述反射图片的第二权重值,将所述中间样本与预处理后的反射图片进行合成,其中,所述第一权重值大于预设权重阈值,且所述第二权重值小于或者等于所述预设权重阈值。
  5. 如权利要求3所述的方法,其特征在于,所述将所述中间样本融合到所述场景样本中,具体包括:
    对所述中间样本进行至少一次几何形变,几何形变后的中间样本中包括用于指示几何形变前的中间样本在几何形变后的中间样本中的位置的掩膜;
    根据所述掩膜将几何形变后的中间样本融合到所述场景样本中。
  6. 如权利要求1-5任一所述的方法,其特征在于,所述负样本模板为具有显示功能的终端的模板,所述显示区域为所述终端的显示屏区域;和/或
    所述场景样本包括场景照片或者场景视频。
  7. 一种人脸识别模型的训练方法,包括:使用正样本和负样本对人脸识别模型进行训练,其特征在于,所述负样本包括:采用权利要求1~5任一所述的方法获得的负样本。
  8. 一种生成人脸识别的负样本的装置,其特征在于,所述装置包括:
    获取单元,用于从用于人脸识别的机器学习所需的训练样本库中,获取一个正样本;
    嵌套单元,用于针对选定的负样本模板,将获取的正样本嵌套进所述负样本模板中,获得模拟在所述负样本模板的显示区域中显示所述正样本的中间样本;
    场景融合单元,用于针对选定的场景样本,将所述中间样本融合到场景样本中以得到用于人脸识别的机器学习所需的负样本。
  9. 一种计算机设备,其特征在于,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1~5任一权利要求所述的方法。
  10. 一种计算机存储介质,其特征在于,
    所述计算机存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如权利要求1~5任一权利要求所述的方法。
PCT/CN2019/093273 2018-08-02 2019-06-27 生成人脸识别的负样本的方法、装置及计算机设备 WO2020024737A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19845530.5A EP3751450A4 (en) 2018-08-02 2019-06-27 METHOD AND DEVICE FOR GENERATING A NEGATIVE SAMPLE FACIAL RECOGNITION AND COMPUTER DEVICE
US17/016,162 US11302118B2 (en) 2018-08-02 2020-09-09 Method and apparatus for generating negative sample of face recognition, and computer device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810869295.4A CN110163053B (zh) 2018-08-02 2018-08-02 生成人脸识别的负样本的方法、装置及计算机设备
CN201810869295.4 2018-08-02

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/016,162 Continuation US11302118B2 (en) 2018-08-02 2020-09-09 Method and apparatus for generating negative sample of face recognition, and computer device

Publications (1)

Publication Number Publication Date
WO2020024737A1 true WO2020024737A1 (zh) 2020-02-06

Family

ID=67645051

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/093273 WO2020024737A1 (zh) 2018-08-02 2019-06-27 生成人脸识别的负样本的方法、装置及计算机设备

Country Status (4)

Country Link
US (1) US11302118B2 (zh)
EP (1) EP3751450A4 (zh)
CN (1) CN110163053B (zh)
WO (1) WO2020024737A1 (zh)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3100070B1 (fr) * 2019-08-21 2021-09-24 Idemia Identity & Security France Procédé de reconnaissance biométrique à contrôle de dérive et installation associée
CN111008564B (zh) * 2019-11-01 2023-05-09 南京航空航天大学 一种非配合式人脸图像识别方法及***
CN110991242B (zh) * 2019-11-01 2023-02-21 武汉纺织大学 一种负样本挖掘的深度学习烟雾识别方法
CN112819173A (zh) * 2019-11-18 2021-05-18 上海光启智城网络科技有限公司 一种样本生成方法和装置、计算机可读存储介质
JP2021135552A (ja) * 2020-02-21 2021-09-13 株式会社ビットキー 利用管理装置、利用管理方法、およびプログラム
CN111444490A (zh) * 2020-03-24 2020-07-24 中国南方电网有限责任公司 身份识别方法、装置、计算机设备和存储介质
CN113111966B (zh) * 2021-04-29 2022-04-26 北京九章云极科技有限公司 一种图像处理方法和图像处理***
CN113537374B (zh) * 2021-07-26 2023-09-08 百度在线网络技术(北京)有限公司 一种对抗样本生成方法
CN113673541B (zh) * 2021-10-21 2022-02-11 广州微林软件有限公司 一种用于目标检测的图像样本生成方法及应用

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050220336A1 (en) * 2004-03-26 2005-10-06 Kohtaro Sabe Information processing apparatus and method, recording medium, and program
CN103679158A (zh) * 2013-12-31 2014-03-26 北京天诚盛业科技有限公司 人脸认证方法和装置
CN106503617A (zh) * 2016-09-21 2017-03-15 北京小米移动软件有限公司 模型训练方法及装置
CN107798390A (zh) * 2017-11-22 2018-03-13 阿里巴巴集团控股有限公司 一种机器学习模型的训练方法、装置以及电子设备
CN108229555A (zh) * 2017-12-29 2018-06-29 深圳云天励飞技术有限公司 样本权重分配方法、模型训练方法、电子设备及存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824055B (zh) 2014-02-17 2018-03-02 北京旷视科技有限公司 一种基于级联神经网络的人脸识别方法
CN105096354A (zh) * 2014-05-05 2015-11-25 腾讯科技(深圳)有限公司 一种图像处理的方法和装置
CN108229326A (zh) * 2017-03-16 2018-06-29 北京市商汤科技开发有限公司 人脸防伪检测方法和***、电子设备、程序和介质
CN107609462A (zh) * 2017-07-20 2018-01-19 北京百度网讯科技有限公司 待检测信息生成及活体检测方法、装置、设备及存储介质
CN108229344A (zh) * 2017-12-19 2018-06-29 深圳市商汤科技有限公司 图像处理方法和装置、电子设备、计算机程序和存储介质
CN107958236B (zh) * 2017-12-28 2021-03-19 深圳市金立通信设备有限公司 人脸识别样本图像的生成方法及终端
CN108197279B (zh) * 2018-01-09 2020-08-07 北京旷视科技有限公司 攻击数据生成方法、装置、***及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050220336A1 (en) * 2004-03-26 2005-10-06 Kohtaro Sabe Information processing apparatus and method, recording medium, and program
CN103679158A (zh) * 2013-12-31 2014-03-26 北京天诚盛业科技有限公司 人脸认证方法和装置
CN106503617A (zh) * 2016-09-21 2017-03-15 北京小米移动软件有限公司 模型训练方法及装置
CN107798390A (zh) * 2017-11-22 2018-03-13 阿里巴巴集团控股有限公司 一种机器学习模型的训练方法、装置以及电子设备
CN108229555A (zh) * 2017-12-29 2018-06-29 深圳云天励飞技术有限公司 样本权重分配方法、模型训练方法、电子设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3751450A4 *

Also Published As

Publication number Publication date
CN110163053A (zh) 2019-08-23
US11302118B2 (en) 2022-04-12
CN110163053B (zh) 2021-07-13
EP3751450A1 (en) 2020-12-16
US20200410266A1 (en) 2020-12-31
EP3751450A4 (en) 2021-04-28

Similar Documents

Publication Publication Date Title
WO2020024737A1 (zh) 生成人脸识别的负样本的方法、装置及计算机设备
US10275672B2 (en) Method and apparatus for authenticating liveness face, and computer program product thereof
US10832086B2 (en) Target object presentation method and apparatus
CN106599772B (zh) 活体验证方法和装置及身份认证方法和装置
CN106228628B (zh) 基于人脸识别的签到***、方法和装置
US10853677B2 (en) Verification method and system
JP6629513B2 (ja) ライブネス検査方法と装置、及び映像処理方法と装置
JP7096355B2 (ja) 生体検知方法、装置及び記憶媒体
JP2020523665A (ja) 生体検出方法及び装置、電子機器並びに記憶媒体
CN109977847A (zh) 图像生成方法及装置、电子设备和存储介质
WO2022111044A1 (zh) 图像处理方法及装置、终端控制方法及装置
US10762663B2 (en) Apparatus, a method and a computer program for video coding and decoding
WO2023173686A1 (zh) 检测方法、装置、电子设备及存储介质
CN108446654A (zh) 一种基于图像的人脸辨识方法
CN114616565A (zh) 使用视听不一致性的活体检测
CN111724310A (zh) 图像修复模型的训练方法、图像修复方法及装置
CN109726613A (zh) 一种用于检测的方法和装置
CN109636867B (zh) 图像处理方法、装置及电子设备
CN112580615B (zh) 一种活体的认证方法、认证装置及电子设备
CN114358112A (zh) 视频融合方法、计算机程序产品、客户端及存储介质
JP7264308B2 (ja) 二次元顔画像の2つ以上の入力に基づいて三次元顔モデルを適応的に構築するためのシステムおよび方法
CN109886084B (zh) 基于陀螺仪的人脸认证方法、电子设备及存储介质
WO2022222957A1 (zh) 一种目标识别的方法和***
CN113033243A (zh) 一种人脸识别方法、装置及设备
CN113837019B (zh) 一种化妆进度检测方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19845530

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019845530

Country of ref document: EP

Effective date: 20200909

NENP Non-entry into the national phase

Ref country code: DE