CN115082990A - Living body detection method and device for human face - Google Patents

Living body detection method and device for human face Download PDF

Info

Publication number
CN115082990A
CN115082990A CN202210741588.0A CN202210741588A CN115082990A CN 115082990 A CN115082990 A CN 115082990A CN 202210741588 A CN202210741588 A CN 202210741588A CN 115082990 A CN115082990 A CN 115082990A
Authority
CN
China
Prior art keywords
face
image
weight
target
living body
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210741588.0A
Other languages
Chinese (zh)
Inventor
周军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210741588.0A priority Critical patent/CN115082990A/en
Publication of CN115082990A publication Critical patent/CN115082990A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • G06V10/431Frequency domain transformation; Autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of image recognition, and provides a living body detection method and a living body detection device for a human face. The method comprises the following steps: extracting a target area corresponding to the target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction; according to the first weight, weighting each first pixel value of the target area, and according to the second weight, weighting each second pixel value of the rest area in the face image to be detected, so as to determine the target face image; inputting the target face image into a trained face recognition model, and determining a detection result; the remaining region is a region except the target region in the face image to be detected; the first weight is greater than the second weight. The living body detection method for the human face can effectively defend attack of the local human face sticker and improve accuracy of living body detection of the human face.

Description

Living body detection method and device for human face
Technical Field
The application relates to the technical field of image recognition, in particular to a living body detection method and a living body detection device for a human face.
Background
With the development of mobile internet, the scenes of identity verification through human faces are more and more extensive, and meanwhile, the safety problem that an attacker completes human face recognition by using a dummy human face is more and more common. In order to avoid an attacker to use a dummy face to complete face recognition, in general, the living body verification is performed by requiring the user to perform combined actions of blinking, mouth opening, head shaking, head nodding, and the like.
However, in the practical application process, it is found that if an attacker attaches a local face sticker of the attacker, such as the forehead, the nose, or other areas that do not need to perform related operations according to the living body verification, to the face of the attacker, so as to pass the living body verification by using the overall high similarity, the accuracy of the living body detection of the face is not ideal enough, and the safety is poor.
Disclosure of Invention
The present application is directed to solving at least one of the technical problems occurring in the related art. Therefore, the living body detection method for the human face can effectively defend the attack of local human face stickers and improve the accuracy of living body detection of the human face.
The application also provides a living body detection device for the human face.
The application also provides an electronic device.
The present application also provides a computer-readable storage medium.
According to the living body detection method of the human face in the embodiment of the first aspect of the application, the method comprises the following steps:
extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
according to the first weight, weighting each first pixel value of the target area, and according to the second weight, weighting each second pixel value of the rest area in the face image to be detected, so as to determine a target face image;
inputting the target face image into a trained face recognition model, and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
According to the living body detection method of the human face, the target area formulated by the living body verification instruction is determined from the human face image to be detected, the first pixel value of the target area is endowed with high weight, the second pixel value of the residual area of the human face image to be detected is endowed with low weight and then is weighted, the living body detection is carried out through the trained human face recognition model after the target human face image is obtained, so that the detection requirement of the local area for executing the relevant operation of the living body detection in the human face is improved, meanwhile, the interference of other parts of the human face to the whole judgment is weakened, the attack of the local human face sticker can be effectively defended, and the accuracy of the living body detection of the human face is improved.
According to an embodiment of the present application, further comprising:
sampling a face image of a face video to obtain an initial face image;
and carrying out frequency domain conversion on the initial face image to obtain the face image to be detected.
According to an embodiment of the present application, the performing frequency domain conversion on the initial face image to obtain the face image to be detected includes:
carrying out spatial domain conversion on the initial face image to obtain a face spatial image;
carrying out Fourier transform and ambient light filtering on the face space image in sequence to obtain a face frequency domain image;
performing inverse Fourier transform on the face frequency domain image to obtain a face space image to be detected;
and performing channel superposition on the spatial image of the face to be detected and the frequency domain image of the face to obtain the image of the face to be detected.
According to an embodiment of the present application, the performing fourier transform and ambient light filtering on the face space image in sequence to obtain a face frequency domain image includes:
carrying out Fourier transform on the face space image to obtain an initial frequency domain image;
carrying out ambient light sampling on the initial frequency domain image to obtain a current ambient light sampling value;
determining the current filtering output value of the initial frequency domain image according to the current environment light sampling value and the historical filtering output value of the historical face image of the frame which is adjacent to the initial face image in the face video;
and according to the current filtering output value, carrying out ambient light filtering on the initial frequency domain image to obtain a human face frequency domain image.
According to an embodiment of the present application, further comprising:
and performing differentiation adjustment on the initial weight of the target area and the initial weight of the residual area, and determining a first weight of the target area and a second weight of the residual area.
According to an embodiment of the present application, the differentially adjusting the initial weight of the target region and the initial weight of the remaining region to determine the first weight of the target region and the second weight of the remaining region includes:
according to a first preset multiple, expanding the initial weight of the target region to determine the first weight, and determining the initial weight of the remaining region as the second weight; or,
according to a second preset multiple, reducing the initial weight of the residual region to determine the second weight, and determining the initial weight of the target region as the first weight; or,
and according to a first preset multiple, expanding the initial weight of the target region to determine the first weight, and according to a second preset multiple, reducing the initial weight of the residual region to determine the second weight.
According to an embodiment of the present application, inputting a target face image into a trained face recognition model, and determining a detection result, includes:
inputting the target face image into a trained face recognition model to carry out depth separable convolution to obtain a characteristic image;
inputting the characteristic image into an attention module in the face recognition model, and determining the living body probability of the characteristic image;
and carrying out secondary classification on the living body probability according to a preset value, and determining a detection result.
According to the living body detection device of the human face in the embodiment of the second aspect of the application, the living body detection device comprises:
the target area extraction module is used for extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
the face image determining module is used for weighting each first pixel value of the target area according to the first weight, and weighting each second pixel value of the rest area in the face image to be detected according to the second weight to determine a target face image;
the human face living body detection module is used for inputting a target human face image into the trained human face recognition model and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
The electronic device according to the third aspect of the present application includes a processor and a memory storing a computer program, and the processor implements the living body detection method of a human face according to any one of the above embodiments when executing the computer program.
A computer-readable storage medium according to an embodiment of the fourth aspect of the present application, on which a computer program is stored, which, when executed by a processor, implements the method for detecting a living body of a human face according to any of the embodiments described above.
A computer program product according to an embodiment of the fifth aspect of the application, comprising: the computer program, when executed by a processor, implements a method for live detection of a human face as in any of the above embodiments.
One or more technical solutions in the embodiments of the present application have at least one of the following technical effects:
the method comprises the steps of determining a target area formulated by a living body verification instruction from a face image to be detected, giving high weight to a first pixel value of the target area, giving low weight to a second pixel value of the rest area of the face image to be detected, weighting, carrying out living body detection through a trained face recognition model after the target face image is obtained, and accordingly improving the detection requirement of a local area for executing relevant operation of the living body detection in the face, weakening the interference of other parts of the face on overall judgment, effectively defending the attack of local face stickers, and improving the accuracy of the living body detection of the face.
Furthermore, the human face image to be detected is obtained by performing frequency domain conversion on the human face image, so that the anti-interference performance of the human face image to be detected on ambient light is improved, and the human face living body detection can be performed better in the follow-up process.
Furthermore, the channels of the face frequency domain image obtained after the frequency domain conversion and the face space image to be detected obtained by performing the inverse Fourier transform on the face frequency domain image are overlapped, so that more available features of the image can be explored from more dimensional visual angles, and the accuracy of the living body detection of the face is further improved.
Furthermore, the current ambient light sampling value and the historical filtering output value of the historical face image of the frame adjacent to the initial face image are weighted to obtain an effective filtering value, so that the output has a feedback effect on the input, and the ambient light separation effect of the obtained face frequency domain image is better.
Furthermore, the initial weights of the target area and the residual area are adjusted in a differentiation mode, so that the weight of the target area is higher than that of the residual area, after the first weight and the second weight are determined, each first pixel value of the target area is weighted through the first weight, each second pixel value of the residual area in the face image to be detected is weighted according to the second weight, the obtained first pixel value of the target area of the target face image is larger than that of the residual area, the detection occupation ratio of the target area is further enhanced, and the detection occupation ratio of the face residual area is weakened.
Drawings
In order to more clearly illustrate the technical solutions in the present application or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a living body detection method for a human face according to an embodiment of the present application;
fig. 2 is a flowchart for further detailing the acquisition of a face image to be detected in the living body detection method of the face in fig. 1 in the embodiment of the present application;
FIG. 3 is a schematic flow chart of image frequency domain conversion in the embodiment of the present application;
FIG. 4 is a schematic diagram of a model structure of a face recognition model in an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a living body detection apparatus for a human face according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Hereinafter, the living body detection method and apparatus for human face provided by the embodiments of the present application will be described and explained in detail through several specific embodiments.
In one embodiment, a living body detection method of a human face is provided, and the method is applied to a server and used for carrying out living body detection of the human face. The server can be an independent server or a server cluster formed by a plurality of servers, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN (content delivery network), big data and artificial intelligence sampling point equipment and the like.
As shown in fig. 1, the living body detection method for a human face provided by this embodiment includes:
step 101, extracting a target area corresponding to a target part from an acquired human face image to be detected according to the target part specified by a living body verification instruction;
102, weighting each first pixel value of a target area according to a first weight, and weighting each second pixel value of the rest area in the face image to be detected according to a second weight to determine a target face image;
step 103, inputting the target face image into the trained face recognition model, and determining a detection result;
the remaining region is a region except the target region in the face image to be detected;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
The method comprises the steps of determining a target area formulated by a living body verification instruction from a face image to be detected, giving high weight to a first pixel value of the target area, giving low weight to a second pixel value of the rest area of the face image to be detected, weighting, carrying out living body detection through a trained face recognition model after the target face image is obtained, and accordingly improving the detection requirement of a local area for executing relevant operation of the living body detection in the face, weakening the interference of other parts of the face on overall judgment, effectively defending the attack of local face stickers, and improving the accuracy of the living body detection of the face.
In one embodiment, a plurality of channels of face images to be detected are extracted from the acquired face video in advance through a face detector, such as a mediaprofile. The face video may be a face reflection video. Specifically, a color sequence composed of multiple colors may be selected in advance, for example, four colors are randomly extracted from nine colors, namely, red, orange, yellow, green, blue, purple, black and white to generate a color sequence, then the color sequence is sequentially passed through an irradiation device, for example, a screen of a terminal device generates a light ray to perform reflection irradiation on a face to obtain a face reflection video, and a face image is randomly captured in the face reflection video through a face detector, so that the possibility that the face image is stolen is reduced. After the face image is intercepted, the face image with the RGB channel can be used as the face image to be detected.
In order to highlight local features of the face and further improve accuracy of living body detection, in an embodiment, the RGB face image captured from the face video may be subjected to spatial frequency domain conversion, for example, the captured RGB face image is converted into UV spatial coordinates, and a UV face spatial image is obtained as the face image to be detected. The actual size of the UV face space map may be determined according to actual conditions, such as 256 × 3.
In view of the fact that the human face image to be detected obtained in the above manner has poor anti-interference performance to the ambient light and is easy to affect the accuracy of human face live body detection, in order to improve the anti-interference performance of the human face image to be detected to the ambient light and better perform the human face live body detection, in an embodiment, as shown in fig. 2, the method for acquiring the human face image to be detected includes:
step 201, sampling a face image of a face video to obtain an initial face image;
step 202, performing frequency domain conversion on the initial face image to obtain a face image to be detected.
The initial face image is an RGB face image cut from a face video. After the RGB face image is obtained, the RGB face image can be transferred to a frequency domain through Fourier transformation, a low frequency band of the environment light in the image frequency domain is analyzed, and the low-pass filter is used for filtering the environment light in the RGB face image to obtain the face image to be detected.
The order of the low-pass filter can be selected according to actual conditions, and for example, the low-pass filter is a first-order low-pass filter.
In view of the fact that more available features of an image can be explored from a perspective with more dimensions to further improve the accuracy of living body detection of a human face, in an embodiment, the frequency domain conversion is performed on an initial human face image to obtain a human face image to be detected, and the method includes:
carrying out spatial domain conversion on the initial face image to obtain a face spatial image;
carrying out Fourier transform and ambient light filtering on the face space image in sequence to obtain a face frequency domain image;
performing inverse Fourier transform on the face frequency domain image to obtain a face space image to be detected;
and performing channel superposition on the spatial image of the face to be detected and the frequency domain image of the face to be detected to obtain the image of the face to be detected.
In an embodiment, as shown in fig. 3, after an initial face image is acquired from a face video, the initial face image is converted into UV space coordinates, and then the face space image is extracted from the face image converted into UV space according to a mask set in UV space and a pixel value weighting manner. Then, the face space image is transferred from the space domain to the frequency domain through Fourier transform, and the low frequency band of the environment light in the image frequency domain is determined. Wherein, the Fourier transform formula is:
Figure BDA0003715514770000101
wherein, F (ω) is the face frequency domain image function of F (t), and F (t) is the face space image function.
After determining that the ambient light is located in a low frequency band of an image frequency domain, filtering the ambient light by using a low-pass filter to obtain a face frequency domain image, performing inverse Fourier transform on the face frequency domain image, and recovering the face frequency domain image from the frequency domain to a space domain to obtain a new space face image as a face space image to be detected. And after the spatial image of the face to be detected is obtained, performing channel superposition on the spatial image of the face to be detected and the frequency domain image of the face, thereby obtaining the image of the face to be detected.
Wherein, the formula of the inverse Fourier transform is as follows:
Figure BDA0003715514770000102
wherein, F (omega) is a human face frequency domain image function, and F (t) is a human face space image function to be detected.
For example, the channel stacking mode may be sequential stacking, and if the channel of the spatial image of the face to be detected is an RGB channel and the frequency domain channel of the frequency domain image of the face is an F channel, the face image to be detected obtained after stacking is the face image to be detected of the RGBF channel.
In order to make the ambient light filtering effect better, in an embodiment, after performing fourier transform on the face space image to obtain an initial frequency domain image, performing ambient light sampling on the initial frequency domain image, and extracting an ambient light sampling value of the initial frequency domain image as a current ambient light sampling value. After extracting the current ambient light sampling value, inputting the current ambient light sampling value and the historical filtering output value of the historical face image of the frame which is adjacent to the initial face image in the face video into a low-pass filter, and determining the current filtering output value of the initial frequency domain image through a preset filtering model Y (n) ═ alpha X (n) + (1-alpha) Y (n-1) in the low-pass filter. Wherein, Y (n) represents the current filtering output value of the initial frequency domain image, x (n) represents the current ambient light sampling value, Y (n-1) represents the historical filtering output value of the historical human face image, and α represents the filtering coefficient.
And after the current filtering output value is determined, filtering the initial frequency domain image according to the current filtering output value to obtain a face frequency domain image. After the face frequency domain image is obtained, the face frequency domain image and the spatial image of the face to be detected can be subjected to channel superposition, so that the face image to be detected is obtained.
The current ambient light sampling value and the historical filtering output value of the historical face image of the frame which is adjacent to the initial face image are weighted to obtain an effective filtering value, so that the output has a feedback effect on the input, and the ambient light separation effect of the obtained face frequency domain image is better.
In an embodiment, after the face image to be detected is acquired, a target area corresponding to a target part can be determined from the face image to be detected according to the target part specified by the living body verification instruction. The living body verification instruction is used for indicating a certain part of a human face to execute corresponding operation in the living body verification process, and the appointed part is a target part. If the living body verification instruction is to require the user to execute blinking operation, the target part is the eyes of the user; and if the living body verification instruction is to require the user to perform mouth opening operation, the target part is the mouth of the user. The living body verification instruction may be an instruction that requires a plurality of parts of the face of the user to perform corresponding operations, in addition to a single part, and the target part includes the eyes and the mouth of the user if the living body verification instruction requires the user to perform a blinking operation and a mouth opening operation. After the target part is determined, a corresponding target area can be extracted from the human face image to be detected through a detection tool such as a human face detector, and other areas of the human face image to be detected except the target area are marked as residual areas. If the target parts are eyes, a nose and a mouth, the target region extracted from the human face image to be detected comprises an eye image, a nose image and a mouth image, and the other regions except the eye image, the nose image and the mouth image in the human face image to be detected, such as the forehead, the eyebrows, the ears and the like, are the residual regions.
In an embodiment, after the target area and the residual area are determined, each first pixel value of the target area is multiplied by a first weight to be weighted, and each second pixel value of the residual area is multiplied by a second weight smaller than the first weight to be weighted to obtain the target face image.
The determining of the first weight and the second weight may be, after the target area is determined, performing differentiation adjustment on the initial weight of the target area and the initial weight of the remaining area to determine the first weight of the target area and the second weight of the remaining area.
It will be appreciated that the initial weights of the target region and the remaining region may be the same.
For example, the server may prestore a first preset weight and a second preset weight smaller than the first preset weight, and after the target region is determined, the first preset weight may be used as the first weight to replace the initial weight of the target region, and the second preset weight may be used as the second weight to replace the initial weight of the remaining region, thereby implementing the differential adjustment.
In an embodiment, the difference between the initial weights of the target region and the remaining region may be adjusted by expanding the initial weight of the target region according to a first preset multiple to determine a first weight, and determining the initial weight of the remaining region as a second weight; or, according to a second preset multiple, reducing the initial weight of the remaining region to determine a second weight, and determining the initial weight of the target region as the first weight; or, according to a first preset multiple, expanding the initial weight of the target region to determine a first weight, and according to a second preset multiple, reducing the initial weight of the remaining region to determine a second weight.
The first preset multiple and the second preset multiple can be determined according to actual conditions. In one embodiment, the differentiation adjustment may be achieved by adjusting only the initial weight of the target region or the remaining region. For example, if the first preset multiple is 1.2 times, the initial weight of the target region may be multiplied by 1.2 to obtain a first weight, and the initial weight of the remaining region may be directly determined as the second weight. Or, if the second preset multiple is 0.9 times, the initial weight of the target region may be directly used as the first weight, and the initial weight of the remaining region may be multiplied by 0.9 to obtain the second weight.
In an embodiment, the initial weights of the target region and the remaining region may be adjusted simultaneously to achieve the differential adjustment. For example, if the first preset multiple is 1.2 times and the second preset multiple is 0.9 times, the initial weight of the target region may be multiplied by 1.2 to obtain a first weight, and the initial weight of the remaining region may be multiplied by 0.9 to obtain a second weight.
The initial weight is adjusted in a differentiated mode, the weight of the target area is higher than that of the rest area, after the first weight and the second weight are determined, each first pixel value of the target area is weighted through the first weight, each second pixel value of the rest area in the face image to be detected is weighted according to the second weight, and therefore the first pixel value of the target area of the obtained target face image is larger than that of the rest area, the detection proportion of the target area is enhanced, and the detection proportion of the face rest area is weakened.
After the target face image is determined, the target face image can be input into the trained face recognition model to perform the living body detection of the face. In an embodiment, the face recognition model is as shown in fig. 4, a trunk network of the face recognition model may use a densenert 201, and an Attention module is introduced into the trunk network, so that the learning capability of the model is stronger, and the generalization performance is further improved. In the feature map (h × w × c) obtained from the deep convolutional neural network, each channel c (channel) is equally weighted, and the attention module sets different weighting parameters for each channel, so as to represent the degree of importance for different channels c.
And for the common convolution kernel in the Densenet201, deep separable convolution can be used for replacing, so that the model parameters can be reduced and the inference speed can be increased while the accuracy is ensured. The depth separable Convolution is a Convolution mode composed of channel-by-channel Convolution (Depthwise Convolution) and point-by-point Convolution (Pointwise Convolution). Compared with the common convolution, the method has the advantages that the parameter quantity and the operation quantity of the model can be reduced while the accuracy of the model is maintained.
Meanwhile, the face recognition model is constructed with a two-classification loss function, and the loss function can use BCE (binary cross loss function), so that two classifications aiming at the target face image are realized.
Besides the two classifications aiming at the target face image, the face recognition model can be trained to perform the reflection color classification of the face image.
In one embodiment, the face recognition model can be trained from a plurality of face sample images with multiple channels. The multi-channel face sample image can comprise a positive sample formed by a normal face image and a negative sample formed by a cut face sticker. Specifically, a multi-channel face sample image is input into a pre-constructed face recognition model, and feature extraction is performed on the face sample image through depth separable convolution in the face recognition model to obtain a feature sample image. And then inputting the characteristic sample image into an attention module of the face recognition model, determining a loss value of the characteristic sample image, comparing the loss value with a preset threshold value, adjusting a loss function of the face recognition model according to a comparison result, inputting a next multi-channel face sample image into the face recognition model for training, and finishing the training of the face recognition model until the loss value is smaller than the preset threshold value to obtain the trained face recognition model.
In an embodiment, after the trained face recognition model is obtained, the trained face recognition model can be used to recognize the target face image, so as to realize the living body detection of the face. Specifically, a target face image is input into a trained face recognition model to be subjected to depth separable convolution, and a characteristic image is obtained; inputting the characteristic image into an attention module in a face recognition model, and determining the living body probability of the characteristic image; and performing secondary classification on the living body probability according to a preset value to determine a detection result.
In an embodiment, after a target face image is input into a trained face recognition model, a corresponding living body probability can be determined through a depth separable convolution and attention module of the face recognition model, then the living body probability is compared with a preset value, if the living body probability is smaller than or equal to the preset value, the target face image is judged to be a non-living body, and if not, the target face image is judged to be a living body.
The living body detection device of the human face provided by the present application is described below, and the living body detection device of the human face described below and the living body detection method of the human face described above may be referred to in correspondence with each other.
In one embodiment, as shown in fig. 5, there is provided a living body detecting apparatus of a human face, including:
the target area extraction module 210 is configured to extract a target area corresponding to a target portion from the acquired face image to be detected according to the target portion specified by the living body verification instruction;
the face image determining module 220 is configured to weight each first pixel value of the target region according to the first weight, and weight each second pixel value of the remaining region in the face image to be detected according to the second weight, so as to determine a target face image;
the face living body detection module 230 is used for inputting the target face image into the trained face recognition model and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
The method comprises the steps of determining a target area formulated by a living body verification instruction from a face image to be detected, giving high weight to a first pixel value of the target area, giving low weight to a second pixel value of the rest area of the face image to be detected, weighting, carrying out living body detection through a trained face recognition model after the target face image is obtained, and accordingly improving the detection requirement of a local area for executing relevant operation of the living body detection in the face, weakening the interference of other parts of the face on overall judgment, effectively defending the attack of local face stickers, and improving the accuracy of the living body detection of the face.
In an embodiment, the target region extraction module 210 is further configured to:
sampling a face image of a face video to obtain an initial face image;
and carrying out frequency domain conversion on the initial face image to obtain the face image to be detected.
In an embodiment, the target region extracting module 210 is specifically configured to:
carrying out spatial domain conversion on the initial face image to obtain a face spatial image;
carrying out Fourier transform and ambient light filtering on the face space image in sequence to obtain a face frequency domain image;
performing inverse Fourier transform on the face frequency domain image to obtain a face space image to be detected;
and performing channel superposition on the spatial image of the face to be detected and the frequency domain image of the face to obtain the image of the face to be detected.
In an embodiment, the target region extracting module 210 is specifically configured to:
carrying out Fourier transform on the face space image to obtain an initial frequency domain image;
carrying out ambient light sampling on the initial frequency domain image to obtain a current ambient light sampling value;
determining the current filtering output value of the initial frequency domain image according to the current environment light sampling value and the historical filtering output value of the historical face image of the frame which is adjacent to the initial face image in the face video;
and according to the current filtering output value, filtering the ambient light of the initial frequency domain image to obtain a face frequency domain image.
In one embodiment, the face image determination module 220 is further configured to:
and carrying out differentiation adjustment on the initial weight of the target area and the initial weight of the residual area, and determining a first weight of the target area and a second weight of the residual area.
In an embodiment, the facial image determination module 220 is specifically configured to:
according to a first preset multiple, expanding the initial weight of the target region to determine the first weight, and determining the initial weight of the remaining region as the second weight; or,
according to a second preset multiple, reducing the initial weight of the residual region to determine the second weight, and determining the initial weight of the target region as the first weight; or,
and according to a first preset multiple, expanding the initial weight of the target region to determine the first weight, and according to a second preset multiple, reducing the initial weight of the residual region to determine the second weight.
In an embodiment, the face liveness detection module 230 is specifically configured to:
inputting the target face image into a trained face recognition model to carry out depth separable convolution to obtain a characteristic image;
inputting the characteristic image into an attention module in the face recognition model, and determining the living body probability of the characteristic image;
and carrying out secondary classification on the living body probability according to a preset value, and determining a detection result.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)810, a Communication Interface 820, a memory 830 and a Communication bus 840, wherein the processor 810, the Communication Interface 820 and the memory 830 communicate with each other via the Communication bus 840. The processor 810 may invoke a computer program in the memory 830 to perform a method of live detection of a human face, including, for example:
extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
weighting each first pixel value of the target area according to the first weight, and weighting each second pixel value of the rest area in the face image to be detected according to the second weight to determine a target face image;
inputting the target face image into a trained face recognition model, and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present application further provides a storage medium, where the storage medium includes a computer program, where the computer program is stored on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, a computer is capable of executing the living body detection method for a human face provided in the foregoing embodiments, for example, including:
extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
according to the first weight, weighting each first pixel value of the target area, and according to the second weight, weighting each second pixel value of the rest area in the face image to be detected, so as to determine a target face image;
inputting the target face image into a trained face recognition model, and determining a detection result;
the remaining region is a region except the target region in the face image to be detected;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
On the other hand, embodiments of the present application further provide a processor-readable storage medium, where a computer program is stored in the processor-readable storage medium, where the computer program is configured to cause a processor to execute the method provided in each of the above embodiments, for example, the method includes:
extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
according to the first weight, weighting each first pixel value of the target area, and according to the second weight, weighting each second pixel value of the rest area in the face image to be detected, so as to determine a target face image;
inputting the target face image into a trained face recognition model, and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
The processor-readable storage medium can be any available medium or data storage device that can be accessed by a processor, including but not limited to magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memory (NAND FLASH), Solid State Disks (SSDs)), etc.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A living body detection method of a human face is characterized by comprising the following steps:
extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
according to the first weight, weighting each first pixel value of the target area, and according to the second weight, weighting each second pixel value of the rest area in the face image to be detected, so as to determine a target face image;
inputting the target face image into a trained face recognition model, and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
2. The living body detection method of a human face according to claim 1, characterized by further comprising:
sampling a face image of a face video to obtain an initial face image;
and carrying out frequency domain conversion on the initial face image to obtain the face image to be detected.
3. The living body detection method of a human face according to claim 2, wherein the frequency domain conversion of the initial human face image to obtain the human face image to be detected comprises:
carrying out spatial domain conversion on the initial face image to obtain a face spatial image;
carrying out Fourier transform and ambient light filtering on the face space image in sequence to obtain a face frequency domain image;
performing inverse Fourier transform on the face frequency domain image to obtain a face space image to be detected;
and performing channel superposition on the spatial image of the face to be detected and the frequency domain image of the face to obtain the image of the face to be detected.
4. The living body detection method of a human face according to claim 3, wherein the performing Fourier transform and ambient light filtering on the human face space image in sequence to obtain a human face frequency domain image comprises:
carrying out Fourier transform on the face space image to obtain an initial frequency domain image;
carrying out ambient light sampling on the initial frequency domain image to obtain a current ambient light sampling value;
determining the current filtering output value of the initial frequency domain image according to the current environment light sampling value and the historical filtering output value of the historical face image of the frame which is adjacent to the initial face image in the face video;
and according to the current filtering output value, carrying out ambient light filtering on the initial frequency domain image to obtain a human face frequency domain image.
5. The live body detection method of a human face according to any one of claims 1 to 4, characterized by further comprising:
and performing differentiation adjustment on the initial weight of the target area and the initial weight of the residual area, and determining a first weight of the target area and a second weight of the residual area.
6. The method of claim 5, wherein the differentiating the initial weight of the target region and the initial weight of the remaining region to determine the first weight of the target region and the second weight of the remaining region comprises:
according to a first preset multiple, expanding the initial weight of the target region to determine the first weight, and determining the initial weight of the remaining region as the second weight; or,
according to a second preset multiple, reducing the initial weight of the residual region to determine the second weight, and determining the initial weight of the target region as the first weight; or,
and according to a first preset multiple, expanding the initial weight of the target region to determine the first weight, and according to a second preset multiple, reducing the initial weight of the residual region to determine the second weight.
7. The living body detection method of human face according to claim 1, wherein inputting the target human face image into the trained human face recognition model, and determining the detection result comprises:
inputting the target face image into a trained face recognition model to carry out depth separable convolution to obtain a characteristic image;
inputting the characteristic image into an attention module in the face recognition model, and determining the living body probability of the characteristic image;
and carrying out secondary classification on the living body probability according to a preset value, and determining a detection result.
8. A living body detecting apparatus of a human face, characterized by comprising:
the target area extraction module is used for extracting a target area corresponding to a target part from the acquired human face image to be detected according to the target part specified by the living body verification instruction;
the face image determining module is used for weighting each first pixel value of the target area according to the first weight, and weighting each second pixel value of the rest area in the face image to be detected according to the second weight to determine a target face image;
the human face living body detection module is used for inputting a target human face image into the trained human face recognition model and determining a detection result;
the residual region is a region of the face image to be detected except the target region;
the first weight is greater than the second weight;
the face recognition model is obtained by training a plurality of multi-channel face sample images.
9. An electronic device comprising a processor and a memory storing a computer program, wherein the processor implements the method of detecting a living body of a human face according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the living body detection method of a human face according to any one of claims 1 to 7.
CN202210741588.0A 2022-06-27 2022-06-27 Living body detection method and device for human face Pending CN115082990A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210741588.0A CN115082990A (en) 2022-06-27 2022-06-27 Living body detection method and device for human face

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210741588.0A CN115082990A (en) 2022-06-27 2022-06-27 Living body detection method and device for human face

Publications (1)

Publication Number Publication Date
CN115082990A true CN115082990A (en) 2022-09-20

Family

ID=83255254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210741588.0A Pending CN115082990A (en) 2022-06-27 2022-06-27 Living body detection method and device for human face

Country Status (1)

Country Link
CN (1) CN115082990A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881846A (en) * 2020-07-30 2020-11-03 北京市商汤科技开发有限公司 Image processing method and related device, equipment and storage medium
US20210019503A1 (en) * 2018-09-30 2021-01-21 Tencent Technology (Shenzhen) Company Limited Face detection method and apparatus, service processing method, terminal device, and storage medium
CN112507922A (en) * 2020-12-16 2021-03-16 平安银行股份有限公司 Face living body detection method and device, electronic equipment and storage medium
CN114581709A (en) * 2022-03-02 2022-06-03 深圳硅基智能科技有限公司 Model training, method, apparatus, and medium for recognizing target in medical image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210019503A1 (en) * 2018-09-30 2021-01-21 Tencent Technology (Shenzhen) Company Limited Face detection method and apparatus, service processing method, terminal device, and storage medium
CN111881846A (en) * 2020-07-30 2020-11-03 北京市商汤科技开发有限公司 Image processing method and related device, equipment and storage medium
CN112507922A (en) * 2020-12-16 2021-03-16 平安银行股份有限公司 Face living body detection method and device, electronic equipment and storage medium
CN114581709A (en) * 2022-03-02 2022-06-03 深圳硅基智能科技有限公司 Model training, method, apparatus, and medium for recognizing target in medical image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何光辉等: "图像分割方法在人脸识别中的应用", 计算机工程与应用, vol. 46, no. 28, 31 December 2010 (2010-12-31), pages 196 - 198 *

Similar Documents

Publication Publication Date Title
CN111340008B (en) Method and system for generation of counterpatch, training of detection model and defense of counterpatch
WO2020258667A1 (en) Image recognition method and apparatus, and non-volatile readable storage medium and computer device
CN109657554B (en) Image identification method and device based on micro expression and related equipment
US20230081645A1 (en) Detecting forged facial images using frequency domain information and local correlation
CN111611873B (en) Face replacement detection method and device, electronic equipment and computer storage medium
CN111738735B (en) Image data processing method and device and related equipment
Fang et al. Learnable multi-level frequency decomposition and hierarchical attention mechanism for generalized face presentation attack detection
CN111476200A (en) Face de-identification generation method based on generation of confrontation network
CN111783085B (en) Defense method and device for resisting sample attack and electronic equipment
WO2021249006A1 (en) Method and apparatus for identifying authenticity of facial image, and medium and program product
Bharadi et al. Multi-instance iris recognition
CN112818774A (en) Living body detection method and device
CN111639537A (en) Face action unit identification method and device, electronic equipment and storage medium
CN115936961B (en) Steganalysis method, equipment and medium based on few-sample comparison learning network
CN116109505A (en) Image deblurring method and device, electronic equipment and storage medium
CN115082990A (en) Living body detection method and device for human face
CN115082992A (en) Face living body detection method and device, electronic equipment and readable storage medium
CN116229528A (en) Living body palm vein detection method, device, equipment and storage medium
CN117975519A (en) Model training and image generating method and device, electronic equipment and storage medium
CN112733635B (en) Object identification method and device and electronic equipment
Alharbi et al. Spoofing Face Detection Using Novel Edge-Net Autoencoder for Security.
US10438061B2 (en) Adaptive quantization method for iris image encoding
Bharadwaj et al. Reliable human authentication using AI-based multibiometric image sensor fusion: Assessment of performance in information security
CN117593679B (en) Fake video detection method, fake video detection device, electronic equipment and storage medium
CN110084147B (en) Gender privacy protection method and system for face recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination