CN111209863A - Living body model training and human face living body detection method, device and electronic equipment - Google Patents

Living body model training and human face living body detection method, device and electronic equipment Download PDF

Info

Publication number
CN111209863A
CN111209863A CN202010012081.2A CN202010012081A CN111209863A CN 111209863 A CN111209863 A CN 111209863A CN 202010012081 A CN202010012081 A CN 202010012081A CN 111209863 A CN111209863 A CN 111209863A
Authority
CN
China
Prior art keywords
living body
score
face
video
silent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010012081.2A
Other languages
Chinese (zh)
Other versions
CN111209863B (en
Inventor
王鹏
姚聪
陈坤鹏
周争光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Original Assignee
Beijing Kuangshi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuangshi Technology Co Ltd filed Critical Beijing Kuangshi Technology Co Ltd
Priority to CN202010012081.2A priority Critical patent/CN111209863B/en
Publication of CN111209863A publication Critical patent/CN111209863A/en
Application granted granted Critical
Publication of CN111209863B publication Critical patent/CN111209863B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The invention provides a method, a device and electronic equipment for living body model training and human face living body detection, wherein the living body model training method comprises the following steps: acquiring a plurality of silent videos and extracting a plurality of frame images to be trained; inputting the data into a preset living body model to obtain an output result corresponding to the silent video; calculating the value of a loss function according to the output result and the labeling result of the silent video; adjusting parameters of the living body model according to the values of the loss function until the values of the loss function converge. Therefore, a plurality of single-frame images can be extracted from the silent video for training and result output, the condition of low security level caused by the single-frame images is avoided by the multi-frame image judging mode and the multi-frame image judging model, and the silent video is obtained without setting specific actions, so that the problem of too complicated action difficulty in the dynamic image is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.

Description

Living body model training and human face living body detection method, device and electronic equipment
Technical Field
The invention relates to the technical field of face recognition, in particular to a method and a device for living model training and face living detection and electronic equipment.
Background
With the wide application of face recognition technology, the safety of use is slowly receiving attention from people. The living human face judgment means that common deceptive human faces include masks, printed photos, photos displayed on a screen, video clips played and the like through a technology capable of automatically judging whether the human faces in a given image or video are from real people or deceptive human faces. The human face living body judgment is an important technical means for preventing attack and fraud, and has wide application in industries and occasions relating to remote identity authentication, such as banks, insurance, internet finance, electronic commerce and the like.
The existing face living body judgment technology can be roughly divided into two types: static methods and dynamic methods. The static method mainly judges the authenticity of a given face through the characteristics of color, texture, background objects and the like in an image, and has the characteristics of simplicity and high efficiency, but the safety level is not high; the reason is that the static face image is easy to be forged by means of PS, synthesis software, high-definition screen display photos and the like, and the technical difficulty and cost of the forging method will be lower and lower with the development of the technology. The dynamic method mainly refers to the judgment of various single-frame images based on actions, and requires a user to complete specified facial actions such as mouth opening, blinking and the like before taking a picture, however, the facial actions increase the technical implementation difficulty and simultaneously reduce the user experience.
Therefore, a living human face recognition technology with high security level and simple operation difficulty is needed to improve the user experience.
Disclosure of Invention
The invention solves the problem of how to realize more accurate classification by monitoring a deep neural network by combining the sample distribution characteristics.
In order to solve the above problems, the present invention provides a living body model training method, which includes:
acquiring a plurality of silent videos, and extracting a plurality of frame images to be trained from each silent video;
inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video;
calculating the value of a loss function according to the output result and the labeling result of the silent video;
adjusting parameters of the living body model according to the values of the loss function until the values of the loss function converge.
Therefore, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic images is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Optionally, the living body model is a neural network model.
Optionally, the obtaining a plurality of silence videos and extracting a plurality of frame images to be trained from each silence video further includes:
and acquiring an annotation result according to the silent video.
Therefore, the annotation result is obtained through the silent video, and the real result of the frame image to be trained can be obtained, so that the identification accuracy is improved through subsequent comparison, and the living body model with higher accuracy is obtained.
Optionally, the obtaining a plurality of silence videos and extracting a plurality of frame images to be trained from each silence video includes:
acquiring the silent video, and dividing the silent video into a plurality of intervals;
extracting a frame image in each section; the frame image is a frame image to be trained;
and traversing all the silent videos to obtain a frame image to be trained.
In this way, after the segments are divided, each segment extracts one frame image, so that the loss or excessive repetition of effective information caused by the fact that the distribution of the frame images in the silent video is too uneven (the time interval between two frame images is too long, which causes the loss of effective information, and the time interval is too short, which causes repeated display) due to direct extraction can be avoided.
Optionally, the output result of the living body model at least includes: face presence score, liveness score, and attack score.
Optionally, the attack score includes: a screen flip score, a print paper score, a paper cut score, a matte mask score, and a 3D model score.
Therefore, by finely classifying the attack types, the performances of the detection method and the model on different attack types can be counted, the defects of the method and the model on different attack types can be made up in time, and the purposes of simple use, high precision and high safety are achieved.
Optionally, in the annotation result of the silence video, when the face existence score represents that no face exists, the loss function is determined by the face existence score in the annotation result of the silence video and the face existence score in the output result.
Optionally, in the annotation result of the silence video, when the face existence score represents that a face exists, the loss function is determined by the face existence score, the living body score, and the attack score in the annotation result of the silence video and the face existence score, the living body score, and the attack score in the output result.
Therefore, the corresponding relation between the output result and the actual classification is established through the loss function setting under different human faces, so that the classification function can reflect the accuracy degree of the actual classification, and thus, the accuracy and the identification degree of the living body model identification can be improved through adjusting the parameters in the living body model through the loss function, and a better training effect is achieved.
Secondly, a human face living body detection method is provided, which comprises the following steps:
shooting a silent video of a human face, and extracting a plurality of frame images to be evaluated from the silent video;
inputting the frame image to be evaluated into a preset living body model to obtain an evaluation result; the living body model is obtained by training by the living body model training method;
and judging the detection result of the silent video according to the evaluation result.
Therefore, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic images is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Optionally, the duration of the silence video is 1-3 s. Therefore, the time that the user needs to shoot can be shortened, the action difficulty of shooting by the user is reduced, and the experience feeling is improved
Optionally, the determining the detection result of the silence video according to the evaluation result includes:
judging whether the face existence score in the evaluation result is smaller than a preset face threshold value or not;
if the detected value is smaller than the face threshold value, the live body detection of the silent video fails;
if the living body score is not smaller than the face threshold, judging whether the living body score is the highest score or not in the living body score and the attack score;
if the score is the highest score, the live body detection of the silent video passes;
and if the score is not the highest score, the live body detection of the silent video fails.
Therefore, the attack types are finely classified, the defects of the method and the model in different attack types are overcome in time through the statistics of the detection method and the expression of the model in different attack types, the attack types are directly obtained, and the purposes of simple use, high precision and high safety are achieved.
Further, there is provided a living body model training apparatus comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of silent videos and extracting a plurality of frame images to be trained from each silent video;
the model unit is used for inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video;
the computing unit is used for computing the value of the loss function according to the output result and the labeling result of the silent video;
an adjusting unit for adjusting the parameters of the living body model according to the value of the loss function until the value of the loss function converges.
Therefore, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic images is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
From the second aspect, there is provided a face liveness detection device, comprising:
the shooting unit is used for shooting a silent video of a human face and extracting a plurality of frame images to be evaluated from the silent video;
the evaluation unit is used for inputting the frame image to be evaluated into a preset living body model to obtain an evaluation result; the living body model is obtained by training by the living body model training method;
and the judging unit is used for judging the detection result of the silent video according to the evaluation result.
Therefore, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic images is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Finally, an electronic device is provided, comprising a processor and a memory, wherein the memory stores a control program, and the control program realizes the living body model training method or the human face living body detection method when executed by the processor.
In addition, a computer readable storage medium is provided, which stores instructions that when loaded and executed by a processor implement the living body model training method or the human face living body detection method.
Drawings
FIG. 1 is a flow diagram of an in vivo model training method according to one embodiment of the invention;
FIG. 2 is a flow chart of an in vivo model training method according to another embodiment of the present invention;
FIG. 3 is a flowchart of an in vivo model training method step 10 according to an embodiment of the invention;
FIG. 4 is a flowchart of a face liveness detection method according to an embodiment of the present invention;
FIG. 5 is a flowchart of steps 300 of a face liveness detection method according to an embodiment of the invention
FIG. 6 is a block diagram showing the construction of an in-vivo model training apparatus according to an embodiment of the present invention;
FIG. 7 is a block diagram of an apparatus for detecting a living human face according to an embodiment of the present invention;
FIG. 8 is a block diagram of an electronic device according to an embodiment of the invention;
FIG. 9 is a block diagram of another electronic device according to an embodiment of the invention.
Description of reference numerals:
1-acquisition unit, 2-model unit, 3-calculation unit, 4-adjustment unit, 5-shooting unit, 6-evaluation unit, 7-determination unit, 12-electronic device, 14-external device, 16-processing unit, 18-bus, 20-network adapter, 22-input/output (I/O) interface, 24-display, 28-system memory, 30-random access memory, 32-cache memory, 34-storage system, 40-utility, 42-program module.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For easy understanding, technical problems and technical principles thereof need to be set forth in detail in the present invention.
The face recognition technology is widely applied at present; wherein, whether the face recognition is safe or not is determined by the judgment of the living body of the face. The living human face judgment refers to a technology for automatically judging whether the human face in a given image or video is from a real person or a deceptive human face (a mask, a printed photo, a photo displayed on a screen or a played video segment, etc.).
That is, the face living body judgment is to judge whether an image or a video acquired by a camera or the like is a real face or a fraudulent face made of a mask, a print photo, a photo displayed on a screen, a video clip played, or the like. If the face is a real face, subsequently judging the real identity of the face so as to identify the face; if the face is deceived, the real identity of the face is stopped to be judged, and recognition errors and other serious consequences are prevented.
The existing face living body judgment technology can be roughly divided into two types: static methods and dynamic methods.
The static method mainly judges the authenticity of a given face through the characteristics of color, texture, background objects and the like in an image after a single-frame image is obtained. The method has the characteristics of simplicity and high efficiency, but the safety level is not high, because the static face image is easy to forge through the modes of PS, synthesis software, high-definition screen display photos and the like, and the forgery is difficult to judge through the characteristics of colors, textures, background objects and the like. In fact, if a face image is given, it is difficult to distinguish (manually) if the image is a real face or a face obtained by other means such as a mask, a print, a picture displayed on a screen, or a video clip shot played, in the case of high definition. In addition, with the development of technology, the technical difficulty and the counterfeiting cost of the above-mentioned face counterfeiting method are lower and lower, the failure rate of identification is higher and higher, and the drawbacks of the current static method are more and more obvious.
The dynamic method mainly refers to the judgment of various single-frame images based on actions, and requires a user to complete specified facial actions such as mouth opening, blinking and the like before taking a picture. That is, the user performs actions such as opening the mouth, blinking and the like in front of the lens according to the instructions; after the action videos of opening the mouth and blinking are obtained, a single-frame image of specific action is captured, and then the single-frame image is judged to determine whether the single-frame image is a living face or a deceptive face. In this way, since one frame of image is extracted from the video, a counterfeiter must forge the whole video to cheat; these specific actions involve multiple facial nerves, and the difficulty of counterfeiting and therefore the level of security is high. However, since these actions involve a plurality of facial nerves, it is very complicated for the judging party to judge by what manner and what standard, and therefore, the judgment is difficult to be implemented; in addition, the user is difficult to perform and experiences poor experience because the set operation is not standard and needs to be repeated as required.
The existing living body detection method and the detection model thereof are both static or dynamic identification modes aiming at a single-frame image, so that the problems of low safety level or complex action difficulty are generally existed.
The disclosed embodiments provide a living body model training method, which can be executed by a living body model training device, and the living body model training device can be integrated in electronic equipment such as a computer and a server. FIG. 1 is a flowchart of an in vivo model training method according to an embodiment of the present invention; the living body model training method comprises the following steps:
step 10, acquiring a plurality of silent videos, and extracting a plurality of frame images to be trained from each silent video;
the silent video can be obtained after the user watches the camera or the screen for a period of time, and the user does not need to do special actions or specific actions, so that the experience degree of the client is not reduced.
It should be noted that the silent video (or each video) is formed by sequentially arranging multiple frames of images, and each frame of image may be an RGB image or an infrared image. And extracting a plurality of frame images to be trained from each silent video, namely extracting a plurality of frame images to be trained from the images of the plurality of frames in each silent video which are sequentially arranged.
Optionally, the number of frame images to be trained extracted from each silent video is the same, for example, N frame images to be trained are extracted from each silent video; therefore, by uniformly extracting the standard, the identification error caused by non-uniform standard can be greatly reduced, and the accuracy of model identification is improved.
Optionally, the extraction time intervals of two adjacent frame images to be trained in the frame images to be trained extracted from the silence video are the same. Therefore, the frame images to be trained can be uniformly extracted, and the recognition error caused by non-uniform extraction is avoided, so that the living body model with higher recognition accuracy is obtained.
Step 30, inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video;
the living body model may be a model such as VGG16, Alexnet, Resnet, etc., a model modified based on the above, or a customized network structure or a customized neural network.
After the living body model is determined, various parameters in the living body model are preset, and then the parameters are gradually adjusted through a loss function in the later period, so that the purpose of obtaining the optimal living body model is achieved. Each parameter in the living body model may be preset empirically, or may be a default value set by other means such as actual needs or usual settings.
And inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video, wherein the output result is a prediction result of the frame images to be trained/the silent video of the living body model and is not a real result of the frame images to be trained/the silent video.
Optionally, the living body model is a neural network model. Therefore, the neural network model can be judged more accurately through training, and the accuracy of human face living body identification is improved.
It should be noted that, in this step, inputting the frame images to be trained into the living body model means that all the frame images to be trained extracted from a silent video are input into the living body model, so that it can be determined whether the face in the silent video is a real face.
The method comprises the steps of inputting a plurality of frame images to be trained extracted from a silent video into a living body model, and further outputting a result from the living body model, wherein the output result is the output result of the silent video, namely the prediction result/identification result of the living body model on the silent video.
It should be noted that, frame images to be trained are input into the living body model, instead of being input frame by frame, frame images to be trained extracted from the same silent video are used as a group of frame images, and when the group of frame images are input, the group of frame images are input into the living body model to obtain an output result, where the result is an output result corresponding to the group of frame images and also an output result corresponding to the silent video. Step 40, calculating the value of the loss function according to the output result and the labeling result of the silent video;
the annotation result of the silent video refers to a real result of the frame image to be trained, and the obtaining mode of the result can be synchronously obtained when the frame image to be trained or the silent video is obtained, or can be obtained through manual identification. For example, it can be identified whether the obtained silence video is a real face or a deceptive face by a manual method, and the identification result of the silence video is the identification result of a plurality of frame images to be trained extracted from the silence video; for example, when recording a silent video, the user can simultaneously provide the recognition result of the silent video, and further obtain the recognition results of the extracted frame images to be trained.
And 50, adjusting the parameters of the living body model according to the value of the loss function until the value of the loss function is converged.
When the parameters of the neural network model or the living body model are adjusted according to the values of the loss function, iteration can be performed through a small batch of update samples. For example, 1 ten thousand frame images to be trained are provided, 100 frame images to be trained are selected each time to be input into a neural network model or the living body model, the values of loss functions of the 100 samples are calculated, and the parameters of the neural network model or the living body model are adjusted through the values of the loss functions; and then 100 frame images to be trained are selected to be input into the adjusted neural network model, and the updating is carried out by loop iteration until the value of the loss function is converged.
Thus, through steps 10-50, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic image is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Optionally, as shown in fig. 2, in step 10, obtaining a plurality of silence videos, and extracting a plurality of frame images to be trained from each silence video, then further includes:
and 20, acquiring an annotation result according to the silent video.
The annotation result refers to a real result of the frame image to be trained, and the acquisition mode of the annotation result may be synchronously obtained when the frame image to be trained or the silent video is acquired, may be pre-stored in the silent video or other storage parts, or may be obtained by manual identification. For example, it can be identified whether the obtained silence video is a real face or a deceptive face by a manual method, and the identification result of the silence video is the identification result of a plurality of frame images to be trained extracted from the silence video; for example, when recording a silent video, the user can simultaneously provide the recognition result of the silent video, and further obtain the recognition results of the extracted frame images to be trained.
Therefore, the annotation result is obtained through the silent video, and the real result of the frame image to be trained can be obtained, so that the identification accuracy is improved through subsequent comparison, and the living body model with higher accuracy is obtained.
Optionally, as shown in fig. 3, in step 10, acquiring a plurality of silence videos, and extracting a plurality of frame images to be trained from each silence video, includes:
step 11, obtaining a silent video, and dividing the silent video into a plurality of intervals;
when frame images to be trained need to be extracted from a plurality of silent videos, one silent video is selected for extraction, and another silent video is selected for extraction after extraction is finished until all silent videos are extracted.
The dividing of the silence video into a plurality of segments means that a plurality of time nodes are set in the silence video, and the time nodes between adjacent time nodes are divided segments, and since the time points further include 0 time and the last time, the number of the divided segments is one more than the number of the time nodes (not including 0 time and the last time) in the silence video.
Optionally, the silence video is uniformly divided into a plurality of segments, so as to improve the uniformity of the segments.
Optionally, the duration of the block section is a fixed value (constant); that is, when dividing the silent video, the silent video is divided one by one with a preset constant value (constant) as a time length, and the remaining time length is insufficient and is divided into a segment (or divided from the last time to 0 time by a preset time length, or other division modes); therefore, the duration of the interval is limited, so that omission of effective information of the silent video caused by too long duration of the interval can be prevented, and the situation that the quantity of the interval is too large and the quantity of data to be processed is increased caused by too short duration of the interval can also be prevented.
Step 12, extracting a frame image in each section; the frame image is a frame image to be trained;
each of the blocks includes a plurality of RGB images (frame images), and one frame image is extracted in each block, that is, one frame image is selected from the plurality of RGB images as an extracted frame image.
In this way, after the segments are divided, each segment extracts one frame image, so that the loss or excessive repetition of effective information caused by the fact that the distribution of the frame images in the silent video is too uneven (the time interval between two frame images is too long, which causes the loss of effective information, and the time interval is too short, which causes repeated display) due to direct extraction can be avoided.
And step 13, traversing all the silent videos to obtain a frame image to be trained.
Firstly, selecting one silent video for extraction, and then selecting another silent video for extraction after extraction is finished until all silent videos are extracted; therefore, all the silent videos can be ensured to be extracted in sequence through traversal, and the video extraction efficiency is improved.
In this way, through steps 11-13, after the segments are divided, each segment extracts one frame image, which can avoid the loss or excessive duplication of effective information caused by the uneven distribution of the frame images in the silent video (the too long time interval between two frame images can cause the loss of effective information, and the too short time interval can cause repeated display).
Optionally, the output result of the living body model at least includes: face presence score, liveness score, and attack score.
The face existence score is used for judging whether a face exists or not; the living body score is used for judging whether the face in the living body score is a living body (a living person or a real face); the attack score is used for judging whether the face is a network attack (a spoofed face).
In this way, the face detection and the living body detection are integrated into one model by outputting three scores in the output result, so that the face and the living body can be simultaneously detected directly through one living body detection model, and the method is simple to use, high in precision and high in safety.
Optionally, the attack score includes: a screen flip score, a print paper score, a paper cut score, a matte mask score, and a 3D model score.
The attack score comprises types, wherein the types are specific types used for judging five types of attacks in reality; this attack five kinds of attack screen reproduction attacks, beat paper attack, paper-cut attack, scratch mask attack and 3D model attack, through to the attack kind in the real life, carry out a large amount of practices and summarize and obtain, divide into five major categories with it in this step, specifically do:
and (3) screen copying attack: the shooting objects are screens of mobile phones, computers, ipads, display screens and the like, and the materials comprise pictures, videos and the like;
attack on printing paper: the shooting objects are complete printing paper with various sizes, and the materials comprise black and white paper, color paper, glossy paper, matte paper, art paper and the like;
paper-cut attack: the shooting object is printing paper cut along the contour of the human body, and the materials comprise black and white paper, color paper, glossy paper, matte paper, art paper and the like;
attack of the matting mask: the shooting object is an attack combining cut printing paper and a real person, the cutting mode comprises cutting eyes, mouths, noses and the like, and the materials comprise black and white paper, color paper, glossy paper, matte paper, art paper and the like;
3D model attack: the shooting object is a 3D model, and the model material comprises silica gel, plastic, graphite and the like.
Therefore, by finely classifying the attack types, the performances of the detection method and the model on different attack types can be counted, the defects of the method and the model on different attack types can be made up in time, and the purposes of simple use, high precision and high safety are achieved.
It should be noted that the classification into five categories is one of the classification manners for the existing attack types, and may also be classified into other multiple categories according to other classification standards, such as four categories, three categories, seven categories, eight categories, and the like, where the specific classification number and types may be adjusted according to actual situations.
When classifying attacks into five types, we prefer to set the output result and the annotation result as vectors of (1, 7), specifically:
y=(p,c0,c1,c2,c3,c4,c5)
wherein y represents the annotation result, p represents whether a face exists in the video, p ═ 1 represents the face, and p ═ 0 represents no face (in the output result, the intermediate value between 0 and 1 may also be output, so a threshold value may be set, above which a face is considered to exist, and below which no face is considered to exist); c0 represents the live score, c1 represents the screen flip attack score, c2 represents the print paper attack score, c3 represents the cut paper attack score, c4 represents the matting mask attack score, and c5 represents the 3D model attack score.
Thus, the seven types of videos collected in the data collection and annotation stage (living body model training stage) and the corresponding annotation results are as follows:
no face video- (0, 0, 0, 0, 0, 0, 0, 0)
Human video (live video) — (1, 1, 0, 0, 0)
Screen-flipping attack video- (1, 0, 1, 0, 0, 0, 0)
Printing paper attack video (1, 0, 0, 1, 0, 0, 0)
Paper-cut attack video- (1, 0, 0, 0, 1, 0, 0)
Matting mask attack video (1, 0, 0, 0, 0, 1, 0)
3D model attack video- (1, 0, 0, 0, 0, 0, 1)
Optionally, in the annotation result of the silence video, when the face existence score represents that no face exists, the loss function is determined by the face existence score in the annotation result of the silence video and the face existence score in the output result.
That is, when the face existence score represents that no face exists (the score is 0), the loss function is calculated only through the face existence score, and other scores in the labeling result and the output result are not considered.
When the face existence score represents that no face exists (or when p is 0 in the labeling result), the loss function is as follows:
L=(p-p^)2
in the formula, L is the value of the loss function, p is the face existence fraction in the living body model labeling result, and p ^ is the face existence fraction in the output result.
Optionally, in the annotation result of the silence video, when the face existence score represents that a face exists, the loss function is determined by the face existence score, the living body score, and the attack score in the annotation result of the silence video and the face existence score, the living body score, and the attack score in the output result.
That is, when the face presence score represents the presence of a face, the loss function is calculated by labeling all scores in the result and the output result.
When the face existence score represents that a face exists (or when p is 1 in the labeling result), the loss function is as follows:
Figure BDA0002357506070000141
wherein L is the value of the loss function, p is the face existence fraction in the living body model labeling result, p ^ is the face existence fraction in the output result, ciTo label the live or attack scores in the results,
Figure BDA0002357506070000142
is the living body score or attack score in the output result.
Therefore, the corresponding relation between the output result and the actual classification is established through the loss function setting under different human faces, so that the classification function can reflect the accuracy degree of the actual classification, and thus, the accuracy and the identification degree of the living body model identification can be improved through adjusting the parameters in the living body model through the loss function, and a better training effect is achieved.
The embodiment of the disclosure provides a face in-vivo detection method, which can be executed by a face in-vivo detection device, and the face in-vivo detection device can be integrated in electronic equipment such as a computer and a server. Fig. 4 is a flowchart of a living human face detection method according to an embodiment of the present invention; the human face living body detection method comprises the following steps:
step 100, shooting a silent video of a human face, and extracting a plurality of frame images to be evaluated from the silent video;
in the human face living body detection method, specific contents of step 100 may refer to specific descriptions of step 10 and steps 11 to 13 in the living body model training method, and are not described herein again.
Optionally, the duration of the silence video is 1-3 s. Therefore, the time that the user needs to shoot can be shortened, the action difficulty of shooting of the user is reduced, and the experience feeling is improved.
Step 200, inputting the frame image to be evaluated into a preset living body model to obtain an evaluation result; the living body model is obtained by training by the living body model training method;
in the human face living body detection method, specific contents of step 200 may refer to specific descriptions of step 30 in the living body model training method, and are not described herein again.
Wherein, the living body model is obtained by adopting the living body model training method; therefore, the living body model is firstly subjected to learning training, and then the frame image to be evaluated is judged and identified through the trained living body model, so that the frame image to be evaluated can be accurately classified.
And 300, judging the detection result of the silent video according to the evaluation result.
Thus, through the step 100-300, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the situation of low security level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic image is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Optionally, as shown in fig. 5, the step 300 of determining the detection result of the silence video according to the evaluation result includes:
step 310, judging whether the face existence score in the evaluation result is smaller than a preset face threshold value;
the face existence score is greater than the face threshold, which means that the possibility of the face existence is high, and the face existence score can be regarded as the face existence in the silent video; the face existence score is smaller than the face threshold, which means that the probability of the face existence is very low, and it can be considered that no face exists in the silence video.
Optionally, the face threshold value ranges from 0.3 to 0.5. Therefore, the face existence scores can be accurately judged, and whether the face exists in the silent video or not can be determined.
Step 320, if the value is smaller than the face threshold value, the live body detection of the silent video fails;
the silent video is a video shot by a user, and the absence of a human face in the silent video means that the shooting of the silent video is problematic, and the human face does not exist in the video, so that it can be determined that the real human face does not exist in the silent video, and therefore, the live body detection of the silent video is determined not to pass.
Step 330, if the living body score is not less than the face threshold, judging whether the living body score is the highest score in the living body score and the attack score;
step 340, if the score is the highest score, the live body detection of the silent video passes;
and 350, if the score is not the highest score, the live body detection of the silent video fails.
The existence of the face in the silent video can be judged on the basis that the face in the silent video is a real face or a deceptive face.
Judging whether the living body score and the attack score are the highest scores; if the live body score is the highest score, the face in the silent video is a real face, and the live body detection is passed; and if the attack score is the highest score, the face in the silent video is a deception face, and the living body detection fails.
Optionally, as mentioned above, the attack score includes: a screen flip score, a print paper score, a paper cut score, a matte mask score, and a 3D model score.
It should be noted that the classification into five categories is one of the classification manners for the existing attack types, and may also be classified into other multiple categories according to other classification standards, such as four categories, three categories, seven categories, eight categories, and the like, where the specific classification number and types may be adjusted according to actual situations.
As mentioned above, the evaluation result is:
y=(p,c0,c1,c2,c3,c4,c5)
the liveness score and the attack score are c0, c1, c2, c3, c4, c 5; the above-mentioned judging step is to judge the highest score in c0, c1, c2, c3, c4, c5, if the live body score c0 is the highest score, the face in the silent video is the real face, and the live body detection passes; if one of the attack scores c1, c2, c3, c4 and c5 is the highest score, the face in the silent video is a deceptive face, and the live body detection is not passed; in the attack scores, c1 represents a screen-flipping attack score, c2 represents a printing paper attack score, c3 represents a paper-cut attack score, c4 represents a matting mask attack score, and c5 represents a 3D model attack score, wherein one of the scores is the highest score (which means that the attack score is the highest score), and represents that the spoofed face in the silent video is the attack type corresponding to the highest score.
Therefore, the attack types are finely classified through the step 310-350, the defects of the method and the model in different attack types are made up in time through the performance of the statistical detection method and the model in different attack types, the attack types are directly obtained, and the purposes of simplicity and convenience in use, high precision and high safety are achieved.
The embodiment of the present disclosure provides a living body model training device, which is used for executing the living body model training method described in the above of the present disclosure, and the living body model training device is described in detail below.
FIG. 6 is a block diagram of an apparatus for training an in-vivo model according to an embodiment of the present invention; wherein, the living body model training device includes:
the device comprises an acquisition unit 1, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of silent videos and extracting a plurality of frame images to be trained from each silent video;
the model unit 2 is used for inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video;
the calculating unit 3 is used for calculating the value of the loss function according to the output result and the labeling result of the silent video;
an adjusting unit 4 for adjusting the parameters of the living body model according to the value of the loss function until the value of the loss function converges.
Therefore, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic images is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Optionally, the living body model is a neural network model.
Optionally, the obtaining unit 1 is further configured to: and acquiring an annotation result according to the silent video.
Optionally, the obtaining unit 1 is further configured to: acquiring the silent video, and dividing the silent video into a plurality of intervals; extracting a frame image in each section; the frame image is a frame image to be trained; and traversing all the silent videos to obtain a frame image to be trained.
Optionally, the output result of the living body model at least includes: face presence score, liveness score, and attack score.
Optionally, the attack score includes: a screen flip score, a print paper score, a paper cut score, a matte mask score, and a 3D model score.
Optionally, in the annotation result of the silence video, when the face existence score represents that no face exists, the loss function is determined by the face existence score in the annotation result of the silence video and the face existence score in the output result.
Optionally, in the annotation result of the silence video, when the face existence score represents that a face exists, the loss function is determined by the face existence score, the living body score, and the attack score in the annotation result of the silence video and the face existence score, the living body score, and the attack score in the output result.
The embodiment of the present disclosure provides a face biopsy device, which is used for executing the face biopsy method according to the above-mentioned contents of the present disclosure, and the face biopsy device is described in detail below.
Fig. 7 is a block diagram of a living human face detection apparatus according to an embodiment of the present invention; wherein, human face live body detection device includes:
the shooting unit 5 is used for shooting a silent video of a human face and extracting a plurality of frame images to be evaluated from the silent video;
the evaluation unit 6 is used for inputting the frame image to be evaluated into a preset living body model to obtain an evaluation result; the living body model is obtained by training by the living body model training method as claimed in the foregoing;
and the judging unit 7 is used for judging the detection result of the silent video according to the evaluation result.
Therefore, a plurality of single-frame images can be extracted from the silent video and input into the living body model for training and result output, and the multi-frame image judgment mode and the multi-frame image judgment model avoid the condition of low safety level caused by the single-frame images, and the silent video is obtained without setting specific actions, so that the problem of too complex action difficulty in the dynamic images is avoided; not only the security level is high, and the action is simple, and the realization of being convenient for has improved user's experience degree.
Optionally, the living body model is a neural network model.
Optionally, the output result of the living body model at least includes: face presence score, liveness score, and attack score.
Optionally, the attack score includes: a screen flip score, a print paper score, a paper cut score, a matte mask score, and a 3D model score.
Optionally, the determining unit 7 is further configured to: judging whether the face existence score in the evaluation result is smaller than a preset face threshold value or not; if the detected value is smaller than the face threshold value, the live body detection of the silent video fails; if the living body score is not smaller than the face threshold, judging whether the living body score is the highest score or not in the living body score and the attack score; if the score is the highest score, the live body detection of the silent video passes; and if the score is not the highest score, the live body detection of the silent video fails.
It should be noted that the above-described device embodiments are merely illustrative, for example, the division of the units is only one logical function division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
Having described the internal functions and structures of the living body model training apparatus and the human face living body detection apparatus as described above, as shown in fig. 8, in practice, the living body model training apparatus and the human face living body detection apparatus may be implemented as an electronic device including: a processor and a memory, the memory storing a control program, the control program, when executed by the processor, implementing the living body model training method or the human face living body detection method.
Fig. 9 is a block diagram illustrating another electronic device according to an embodiment of the invention. For example, the electronic device 800 may be a computer, a server, a terminal, a digital broadcast terminal, a messaging device, etc.
The electronic device 12 shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 9, the electronic device 12 may be implemented in the form of a general-purpose electronic device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. These architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, to name a few.
Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer-readable storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown, but commonly referred to as a "hard drive"). Although not shown in FIG. 9, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only memory (CD-ROM), a Digital versatile disk Read Only memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described herein.
Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the computer system/server 12, and/or with any devices (e.g., network card, modem, etc.) that enable the computer system/server 12 to communicate with one or more other electronic devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network such as the Internet) via the Network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It is noted that although not shown, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing, for example, implementing the methods mentioned in the foregoing embodiments, by executing programs stored in the system memory 28.
The electronic device of the invention can be a server or a terminal device with limited computing power, and the lightweight network structure of the invention is particularly suitable for the latter. The base body implementation of the terminal device includes but is not limited to: intelligent mobile communication terminal, unmanned aerial vehicle, robot, portable image processing equipment, security protection equipment etc..
The embodiment of the present disclosure provides a computer-readable storage medium, which stores instructions, and when the instructions are loaded and executed by a processor, the living body model training method described above is implemented, or the human face living body detection method described above is implemented.
The technical solution of the embodiment of the present invention substantially or partly contributes to the prior art, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method according to the embodiment of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
Although the present disclosure has been described above, the scope of the present disclosure is not limited thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present disclosure, and these changes and modifications are intended to be within the scope of the present disclosure.

Claims (15)

1. A method for training a living body model, comprising:
acquiring a plurality of silent videos, and extracting a plurality of frame images to be trained from each silent video;
inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video;
calculating the value of a loss function according to the output result and the labeling result of the silent video;
adjusting parameters of the living body model according to the values of the loss function until the values of the loss function converge.
2. The in-vivo model training method according to claim 1, wherein the in-vivo model is a neural network model.
3. The living body model training method according to claim 1, wherein the acquiring a plurality of silent videos and extracting a plurality of frame images to be trained from each silent video further comprises:
and acquiring an annotation result according to the silent video.
4. The living body model training method according to claim 1, wherein the obtaining a plurality of silent videos and extracting a plurality of frame images to be trained from each silent video comprises:
acquiring the silent video, and dividing the silent video into a plurality of intervals;
extracting a frame image in each section; the frame image is a frame image to be trained;
and traversing all the silent videos to obtain a frame image to be trained.
5. The in-vivo model training method as claimed in any one of claims 1 to 4, wherein the output result of the in-vivo model at least comprises: face presence score, liveness score, and attack score.
6. The in-vivo model training method as claimed in claim 5, wherein the attack score comprises: a screen flip score, a print paper score, a paper cut score, a matte mask score, and a 3D model score.
7. The living body model training method according to claim 5, wherein in the labeling result of the silence video, when the face existence score indicates that no face exists, the loss function is determined by the face existence score in the labeling result of the silence video and the face existence score in the output result.
8. The living body model training method according to claim 5, wherein when the face existence score indicates that a face exists in the labeling result of the silence video, the loss function is determined by the face existence score, the living body score and the attack score in the labeling result of the silence video and the face existence score, the living body score and the attack score in the output result.
9. A face living body detection method is characterized by comprising the following steps:
shooting a silent video of a human face, and extracting a plurality of frame images to be evaluated from the silent video;
inputting the frame image to be evaluated into a preset living body model to obtain an evaluation result; the living body model is obtained by training by using the living body model training method according to any one of claims 1-8;
and judging the detection result of the silent video according to the evaluation result.
10. The face live detection method according to claim 9, wherein the duration of the silent video is 1-3 s.
11. The face live body detection method according to claim 9, wherein the determining the detection result of the silence video according to the evaluation result includes:
judging whether the face existence score in the evaluation result is smaller than a preset face threshold value or not;
if the detected value is smaller than the face threshold value, the live body detection of the silent video fails;
if the living body score is not smaller than the face threshold, judging whether the living body score is the highest score or not in the living body score and the attack score;
if the score is the highest score, the live body detection of the silent video passes;
and if the score is not the highest score, the live body detection of the silent video fails.
12. An exercise apparatus for living body models, comprising:
the device comprises an acquisition unit (1) for acquiring a plurality of silent videos and extracting a plurality of frame images to be trained from each silent video;
the model unit (2) is used for inputting the plurality of frame images to be trained into a preset living body model to obtain an output result corresponding to the silent video;
the computing unit (3) is used for computing the value of the loss function according to the output result and the labeling result of the silent video;
an adjusting unit (4) for adjusting the parameters of the living body model according to the values of the loss function until the values of the loss function converge.
13. A face liveness detection device, comprising:
the shooting unit (5) is used for shooting a silent video of a human face and extracting a plurality of frame images to be evaluated from the silent video;
the evaluation unit (6) is used for inputting the frame image to be evaluated into a preset living body model to obtain an evaluation result; the living body model is obtained by training by using the living body model training method according to any one of claims 1-8;
and the judging unit (7) is used for judging the detection result of the silent video according to the evaluation result.
14. An electronic device comprising a processor and a memory, wherein the memory stores a control program which, when executed by the processor, implements the in vivo model training method as recited in any one of claims 1 to 8, or implements the in vivo human face detection method as recited in any one of claims 9 to 11.
15. A computer-readable storage medium storing instructions which, when loaded and executed by a processor, implement the in-vivo model training method as claimed in any one of claims 1 to 8, or implement the in-vivo human face detection method as claimed in any one of claims 9 to 11.
CN202010012081.2A 2020-01-07 2020-01-07 Living model training and human face living body detection method and device and electronic equipment Active CN111209863B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010012081.2A CN111209863B (en) 2020-01-07 2020-01-07 Living model training and human face living body detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010012081.2A CN111209863B (en) 2020-01-07 2020-01-07 Living model training and human face living body detection method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN111209863A true CN111209863A (en) 2020-05-29
CN111209863B CN111209863B (en) 2023-12-15

Family

ID=70789573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010012081.2A Active CN111209863B (en) 2020-01-07 2020-01-07 Living model training and human face living body detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111209863B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884001A (en) * 2021-01-15 2021-06-01 广东省特种设备检测研究院珠海检测院 Carbon steel graphitization automatic grading method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670413A (en) * 2018-11-30 2019-04-23 腾讯科技(深圳)有限公司 Face living body verification method and device
CN110298230A (en) * 2019-05-06 2019-10-01 深圳市华付信息技术有限公司 Silent biopsy method, device, computer equipment and storage medium
CN110378219A (en) * 2019-06-13 2019-10-25 北京迈格威科技有限公司 Biopsy method, device, electronic equipment and readable storage medium storing program for executing
CN110427899A (en) * 2019-08-07 2019-11-08 网易(杭州)网络有限公司 Video estimation method and device, medium, electronic equipment based on face segmentation
US20190362171A1 (en) * 2018-05-25 2019-11-28 Beijing Kuangshi Technology Co., Ltd. Living body detection method, electronic device and computer readable medium
CN110633647A (en) * 2019-08-21 2019-12-31 阿里巴巴集团控股有限公司 Living body detection method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190362171A1 (en) * 2018-05-25 2019-11-28 Beijing Kuangshi Technology Co., Ltd. Living body detection method, electronic device and computer readable medium
CN109670413A (en) * 2018-11-30 2019-04-23 腾讯科技(深圳)有限公司 Face living body verification method and device
CN110298230A (en) * 2019-05-06 2019-10-01 深圳市华付信息技术有限公司 Silent biopsy method, device, computer equipment and storage medium
CN110378219A (en) * 2019-06-13 2019-10-25 北京迈格威科技有限公司 Biopsy method, device, electronic equipment and readable storage medium storing program for executing
CN110427899A (en) * 2019-08-07 2019-11-08 网易(杭州)网络有限公司 Video estimation method and device, medium, electronic equipment based on face segmentation
CN110633647A (en) * 2019-08-21 2019-12-31 阿里巴巴集团控股有限公司 Living body detection method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MOSTAFA PARCHAMI等: "Video-Based Face Recognition Using Ensemble of Haar-Like Deep Convolutional Neural Networks", 《2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)》 *
武警贺: "视频中的人脸检测与识别方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑 2019年第08期》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884001A (en) * 2021-01-15 2021-06-01 广东省特种设备检测研究院珠海检测院 Carbon steel graphitization automatic grading method and system
CN112884001B (en) * 2021-01-15 2024-03-05 广东省特种设备检测研究院珠海检测院 Automatic grading method and system for graphitization of carbon steel

Also Published As

Publication number Publication date
CN111209863B (en) 2023-12-15

Similar Documents

Publication Publication Date Title
EP3564854B1 (en) Facial expression recognition method, apparatus, electronic device, and storage medium
Korshunov et al. Vulnerability assessment and detection of deepfake videos
CN109359548B (en) Multi-face recognition monitoring method and device, electronic equipment and storage medium
EP3916627A1 (en) Living body detection method based on facial recognition, and electronic device and storage medium
CN108805047B (en) Living body detection method and device, electronic equipment and computer readable medium
CN109858371B (en) Face recognition method and device
WO2018166524A1 (en) Face detection method and system, electronic device, program, and medium
CN109657554B (en) Image identification method and device based on micro expression and related equipment
WO2020258667A1 (en) Image recognition method and apparatus, and non-volatile readable storage medium and computer device
CN108549886A (en) A kind of human face in-vivo detection method and device
CN109325933A (en) A kind of reproduction image-recognizing method and device
TW202036463A (en) Living body detection method, device, apparatus, and storage medium
CN110851835A (en) Image model detection method and device, electronic equipment and storage medium
CN111222433B (en) Automatic face auditing method, system, equipment and readable storage medium
CN112464822B (en) Helmet wearing detection method and system based on feature enhancement
CN108805005A (en) Auth method and device, electronic equipment, computer program and storage medium
CN110059607B (en) Living body multiplex detection method, living body multiplex detection device, computer equipment and storage medium
CN109657627A (en) Auth method, device and electronic equipment
CN112633221A (en) Face direction detection method and related device
CN113468954B (en) Face counterfeiting detection method based on local area features under multiple channels
CN112488137A (en) Sample acquisition method and device, electronic equipment and machine-readable storage medium
CN111209863B (en) Living model training and human face living body detection method and device and electronic equipment
CN113723310B (en) Image recognition method and related device based on neural network
CN108764033A (en) Auth method and device, electronic equipment, computer program and storage medium
CN111126283B (en) Rapid living body detection method and system for automatically filtering fuzzy human face

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant