CN110674730A - Monocular-based face silence living body detection method - Google Patents

Monocular-based face silence living body detection method Download PDF

Info

Publication number
CN110674730A
CN110674730A CN201910893676.0A CN201910893676A CN110674730A CN 110674730 A CN110674730 A CN 110674730A CN 201910893676 A CN201910893676 A CN 201910893676A CN 110674730 A CN110674730 A CN 110674730A
Authority
CN
China
Prior art keywords
layer
neural network
convolutional neural
output
convolutional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910893676.0A
Other languages
Chinese (zh)
Inventor
谢巍
周延
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201910893676.0A priority Critical patent/CN110674730A/en
Publication of CN110674730A publication Critical patent/CN110674730A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a monocular-based face silence living body detection method, which comprises the following steps: s1, obtaining a training data set in a data enhancement mode; s2, training the image by using the improved convolutional neural network, and storing the convolutional neural network model obtained after training; and S3, capturing a single-frame face by using a camera, calling the trained model to perform living body detection, and realizing real-time high-accuracy face living body identification. The method trains true and false face data samples by building a multilayer convolutional neural network to obtain a classification model, and achieves the aim of identifying the living body of the image.

Description

Monocular-based face silence living body detection method
Technical Field
The invention relates to the field of image processing technology, computer vision and pattern recognition, in particular to a monocular-based face silence living body detection method.
Background
With the increasing maturity of image processing technology, computer vision algorithm, etc., the face recognition technology is developed vigorously, and the face anti-counterfeiting technology is an important research subject. The living body detection is a method for determining the real physiological characteristics of an object in some identity verification scenes, and in the application of face recognition, the living body detection can verify whether a user operates for the real living body by combining actions of blinking, mouth opening, shaking, nodding and the like and using technologies such as face key point positioning, face tracking and the like. Common attack means such as photos, face changing, masks, sheltering and screen copying can be effectively resisted, so that a user is helped to discriminate fraudulent behaviors, and the benefit of the user is guaranteed. Currently existing in vivo detection methods are:
silent live body detection: compared with the dynamic living body detection method, the silent living body detection method is that a user does not need to do any action and naturally faces the camera for 3 and 4 seconds. Since the real face is not absolutely still, there are micro-expressions such as the rhythm of the eyelid and eyeball, blinking, stretching of the lips and peripheral cheeks, etc., which can be used to counter-cheat.
Infrared living body detection: the infrared camera with the additional device is electromagnetic wave in nature whether visible light or infrared light. What the length of the image is finally seen is related to the reflective properties of the material surface. The reflection characteristics of real human faces and attack media such as paper sheets, screens, stereo masks and the like are different, so that the imaging is also different, and the difference is more obvious in the aspect of infrared wave reflection.
An optical flow method: the motion of each pixel position is determined by utilizing the time domain variation and the correlation of pixel intensity data in the image sequence, the operation information of each pixel point is obtained from the image sequence, and a Gaussian difference filter, LBP characteristics and a support vector machine are adopted for data statistical analysis. Meanwhile, the optical flow field is sensitive to the movement of an object, and eyeball movement and blink can be uniformly detected by using the optical flow field. This way of live detection can be done blindly without user cooperation.
3D camera: shooting a human face to obtain 3D data of a corresponding human face area, and performing further analysis based on the data to finally judge whether the human face is from a living body or a non-living body. The sources of non-living objects are wide, including photos and videos of media such as mobile phones and pads, photos of various printed different materials (including various situations such as bending, folding, cutting, digging, and the like), and the like. The key point is how to select the most distinguishing features to train a classifier based on the 3D face data of the living body and the non-living body, and the trained classifier is used for distinguishing the living body and the non-living body.
Disclosure of Invention
In order to solve the above problems, the present invention provides a monocular-based face silence live detection method. The convolutional neural network is a feed-forward neural network which comprises convolutional calculation and has a deep structure, can perform translation invariant classification on input information, and is widely applied to the fields of image recognition, natural language processing, audio processing and the like. The algorithm provided by the invention comprises three steps: the method comprises the steps of firstly obtaining a training set with rich data in a data enhancement mode, then training images by using an improved deep neural network consisting of a plurality of convolution layers, pooling layers and BN layers, storing a model, finally capturing a single-frame face by using a camera, and calling the trained model to perform living body detection so as to realize real-time high-accuracy face living body recognition.
The invention is realized by at least one of the following technical schemes.
A monocular-based face silence living body detection method comprises the following steps:
s1, obtaining a training data set in a data enhancement mode, and carrying out enhancement processing on the data;
s2, training the image by using the improved convolutional neural network, and storing the convolutional neural network model obtained after training;
and S3, capturing a single-frame face by using a camera, and performing living body detection by using a convolutional neural network model to realize real-time high-accuracy face living body identification.
Further, the training set of step S1 is obtained by:
according to the video in the CASIA-FASD data set, cutting out human faces from images by using a Haar classifier, wherein the images form a part of a training data set; and shooting sample pictures of true and false faces in different scenes as the other part of the training data set, and carrying out data enhancement processing of random adjustment and random rotation on the brightness and the saturation of the image of the training data set. The real face is the real face, and the false face is the face on a photo or the face picture on the screen of various devices.
Further, the improved convolutional neural network is an improved VGG11 network, the improved VGG11 network comprises 11 convolutional layers and three full-connected layers, each convolutional layer is followed by a ReLU layer (namely convolutional layer + ReLU layer), every two convolutional layers + ReLU layers are followed by a maximum pooling layer and a random inactivation layer (dropout), the last three random inactivation layers are respectively followed by a full-connected layer, each full-connected layer is followed by a ReLU layer, and the last ReLU layer is connected with a softmax layer; in the output of the first two convolutional layers, each convolutional layer is connected to a BN layer, which is connected to a largest pooling layer, which is in turn connected to a random deactivation layer.
Further, the training of the improved VGG11 network is specifically as follows:
1) and (3) carrying out Batch Normalization (Batch Normalization) on the outputs of the first two layers of convolution layers so as to ensure that the intermediate output value of the convolutional neural network is stable and prevent the gradient from disappearing, wherein the Batch Normalization principle formula is as follows:
Figure BDA0002209568490000031
wherein x(k)Is the k-th dimension vector of the input, E x(k)]Is x(k)Mean value of (1), Var [ x ](k)]Is x(k)The variance of (a);
2) dropout is used for each convolution layer output, so that the activation value of a certain neuron stops working with probability p during forward propagation, and overfitting is prevented;
3) the learning rate is an attenuated learning rate, and the updating speed of the parameters is controlled by using the learning rate when the improved convolutional neural network is trained.
Further, softmax layer is represented as:
Figure BDA0002209568490000041
wherein is v'jThe output of the last layer of the improved convolutional neural network, j represents the category index, represents the ratio of the index of the current element to the sum of the indexes of all elements, yjTwo neurons are included that correspond to the probability distribution of a binary classified image of a real face and a false face.
Further, the formula of the network structure of VGG11 using dropout is as follows:
rj (l)~Bernoulli(p) (3)
Figure BDA0002209568490000042
Figure BDA0002209568490000043
yi (l+1)=f(zi (l+1)) (6)
wherein z isi (l+1)Is the output of the l +1 layer in the improved convolutional neural network, yi (l+1)Is the final output of the improved convolutional neural network,is the output value of the l-th layer neuron after dropout operation, and the Bernoulli function is to randomly generate a vector r of 0 or 1j (l),y(l)Is the output of the l-th layer of the improved convolutional neural network,
Figure BDA0002209568490000045
for the first layer output after dropout processing, wi (l+1)Is the weight of the l +1 layer, bi (l+1)Is the bias of the l +1 layer, is the representation of the dropout processing operation, r(l)A vector of 0 or 1, i denotes the ith dimension, and p is the activation probability of the neuron.
Further, the BN (batch normalization) layer normalizes the data into a standard gaussian distribution with a mean value of 0 and a variance of 1, as follows:
consider a vector B of size m ═ x1...,xi,xm},xiThe output y after the BN layer is the element in the vector and the two parameters gamma and beta to be learned and used for maintaining the expression capability of the improved convolutional neural networki=BNγ,β(xi)
Figure BDA0002209568490000051
Figure BDA0002209568490000052
Figure BDA0002209568490000054
Wherein, muBIs the minimum batch mean, σB 2Is the minimum batch variance and is the minimum batch variance,
Figure BDA0002209568490000055
is normalized xiε is a constant and is set to 1, yiIs the output of BN layer, BNγ,βIs a BN layer normalization function with gamma and beta as parameters.
Compared with the prior art, the invention has the beneficial effects that: the prior technologies are limited to anti-spoofing detection of faces on a screen or anti-spoofing detection of printed faces, but the invention can realize live body detection of faces of videos and printed photos, and can realize more accurate live body detection without the cooperation of a user.
Drawings
Fig. 1 is a flowchart of a monocular-based face silence live detection method according to the present embodiment;
FIG. 2 is a block diagram of a convolutional neural network of the present embodiment;
FIG. 3 is a block diagram of a convolutional neural network without dropout in the present embodiment;
FIG. 4 is a block diagram of a convolutional neural network using dropout in the present embodiment;
fig. 5 is a network structure diagram of the convolutional neural network test of the embodiment.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments.
As shown in fig. 1, a monocular-based face silence live-body detection method includes the following steps:
s1, obtaining a training data set in a data enhancement mode, wherein the specific obtaining process is as follows:
according to videos in a face anti-spoofing DATABASE (CASIA DATABASE) of the Chinese academy of sciences, a cascade classifier is used for cutting out faces from images, the images form a part of a training data set, and the embodiment also adopts images in the face anti-spoofing DATABASE of Nanjing aerospace university; the method comprises the steps of shooting sample pictures (actualscreenario) of true and false faces in different scenes as training samples, and carrying out random adjustment on image brightness and saturation and random rotation data enhancement processing on a training data set. The CAASA-FASD data set consists of videos, each consisting of 100 to 200 video frames. 30 frames (the same interval between each frame) are captured for each video.
The human face image in the Nanjing aerospace university face anti-spoofing DATABASE (NUAA-DATABASE) can also be used as a training data set, and the image in the NUAA DATABASE is shot by different people under different illumination conditions. Carrying out random brightness adjustment, random saturation adjustment, random contrast adjustment and random overturning on the face image to increase the generalization capability of the model;
s2, training the image by using the improved convolutional neural network, and storing the convolutional neural network model obtained after training;
as shown in fig. 2, the convolutional neural network is a modified VGG11 network structure, and a modified VGG11 network is used to classify real and false faces. On the basis of the original VGG11 network, the improved VGG11 network structure comprises 11 convolutional layers and three full connected layers (full connected), wherein a ReLU layer (namely convolutional layer + ReLU layer) is added behind each convolutional layer (Conv), a Max pooling layer (Max pooling) and a random deactivation layer (dropout) are added behind every two convolutional layers + ReLU layers, a full connected layer is respectively added behind the last three random deactivation layers, a linear rectification function (ReLU) layer is added behind each full connected layer, and the last ReLU layer is connected with a softmax layer; in the output of the first two convolutional layers, each convolutional layer is connected to a BN layer (batch normalization layer) which is connected to a maximum pooling layer, which in turn is connected to a random deactivation layer.
The size of a pooling kernel in the convolutional neural network is 2 multiplied by 2, the step length is 2, and the pooling kernel comprises an input layer, 8 convolutional layers, two full-link layers and a normalized exponential function (Softmax) layer; the first convolutional layer and the second convolutional layer respectively comprise 64 convolution kernels and 128 convolution kernels; the sizes of convolution kernels are 7 x 7 and 5 x 5, the first convolution layer and the second convolution layer are respectively followed by a maximum pooling layer, and the size of the maximum pooling layer is 2 x 2; weight sharing is carried out between the third convolutional layer and the fourth convolutional layer, the convolutional layers respectively comprise 256 convolutional kernels, and the size of each convolutional kernel is 3 multiplied by 3; weight sharing is carried out between the fifth convolutional layer and the sixth convolutional layer, the convolutional layers respectively comprise 512 convolutional kernels, and the size of each convolutional kernel is 3 multiplied by 3; the seventh convolutional layer and the eighth convolutional layer share the weight, the convolutional layers respectively comprise 512 convolutional cores, the size of each convolutional core is 3 multiplied by 3, and the fully-connected layer is completely connected with the eighth convolutional layer; the image of the input layer is 200 × 200 × 3 pixels, and includes three channels of RGB, and after the image is preprocessed, convolutional neural network processing can be performed.
The convolutional neural network structure adopted by the invention is shown in table 3, and comprises eight convolutional layers, three full-link layers and one softmax layer, wherein the middle activation function adopts a ReLU activation function, and the pooling layer adopts a maximum pooling function. The network with the structure is used for training the face image to obtain a true and false face discrimination model, so that the monocular silence detection of the living body is realized.
TABLE 3 network architecture
Figure BDA0002209568490000071
Figure BDA0002209568490000081
Wherein Conv denotes a convolutional layer, Pool denotes a pooling layer, and Fully connected denotes a Fully connected layer.
The last layer is the Softmax layer, which is denoted as:
wherein is v'jOutput of a layer preceding the last layer of the network, j denotes a category index, yjDenotes the ratio of the index of the current element to the sum of the indices of all elements, yjTwo neurons are included that correspond to the probability distribution of a binary classified image of a real face and a false face.
The training of the improved VGG11 network is specifically as follows:
1) the output of the first two layers of convolution layers is subjected to Batch Normalization (Batch Normalization), input data is subjected to Normalization processing, and therefore the stability of the intermediate output value of the convolution neural network is guaranteed, the disappearance of gradients is prevented, and the Batch Normalization principle formula is as follows:
Figure BDA0002209568490000091
wherein x(k)Is the k-th dimension vector of the input, E x(k)]Is x(k)Mean value of (1), Var [ x ](k)]Is x(k)The variance of (c).
2) Dropout is used for each convolution layer output, namely the activation value of a certain neuron stops working with probability p in the forward propagation process, so that overfitting is prevented;
3) the learning rate adopts an attenuation learning rate, and the updating speed of the learning rate control parameters is used when the neural network is trained; when the learning rate is low, the updating speed of the parameters can be greatly reduced; when the learning rate is high, oscillation occurs in the searching process, so that the parameters linger near the extreme value, and therefore the problem can be solved by adopting the attenuation learning rate.
The random deactivation (Dropout) method refers to randomly selecting a part of nodes of the network for forgetting. This is because either model cannot completely distinguish the data by 100%. When abnormal data appears in a certain class, the network learns the abnormal data as a rule, so that an overfitting problem also occurs. Because the probability of the abnormal data is much lower than that of the mainstream data, the data of some nodes is actively ignored in each model optimization process, and the learning probability of the abnormal data is further reduced, so that the generalization capability of the network is enhanced.
FIG. 3 is a workflow of dropout, and a calculation formula without dropout is as follows:
zi (l+1)=wi (l+1)yl+bi (l+1)(3)
yi (l+1)=f(zi (l+1)) (4);
as shown in fig. 4, the calculation formula of the VGG11 network structure using dropout is as follows:
rj (l)~Bernoulli(p) (5)
yi (l+1)=f(zi (l+1)) (8)
wherein z isi (l+1)Is the output of a layer in the improved convolutional neural network, yi (l+1)Is the final output of the improved convolutional neural network,
Figure BDA0002209568490000103
is a certain layer of neuron output value after dropout operation, the Bernoulli function is to randomly generate a vector r of 0 or 1j (l)。y(l)Is the output of the l layer of the improved convolutional neural network,
Figure BDA0002209568490000104
for the first layer output after dropout processing, wi (l+1)Is the weight of the l +1 layer, bi (l+1)For a bias of l +1 layers, p is the probability of activation of the neuron.
It is worth noting that dropout is used only during training, and is not required to be added during testing. Therefore, when the keep _ prob is set to 1 during the test, i.e. the activation rate of the neuron is one hundred percent, which means that the neuron does not need to be discarded, the structure diagram of the network under test is shown in fig. 5, which lacks a dropout (random inactivation) layer compared with the network structure of fig. 2
The BN (batch normalization) layer is arranged to furthest ensure that each forward propagation output prevents gradient dispersion on the same distribution, data passing through the BN layer are normalized into standard Gaussian distribution with the mean value of 0 and the variance of 1, and the batch normalization principle is as follows:
consider a vector B of size m ═ x1...,xi,xm},xiThe output y after the BN layer is the element in the vector and the two parameters gamma and beta to be learned for keeping the expression capability of the modeli=BNγ,β(xi)
Figure BDA0002209568490000105
Figure BDA0002209568490000112
Figure BDA0002209568490000113
Wherein, muBIs the minimum batch mean, σB 2Is the minimum batch variance and is the minimum batch variance,
Figure BDA0002209568490000114
is normalized xiε is a constant and is set to 1.
A higher learning rate can be used after the BN layer is added, with a lower dropout removed or used, resulting in an increased training speed.
The method is characterized in that the BN layer is added to bring certain effects to the improvement of the training speed and the generalization ability of the model, but the positions where the BN layer is placed have performance differences, under the same hardware condition, experiments that the BN layer (a) is not placed on a CASIA data set, the BN layer (b) is placed on the output of each convolutional layer, and the BN layer (c) is placed on the output of only part of convolutional layers are respectively carried out, and the equal error rate and the training time consumption of the network are counted, as shown in table 1.
TABLE 1 representation on different placement positions CASIA-FASD dataset of BN layer
As can be seen from table 1, the BN layer addition position has an influence on the performance of the convolutional neural network model, and if a BN layer is added to the output of each convolutional layer, the model performance is seriously reduced, the equal error rate is increased compared with that of the BN-free layer model, and the training speed is reduced compared with that of the BN-free layer model. And the error rate and the training speed are reduced by adding the model of the BN layer only on part of the convolution layer (the first two layers).
Dropout improves the generalization capability of the model, and the following table shows the behavior of the three models, i.e., a, b and c, after Dropout is added between the output of the pooling layer and the full connection layer.
TABLE 2 Performance of the three classes of models after Dropout processing
Figure BDA0002209568490000121
As can be seen from table 2, the equal error rate in the model performance is further reduced compared to table 2, but the training time of the convolutional neural network model is increased, and Dropout increases the training time of the convolutional neural network model, so that the retention rate can be set to be higher, i.e., 95%, in cooperation with batch normalization, and the increase of the model training time is alleviated.
Labeling the real face sample, the fake face sample and the sample after data enhancement, and then training; the loss function adopts a cross entropy function (cross entropy); the learning rate adopts an attenuation learning rate, namely every 800 steps are attenuated to ninety percent of the original rate, so that the training speed is high when the step length is long at the beginning of model training, and the global optimum point is not easy to miss when the step length is short at the later stage; the living body detection method is successfully integrated into a face recognition system.
And S3, capturing a single-frame face by using a camera, calling the trained model to perform living body detection, and realizing real-time high-accuracy face living body identification.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (7)

1. A monocular-based face silence living body detection method is characterized by comprising the following steps:
s1, obtaining a training data set in a data enhancement mode, and carrying out enhancement processing on the data;
s2, training the image by using the improved convolutional neural network, and storing the convolutional neural network model obtained after training;
and S3, capturing a single-frame face by using a camera, and performing living body detection by using a convolutional neural network model to realize real-time face living body identification.
2. The monocular based face silence live detection method of claim 1, wherein the training set of step S1 is obtained by:
according to the video in the CASIA-FASD data set, cutting out human faces from images by using a Haar classifier, wherein the images form a part of a training data set; and shooting sample pictures of true and false faces in different scenes as the other part of the training data set, and carrying out random adjustment on image brightness, contrast and saturation and random rotation data enhancement processing on the training data set.
3. The monocular based face silence live detection method of claim 1, wherein the modified convolutional neural network is a modified VGG11 network, the modified VGG11 network comprises 11 convolutional layers and three fully-connected layers, each convolutional layer is followed by a ReLU layer, namely convolutional layer + ReLU layer, each two convolutional layer + ReLU layers are followed by a max pooling layer and a random inactivation layer, namely dropout, the last three random inactivation layers are followed by a fully-connected layer, each fully-connected layer is followed by a ReLU layer, and the last ReLU softmax layer; in the output of the first two convolutional layers, each convolutional layer is connected to a BN (Batch Normalization) layer, which is connected to a largest pooling layer, which is in turn connected to a random deactivation layer.
4. The monocular based face silence live detection method of claim 3, wherein the training of the improved VGG11 network is specifically as follows:
1) and (3) carrying out Batch Normalization (Batch Normalization) on the outputs of the first two layers of convolution layers so as to ensure that the intermediate output value of the convolutional neural network is stable and prevent the gradient from disappearing, wherein the Batch Normalization principle formula is as follows:
Figure FDA0002209568480000021
wherein x(k)Is the k-th dimension vector of the input, E x(k)]Is x(k)Mean value of (1), Var [ x ](k)]Is x(k)The variance of (a);
2) dropout is used for each convolution layer output, so that the activation value of a certain neuron stops working with probability p during forward propagation, and overfitting is prevented;
3) the learning rate is an attenuated learning rate, and the updating speed of the parameters is controlled by using the learning rate when the improved convolutional neural network is trained.
5. The monocular based face silence live detection method of claim 3, wherein the softmax layer is expressed as:
Figure FDA0002209568480000022
whereinThe output of the previous layer of the last layer of the convolutional neural network is further input, j represents a category index and represents the ratio of the index of the current element to the sum of the indexes of all elements, yjTwo neurons are included that correspond to the probability distribution of a binary classified image of a real face and a false face.
6. The monocular-based face silence living body detection method of claim 3, wherein the formula of the VGG11 network structure adopting dropout is as follows:
rj (l)~Bernoulli(p) (3)
Figure FDA0002209568480000023
Figure FDA0002209568480000031
yi (l+1)=f(zi (l+1)) (6)
wherein z isi (l+1)Is the output of the l +1 layer in the improved convolutional neural network, yi (l+1)Is the final output of the improved convolutional neural network,is the output value of the l-th layer neuron after dropout operation, and the Bernoulli function is to randomly generate a vector r of 0 or 1j (l),y(l)Is the output of the l-th layer of the improved convolutional neural network,
Figure FDA0002209568480000033
for the first layer output after dropout processing, wi (l+1)Is the weight of the l +1 layer, bi (l+1)Is the bias of the l +1 layer; -representing a dropout processing operation, r(l)A vector of 0 or 1, i denotes the ith dimension, and p is the activation probability of the neuron.
7. The monocular based face silence live detection method of claim 3, wherein the BN layer normalizes the data into a standard Gaussian distribution with a mean of 0 and a variance of 1, as follows:
consider a vector B of size m ═ x1...,xi,xm},xiThe output y after the BN layer is the element in the vector and the two parameters gamma and beta to be learned and used for maintaining the expression capability of the improved convolutional neural networki=BNγ,β(xi)
Figure FDA0002209568480000035
Figure FDA0002209568480000036
Wherein, muBIs the minimum batch mean, σB 2Is the minimum batch variance and is the minimum batch variance,is normalized xiε is a constant and is set to 1, yiIs the output of BN layer, BNγ,βIs a BN layer normalization function with gamma and beta as parameters.
CN201910893676.0A 2019-09-20 2019-09-20 Monocular-based face silence living body detection method Pending CN110674730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910893676.0A CN110674730A (en) 2019-09-20 2019-09-20 Monocular-based face silence living body detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910893676.0A CN110674730A (en) 2019-09-20 2019-09-20 Monocular-based face silence living body detection method

Publications (1)

Publication Number Publication Date
CN110674730A true CN110674730A (en) 2020-01-10

Family

ID=69077027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910893676.0A Pending CN110674730A (en) 2019-09-20 2019-09-20 Monocular-based face silence living body detection method

Country Status (1)

Country Link
CN (1) CN110674730A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368731A (en) * 2020-03-04 2020-07-03 上海东普信息科技有限公司 Silent in-vivo detection method, silent in-vivo detection device, silent in-vivo detection equipment and storage medium
CN112001240A (en) * 2020-07-15 2020-11-27 浙江大华技术股份有限公司 Living body detection method, living body detection device, computer equipment and storage medium
CN112464864A (en) * 2020-12-08 2021-03-09 上海交通大学 Face living body detection method based on tree-shaped neural network structure
CN112818782A (en) * 2021-01-22 2021-05-18 电子科技大学 Generalized silence living body detection method based on medium sensing
CN112906508A (en) * 2021-02-01 2021-06-04 四川观想科技股份有限公司 Face living body detection method based on convolutional neural network
CN113033487A (en) * 2021-04-21 2021-06-25 南方电网科学研究院有限责任公司 Unstructured data detection method based on sensitive data fingerprint feature library
CN115439691A (en) * 2022-09-05 2022-12-06 哈尔滨市科佳通用机电股份有限公司 TVDS fault automatic identification system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066942A (en) * 2017-03-03 2017-08-18 上海斐讯数据通信技术有限公司 A kind of living body faces recognition methods and system
CN107194376A (en) * 2017-06-21 2017-09-22 北京市威富安防科技有限公司 Mask fraud convolutional neural networks training method and human face in-vivo detection method
CN107220635A (en) * 2017-06-21 2017-09-29 北京市威富安防科技有限公司 Human face in-vivo detection method based on many fraud modes
CN107292267A (en) * 2017-06-21 2017-10-24 北京市威富安防科技有限公司 Photo fraud convolutional neural networks training method and human face in-vivo detection method
CN107301396A (en) * 2017-06-21 2017-10-27 北京市威富安防科技有限公司 Video fraud convolutional neural networks training method and human face in-vivo detection method
CN107944416A (en) * 2017-12-06 2018-04-20 成都睿码科技有限责任公司 A kind of method that true man's verification is carried out by video
CN108549854A (en) * 2018-03-28 2018-09-18 中科博宏(北京)科技有限公司 A kind of human face in-vivo detection method
CN109886121A (en) * 2019-01-23 2019-06-14 浙江大学 A kind of face key independent positioning method blocking robust
CN109886087A (en) * 2019-01-04 2019-06-14 平安科技(深圳)有限公司 A kind of biopsy method neural network based and terminal device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066942A (en) * 2017-03-03 2017-08-18 上海斐讯数据通信技术有限公司 A kind of living body faces recognition methods and system
CN107194376A (en) * 2017-06-21 2017-09-22 北京市威富安防科技有限公司 Mask fraud convolutional neural networks training method and human face in-vivo detection method
CN107220635A (en) * 2017-06-21 2017-09-29 北京市威富安防科技有限公司 Human face in-vivo detection method based on many fraud modes
CN107292267A (en) * 2017-06-21 2017-10-24 北京市威富安防科技有限公司 Photo fraud convolutional neural networks training method and human face in-vivo detection method
CN107301396A (en) * 2017-06-21 2017-10-27 北京市威富安防科技有限公司 Video fraud convolutional neural networks training method and human face in-vivo detection method
CN107944416A (en) * 2017-12-06 2018-04-20 成都睿码科技有限责任公司 A kind of method that true man's verification is carried out by video
CN108549854A (en) * 2018-03-28 2018-09-18 中科博宏(北京)科技有限公司 A kind of human face in-vivo detection method
CN109886087A (en) * 2019-01-04 2019-06-14 平安科技(深圳)有限公司 A kind of biopsy method neural network based and terminal device
CN109886121A (en) * 2019-01-23 2019-06-14 浙江大学 A kind of face key independent positioning method blocking robust

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
M.GROCHOWSKI等: ""Selected technical issues of deep neural networks for image classification purposes"", 《BULLETIN OF THE POLISH ACADEMY OF SCIENCES:TECHNICAL SCIENCES》 *
SHUREN ZHOU等: ""Improved VGG Model for Road Traffic Sign Recognition"", 《COMPUTERS MATERIALS & CONTINUA》 *
佟越洋: ""基于卷积神经网络的活体人脸检测算法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
姜辛魁: ""结合人脸检测的活体识别方法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
张惠楚等: "《人工智能实践 动手做你自己的AI》", 31 August 2019, 上海科技教育出版社 *
胡玉兵: ""基于卷积神经网络的番茄病害识别研究"", 《中国优秀硕士学位论文全文数据库 农业科技辑》 *
许晓: ""基于深度学习的活体人脸检测算法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
陈英义等: ""基于FTVGG16卷积神经网络的鱼类识别方法"", 《农业机械学报》 *
龙敏等: ""应用卷积神经网络的人脸活体检测算法研究"", 《计算机科学与探索》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368731A (en) * 2020-03-04 2020-07-03 上海东普信息科技有限公司 Silent in-vivo detection method, silent in-vivo detection device, silent in-vivo detection equipment and storage medium
CN111368731B (en) * 2020-03-04 2023-06-09 上海东普信息科技有限公司 Silence living body detection method, silence living body detection device, silence living body detection equipment and storage medium
CN112001240A (en) * 2020-07-15 2020-11-27 浙江大华技术股份有限公司 Living body detection method, living body detection device, computer equipment and storage medium
CN112464864A (en) * 2020-12-08 2021-03-09 上海交通大学 Face living body detection method based on tree-shaped neural network structure
CN112818782A (en) * 2021-01-22 2021-05-18 电子科技大学 Generalized silence living body detection method based on medium sensing
CN112906508A (en) * 2021-02-01 2021-06-04 四川观想科技股份有限公司 Face living body detection method based on convolutional neural network
CN112906508B (en) * 2021-02-01 2024-05-28 四川观想科技股份有限公司 Face living body detection method based on convolutional neural network
CN113033487A (en) * 2021-04-21 2021-06-25 南方电网科学研究院有限责任公司 Unstructured data detection method based on sensitive data fingerprint feature library
CN115439691A (en) * 2022-09-05 2022-12-06 哈尔滨市科佳通用机电股份有限公司 TVDS fault automatic identification system
CN115439691B (en) * 2022-09-05 2023-04-21 哈尔滨市科佳通用机电股份有限公司 TVDS fault automatic identification system

Similar Documents

Publication Publication Date Title
CN110674730A (en) Monocular-based face silence living body detection method
CN106951867B (en) Face identification method, device, system and equipment based on convolutional neural networks
Han et al. Two-stage learning to predict human eye fixations via SDAEs
Chakka et al. Competition on counter measures to 2-d facial spoofing attacks
US8995725B2 (en) On-site composition and aesthetics feedback through exemplars for photographers
Negi et al. Face mask detection classifier and model pruning with keras-surgeon
Do et al. Deep neural network-based fusion model for emotion recognition using visual data
CN108563999A (en) A kind of piece identity's recognition methods and device towards low quality video image
WO2021196721A1 (en) Cabin interior environment adjustment method and apparatus
Jiang et al. Multilevel fusing paired visible light and near-infrared spectral images for face anti-spoofing
Damer et al. Deep learning-based face recognition and the robustness to perspective distortion
Afifi et al. Can we boost the power of the Viola–Jones face detector using preprocessing? An empirical study
CN112464864A (en) Face living body detection method based on tree-shaped neural network structure
Hashemi A survey of visual attention models
Mr et al. Developing a novel technique to match composite sketches with images captured by unmanned aerial vehicle
Bonetto et al. Image processing issues in a social assistive system for the blind
Singla et al. Age and gender detection using Deep Learning
Unnikrishnan et al. Texture-based estimation of age and gender from wild conditions
Wang et al. A Learning Analytics Model Based on Expression Recognition and Affective Computing: Review of Techniques and Survey of Acceptance
Cöster et al. Human Attention: The possibility of measuring human attention using OpenCV and the Viola-Jones face detection algorithm
CN112906668B (en) Face information identification method based on convolutional neural network
TWI844284B (en) Method and electrical device for training cross-domain classifier
Tupe et al. Diabetic retinopathy detection using image processing techniques: a study
Una et al. Classification technique for face-spoof detection in artificial neural networks using concepts of machine learning
Vinay et al. Unconstrained face recognition using ASURF and cloud-forest classifier optimized with VLAD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110

RJ01 Rejection of invention patent application after publication