CN112183213A - Facial expression recognition method based on Intra-Class Gap GAN - Google Patents
Facial expression recognition method based on Intra-Class Gap GAN Download PDFInfo
- Publication number
- CN112183213A CN112183213A CN202010905875.1A CN202010905875A CN112183213A CN 112183213 A CN112183213 A CN 112183213A CN 202010905875 A CN202010905875 A CN 202010905875A CN 112183213 A CN112183213 A CN 112183213A
- Authority
- CN
- China
- Prior art keywords
- output
- convolution
- facial expression
- image
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- 230000008921 facial expression Effects 0.000 title claims abstract description 76
- 230000014509 gene expression Effects 0.000 claims abstract description 59
- 238000013528 artificial neural network Methods 0.000 claims abstract description 12
- 238000003062 neural network model Methods 0.000 claims abstract description 10
- 230000004913 activation Effects 0.000 claims description 52
- 230000006870 function Effects 0.000 claims description 51
- 238000012549 training Methods 0.000 claims description 44
- 238000010606 normalization Methods 0.000 claims description 36
- 230000008569 process Effects 0.000 claims description 35
- 238000012545 processing Methods 0.000 claims description 20
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 11
- 238000001514 detection method Methods 0.000 claims description 9
- 230000007935 neutral effect Effects 0.000 claims description 9
- 238000005520 cutting process Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 230000005284 excitation Effects 0.000 claims description 5
- 230000005764 inhibitory process Effects 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 4
- 230000006872 improvement Effects 0.000 abstract description 4
- 238000012937 correction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 210000003128 head Anatomy 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 210000001097 facial muscle Anatomy 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
A facial expression recognition method based on Intra-Class Gap GAN is characterized in that a recognition model is constructed, and the method comprises the following steps: (1) acquiring real-time images of different sources and different expressions of the human face; (2) inputting the image into an Intra-Class Gap GAN neural network model for identification; (3) outputting the identification result; compared with the traditional method for manually extracting expression characteristics, the facial expression recognition method based on the generation countermeasure realizes the automatic extraction of the facial expression characteristics, and compared with the slightly early neural network facial expression recognition, the facial expression recognition method based on the generation countermeasure realizes the improvement of the recognition rate, thereby accurately recognizing the expression.
Description
Technical Field
The invention relates to the field of facial expression recognition of image processing and deep learning, in particular to a facial expression recognition method based on antagonism generation.
Background
The huge floating population in China exerts great pressure on urban infrastructure and public service, the frequent incidents of vicious injury in recent years, the security situation is concerned, the urban management and service system is seriously lagged behind, the improvement is urgently needed, and the enhancement of urban monitoring and the facial expression recognition of lawbreakers become more important. The expression is the emotional state expressed by the change of facial muscles, the abnormal psychological state can be judged and the extreme emotion can be presumed by identifying the expression of facial emotion of people, the facial expression of pedestrians in a complex environment is observed, technical support is provided for further judging the psychology of people, people are roughly judged to be suspicious, and certain criminal activities are stopped in time. The traditional facial expression recognition is mainly a facial expression recognition method based on template matching and a neural network. In addition, the traditional facial expression needs human intervention in the feature selection process, the feature extraction algorithm is finely designed by manpower, sufficient computing power is lacked, the training difficulty is high, the accuracy rate is low, and the original expression information is easy to lose.
Disclosure of Invention
The purpose of the invention is as follows:
according to the proposed intra-class difference of facial expression recognition under the real condition, the facial expression recognition method based on generation countermeasure is provided for the technical problems that the difficulty is high in complex environment security inspection, and the requirement of the facial expression recognition rate cannot be met due to the intra-class difference.
The technical scheme is as follows:
a facial expression recognition method based on Intra-Class Gap GAN,
the identification model construction comprises the following steps:
(1) acquiring real-time images of different sources and different expressions of the human face;
(2) inputting the image into an Intra-Class Gap GAN neural network model for identification;
(3) outputting the identification result;
the method for constructing the Intra-Class Gap GAN neural network model in the step (2) is as follows:
(2.1) acquiring historical images of different sources and different expressions of the human face;
(2.2) preprocessing the collected face image to construct a facial expression data set;
(2.3) constructing an Intra-Class Gap GAN neural network model aiming at the problem of facial expression recognition of intra-Class differences in the data set in the step (2.2);
(2.4) training the generator and discriminator of the network simultaneously by combining the pixel difference between the input image and the reconstructed image and the difference of the potential vector, and ensuring that the difference between the reconstructed image and the input image is minimum.
(2.2) the method for constructing the human face expression data set in the step is as follows:
s11: based on Multi-PIE and JAFFE expression data sets, facial expression pictures are downloaded on the network through the step (2.1), the facial expression data sets required by self-control are carried out, 5 facial expressions of abomination, happy, neutral, anxious and surrose and fear of people in different countries, different age groups, different professions and the like are selected for experiment, and the data sets with the complexity of a large number of facial expression characteristics with intra-class differences are increased and serve as input images x of network training
S12: geometrically normalizing the input image, and carrying out face detection on the normalized image;
s13: the images after the processing in step S12 are scale-normalized to unify the sizes of the images.
The step (2.4) is specifically as follows:
s14: training a facial expression recognition network model based on an IC-GAN (Intra-Class Gap GAN) neural network generating a confrontation based on the image processed in step S13;
s15: carrying out data enhancement and data expansion processing on the image;
s16: and training the network model and storing the trained network model.
The step S12 includes the following steps:
s121: determining a characteristic point [ x, y ] according to the collected image, and calibrating the characteristic points of the two eyes and the nose to obtain coordinate values of the characteristic points;
s122: rotating the image according to the coordinates of the eyes on the face to ensure the consistency of the face direction, wherein the distance between the eyes of a person is d, and the midpoint of the two eyes is O;
s123: and determining a frame containing the face according to the calibrated characteristic points and the geometric model, respectively cutting the distance of d from the O to the left and the right, and respectively cutting by taking 0.5d and 1.5d in the up-down direction.
The step S13 includes the following steps:
s131: and (4) performing scale normalization on the cut images in the step (S123), and unifying resize of the images into 256 × 256 pixel images to complete geometric normalization of the images.
The step S14 includes the following steps:
s141: constructing the proposed IC-GAN (internal-Class Gap GAN) neural network by using a pytorch deep learning framework, firstly inputting the picture processed in the step S13 into a first layer of convolution layer for convolution operation, performing convolution on the input image through a 4 × 4 convolution core, and outputting the result as 128 × 64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, and outputting 128 × 64; the LeakyReLu activation function is:
s142: continuing to perform convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, wherein the output is 64 × 128, then performing normalization operation on the output of the previous layer by using a batchnorm layer, performing nonlinear operation on the convolution by using a LeakyReLu activation function, and outputting 64 × 128
S143: continuing to perform convolution, batchnorm and LeakyReLu operations on the output of the previous layer by using the method in the step S142, wherein the output is 4 × 100;
s144: performing reverse convolution operation of a convolution kernel 4 × 4 on the output of the S143 to obtain an output of 29 × 1, performing batch normalization operation by using a batchnorm, and performing nonlinear operation on the output by using a ReLu activation function to obtain an output of 32 × 128; the ReLu activation function is:
s145: performing the convolution, bathnorm and ReLu operations in step S144 again on the output of the previous layer, and outputting 64 × 64;
s146: performing nonlinear operation on the output of the previous layer by using a ReLu activation function, performing convolution operation on the previous layer by using reverse convolution with a convolution kernel of 4 × 4, and performing nonlinear operation by using a Tanh activation function to output 128 × 128; the Tanh activation function is:
s147: performing the operation in the S141-S143 process again on the output of the previous layer, wherein the output is 1 x 5;
s148: inputting the image subjected to scale normalization in the step S13 and the output of the step S147 into a 4 × 4 convolution layer, performing convolution operation, and then performing nonlinear activation by using a nonlinear activation function LeakyReLu to output 128 × 64;
s149: performing convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, performing batch normalization operation by using batchnorm, and performing LeakyReLu nonlinear activation;
s1491: continuing to perform convolution, batchnorm and nonlinear operation on the output of the previous layer by adopting the process of S142, wherein the output is 4 x 1;
s1492: finally, adopting Softmax to the output of the upper layer, and outputting the probability of judging the output to be true;
s1493: and performing full-connection operation on the output of the S147 process, and finally realizing training of 5 expressions by a Softmax classifier, wherein the 5 expressions are 1 ═ happy, 2 ═ inhibition, 3 ═ neutral, 4 ═ excitation, and 5 ═ surprism and fear, so as to realize recognition of facial expressions.
Step S15 includes
S151: dividing the network loss function into four parts, and reducing the difference between an original image and a reconstructed image on a pixel level for the generated network of the first part, wherein the reconstruction error loss is as follows:
Lcon=Ex~pX||x-G(x)||1;
pX represents data allocation; x is the input image g (X) is the image generated by the generator in the network;
using the feature matching method proposed by Salimans et al to reduce training instability, the image feature level is optimized, and one feature matching error of the discriminator of the second partial network is:
Ladv=Ex~pX||f(x)-f(G(x))||2
wherein f (-) represents the discriminator model transformation;
the third part is a potential vector z and a reconstructed potential vectorThe coding loss of the facial expression information prevents the interference with the picture independence information in the network decoding process:
where h (■) represents the transcoding;
the network loss of the fourth part is the cross entropy loss of the Softmax layer:
where k (■) represents the cross-entropy loss process of Softmax, k (y) represents the true result,representing the recognition result;
the overall network loss function is as follows:
L=ωadvLadv+ωconLcon+ωpLp+ωsLs
wherein ω isadv,ωcon,ωp,ωsIs a parameter for adjusting losses;
s152: the Optimizer selects an Adam Optimizer, the learning rate is set to be 0.0002, training samples are trained in batches, 16 pictures are selected for each batch to be trained, and the epoch is respectively set to be 100, 200, 300 and 400;
s153: in each training, 1 epoch picture is firstly obtained, then the loss value is calculated, and then the Adam optimizer is used for continuously updating the parameters of the network to minimize the loss value of the network.
In the step (3), the pictures are input into the trained IC-GAN network model for recognition, the probability of each facial expression is finally output, and the expression category with the maximum output probability is the classification result of the people; the probability calculation formula is as follows:
wherein z isiAn ith output representing a network; omegaijIs the jth weight of the ith neuron, b is the bias; siRepresented by the output of the ith neuron, yiRepresentative is the ith output value of Softmax.
The advantages and effects are as follows:
the invention designs a facial expression recognition method based on generation confrontation, which comprises a network training process and an off-line recognition process of facial expression recognition with intra-class difference; the offline identification process should include the following steps:
s11: downloading through a network, skipping a frame, analyzing a video, and collecting an input image x;
s12: geometrically normalizing the input image x and detecting an image x' after normalization processing;
s13: processing the detected and cut image x' to a uniform size;
s14: constructing a network model based on the generated confrontation facial expression recognition;
s15: carrying out data enhancement and data expansion processing on the image x' and unifying the image size;
s16: training the network model and storing the trained network model;
for the identification process, the following steps should be included:
s21: downloading, skipping a frame, analyzing a video and collecting an input image I through a network;
s22: then the input image I inputs the trained network model;
s23: and obtaining a recognition result.
The step S12 further includes the following steps:
s121: performing geometric normalization processing on the input image; the geometric normalization method comprises scale normalization, external head correction and face twisting correction;
s122: performing face detection on the image after geometric normalization by using a face detection method in an OpenCV open source library, and then performing noise reduction on the detected image;
s23: resulting in a geometrically normalized image x'.
The step S13 further includes:
s131: determining the position of an image according to the coordinates of the face;
s132: using OpenCV to detect and obtain a face image;
s133: and adjusting the cut face image into a uniform size, and changing the cut face image into 256 × 256 size.
Further, step S14 should further include: s141: building an IC-GAN neural network by using a pytorch deep learning framework, firstly inputting a picture into a con _1 layer for convolution operation, performing convolution on an input image through a 4 × 4 convolution kernel, and outputting 128 × 64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, and outputting 128 × 64; the LeakyReLu activation function is:
s142: continuing to perform convolution operation on the output of the previous layer by using a 4 × 4 convolution kernel, wherein the output is 64 × 128, then performing normalization operation on the output of the previous layer on the batcnorm layer, and performing nonlinear operation on the convolution by using a LeakyReLu activation function, wherein the output is 64 × 128;
s143: continuing to perform convolution, batchnorm and LeakyReLu operations on the output of the previous layer by using the method of S142, wherein the output is 4 x 100;
s144: performing reverse convolution operation of a convolution kernel 4 × 4 on the output of the S143 to obtain an output of 29 × 1, performing batch normalization operation by using a batchnorm, and performing nonlinear operation on the output by using a ReLu activation function to obtain an output of 32 × 128;
s145: performing the convolution, bathnorm and ReLu operations described in S144 again on the output of the previous layer, and outputting 64 × 64;
s146: performing nonlinear operation on the output of the previous layer by using a ReLu activation function, performing convolution operation on the previous layer by using reverse convolution with a convolution kernel of 4 × 4, and performing nonlinear operation by using a Tanh activation function to output 128 × 128;
s147: performing the operation in the S141-S143 process again on the output of the previous layer, wherein the output is 1 x 5;
s148: inputting the original image and the output of S147 into a 4 × 4 convolution layer, performing convolution operation, and then performing nonlinear activation by using a nonlinear activation function LeakyReLu, wherein the output is 128 × 64;
s149: performing convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, performing batch normalization operation by using batchnorm, and performing LeakyReLu nonlinear activation;
s1491: continuously performing convolution, batchnorm and nonlinear operation on the output of the previous layer by adopting the process of S150, wherein the output is 4 x 1;
s1492: and finally, adopting Softmax to the output of the previous layer, and outputting the probability of judging to be true.
S1493: performing full-connection operation on the output of the S147 process, and finally realizing training of 5 expressions through a Softmax classifier, where the 5 expressions are 1 ═ happy, 2 ═ inhibition, 3 ═ neutral, 4 ═ excitation, and 5 ═ surpride and fear, so as to realize recognition of facial expressions;
step S15 should also include: s151: according to the network structure and experimental characteristics, the network loss is also divided into four parts, for the generation network of the first part, the difference between the original image and the reconstructed image is reduced on the pixel level, and the reconstruction error loss is as follows:
Lcon=Ex~pX||x-G(x)||1;
the feature matching method proposed by Salimans et al is used herein to reduce training instability, the image feature level is optimized, and one feature matching error of the discriminator of the second partial network is:
Ladv=Ex~pX||f(x)-f(G(x))||2
where f (-) represents the discriminator model transformation.
The third part is a potential vector z and a reconstructed potential vectorThe coding loss of the facial expression information prevents the interference with the picture independence information in the network decoding process:
where h (-) represents the transcoding.
The network loss of the fourth part is the cross entropy loss of the Softmax layer:
where k (-) represents the cross-entropy loss process of Softmax, k (y) represents the true result,representing the recognition result.
The overall network loss function is as follows:
L=ωadvLadv+ωconLcon+ωpLp+ωsLs
wherein ω isadv,ωcon,ωp,ωsIs a parameter for adjusting the loss.
S152: the Optimizer selects an Adam Optimizer, the learning rate is set to 0.0002, training samples are trained in batches, 16 pictures are selected for each batch to be trained, and the epoch is set to 100, 200, 300 and 400 respectively.
S153: in each training, 1 epoch picture is firstly obtained, then the loss value is calculated, and then the Adam optimizer is used for continuously updating the parameters of the network to minimize the loss value of the network.
Further, the step S16 further includes: s161: downloading, skipping frames, analyzing videos and collecting input images through a network;
s162: performing geometric normalization processing, face detection, opencv processing and size unification on the input image;
s163: and inputting the processed image into a trained IC-GAN network model for recognition, and finally outputting the probability of each expression, wherein the expression with the maximum probability is taken as the expression which is required to be recognized by the network.
Compared with the prior art, the invention has the advantages that:
compared with the traditional method for manually extracting expression characteristics, the facial expression recognition method based on the generation countermeasure realizes the automatic extraction of the facial expression characteristics, and compared with the slightly early neural network facial expression recognition, the facial expression recognition method based on the generation countermeasure realizes the improvement of the recognition rate, thereby accurately recognizing the expression.
Drawings
In order to illustrate embodiments or prior art solutions of the present invention more clearly, all the figures that are essential to the embodiments or prior art descriptions will be briefly described below, so that the following figures are some embodiments of the invention, from which other figures can be derived by other researchers in this field.
FIG. 1 is an overall flow chart of the present invention.
FIG. 2 is a schematic diagram of the IC-GAN network model of the present invention
Detailed Description
A facial expression recognition method based on Intra-Class Gap GAN,
the identification model construction comprises the following steps:
(1) acquiring real-time images of different sources and different expressions of the human face;
(2) inputting the image into an Intra-Class Gap GAN neural network model for identification;
(3) outputting the identification result;
the method for constructing the Intra-Class Gap GAN neural network model in the step (2) is as follows:
(2.1) acquiring historical images of different sources and different expressions of the human face;
(2.2) preprocessing the collected face image to construct a facial expression data set;
(2.3) constructing an Intra-Class Gap GAN neural network model aiming at the facial recognition problem that the data set in the step (2.2) has an intra-Class difference (the difference of the same expression is called as the intra-Class difference, or the same expression has different expression forms, the intra-Class difference is called as the intra-Class difference, the collected image can be influenced by the shooting angle of an external environment shelter, the like expression is also smiling, the like expression can be mistakenly recognized into other types of expressions due to the above reasons, but the characteristic difference is particularly large due to the complex surrounding environment and the like, the recognition accuracy is finally influenced, the like expression is that the like expression is smiling, and the like expression is mistakenly recognized into other types of expressions due to the influence of the shooting angle of the external environment shelter, and the like);
and 2.4, simultaneously training a generator and a discriminator of the network by combining pixel differences and potential vector differences between an input image (a training sample input during network training) and a reconstructed image (an image generated in the training process is used for matching with an original image, and the network is considered to be trained to correctly extract image characteristics when the generated reconstructed image and the input image have no difference), so as to ensure that the difference between the reconstructed image and the input image is minimum. (the network constructed by the user has the strongest network identification capability when the original input picture is compared with the picture generated by the network during training and the picture generated by the network is consistent with the input picture when the network is trained)
(2.2) the method for constructing the human face expression data set in the step is as follows:
s11: based on Multi-PIE and JAFFE expression data sets, facial expression pictures are downloaded on the network through the step (2.1), facial expression data sets (sample expansion) required by self-making of the text are made, 5 facial expressions of abomination, happy, neutral, anxious, surfise and fear of people in different countries, different age groups, different professions and the like are selected for experiment, a large number of similar expressions are added, wherein the similar expressions have large intra-class differences (the similar intra-class differences can be called as large, generally, the differences are basically recognized as large as long as the intra-class differences exist, the same expression forms of one expression comprise the same expression (for example, smiles and laughter and the like), the forms presented under the same background environment are called as intra-class, for example, the same smile expression of one person is presented under the same background environment, if the conditions are not met, the similar intra-class differences exist or the intra-class differences are large, for example: different backgrounds, different expressions or different persons, as long as one is satisfied, the differences exist in the class or the differences are larger. ) Complexity of facial expression features of the data set as input image x for network training
S12: geometrically normalizing the input image and performing face detection on the normalized image (obtaining a suitable face image as in claim 3, obtaining sample data suitable for network training by processing, such as possibly requiring rotation to ensure consistency of face direction, etc.);
s13: the images after the processing in the step S12 are scale-normalized, unifying the sizes of the images (S12 and S13 are preprocessing procedures).
The step (2.4) is specifically as follows:
s14: training a facial expression recognition network model based on an IC-GAN (Intra-Class Gap GAN) neural network generating a confrontation based on the image processed in step S13;
s15: carrying out data enhancement and data expansion processing on the image;
s16: and training the network model and storing the trained network model.
The step S12 includes the following steps:
s121: determining a characteristic point [ x, y ] according to the collected image, and calibrating the characteristic points of the two eyes and the nose to obtain coordinate values of the characteristic points;
s122: rotating the image according to the coordinates of the eyes on the face to ensure the consistency of the face direction (the process of face image preprocessing reflects the rotation invariance of the face in the image plane), wherein the distance between the eyes of a person is d, and the midpoint of the two eyes is O;
s123: and determining a frame containing the face according to the calibrated characteristic points and the geometric model, respectively cutting the distance of d from the O to the left and the right, and respectively cutting by taking 0.5d and 1.5d in the up-down direction.
The step S13 includes the following steps:
s131: and (4) performing scale normalization on the cut images in the step (S123), and unifying resize of the images into 256 × 256 pixel images to complete geometric normalization of the images.
The step S14 includes the following steps:
s141: constructing the proposed IC-GAN (internal-Class Gap GAN) neural network by using a pytorch deep learning framework, firstly inputting the picture processed in the step S13 into a first layer of convolution layer for convolution operation, performing convolution on the input image through a 4 × 4 convolution core, and outputting the result as 128 × 64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, and outputting 128 × 64; the LeakyReLu activation function is:
S142: continuously performing convolution operation on the output of the previous layer (the first convolution layer) by using 4 × 4 convolution kernels, wherein the output is 64 × 128, then performing normalization operation on the output of the previous layer by using a batcnorm layer, performing nonlinear operation on the convolution by using a LeakyReLu activation function, and outputting 64 × 128
S143: continuing to perform convolution, batchnorm and LeakyReLu operations on the output of the previous layer by using the method in the step S142, wherein the output is 4 × 100;
s144: performing reverse convolution operation of a convolution kernel 4 × 4 on the output of the S143 to obtain an output of 29 × 1, performing batch normalization operation by using a batchnorm, and performing nonlinear operation on the output by using a ReLu activation function to obtain an output of 32 × 128; the ReLu activation function is:
s145: performing the convolution, bathnorm and ReLu operations in step S144 again on the output of the previous layer, and outputting 64 × 64;
s146: performing nonlinear operation on the output of the previous layer by using a ReLu activation function, performing convolution operation on the previous layer by using reverse convolution with a convolution kernel of 4 × 4, and performing nonlinear operation by using a Tanh activation function to output 128 × 128; the Tanh activation function is:
s147: performing the operation in the S141-S143 process again on the output of the previous layer, wherein the output is 1 x 5;
s148: inputting the image subjected to scale normalization in the step S13 and the output of the step S147 into a 4 × 4 convolution layer, performing convolution operation, and then performing nonlinear activation by using a nonlinear activation function LeakyReLu to output 128 × 64;
s149: performing convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, performing batch normalization operation by using batchnorm, and performing LeakyReLu nonlinear activation;
s1491: continuing to perform convolution, batchnorm and nonlinear operation on the output of the previous layer by adopting the process of S142, wherein the output is 4 x 1;
s1492: and finally, adopting Softmax to the output of the previous layer, and outputting the probability of judging to be true.
S1493: and performing full-connection operation on the output of the S147 process, and finally realizing training of 5 expressions by a Softmax classifier, wherein the 5 expressions are 1 ═ happy, 2 ═ inhibition, 3 ═ neutral, 4 ═ excitation, and 5 ═ surprism and fear, so as to realize recognition of facial expressions.
Step S15 includes
S151: according to the constructed IC-GAN network structure, the network loss function is divided into four parts, for the generation network of the first part, the difference between the original image and the reconstructed image is reduced on the pixel level, and the reconstruction error loss is as follows:
Lcon=Ex~pX||x-G(x)||1;
pX represents data allocation; x is the input image g (X) is the image generated by the generator in the network;
the feature matching method proposed by Salimans et al is used herein to reduce training instability, the image feature level is optimized, and one feature matching error of the discriminator of the second partial network is:
Ladv=Ex~pX||f(x)-f(G(x))||2
where f (-) represents the discriminator model transformation.
The third part is a potential vector z and a reconstructed potential vectorThe coding loss of the facial expression information prevents the interference with the picture independence information in the network decoding process:
where h (-) represents the transcoding.
The network loss of the fourth part is the cross entropy loss of the Softmax layer:
where k (-) represents the cross-entropy loss process of Softmax, k (y) represents the true result,representing the recognition result.
The overall network loss function is as follows:
L=ωadvLadv+ωconLcon+ωpLp+ωsLs
wherein ω isadv,ωcon,ωp,ωsIs a parameter for adjusting the loss.
S152: the Optimizer selects an Adam Optimizer, the learning rate is set to 0.0002, training samples are trained in batches, 16 pictures are selected for each batch to be trained, and the epoch is set to 100, 200, 300 and 400 respectively.
S153: in each training, 1 epoch picture is firstly obtained, then the loss value is calculated, and then the Adam optimizer is used for continuously updating the parameters of the network to minimize the loss value of the network.
In the step (3), the pictures are input into the trained IC-GAN network model for recognition, the probability of each facial expression is finally output, and the expression category with the maximum probability is output as the classification result of the people
In order to make the technical solution of the present invention more clearly understood, the technical solution of the present invention will be described in detail and completely with reference to the accompanying drawings in the embodiments of the present invention, and only some embodiments of the present invention will be given herein. All other embodiments, which are obtained by researchers without obtaining innovative achievements, based on the embodiments of the present invention, should be all included in the scope of protection of the present invention.
It should be noted that the terms "first", "second", and the like in the description and the claims of the present invention and in the above description are not used for describing a sequential order or a precedence order of similar objects, and are used for distinguishing similar objects in the description. Where used, portions of data are interchangeable, to facilitate describing or illustrating an unexpected order of implementation. In addition, the terms "comprising" and "having" and similar terms in the description, as well as any other steps in the description that follow are intended to more clearly describe the inherent nature of such processes, methods, articles of manufacture, and devices.
As shown in FIGS. 1 and 2, the invention provides a facial expression recognition method based on generation confrontation, which comprises a network training process and an off-line recognition process of facial expression recognition with intra-class distance difference.
As an embodiment, the offline identification process should include the following steps:
step S11: downloading through a network, skipping a frame, analyzing a video, and collecting an input image x;
step S12: geometrically normalizing the input image x and detecting an image x' after normalization processing;
step S13: processing the detected and cut image x' to a uniform size;
step S14: constructing a network model based on the generated confrontation facial expression recognition;
step S15: carrying out data enhancement and data expansion processing on the image x' and unifying the image size;
step S16: training the network model and storing the trained network model;
in a specific embodiment, step S12 should further include the steps of:
step S121: performing geometric normalization processing on the input image; the geometric normalization method comprises scale normalization, external head correction and face twisting correction;
step S122: performing face detection on the image after geometric normalization by using a face detection method in an OpenCV open source library, and then performing noise reduction on the detected image;
s23: resulting in a geometrically normalized image x'.
As a preferred embodiment, step S23 further includes:
s131: determining the position of an image according to the coordinates of the face;
s132: using OpenCV to detect and obtain a face image;
s133: and adjusting the cut face image into a uniform size, and changing the cut face image into 256 × 256 size.
Further, step S14 should further include: s141: building an IC-GAN neural network by using a pytorch deep learning framework, firstly inputting a picture into a con _1 layer for convolution operation, performing convolution on an input image through a 4 × 4 convolution kernel, and outputting 128 × 64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, and outputting 128 × 64; the LeakyReLu activation function is:
s142: continuing to perform convolution operation on the output of the previous layer by using a 4 × 4 convolution kernel, wherein the output is 64 × 128, then performing normalization operation on the output of the previous layer on the batcnorm layer, and performing nonlinear operation on the convolution by using a LeakyReLu activation function, wherein the output is 64 × 128;
s143: continuing to perform convolution, batchnorm and LeakyReLu operations on the output of the previous layer by using the method of S142, wherein the output is 4 x 100;
s144: performing reverse convolution operation of a convolution kernel 4 × 4 on the output of the S143 to obtain an output of 29 × 1, performing batch normalization operation by using a batchnorm, and performing nonlinear operation on the output by using a ReLu activation function to obtain an output of 32 × 128;
s145: performing the convolution, bathnorm and ReLu operations described in S144 again on the output of the previous layer, and outputting 64 × 64;
s146: performing nonlinear operation on the output of the previous layer by using a ReLu activation function, performing convolution operation on the previous layer by using reverse convolution with a convolution kernel of 4 × 4, and performing nonlinear operation by using a Tanh activation function to output 128 × 128;
s147: performing the operation in the S141-S143 process again on the output of the previous layer, wherein the output is 1 x 5;
s148: inputting the original image and the output of S147 into a 4 × 4 convolution layer, performing convolution operation, and then performing nonlinear activation by using a nonlinear activation function LeakyReLu, wherein the output is 128 × 64;
s149: performing convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, performing batch normalization operation by using batchnorm, and performing LeakyReLu nonlinear activation;
s1491: continuously performing convolution, batchnorm and nonlinear operation on the output of the previous layer by adopting the process of S150, wherein the output is 4 x 1;
s1492: and finally, adopting Softmax to the output of the previous layer, and outputting the probability of judging to be true.
S1493: performing full-connection operation on the output of the S147 process, and finally realizing training of 5 expressions through a Softmax classifier, where the 5 expressions are 1 ═ happy, 2 ═ inhibition, 3 ═ neutral, 4 ═ excitation, and 5 ═ surpride and fear, so as to realize recognition of facial expressions;
as a preferred embodiment, the IC-GAN network uses a pytorech to build a network, wherein the network comprises an input layer, a convolutional layer, an activation function, a pooling layer, a full connection layer, a BN layer and an output layer.
As a preferred embodiment, the size before and after the convolutional layer can be described as the following formula:
the input size of the convolutional layer is: w1*H1*D1
D2=K
in the above formula, K is the number of convolution kernels, F is the size of the convolution kernels, S is the step size, and P is the boundary padding.
In a preferred embodiment, a total of 5 expression labels, 1 being happy, 2 being abomination, 3 being neutral, 4 being equal to anisous, and 5 being surpride and fear, are shared by 4455 images as one mixed expression dataset of the present application, and the present invention has a problem that the distribution of the dataset is not uniform, so that the dataset is expanded by using image affine transformation, image mirror transformation, contrast adjustment, brightness adjustment, and the like, and the number of the mixed expression datasets after expansion is as shown in table 1:
TABLE 1 number of expressions in the post-augmentation Mixed dataset
As a most preferred method of the present application, step S15 further includes: s151: defining the loss of the network as 4 parts according to the network structure and the experimental characteristics;
s152: the Optimizer selects an Adam Optimizer, the learning rate is set to be 0.0002, training samples are trained in batches, 16 pictures are selected for each batch to be trained, and the epoch is respectively set to be 100, 200, 300 and 400;
s153: in each training, 1 epoch picture is firstly obtained, then the loss value is calculated, and then the Adam optimizer is used for continuously updating the parameters of the network to minimize the loss value of the network.
Further, the step S16 further includes: s161: downloading, skipping frames, analyzing videos and collecting input images through a network;
s162: performing geometric normalization processing, face detection, opencv processing and size unification on the input image;
s163: and inputting the processed image into a trained IC-GAN network model for recognition, and finally outputting the probability of each expression, wherein the expression with the maximum probability is the expression which the network wants to recognize.
Compared with the prior art, the invention has the advantages that:
compared with the traditional method for manually extracting expression characteristics, the facial expression recognition method based on the generation countermeasure realizes the automatic extraction of the facial expression characteristics, and compared with the slightly early neural network facial expression recognition, the facial expression recognition method based on the generation countermeasure realizes the improvement of the recognition rate, thereby accurately recognizing the expression.
The idea of model training is to cut images by OpenCV open source codes before inputting pictures into a network for training, unify the images into 256 × 256 sizes, and then train an IC-GAN network model by taking preprocessed pictures as input of the network. The Softmax loss function adopts a cross entropy loss function, the Optimizer adopts an Adam Optimizer, the learning rate is set to be 0.0002, the training samples are trained in batches, 16 pictures are selected for each batch to be trained, and the epoch is set to be 100, 200, 300 and 400 respectively.
As a preferred embodiment of the present application, the identification process should comprise the following steps:
s21: downloading, skipping a frame, analyzing a video and collecting an input image I through a network;
s22: then the input image I inputs the trained network model;
s23: and obtaining a recognition result.
The above-mentioned serial numbers of the embodiments of the present invention are only for describing the present invention, and do not indicate the quality of any embodiments.
In the embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and if a part of a certain embodiment is not clearly described, reference may be made to the corresponding descriptions in other embodiments;
in several embodiments provided in the present application, the technical contents described can be implemented in other ways. All of the above description is intended to be illustrative only.
Claims (8)
1. A facial expression recognition method based on Intra-Class Gap GAN is characterized by comprising the following steps:
the identification model construction comprises the following steps:
(1) acquiring real-time images of different sources and different expressions of the human face;
(2) inputting the image into an Intra-Class Gap GAN neural network model for identification;
(3) outputting the identification result;
the method for constructing the Intra-Class Gap GAN neural network model in the step (2) is as follows:
(2.1) acquiring historical images of different sources and different expressions of the human face;
(2.2) preprocessing the collected face image to construct a facial expression data set;
(2.3) aiming at the facial expression recognition problem with the intra-Class Gap in the data set in the step (2.2), constructing an Intra-Class Gap GAN neural network model;
(2.4) training the generator and discriminator of the network simultaneously by combining the pixel difference between the input image and the reconstructed image and the difference of the potential vector, and ensuring that the difference between the reconstructed image and the input image is minimum.
2. The method of claim 1, wherein the facial expression recognition method based on Intra-Class Gap GAN comprises:
(2.2) the method for constructing the human face expression data set in the step is as follows:
s11: based on Multi-PIE and JAFFE expression data sets, facial expression pictures are downloaded on the network through the step (2.1), the facial expression data sets required by self-control are carried out, 5 facial expressions of abomination, happy, neutral, anxious and surrose and fear of people in different countries, different age groups, different professions and the like are selected for experiment, and the data sets with the complexity of a large number of facial expression characteristics with intra-class differences are increased and serve as input images x of network training
S12: geometrically normalizing the input image, and carrying out face detection on the normalized image;
s13: the images after the processing in step S12 are scale-normalized to unify the sizes of the images.
3. The method of claim 2, wherein the facial expression recognition method based on Intra-Class Gap GAN comprises: the step (2.4) is specifically as follows:
s14: training a facial expression recognition network model based on an IC-GAN (Intra-Class Gap GAN) neural network generating a confrontation based on the image processed in step S13;
s15: carrying out data enhancement and data expansion processing on the image;
s16: and training the network model and storing the trained network model.
4. The method of claim 2, wherein the facial expression recognition method based on Intra-Class Gap GAN comprises: : the step S12 includes the following steps:
s121: determining a characteristic point [ x, y ] according to the collected image, and calibrating the characteristic points of the two eyes and the nose to obtain coordinate values of the characteristic points;
s122: rotating the image according to the coordinates of the eyes on the face to ensure the consistency of the face direction, wherein the distance between the eyes of a person is d, and the midpoint of the two eyes is O;
s123: and determining a frame containing the face according to the calibrated characteristic points and the geometric model, respectively cutting the distance of d from the O to the left and the right, and respectively cutting by taking 0.5d and 1.5d in the up-down direction.
5. The method of claim 4, wherein the facial expression recognition method based on Intra-Class Gap GAN comprises:
the step S13 includes the following steps:
s131: and (4) performing scale normalization on the cut images in the step (S123), and unifying resize of the images into 256 × 256 pixel images to complete geometric normalization of the images.
6. The method of claim 3, wherein the facial expression recognition method based on Intra-Class Gap GAN comprises:
the step S14 includes the following steps:
s141: constructing the proposed IC-GAN (internal-Class Gap GAN) neural network by using a pytorch deep learning framework, firstly inputting the picture processed in the step S13 into a first layer of convolution layer for convolution operation, performing convolution on the input image through a 4 × 4 convolution core, and outputting the result as 128 × 64; performing nonlinear operation on the convolution by adopting a LeakyReLu activation function, and outputting 128 × 64; the LeakyReLu activation function is:
s142: continuing to perform convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, wherein the output is 64 × 128, then performing normalization operation on the output of the previous layer by using a batchnorm layer, performing nonlinear operation on the convolution by using a LeakyReLu activation function, and outputting 64 × 128
S143: continuing to perform convolution, batchnorm and LeakyReLu operations on the output of the previous layer by using the method in the step S142, wherein the output is 4 × 100;
s144: performing reverse convolution operation of a convolution kernel 4 × 4 on the output of the S143 to obtain an output of 29 × 1, performing batch normalization operation by using a batchnorm, and performing nonlinear operation on the output by using a ReLu activation function to obtain an output of 32 × 128; the ReLu activation function is:
s145: performing the convolution, bathnorm and ReLu operations in step S144 again on the output of the previous layer, and outputting 64 × 64;
s146: performing nonlinear operation on the output of the previous layer by using a ReLu activation function, performing convolution operation on the previous layer by using reverse convolution with a convolution kernel of 4 × 4, and performing nonlinear operation by using a Tanh activation function to output 128 × 128; the Tanh activation function is:
s147: performing the operation in the S141-S143 process again on the output of the previous layer, wherein the output is 1 x 5;
s148: inputting the image subjected to scale normalization in the step S13 and the output of the step S147 into a 4 × 4 convolution layer, performing convolution operation, and then performing nonlinear activation by using a nonlinear activation function LeakyReLu to output 128 × 64;
s149: performing convolution operation on the output of the previous layer by using 4 × 4 convolution kernel, performing batch normalization operation by using batchnorm, and performing LeakyReLu nonlinear activation;
s1491: continuing to perform convolution, batchnorm and nonlinear operation on the output of the previous layer by adopting the process of S142, wherein the output is 4 x 1;
s1492: finally, adopting Softmax to the output of the upper layer, and outputting the probability of judging the output to be true;
s1493: and performing full-connection operation on the output of the S147 process, and finally realizing training of 5 expressions by a Softmax classifier, wherein the 5 expressions are 1 ═ happy, 2 ═ inhibition, 3 ═ neutral, 4 ═ excitation, and 5 ═ surprism and fear, so as to realize recognition of facial expressions.
7. The method of claim 3, wherein the facial expression recognition method based on Intra-Class Gap GAN comprises: step S15 includes
S151: dividing the network loss function into four parts, and reducing the difference between an original image and a reconstructed image on a pixel level for the generated network of the first part, wherein the reconstruction error loss is as follows:
Lcon=Ex~pX||x-G(x)||1;
pX represents data allocation; x is the input image g (X) is the image generated by the generator in the network;
using the feature matching method proposed by Salimans et al to reduce training instability, the image feature level is optimized, and one feature matching error of the discriminator of the second partial network is:
Ladv=Ex~pX||f(x)-f(G(x))||2
wherein f (-) represents the discriminator model transformation;
the third part is a potential vector z and a reconstructed potential vectorThe coding loss of the facial expression information prevents the interference with the picture independence information in the network decoding process:
where h (-) represents the encoding transform;
the network loss of the fourth part is the cross entropy loss of the Softmax layer:
where k (-) represents the cross-entropy loss process of Softmax, k (y) represents the true result,representing the recognition result;
the overall network loss function is as follows:
L=ωadvLadv+ωconLcon+ωpLp+ωsLs
wherein ω isadv,ωcon,ωp,ωsIs a parameter for adjusting losses;
s152: the Optimizer selects an Adam Optimizer, the learning rate is set to be 0.0002, training samples are trained in batches, 16 pictures are selected for each batch to be trained, and the epoch is respectively set to be 100, 200, 300 and 400;
s153: in each training, 1 epoch picture is firstly obtained, then the loss value is calculated, and then the Adam optimizer is used for continuously updating the parameters of the network to minimize the loss value of the network.
8. A facial expression recognition method based on intraclass Gap GAN as described in claim 1, wherein:
and (3) inputting the picture into the trained IC-GAN network model for recognition, finally outputting the probability of each facial expression, and outputting the expression class with the maximum probability as the classification result.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2019108222525 | 2019-09-02 | ||
CN201910822252 | 2019-09-02 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112183213A true CN112183213A (en) | 2021-01-05 |
CN112183213B CN112183213B (en) | 2024-02-02 |
Family
ID=73924606
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010905875.1A Active CN112183213B (en) | 2019-09-02 | 2020-09-01 | Facial expression recognition method based on Intril-Class Gap GAN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112183213B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113688799A (en) * | 2021-09-30 | 2021-11-23 | 合肥工业大学 | Facial expression recognition method for generating confrontation network based on improved deep convolution |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106778506A (en) * | 2016-11-24 | 2017-05-31 | 重庆邮电大学 | A kind of expression recognition method for merging depth image and multi-channel feature |
US20180068198A1 (en) * | 2016-09-06 | 2018-03-08 | Carnegie Mellon University | Methods and Software for Detecting Objects in an Image Using Contextual Multiscale Fast Region-Based Convolutional Neural Network |
WO2018054283A1 (en) * | 2016-09-23 | 2018-03-29 | 北京眼神科技有限公司 | Face model training method and device, and face authentication method and device |
CN108304826A (en) * | 2018-03-01 | 2018-07-20 | 河海大学 | Facial expression recognizing method based on convolutional neural networks |
CN108615010A (en) * | 2018-04-24 | 2018-10-02 | 重庆邮电大学 | Facial expression recognizing method based on the fusion of parallel convolutional neural networks characteristic pattern |
CN109376625A (en) * | 2018-10-10 | 2019-02-22 | 东北大学 | A kind of human facial expression recognition method based on convolutional neural networks |
-
2020
- 2020-09-01 CN CN202010905875.1A patent/CN112183213B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180068198A1 (en) * | 2016-09-06 | 2018-03-08 | Carnegie Mellon University | Methods and Software for Detecting Objects in an Image Using Contextual Multiscale Fast Region-Based Convolutional Neural Network |
WO2018054283A1 (en) * | 2016-09-23 | 2018-03-29 | 北京眼神科技有限公司 | Face model training method and device, and face authentication method and device |
CN106778506A (en) * | 2016-11-24 | 2017-05-31 | 重庆邮电大学 | A kind of expression recognition method for merging depth image and multi-channel feature |
CN108304826A (en) * | 2018-03-01 | 2018-07-20 | 河海大学 | Facial expression recognizing method based on convolutional neural networks |
CN108615010A (en) * | 2018-04-24 | 2018-10-02 | 重庆邮电大学 | Facial expression recognizing method based on the fusion of parallel convolutional neural networks characteristic pattern |
CN109376625A (en) * | 2018-10-10 | 2019-02-22 | 东北大学 | A kind of human facial expression recognition method based on convolutional neural networks |
Non-Patent Citations (2)
Title |
---|
叶芳芳;许力;: "基于CMAC神经网络的人脸表情识别", 计算机仿真, no. 08 * |
胡敏;余胜男;王晓华;: "基于约束性循环一致生成对抗网络的人脸表情识别方法", 电子测量与仪器学报, no. 04 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113688799A (en) * | 2021-09-30 | 2021-11-23 | 合肥工业大学 | Facial expression recognition method for generating confrontation network based on improved deep convolution |
Also Published As
Publication number | Publication date |
---|---|
CN112183213B (en) | 2024-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Christa et al. | CNN-based mask detection system using openCV and MobileNetV2 | |
CN111666845B (en) | Small sample deep learning multi-mode sign language recognition method based on key frame sampling | |
CN111126307B (en) | Small sample face recognition method combining sparse representation neural network | |
JP2010108494A (en) | Method and system for determining characteristic of face within image | |
Vadlapati et al. | Facial recognition using the OpenCV Libraries of Python for the pictures of human faces wearing face masks during the COVID-19 pandemic | |
CN112633221A (en) | Face direction detection method and related device | |
Zuobin et al. | Feature regrouping for cca-based feature fusion and extraction through normalized cut | |
CN110826534B (en) | Face key point detection method and system based on local principal component analysis | |
CN115393944A (en) | Micro-expression identification method based on multi-dimensional feature fusion | |
CN116704585A (en) | Face recognition method based on quality perception | |
CN112183213B (en) | Facial expression recognition method based on Intril-Class Gap GAN | |
CN107590806B (en) | Detection method and system based on brain medical imaging | |
Silva et al. | POEM-based facial expression recognition, a new approach | |
CN113221667A (en) | Face and mask attribute classification method and system based on deep learning | |
CN115116117A (en) | Learning input data acquisition method based on multi-mode fusion network | |
Nimitha et al. | Supervised chromosomal anomaly detection using VGG-16 CNN model | |
CN111652164B (en) | Isolated word sign language recognition method and system based on global-local feature enhancement | |
Mahmood et al. | An investigational FW-MPM-LSTM approach for face recognition using defective data | |
CN110210321B (en) | Under-sample face recognition method based on multi-dimensional scale transformation network and block weighting method | |
Shukla et al. | Deep Learning Model to Identify Hide Images using CNN Algorithm | |
Bhavani | Automated Attendance System and Voice Assistance using Face Recognition | |
Praneel et al. | Malayalam Sign Language Character Recognition System | |
Agnihotri et al. | Vision based Interpreter for Sign Languages and Static Gesture Control using Convolutional Neural Network | |
CN114049668B (en) | Face recognition method | |
AU2020103067A4 (en) | Computer Vision IoT-Facial Expression/Human Action Recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |