CN106127139B

CN106127139B - A kind of dynamic identifying method of MOOC course middle school student's facial expression

Info

Publication number: CN106127139B
Application number: CN201610453639.4A
Authority: CN
Inventors: 郦泽坤; 苏航; 陈美月; 赵长宽; 高克宁
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2016-06-21
Filing date: 2016-06-21
Publication date: 2019-06-25
Anticipated expiration: 2036-06-21
Also published as: CN106127139A

Abstract

A kind of dynamic identifying method of MOOC course middle school student's facial expression, belong to field of image processing, this method carries out framing to the video shot in MOOC course, the image zooming-out characteristic point that framing is obtained, and after extracting key position feature point group into feature vector, training expression pattern classifier, the classification of expression mode is carried out using expression pattern classifier to the real-time video middle school student's facial expression shot in MOOC course；The present invention redefines six kinds of basic facial expressions for the classroom MOOC middle school student's facial expression feature, keeps recognition result more effective to student classroom state analysis；Expression mode is obtained by the combination to action mode, keeps the identification of expression mode more acurrate；The action mode feature vector constructed using Euclidean distance, feature vector are had the characteristics that dimension is low, quantity is few, make Expression Recognition speed faster.

Description

Dynamic identification method for facial expressions of students in MOOC course

Technical Field

The invention belongs to the field of image processing, and particularly relates to a dynamic identification method for facial expressions of students in MOOC courses.

Background

A large-scale Online Open Course (MOOC) is a Course form that has been rapidly developed in recent years, has gained high attention from governments, colleges and universities and enterprises in the global scope, and has become an important force for promoting "advanced education transformation". The MOOC course utilizes the rapidity and the convenience of video transmission to realize the large-scale transmission of the course of teaching, and introduces interactive practice aiming at the problem of insufficient teaching feedback caused by video unidirectional transmission. The teaching feedback provided by interactive exercises is still insufficient compared to the traditional offline lecture process. In the offline course, the lecturer can obtain feedback through the student's facial expression and through asking questions to the student to make timely teaching adjustment, this point still can not be accomplished to the MOOC course.

The expression recognition technology is regarded as an important technology of future emotion man-machine interaction, and attracts participation and research of numerous colleges and universities and scientific research institutions at home and abroad. At present, a plurality of expression recognition methods aiming at standard expression databases such as JAFFE have obtained higher recognition rate, the expression recognition technology is applied to MOOC courses, the classroom states of students can be obtained in real time, and classroom adjustment is made aiming at student reflection, but the existing expression recognition methods have the following defects when being applied to the MOOC courses: 1. the method has the advantages that the artificial expressions existing on the human face are taken as the premise, the diversity of expression components is neglected, and the practicability is not high; 2. the expression is not fine enough. The facial expression is not limited to 6 basic expressions; 3. expressions are not targeted. Facial expressions tend to have a tendency in a specific scene, that is, the probability of some expressions appearing is higher, and some expressions appear less.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a dynamic recognition method for facial expressions of students in MOOC courses.

The technical scheme of the invention is as follows:

a dynamic recognition method for facial expressions of students in MOOC courses comprises the following steps:

step 1: shooting facial expressions of students in the learning process on an MOOC course to obtain facial expression videos of the students;

step 2: acquiring a historical video of facial expressions of students in an MOOC course;

and step 3: performing framing processing on MOOC course student facial expression historical videos to convert the MOOC course student facial expression historical videos into n frames of historical still images with student facial features;

and 4, step 4: sequentially extracting feature points of facial features of students in each frame of historical static image by using a face + + face recognition system, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;

and 5: the method comprises the following steps of preprocessing feature points of facial features of students in each frame of historical static image, and comprises the following specific steps: converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into pixel coordinates after the homogeneous coordinates are subjected to rotation, translation and scaling conversion in sequence;

step 6: extracting key parts of the face from feature points of facial features of the students of each frame of preprocessed image: characteristic points of eyes, mouth, eyebrows;

and 7: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of image according to the characteristic points of eyes, mouths and eyebrows in each frame of image;

and 8: defining eye action modes of gazelle, normal and closed eyes respectively, mouth action modes of normal, sipping biased to thinking and sadness, sipping biased to anger and grinning respectively, and eyebrow action modes of frown, normal and eyebrow picking respectively;

and step 9: according to the definition of the action mode, respectively calibrating the action mode to which the eye characteristic vector belongs, the action mode to which the mouth characteristic vector belongs and the action mode to which the eyebrow characteristic vector belongs in each frame of image;

step 10: respectively storing the eye characteristic vector, the mouth characteristic vector and the eyebrow characteristic vector of each frame of image into the characteristic vector set of the respective marked action mode;

step 11: training a feature vector set of each action mode by using a Support Vector Machine (SVM) to obtain a classifier of each action mode;

step 12: acquiring a real-time video of facial expressions of students in an MOOC course;

step 13: performing framing processing on the real-time video of the facial expression of the student to convert the real-time video into m frames of real-time static images with facial features of the student;

step 14: sequentially extracting feature points of the facial features of the students in each frame of real-time static image by using the face + + api, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;

step 15: the method comprises the following steps of preprocessing the feature points of the facial features of students in each frame of real-time image, and comprises the following specific steps: converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into pixel coordinates after the homogeneous coordinates are subjected to rotation, translation and scaling conversion in sequence;

step 16: extracting feature points of eyes, a mouth and eyebrows from each frame of preprocessed real-time image, and respectively establishing an eye feature vector, a mouth feature vector and an eyebrow feature vector;

and step 17: classifying the mouth, eyes and eyebrow feature vectors of each frame of real-time image by using a motion mode classifier to obtain motion modes corresponding to the mouth, the eyes and the eyebrows in each frame of real-time image;

step 18: determining an expression mode to which the expression of the student belongs in each frame of real-time image according to the combination of the action modes of the mouth, eyes and eyebrows in each frame of real-time image;

the method for establishing the eye feature vector, the mouth feature vector and the eyebrow feature vector in the steps 7 and 16 comprises the following steps:

respectively calculating Euclidean distances between every two points in the eye characteristic points, namely Euclidean distance values of the eye characteristic points, Euclidean distances between every two points in the mouth characteristic points, namely Euclidean distance values of the mouth characteristic points, and Euclidean distances between every two points in the eyebrow characteristic points, namely Euclidean distance values of the eyebrow characteristic points, and respectively forming an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector by using the Euclidean distance values of the eye characteristic points, the Euclidean distance values of the mouth characteristic points and the Euclidean distance values of the eyebrow characteristic points.

In the step 18, the expression mode is determined by adopting the mapping relation in table 1:

TABLE 1 mapping relationship table of action mode and expression mode

Has the advantages that: the dynamic recognition method for the facial expressions of the students in the MOOC course has the following advantages that:

1. the six basic expression modes are happy, hurted, feared, angry, surprised and disliked, but the expression probability of fear and disliked in the MOOC classroom is smaller, and even if the expression probability is smaller, the expression probability is mostly irrelevant to the course content, so the fear and dislike are removed from the expression modes, and the thinking, chatting and normal addition with higher occurrence frequency to the expression recognition result are added, so that the expression recognition result is more effective for the MOOC classroom;

2. the expression mode is obtained through the combination of the action modes, so that the expression mode is more accurately identified;

3. the motion mode feature vectors constructed by the Euclidean distance have the characteristics of low dimensionality and small quantity, so that the expression recognition speed is higher.

Drawings

FIG. 1 is a flow chart of a method for dynamically identifying facial expressions of students in MOOC courses;

FIG. 2 is a schematic diagram of an expression recognition process in the MOOC course;

FIG. 3(a) a glaring schematic in eye action mode;

FIG. 3(b) is a schematic diagram of the eye movement pattern;

FIG. 3(c) is a schematic diagram of eye closing in the eye movement mode;

FIG. 4(a) is a schematic view of a normal mode of mouth movement;

fig. 4(b) is a schematic diagram of a sipping mouth biased towards thinking and sadness in the mouth action mode;

fig. 4(c) a schematic diagram of a sipping mouth biased towards anger in the mouth movement pattern;

fig. 4(d) a schematic diagram of grin in the mouth action mode;

FIG. 5(a) is a schematic diagram of frowning in eyebrow action mode;

FIG. 5(b) is a schematic diagram of a normal operation mode of eyebrow movement;

FIG. 5(c) is a schematic diagram of eyebrow selection in the eyebrow action mode;

FIG. 6 is a schematic diagram of a part of an image in an intercepted image;

FIG. 7 is a schematic view of facial feature points;

FIG. 8 is a schematic diagram of facial feature point coordinates;

FIG. 9 is a schematic view of a rotational reference;

FIG. 10 is a schematic illustration of a pan and zoom reference;

FIG. 11 is a schematic view of eye feature points;

FIG. 12 is a schematic view of feature points of the mouth;

FIG. 13 is a schematic view of eyebrow feature points;

FIG. 14 is a schematic view of a classification tree model, wherein E represents eye, E1, E2, and E3 represent normal, closed, and gladio eyes, respectively; m stands for mouth, M1, M2, M3 and M4 respectively stand for normal mouth, sipping mouth biased toward thinking and sadness, sipping mouth biased toward anger and grinning mouth; b represents eyebrow, B1, B2 and B3 represent normal, frown and brow-picking respectively.

Detailed Description

An embodiment of the present invention will be described in detail below with reference to the accompanying drawings.

As shown in fig. 1 and fig. 2, the expression dynamic recognition process of the dynamic recognition method for facial expressions of students in an MOOC course according to the present embodiment includes the following steps:

the dynamic recognition method for the facial expressions of the students in the MOOC course specifically comprises the following steps:

and step 3: the MOOC course student facial expression historical video is subjected to framing processing, so that the MOOC course student facial expression historical video is converted into n frames of historical still images with student facial features, one frame of image is intercepted from the first frame of still image every 5 frames of still images until the historical video is intercepted completely, expression loss cannot be caused by the interception, and meanwhile data redundancy is effectively avoided. The history image described below in this embodiment is a history still image captured from a history video.

And 4, step 4: using a face + + face recognition system to sequentially extract 83 feature points of the facial features of the students in each frame of historical image, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;

and 5: because the human faces in the historical images have different sizes and coordinate proportions, and the action modes cannot be described by Euclidean distance values among the feature points, the feature points of the facial features of the students are preprocessed, so that the feature points of the facial features of the students are kept horizontal in the images and can be described by uniform coordinates, and the specific method comprises the following steps:

step 5.1: converting the 2-dimensional pixel coordinate into a 3-dimensional homogeneous coordinate, wherein the conversion formula is as follows:

wherein, C'_i'k'The pixel coordinates of the kth characteristic point in the ith' historical image are obtained; m'_i'k'The homogeneous coordinate of the kth characteristic point in the ith' historical image is obtained; k' is e [1, 83 ]]；

Step 5.2: and rotating the homogeneous coordinate by an angle theta' by taking the left eye corner and the right eye corner as horizontal reference. Let the right corner point of the left eye in the ith ' historical still image be the first feature point, i.e. k ' be 1, and its homogeneous coordinate be M '_i'1(x'_i'1,y'_i'11), the right eye left corner point is a second feature point, i.e. k 'is 2, and the homogeneous coordinate thereof is M'_i'2(x'_i'2,y'_i'21), define a rotation matrix N'₁Comprises the following steps:

wherein,

translating the rotated coordinates to define a translation matrix N'₂Comprises the following steps:

wherein, p'₁＝(x'_i'2-x'_i'1)+x'_i'2，q'₁＝(y'_i'2-y'_i'1)+y'_i'2；

Zooming the translated coordinates to make L 'be the left-right distance of the nose bridge, and defining a zooming matrix N'₃Comprises the following steps:

step 5.3: will be homogeneous coordinate M'_i'k'Transforming to obtain transformed homogeneous coordinatesThe transformation formula is as follows:

step 5.4: converting the homogeneous coordinateConverting the pixel coordinates of the feature points into transformed pixel coordinates by the inverse process of equation (1)

Step 6: because the parts of the human face which can show the expression change only comprise the mouth, the eyes and the eyebrows, and the nose does not change along with the change of the expression, the invention takes the mouth, the eyes and the eyebrows as the key parts of the face, and extracts the key parts of the face from the feature points of the facial features of the students of each frame of the preprocessed historical images: characteristic points of eyes, mouth, eyebrows;

and 7: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of historical image according to the characteristic points of eyes, mouth and eyebrows in each frame of historical image, wherein the method comprises the following steps:

respectively calculating Euclidean distances between every two points in the eye characteristic points, namely Euclidean distance values of the eye characteristic points, Euclidean distances between every two points in the mouth characteristic points, namely Euclidean distance values of the mouth characteristic points, and Euclidean distances between every two points in the eyebrow characteristic points, namely Euclidean distance values of the eyebrow characteristic points, and respectively forming an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector by using the Euclidean distance values of the eye characteristic points, the Euclidean distance values of the mouth characteristic points and the Euclidean distance values of the eyebrow characteristic points;

and 8: predefining an eye action mode, a mouth action mode and an eyebrow action mode; as shown in fig. 3, the eye movement patterns are the glaring eye shown in fig. 3(a), the normal eye shown in fig. 3(b), and the closed eye shown in fig. 3(c), respectively; as shown in fig. 4, the mouth action patterns are normal as shown in fig. 4(a), sipping biased to thinking and sadness as shown in fig. 4(b), sipping biased to anger as shown in fig. 4(c), and grinning as shown in fig. 4(d), respectively; as shown in FIG. 5, the eyebrow movement patterns are frown as shown in FIG. 5(a), normal as shown in FIG. 5(b), and eyebrow plucking as shown in FIG. 5(c), respectively.

And step 9: according to the definition of the action mode, respectively calibrating the action mode to which the eye characteristic vector belongs, the action mode to which the mouth characteristic vector belongs and the action mode to which the eyebrow characteristic vector belongs in each frame of historical image;

step 10: respectively storing the eye characteristic vector, the mouth characteristic vector and the eyebrow characteristic vector of each frame of historical image into the characteristic vector set of the action mode marked by each frame of historical image;

step 11: programming through an LIBSVM tool box of matlab, and training a feature vector set of each action mode by using a Support Vector Machine (SVM) to obtain an action mode classifier;

step 13: the method comprises the steps of performing framing processing on a real-time video of facial expressions of students to convert the real-time video into m frames of real-time static images with facial features of the students, intercepting one frame of image from the first frame of real-time static image every 5 frames of real-time static images until historical videos are intercepted completely, wherein the expression loss cannot be caused by the interception, and meanwhile, data redundancy is effectively avoided. As shown in fig. 6, the partial image is a real-time still image captured in a real-time video in the present embodiment.

Step 14: as shown in fig. 7, 83 feature points of the facial features of the students in each frame of the real-time image are sequentially extracted by using a face + + face recognition system, and the extracted feature points of the facial features of the students are respectively stored in the form of pixel coordinate vectors as shown in fig. 8;

step 15: because the human faces in the real-time image have different sizes and coordinate proportions, and the action mode cannot be described by the Euclidean distance values between the feature points, the feature points of the facial features of the students are preprocessed, so that the feature points of the facial features of the students are kept horizontal in the image and can be described by uniform coordinates, and the specific method comprises the following steps:

step 15.1: converting the 2-dimensional pixel coordinate into a 3-dimensional homogeneous coordinate, wherein the conversion formula is as follows:

wherein, C_ikThe pixel coordinates of the kth characteristic point in the ith real-time image are obtained; m_ikFor the k-th real-time image in the ith real-time imageHomogeneous coordinates of the feature points; k is an element of [1, 83 ]]；

Step 15.2: as shown in fig. 9, the obtained homogeneous coordinates are rotated by an angle θ with the left and right eye corners as horizontal references. Let the right corner point of the left eye in the ith real-time image be the first feature point, i.e. k equals 1, and its homogeneous coordinate is M_i1(x_i1,y_i11), the right eye left corner point is the second characteristic point, i.e. k is 2, and its homogeneous coordinate is M_i2(x_i2,y_i21), defining a rotation matrix N₁Comprises the following steps:

wherein,

translating the rotated coordinates, defining a translation matrix N with the translated origin and coordinate axes as shown in FIG. 10₂Comprises the following steps:

wherein p is₁＝(x_i2-x_i1)+x_i2，q₁＝(y_i2-y_i1)+y_i2；

Scaling the coordinates after translation, as shown in fig. 10, let L be the distance between the left and right nose bridges, and define a scaling matrix N₃Comprises the following steps:

step 15.3: will coordinate M in homogeneous order_ikTransforming to obtain transformed homogeneous coordinatesThe transformation formula is as follows:

step 15.4: the transformed homogeneous coordinate M_ikConverting the pixel coordinates of the feature points into transformed pixel coordinates by the inverse process of equation (6)

Step 16: extracting key parts of the face from feature points of facial features of the students of each frame of the preprocessed real-time images: characteristic points of eyes, mouth, eyebrows;

and step 17: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of real-time image according to the characteristic points of eyes, mouth and eyebrows in each frame of real-time image;

the feature vector generation method in the present embodiment is:

(1) as shown in fig. 11, two points of pupil location are removed from the feature points in the picture to obtain 8 feature points of the eye region, and the euclidean distance between every two points in the feature points, that is, the euclidean distance value of the eye feature points, is calculated, and 28 values in total form an eye feature vector;

(2) as shown in fig. 12, two points on the left and right of the nose and 4 points on the left and right inside the lips are removed from the feature points in the picture to obtain 15 feature points of the mouth part, wherein 14 feature points come from the mouth, and one feature point comes from the lower part of the nose tip, and the distance between every two feature points, namely the euclidean distance value of the mouth feature point, is taken, and 105 values in total form a feature vector;

(3) as shown in fig. 13, all the points of the eye are removed from the feature points in the picture to obtain 9 feature points of the eyebrow portion, one point at the eye corner is selected as a reference point in addition to 8 points of the eyebrow contour, and the distance between every two feature points, i.e., the euclidean distance value of the eyebrow feature points, is taken, and 36 values are totally used to form a feature vector.

And step 17: and classifying the mouth, eyes and eyebrow feature vectors of each frame of real-time image by using a motion mode classifier to obtain motion modes corresponding to the mouth, eyes and eyebrows in each frame of real-time image. In the present embodiment, the accuracy of the motion pattern recognition is tested, and the test results are shown in table 2:

TABLE 2 action Pattern recognition accuracy Table

Step 18: determining the expression mode of the student's expression in each frame of real-time image by adopting the mapping relation in table 3 according to the combination of the action modes of the mouth, eyes and eyebrows in each frame of real-time image, wherein table 3 is as follows;

TABLE 3 mapping relationship table of action mode and expression mode

The definition of the expression schema in Table 3 is based on the idea of a classification tree model as follows:

as shown in fig. 14, where E represents eyes, E1, E2 and E3 represent normal, closed and glaring eyes, respectively, M represents mouth, M1, M2, M3 and M4 represent normal, frightened and sad sipping, respectively, angry sipping and grinning, B represents brow, and B1, B2 and B3 represent normal, frown and brow, respectively. The eyes with the highest action pattern recognition rate are the first layer nodes of the tree, the mouth is the second layer nodes, the eyebrows are the leaves, the 36 leaves correspond to the combination of the 36 action patterns, when the 36 action patterns are defined as 7 expression patterns, the eye action patterns are mainly defined in the definition process, namely, the action pattern recognition results of the eyes have the most main influence on the expression pattern recognition results, the mouth action patterns are the second layer nodes, and the eyebrow action patterns are the last layer nodes, so that the definition can improve the accuracy of expression recognition.

In the present embodiment, the expression pattern recognition accuracy is shown in table 4:

table 4 expression pattern recognition accuracy table

Claims

1. A dynamic recognition method for facial expressions of students in MOOC courses is characterized by comprising the following steps: the method comprises the following steps:

and step 3: performing framing processing on the historical videos of the facial expressions of students in the MOOC course to convert the videos into n frames of historical static images with the facial features of the students;

and 5: preprocessing the feature points of the facial features of the students in each frame of static image;

converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into the pixel coordinates after rotation, translation and scaling conversion are carried out on the homogeneous coordinates in sequence, wherein the specific method comprises the following steps:

Step 5.2: rotating the homogeneous coordinate by an angle theta 'by taking the left eye corner and the right eye corner as horizontal references, wherein the right corner point of the left eye in the ith' historical static image is a first characteristic point, namely k 'is 1, and the homogeneous coordinate is M'_i'1(x'_i'1,y'_i'11), the right eye left corner point is a second feature point, i.e. k 'is 2, and the homogeneous coordinate thereof is M'_i'2(x'_i'2,y'_i'21), define a rotation matrix N'₁Comprises the following steps:

wherein,

translating the rotated coordinates, and defining a translation matrix N'2 as:

wherein, p'₁＝(x'_i'2-x'_i'1)+x'_i'2，q'₁＝(y'_i'2-y'_i'1)+y'_i'2；

step 5.3: will be homogeneous coordinate M'_i'k'Transforming to obtain transformed homogeneous coordinate M^′* _i′k′The transformation formula is:

M^′* _i′k′＝M′_i′k′×N′₁×N′₂×N′₃ (5)

step 5.4: the transformed homogeneous coordinate M^′* _i′k′The pixel coordinates C of the feature points after conversion are obtained by the inverse process of the equation (1)^′* _i′k′；

Step 6: extracting key parts of the face from feature points of facial features of the students of each frame of the preprocessed static image: characteristic points of eyes, mouth, eyebrows;

and 8: predefining an eye action mode, a mouth action mode and an eyebrow action mode;

and step 9: according to the definition of the action mode, respectively calibrating the action mode to which the eye characteristic vector belongs, the action mode to which the mouth characteristic vector belongs and the action mode to which the eyebrow characteristic vector belongs in each frame of static image;

step 11: training a feature vector set of each action mode by using an SVM (support vector machine) to obtain a classifier of each action mode; step 12: acquiring a real-time video of facial expressions of students in an MOOC course;

step 15: preprocessing the feature points of the facial features of the students in each frame of real-time static image;

wherein, C_ikThe pixel coordinates of the kth characteristic point in the ith real-time image are obtained; m_ikThe homogeneous coordinate of the kth characteristic point in the ith real-time image is obtained; k is an element of [1, 83 ]]；

Step 15.2: rotating the obtained homogeneous coordinate by an angle theta by taking the left eye corner and the right eye corner as horizontal reference, and enabling the right corner point of the left eye in the ith real-time image to be a first characteristic point, namely k is 1 and the homogeneous coordinate is M_i1(x_i1,y_i11), the right eye left corner point is the second characteristic point, i.e. k is 2, and its homogeneous coordinate is M_i2(x_i2,y_i21), define the rotation matrix N1 as:

wherein,

translating the rotated coordinates, wherein the translated coordinate origin and coordinate axis are shown, and defining a translation matrix N2 as follows:

wherein p1 ═ (xi2-xi1) + xi2, q1 ═ (yi2-yi1) + yi 2;

scaling the translated coordinates, wherein L is the distance between the left and right nose bridges, and a scaling matrix N3 is defined as:

step 15.4: the transformed homogeneous coordinate M^* _ikConverting the pixel coordinates of the feature points into transformed pixel coordinates by the inverse process of equation (6)

the expression mode is determined by adopting the mapping relation in the table 1:

TABLE 1 mapping relationship table of action mode and expression mode

respectively calculating Euclidean distances between every two points in the eye characteristic points, namely Euclidean distance values of the eye characteristic points, Euclidean distances between every two points in the mouth characteristic points, namely Euclidean distance values of the mouth characteristic points, and Euclidean distances between every two points in the eyebrow characteristic points, namely Euclidean distance values of the eyebrow characteristic points; respectively using the Euclidean distance value of the eye characteristic point, the Euclidean distance value of the mouth characteristic point and the Euclidean distance value of the eyebrow characteristic point to form an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector;

the eye action modes are respectively gazelle, normal and closed eyes, the mouth action modes are respectively normal, sipping biased to thinking and sadness, sipping biased to anger and grinning, and the eyebrow action modes are respectively frown, normal and brow picking.