CN106127139B - A kind of dynamic identifying method of MOOC course middle school student's facial expression - Google Patents

A kind of dynamic identifying method of MOOC course middle school student's facial expression Download PDF

Info

Publication number
CN106127139B
CN106127139B CN201610453639.4A CN201610453639A CN106127139B CN 106127139 B CN106127139 B CN 106127139B CN 201610453639 A CN201610453639 A CN 201610453639A CN 106127139 B CN106127139 B CN 106127139B
Authority
CN
China
Prior art keywords
mouth
characteristic
eye
points
students
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610453639.4A
Other languages
Chinese (zh)
Other versions
CN106127139A (en
Inventor
郦泽坤
苏航
陈美月
赵长宽
高克宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201610453639.4A priority Critical patent/CN106127139B/en
Publication of CN106127139A publication Critical patent/CN106127139A/en
Application granted granted Critical
Publication of CN106127139B publication Critical patent/CN106127139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A kind of dynamic identifying method of MOOC course middle school student's facial expression, belong to field of image processing, this method carries out framing to the video shot in MOOC course, the image zooming-out characteristic point that framing is obtained, and after extracting key position feature point group into feature vector, training expression pattern classifier, the classification of expression mode is carried out using expression pattern classifier to the real-time video middle school student's facial expression shot in MOOC course;The present invention redefines six kinds of basic facial expressions for the classroom MOOC middle school student's facial expression feature, keeps recognition result more effective to student classroom state analysis;Expression mode is obtained by the combination to action mode, keeps the identification of expression mode more acurrate;The action mode feature vector constructed using Euclidean distance, feature vector are had the characteristics that dimension is low, quantity is few, make Expression Recognition speed faster.

Description

Dynamic identification method for facial expressions of students in MOOC course
Technical Field
The invention belongs to the field of image processing, and particularly relates to a dynamic identification method for facial expressions of students in MOOC courses.
Background
A large-scale Online Open Course (MOOC) is a Course form that has been rapidly developed in recent years, has gained high attention from governments, colleges and universities and enterprises in the global scope, and has become an important force for promoting "advanced education transformation". The MOOC course utilizes the rapidity and the convenience of video transmission to realize the large-scale transmission of the course of teaching, and introduces interactive practice aiming at the problem of insufficient teaching feedback caused by video unidirectional transmission. The teaching feedback provided by interactive exercises is still insufficient compared to the traditional offline lecture process. In the offline course, the lecturer can obtain feedback through the student's facial expression and through asking questions to the student to make timely teaching adjustment, this point still can not be accomplished to the MOOC course.
The expression recognition technology is regarded as an important technology of future emotion man-machine interaction, and attracts participation and research of numerous colleges and universities and scientific research institutions at home and abroad. At present, a plurality of expression recognition methods aiming at standard expression databases such as JAFFE have obtained higher recognition rate, the expression recognition technology is applied to MOOC courses, the classroom states of students can be obtained in real time, and classroom adjustment is made aiming at student reflection, but the existing expression recognition methods have the following defects when being applied to the MOOC courses: 1. the method has the advantages that the artificial expressions existing on the human face are taken as the premise, the diversity of expression components is neglected, and the practicability is not high; 2. the expression is not fine enough. The facial expression is not limited to 6 basic expressions; 3. expressions are not targeted. Facial expressions tend to have a tendency in a specific scene, that is, the probability of some expressions appearing is higher, and some expressions appear less.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a dynamic recognition method for facial expressions of students in MOOC courses.
The technical scheme of the invention is as follows:
a dynamic recognition method for facial expressions of students in MOOC courses comprises the following steps:
step 1: shooting facial expressions of students in the learning process on an MOOC course to obtain facial expression videos of the students;
step 2: acquiring a historical video of facial expressions of students in an MOOC course;
and step 3: performing framing processing on MOOC course student facial expression historical videos to convert the MOOC course student facial expression historical videos into n frames of historical still images with student facial features;
and 4, step 4: sequentially extracting feature points of facial features of students in each frame of historical static image by using a face + + face recognition system, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;
and 5: the method comprises the following steps of preprocessing feature points of facial features of students in each frame of historical static image, and comprises the following specific steps: converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into pixel coordinates after the homogeneous coordinates are subjected to rotation, translation and scaling conversion in sequence;
step 6: extracting key parts of the face from feature points of facial features of the students of each frame of preprocessed image: characteristic points of eyes, mouth, eyebrows;
and 7: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of image according to the characteristic points of eyes, mouths and eyebrows in each frame of image;
and 8: defining eye action modes of gazelle, normal and closed eyes respectively, mouth action modes of normal, sipping biased to thinking and sadness, sipping biased to anger and grinning respectively, and eyebrow action modes of frown, normal and eyebrow picking respectively;
and step 9: according to the definition of the action mode, respectively calibrating the action mode to which the eye characteristic vector belongs, the action mode to which the mouth characteristic vector belongs and the action mode to which the eyebrow characteristic vector belongs in each frame of image;
step 10: respectively storing the eye characteristic vector, the mouth characteristic vector and the eyebrow characteristic vector of each frame of image into the characteristic vector set of the respective marked action mode;
step 11: training a feature vector set of each action mode by using a Support Vector Machine (SVM) to obtain a classifier of each action mode;
step 12: acquiring a real-time video of facial expressions of students in an MOOC course;
step 13: performing framing processing on the real-time video of the facial expression of the student to convert the real-time video into m frames of real-time static images with facial features of the student;
step 14: sequentially extracting feature points of the facial features of the students in each frame of real-time static image by using the face + + api, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;
step 15: the method comprises the following steps of preprocessing the feature points of the facial features of students in each frame of real-time image, and comprises the following specific steps: converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into pixel coordinates after the homogeneous coordinates are subjected to rotation, translation and scaling conversion in sequence;
step 16: extracting feature points of eyes, a mouth and eyebrows from each frame of preprocessed real-time image, and respectively establishing an eye feature vector, a mouth feature vector and an eyebrow feature vector;
and step 17: classifying the mouth, eyes and eyebrow feature vectors of each frame of real-time image by using a motion mode classifier to obtain motion modes corresponding to the mouth, the eyes and the eyebrows in each frame of real-time image;
step 18: determining an expression mode to which the expression of the student belongs in each frame of real-time image according to the combination of the action modes of the mouth, eyes and eyebrows in each frame of real-time image;
the method for establishing the eye feature vector, the mouth feature vector and the eyebrow feature vector in the steps 7 and 16 comprises the following steps:
respectively calculating Euclidean distances between every two points in the eye characteristic points, namely Euclidean distance values of the eye characteristic points, Euclidean distances between every two points in the mouth characteristic points, namely Euclidean distance values of the mouth characteristic points, and Euclidean distances between every two points in the eyebrow characteristic points, namely Euclidean distance values of the eyebrow characteristic points, and respectively forming an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector by using the Euclidean distance values of the eye characteristic points, the Euclidean distance values of the mouth characteristic points and the Euclidean distance values of the eyebrow characteristic points.
In the step 18, the expression mode is determined by adopting the mapping relation in table 1:
TABLE 1 mapping relationship table of action mode and expression mode
Has the advantages that: the dynamic recognition method for the facial expressions of the students in the MOOC course has the following advantages that:
1. the six basic expression modes are happy, hurted, feared, angry, surprised and disliked, but the expression probability of fear and disliked in the MOOC classroom is smaller, and even if the expression probability is smaller, the expression probability is mostly irrelevant to the course content, so the fear and dislike are removed from the expression modes, and the thinking, chatting and normal addition with higher occurrence frequency to the expression recognition result are added, so that the expression recognition result is more effective for the MOOC classroom;
2. the expression mode is obtained through the combination of the action modes, so that the expression mode is more accurately identified;
3. the motion mode feature vectors constructed by the Euclidean distance have the characteristics of low dimensionality and small quantity, so that the expression recognition speed is higher.
Drawings
FIG. 1 is a flow chart of a method for dynamically identifying facial expressions of students in MOOC courses;
FIG. 2 is a schematic diagram of an expression recognition process in the MOOC course;
FIG. 3(a) a glaring schematic in eye action mode;
FIG. 3(b) is a schematic diagram of the eye movement pattern;
FIG. 3(c) is a schematic diagram of eye closing in the eye movement mode;
FIG. 4(a) is a schematic view of a normal mode of mouth movement;
fig. 4(b) is a schematic diagram of a sipping mouth biased towards thinking and sadness in the mouth action mode;
fig. 4(c) a schematic diagram of a sipping mouth biased towards anger in the mouth movement pattern;
fig. 4(d) a schematic diagram of grin in the mouth action mode;
FIG. 5(a) is a schematic diagram of frowning in eyebrow action mode;
FIG. 5(b) is a schematic diagram of a normal operation mode of eyebrow movement;
FIG. 5(c) is a schematic diagram of eyebrow selection in the eyebrow action mode;
FIG. 6 is a schematic diagram of a part of an image in an intercepted image;
FIG. 7 is a schematic view of facial feature points;
FIG. 8 is a schematic diagram of facial feature point coordinates;
FIG. 9 is a schematic view of a rotational reference;
FIG. 10 is a schematic illustration of a pan and zoom reference;
FIG. 11 is a schematic view of eye feature points;
FIG. 12 is a schematic view of feature points of the mouth;
FIG. 13 is a schematic view of eyebrow feature points;
FIG. 14 is a schematic view of a classification tree model, wherein E represents eye, E1, E2, and E3 represent normal, closed, and gladio eyes, respectively; m stands for mouth, M1, M2, M3 and M4 respectively stand for normal mouth, sipping mouth biased toward thinking and sadness, sipping mouth biased toward anger and grinning mouth; b represents eyebrow, B1, B2 and B3 represent normal, frown and brow-picking respectively.
Detailed Description
An embodiment of the present invention will be described in detail below with reference to the accompanying drawings.
As shown in fig. 1 and fig. 2, the expression dynamic recognition process of the dynamic recognition method for facial expressions of students in an MOOC course according to the present embodiment includes the following steps:
the dynamic recognition method for the facial expressions of the students in the MOOC course specifically comprises the following steps:
step 1: shooting facial expressions of students in the learning process on an MOOC course to obtain facial expression videos of the students;
step 2: acquiring a historical video of facial expressions of students in an MOOC course;
and step 3: the MOOC course student facial expression historical video is subjected to framing processing, so that the MOOC course student facial expression historical video is converted into n frames of historical still images with student facial features, one frame of image is intercepted from the first frame of still image every 5 frames of still images until the historical video is intercepted completely, expression loss cannot be caused by the interception, and meanwhile data redundancy is effectively avoided. The history image described below in this embodiment is a history still image captured from a history video.
And 4, step 4: using a face + + face recognition system to sequentially extract 83 feature points of the facial features of the students in each frame of historical image, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;
and 5: because the human faces in the historical images have different sizes and coordinate proportions, and the action modes cannot be described by Euclidean distance values among the feature points, the feature points of the facial features of the students are preprocessed, so that the feature points of the facial features of the students are kept horizontal in the images and can be described by uniform coordinates, and the specific method comprises the following steps:
step 5.1: converting the 2-dimensional pixel coordinate into a 3-dimensional homogeneous coordinate, wherein the conversion formula is as follows:
wherein, C'i'k'The pixel coordinates of the kth characteristic point in the ith' historical image are obtained; m'i'k'The homogeneous coordinate of the kth characteristic point in the ith' historical image is obtained; k' is e [1, 83 ]];
Step 5.2: and rotating the homogeneous coordinate by an angle theta' by taking the left eye corner and the right eye corner as horizontal reference. Let the right corner point of the left eye in the ith ' historical still image be the first feature point, i.e. k ' be 1, and its homogeneous coordinate be M 'i'1(x'i'1,y'i'11), the right eye left corner point is a second feature point, i.e. k 'is 2, and the homogeneous coordinate thereof is M'i'2(x'i'2,y'i'21), define a rotation matrix N'1Comprises the following steps:
wherein,
translating the rotated coordinates to define a translation matrix N'2Comprises the following steps:
wherein, p'1=(x'i'2-x'i'1)+x'i'2,q'1=(y'i'2-y'i'1)+y'i'2
Zooming the translated coordinates to make L 'be the left-right distance of the nose bridge, and defining a zooming matrix N'3Comprises the following steps:
step 5.3: will be homogeneous coordinate M'i'k'Transforming to obtain transformed homogeneous coordinatesThe transformation formula is as follows:
step 5.4: converting the homogeneous coordinateConverting the pixel coordinates of the feature points into transformed pixel coordinates by the inverse process of equation (1)
Step 6: because the parts of the human face which can show the expression change only comprise the mouth, the eyes and the eyebrows, and the nose does not change along with the change of the expression, the invention takes the mouth, the eyes and the eyebrows as the key parts of the face, and extracts the key parts of the face from the feature points of the facial features of the students of each frame of the preprocessed historical images: characteristic points of eyes, mouth, eyebrows;
and 7: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of historical image according to the characteristic points of eyes, mouth and eyebrows in each frame of historical image, wherein the method comprises the following steps:
respectively calculating Euclidean distances between every two points in the eye characteristic points, namely Euclidean distance values of the eye characteristic points, Euclidean distances between every two points in the mouth characteristic points, namely Euclidean distance values of the mouth characteristic points, and Euclidean distances between every two points in the eyebrow characteristic points, namely Euclidean distance values of the eyebrow characteristic points, and respectively forming an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector by using the Euclidean distance values of the eye characteristic points, the Euclidean distance values of the mouth characteristic points and the Euclidean distance values of the eyebrow characteristic points;
and 8: predefining an eye action mode, a mouth action mode and an eyebrow action mode; as shown in fig. 3, the eye movement patterns are the glaring eye shown in fig. 3(a), the normal eye shown in fig. 3(b), and the closed eye shown in fig. 3(c), respectively; as shown in fig. 4, the mouth action patterns are normal as shown in fig. 4(a), sipping biased to thinking and sadness as shown in fig. 4(b), sipping biased to anger as shown in fig. 4(c), and grinning as shown in fig. 4(d), respectively; as shown in FIG. 5, the eyebrow movement patterns are frown as shown in FIG. 5(a), normal as shown in FIG. 5(b), and eyebrow plucking as shown in FIG. 5(c), respectively.
And step 9: according to the definition of the action mode, respectively calibrating the action mode to which the eye characteristic vector belongs, the action mode to which the mouth characteristic vector belongs and the action mode to which the eyebrow characteristic vector belongs in each frame of historical image;
step 10: respectively storing the eye characteristic vector, the mouth characteristic vector and the eyebrow characteristic vector of each frame of historical image into the characteristic vector set of the action mode marked by each frame of historical image;
step 11: programming through an LIBSVM tool box of matlab, and training a feature vector set of each action mode by using a Support Vector Machine (SVM) to obtain an action mode classifier;
step 12: acquiring a real-time video of facial expressions of students in an MOOC course;
step 13: the method comprises the steps of performing framing processing on a real-time video of facial expressions of students to convert the real-time video into m frames of real-time static images with facial features of the students, intercepting one frame of image from the first frame of real-time static image every 5 frames of real-time static images until historical videos are intercepted completely, wherein the expression loss cannot be caused by the interception, and meanwhile, data redundancy is effectively avoided. As shown in fig. 6, the partial image is a real-time still image captured in a real-time video in the present embodiment.
Step 14: as shown in fig. 7, 83 feature points of the facial features of the students in each frame of the real-time image are sequentially extracted by using a face + + face recognition system, and the extracted feature points of the facial features of the students are respectively stored in the form of pixel coordinate vectors as shown in fig. 8;
step 15: because the human faces in the real-time image have different sizes and coordinate proportions, and the action mode cannot be described by the Euclidean distance values between the feature points, the feature points of the facial features of the students are preprocessed, so that the feature points of the facial features of the students are kept horizontal in the image and can be described by uniform coordinates, and the specific method comprises the following steps:
step 15.1: converting the 2-dimensional pixel coordinate into a 3-dimensional homogeneous coordinate, wherein the conversion formula is as follows:
wherein, CikThe pixel coordinates of the kth characteristic point in the ith real-time image are obtained; mikFor the k-th real-time image in the ith real-time imageHomogeneous coordinates of the feature points; k is an element of [1, 83 ]];
Step 15.2: as shown in fig. 9, the obtained homogeneous coordinates are rotated by an angle θ with the left and right eye corners as horizontal references. Let the right corner point of the left eye in the ith real-time image be the first feature point, i.e. k equals 1, and its homogeneous coordinate is Mi1(xi1,yi11), the right eye left corner point is the second characteristic point, i.e. k is 2, and its homogeneous coordinate is Mi2(xi2,yi21), defining a rotation matrix N1Comprises the following steps:
wherein,
translating the rotated coordinates, defining a translation matrix N with the translated origin and coordinate axes as shown in FIG. 102Comprises the following steps:
wherein p is1=(xi2-xi1)+xi2,q1=(yi2-yi1)+yi2
Scaling the coordinates after translation, as shown in fig. 10, let L be the distance between the left and right nose bridges, and define a scaling matrix N3Comprises the following steps:
step 15.3: will coordinate M in homogeneous orderikTransforming to obtain transformed homogeneous coordinatesThe transformation formula is as follows:
step 15.4: the transformed homogeneous coordinate MikConverting the pixel coordinates of the feature points into transformed pixel coordinates by the inverse process of equation (6)
Step 16: extracting key parts of the face from feature points of facial features of the students of each frame of the preprocessed real-time images: characteristic points of eyes, mouth, eyebrows;
and step 17: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of real-time image according to the characteristic points of eyes, mouth and eyebrows in each frame of real-time image;
the feature vector generation method in the present embodiment is:
(1) as shown in fig. 11, two points of pupil location are removed from the feature points in the picture to obtain 8 feature points of the eye region, and the euclidean distance between every two points in the feature points, that is, the euclidean distance value of the eye feature points, is calculated, and 28 values in total form an eye feature vector;
(2) as shown in fig. 12, two points on the left and right of the nose and 4 points on the left and right inside the lips are removed from the feature points in the picture to obtain 15 feature points of the mouth part, wherein 14 feature points come from the mouth, and one feature point comes from the lower part of the nose tip, and the distance between every two feature points, namely the euclidean distance value of the mouth feature point, is taken, and 105 values in total form a feature vector;
(3) as shown in fig. 13, all the points of the eye are removed from the feature points in the picture to obtain 9 feature points of the eyebrow portion, one point at the eye corner is selected as a reference point in addition to 8 points of the eyebrow contour, and the distance between every two feature points, i.e., the euclidean distance value of the eyebrow feature points, is taken, and 36 values are totally used to form a feature vector.
And step 17: and classifying the mouth, eyes and eyebrow feature vectors of each frame of real-time image by using a motion mode classifier to obtain motion modes corresponding to the mouth, eyes and eyebrows in each frame of real-time image. In the present embodiment, the accuracy of the motion pattern recognition is tested, and the test results are shown in table 2:
TABLE 2 action Pattern recognition accuracy Table
Step 18: determining the expression mode of the student's expression in each frame of real-time image by adopting the mapping relation in table 3 according to the combination of the action modes of the mouth, eyes and eyebrows in each frame of real-time image, wherein table 3 is as follows;
TABLE 3 mapping relationship table of action mode and expression mode
The definition of the expression schema in Table 3 is based on the idea of a classification tree model as follows:
as shown in fig. 14, where E represents eyes, E1, E2 and E3 represent normal, closed and glaring eyes, respectively, M represents mouth, M1, M2, M3 and M4 represent normal, frightened and sad sipping, respectively, angry sipping and grinning, B represents brow, and B1, B2 and B3 represent normal, frown and brow, respectively. The eyes with the highest action pattern recognition rate are the first layer nodes of the tree, the mouth is the second layer nodes, the eyebrows are the leaves, the 36 leaves correspond to the combination of the 36 action patterns, when the 36 action patterns are defined as 7 expression patterns, the eye action patterns are mainly defined in the definition process, namely, the action pattern recognition results of the eyes have the most main influence on the expression pattern recognition results, the mouth action patterns are the second layer nodes, and the eyebrow action patterns are the last layer nodes, so that the definition can improve the accuracy of expression recognition.
In the present embodiment, the expression pattern recognition accuracy is shown in table 4:
table 4 expression pattern recognition accuracy table

Claims (1)

1. A dynamic recognition method for facial expressions of students in MOOC courses is characterized by comprising the following steps: the method comprises the following steps:
step 1: shooting facial expressions of students in the learning process on an MOOC course to obtain facial expression videos of the students;
step 2: acquiring a historical video of facial expressions of students in an MOOC course;
and step 3: performing framing processing on the historical videos of the facial expressions of students in the MOOC course to convert the videos into n frames of historical static images with the facial features of the students;
and 4, step 4: sequentially extracting feature points of facial features of students in each frame of historical static image by using a face + + face recognition system, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;
and 5: preprocessing the feature points of the facial features of the students in each frame of static image;
converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into the pixel coordinates after rotation, translation and scaling conversion are carried out on the homogeneous coordinates in sequence, wherein the specific method comprises the following steps:
step 5.1: converting the 2-dimensional pixel coordinate into a 3-dimensional homogeneous coordinate, wherein the conversion formula is as follows:
wherein, C'i'k'The pixel coordinates of the kth characteristic point in the ith' historical image are obtained; m'i'k'The homogeneous coordinate of the kth characteristic point in the ith' historical image is obtained; k' is e [1, 83 ]];
Step 5.2: rotating the homogeneous coordinate by an angle theta 'by taking the left eye corner and the right eye corner as horizontal references, wherein the right corner point of the left eye in the ith' historical static image is a first characteristic point, namely k 'is 1, and the homogeneous coordinate is M'i'1(x'i'1,y'i'11), the right eye left corner point is a second feature point, i.e. k 'is 2, and the homogeneous coordinate thereof is M'i'2(x'i'2,y'i'21), define a rotation matrix N'1Comprises the following steps:
wherein,
translating the rotated coordinates, and defining a translation matrix N'2 as:
wherein, p'1=(x'i'2-x'i'1)+x'i'2,q'1=(y'i'2-y'i'1)+y'i'2
Zooming the translated coordinates to make L 'be the left-right distance of the nose bridge, and defining a zooming matrix N'3Comprises the following steps:
step 5.3: will be homogeneous coordinate M'i'k'Transforming to obtain transformed homogeneous coordinate M′* i′k′The transformation formula is:
M′* i′k′=M′i′k′×N′1×N′2×N′3 (5)
step 5.4: the transformed homogeneous coordinate M′* i′k′The pixel coordinates C of the feature points after conversion are obtained by the inverse process of the equation (1)′* i′k′
Step 6: extracting key parts of the face from feature points of facial features of the students of each frame of the preprocessed static image: characteristic points of eyes, mouth, eyebrows;
and 7: respectively establishing an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector of each frame of image according to the characteristic points of eyes, mouths and eyebrows in each frame of image;
and 8: predefining an eye action mode, a mouth action mode and an eyebrow action mode;
and step 9: according to the definition of the action mode, respectively calibrating the action mode to which the eye characteristic vector belongs, the action mode to which the mouth characteristic vector belongs and the action mode to which the eyebrow characteristic vector belongs in each frame of static image;
step 10: respectively storing the eye characteristic vector, the mouth characteristic vector and the eyebrow characteristic vector of each frame of image into the characteristic vector set of the respective marked action mode;
step 11: training a feature vector set of each action mode by using an SVM (support vector machine) to obtain a classifier of each action mode; step 12: acquiring a real-time video of facial expressions of students in an MOOC course;
step 13: performing framing processing on the real-time video of the facial expression of the student to convert the real-time video into m frames of real-time static images with facial features of the student;
step 14: sequentially extracting feature points of the facial features of the students in each frame of real-time static image by using the face + + api, and storing the extracted feature points of the facial features of the students in a pixel coordinate vector form;
step 15: preprocessing the feature points of the facial features of the students in each frame of real-time static image;
converting the pixel coordinates of the characteristic points into homogeneous coordinates, and converting the homogeneous coordinates into the pixel coordinates after rotation, translation and scaling conversion are carried out on the homogeneous coordinates in sequence, wherein the specific method comprises the following steps:
step 15.1: converting the 2-dimensional pixel coordinate into a 3-dimensional homogeneous coordinate, wherein the conversion formula is as follows:
wherein, CikThe pixel coordinates of the kth characteristic point in the ith real-time image are obtained; mikThe homogeneous coordinate of the kth characteristic point in the ith real-time image is obtained; k is an element of [1, 83 ]];
Step 15.2: rotating the obtained homogeneous coordinate by an angle theta by taking the left eye corner and the right eye corner as horizontal reference, and enabling the right corner point of the left eye in the ith real-time image to be a first characteristic point, namely k is 1 and the homogeneous coordinate is Mi1(xi1,yi11), the right eye left corner point is the second characteristic point, i.e. k is 2, and its homogeneous coordinate is Mi2(xi2,yi21), define the rotation matrix N1 as:
wherein,
translating the rotated coordinates, wherein the translated coordinate origin and coordinate axis are shown, and defining a translation matrix N2 as follows:
wherein p1 ═ (xi2-xi1) + xi2, q1 ═ (yi2-yi1) + yi 2;
scaling the translated coordinates, wherein L is the distance between the left and right nose bridges, and a scaling matrix N3 is defined as:
step 15.3: will coordinate M in homogeneous orderikTransforming to obtain transformed homogeneous coordinatesThe transformation formula is as follows:
step 15.4: the transformed homogeneous coordinate M* ikConverting the pixel coordinates of the feature points into transformed pixel coordinates by the inverse process of equation (6)
Step 16: extracting feature points of eyes, a mouth and eyebrows from each frame of preprocessed real-time image, and respectively establishing an eye feature vector, a mouth feature vector and an eyebrow feature vector;
and step 17: classifying the mouth, eyes and eyebrow feature vectors of each frame of real-time image by using a motion mode classifier to obtain motion modes corresponding to the mouth, the eyes and the eyebrows in each frame of real-time image;
step 18: determining an expression mode to which the expression of the student belongs in each frame of real-time image according to the combination of the action modes of the mouth, eyes and eyebrows in each frame of real-time image;
the expression mode is determined by adopting the mapping relation in the table 1:
TABLE 1 mapping relationship table of action mode and expression mode
The method for establishing the eye feature vector, the mouth feature vector and the eyebrow feature vector in the steps 7 and 16 comprises the following steps:
respectively calculating Euclidean distances between every two points in the eye characteristic points, namely Euclidean distance values of the eye characteristic points, Euclidean distances between every two points in the mouth characteristic points, namely Euclidean distance values of the mouth characteristic points, and Euclidean distances between every two points in the eyebrow characteristic points, namely Euclidean distance values of the eyebrow characteristic points; respectively using the Euclidean distance value of the eye characteristic point, the Euclidean distance value of the mouth characteristic point and the Euclidean distance value of the eyebrow characteristic point to form an eye characteristic vector, a mouth characteristic vector and an eyebrow characteristic vector;
the eye action modes are respectively gazelle, normal and closed eyes, the mouth action modes are respectively normal, sipping biased to thinking and sadness, sipping biased to anger and grinning, and the eyebrow action modes are respectively frown, normal and brow picking.
CN201610453639.4A 2016-06-21 2016-06-21 A kind of dynamic identifying method of MOOC course middle school student's facial expression Active CN106127139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610453639.4A CN106127139B (en) 2016-06-21 2016-06-21 A kind of dynamic identifying method of MOOC course middle school student's facial expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610453639.4A CN106127139B (en) 2016-06-21 2016-06-21 A kind of dynamic identifying method of MOOC course middle school student's facial expression

Publications (2)

Publication Number Publication Date
CN106127139A CN106127139A (en) 2016-11-16
CN106127139B true CN106127139B (en) 2019-06-25

Family

ID=57470165

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610453639.4A Active CN106127139B (en) 2016-06-21 2016-06-21 A kind of dynamic identifying method of MOOC course middle school student's facial expression

Country Status (1)

Country Link
CN (1) CN106127139B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846949A (en) * 2017-03-07 2017-06-13 佛山市金蓝领教育科技有限公司 A kind of long-range Emotional Teaching system
CN106652605A (en) * 2017-03-07 2017-05-10 佛山市金蓝领教育科技有限公司 Remote emotion teaching method
CN107145326B (en) * 2017-03-28 2020-07-28 浙江大学 Music automatic playing system and method based on target facial expression collection
CN107133593A (en) * 2017-05-08 2017-09-05 湖南科乐坊教育科技股份有限公司 A kind of child's mood acquisition methods and system
CN107292778A (en) * 2017-05-19 2017-10-24 华中师范大学 A kind of cloud classroom learning evaluation method and its device based on cognitive emotion perception
CN107169902A (en) * 2017-06-02 2017-09-15 武汉纺织大学 The classroom teaching appraisal system of micro- Expression analysis based on artificial intelligence
CN109101103B (en) * 2017-06-21 2022-04-12 阿里巴巴集团控股有限公司 Blink detection method and device
CN107292289A (en) * 2017-07-17 2017-10-24 东北大学 Facial expression recognizing method based on video time sequence
CN108216254B (en) * 2018-01-10 2020-03-10 山东大学 Road anger emotion recognition method based on fusion of facial image and pulse information
CN108647710B (en) * 2018-04-28 2022-10-18 山东影响力智能科技有限公司 Video processing method and device, computer and storage medium
CN109344723A (en) * 2018-09-04 2019-02-15 四川文轩教育科技有限公司 A kind of student's monitoring method based on sighting distance algorithm
CN111382648A (en) * 2018-12-30 2020-07-07 广州市百果园信息技术有限公司 Method, device and equipment for detecting dynamic facial expression and storage medium
CN109961054A (en) * 2019-03-29 2019-07-02 山东大学 It is a kind of based on area-of-interest characteristic point movement anxiety, depression, angry facial expression recognition methods
CN110206330B (en) * 2019-06-10 2020-03-03 广东叠一网络科技有限公司 Campus floor intelligence protection system based on big data
CN110532977B (en) * 2019-09-02 2023-09-12 西南大学 Learning state determining method and device
CN112492389B (en) * 2019-09-12 2022-07-19 上海哔哩哔哩科技有限公司 Video pushing method, video playing method, computer device and storage medium
CN110879966A (en) * 2019-10-15 2020-03-13 杭州电子科技大学 Student class attendance comprehension degree evaluation method based on face recognition and image processing
CN110991277B (en) * 2019-11-20 2023-09-22 湖南检信智能科技有限公司 Multi-dimensional multi-task learning evaluation system based on deep learning
CN111209867A (en) * 2020-01-08 2020-05-29 上海商汤临港智能科技有限公司 Expression recognition method and device
CN111507241A (en) * 2020-04-14 2020-08-07 四川聚阳科技集团有限公司 Lightweight network classroom expression monitoring method
CN112528777A (en) * 2020-11-27 2021-03-19 富盛科技股份有限公司 Student facial expression recognition method and system used in classroom environment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101415103A (en) * 2007-10-19 2009-04-22 英华达(南京)科技有限公司 Method for regulating video signal picture
WO2011155902A1 (en) * 2010-06-11 2011-12-15 National University Of Singapore General motion-based face recognition
CN102479388A (en) * 2010-11-22 2012-05-30 北京盛开互动科技有限公司 Expression interaction method based on face tracking and analysis
CN102945624A (en) * 2012-11-14 2013-02-27 南京航空航天大学 Intelligent video teaching system based on cloud calculation model and expression information feedback
CN104123749A (en) * 2014-07-23 2014-10-29 邢小月 Picture processing method and system
CN104732506A (en) * 2015-03-27 2015-06-24 浙江大学 Character picture color style converting method based on face semantic analysis
CN105608447A (en) * 2016-02-17 2016-05-25 陕西师范大学 Method for detecting human face smile expression depth convolution nerve network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101415103A (en) * 2007-10-19 2009-04-22 英华达(南京)科技有限公司 Method for regulating video signal picture
WO2011155902A1 (en) * 2010-06-11 2011-12-15 National University Of Singapore General motion-based face recognition
CN102479388A (en) * 2010-11-22 2012-05-30 北京盛开互动科技有限公司 Expression interaction method based on face tracking and analysis
CN102945624A (en) * 2012-11-14 2013-02-27 南京航空航天大学 Intelligent video teaching system based on cloud calculation model and expression information feedback
CN104123749A (en) * 2014-07-23 2014-10-29 邢小月 Picture processing method and system
CN104732506A (en) * 2015-03-27 2015-06-24 浙江大学 Character picture color style converting method based on face semantic analysis
CN105608447A (en) * 2016-02-17 2016-05-25 陕西师范大学 Method for detecting human face smile expression depth convolution nerve network

Also Published As

Publication number Publication date
CN106127139A (en) 2016-11-16

Similar Documents

Publication Publication Date Title
CN106127139B (en) A kind of dynamic identifying method of MOOC course middle school student's facial expression
Gecer et al. Semi-supervised adversarial learning to generate photorealistic face images of new identities from 3d morphable model
CN110889672B (en) Student card punching and class taking state detection system based on deep learning
US9361723B2 (en) Method for real-time face animation based on single video camera
CN108363973B (en) Unconstrained 3D expression migration method
CN110175534A (en) Teaching assisting system based on multitask concatenated convolutional neural network
CN103473801A (en) Facial expression editing method based on single camera and motion capturing data
CN110163567A (en) Classroom roll calling system based on multitask concatenated convolutional neural network
CN113255457A (en) Animation character facial expression generation method and system based on facial expression recognition
Geng et al. Learning deep spatiotemporal feature for engagement recognition of online courses
CN109064389A (en) A kind of free hand line draws the deep learning method of generation presence image
CN107066979A (en) A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks
Jeong et al. Facial expression recognition based on multi-head cross attention network
Wu et al. Adversarial UV-transformation texture estimation for 3D face aging
Tang et al. Automatic facial expression analysis of students in teaching environments
Zejie et al. Recognition of classroom learning behaviors based on the fusion of human pose estimation and object detection
Liu et al. 4D facial analysis: A survey of datasets, algorithms and applications
Xu The research on applying artificial intelligence technology to virtual youtuber
Zhu et al. StyleGAN3: generative networks for improving the equivariance of translation and rotation
You et al. Multi-stream I3D network for fine-grained action recognition
Wenchao et al. Research on intelligent recognition algorithm of college students’ classroom behavior based on improved SSD
CN113436302B (en) Face animation synthesis method and system
CN114120443A (en) Classroom teaching gesture recognition method and system based on 3D human body posture estimation
Chunyan et al. Detecting human head and shoulders trajectory in a smart classroom
Cao et al. Facial Expression Study Based on 3D Facial Emotion Recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant