CN115050075A

CN115050075A - Cross-granularity interactive learning micro-expression image labeling method and device

Info

Publication number: CN115050075A
Application number: CN202210736803.8A
Authority: CN
Inventors: 刘海; 张昭理; 周启云; 石佛波; 朱俊艳; 宋云霄; 刘婷婷; 杨兵
Original assignee: Hubei University; Central China Normal University
Current assignee: Hubei University; Central China Normal University
Priority date: 2022-06-27
Filing date: 2022-06-27
Publication date: 2022-09-13

Abstract

The invention provides a micro-expression image labeling method and device for cross-granularity interactive learning, which relate to the technical field of image processing and comprise the following steps: acquiring a micro-expression image sequence to be marked; acquiring a preset number of marked micro expression images, and inputting the marked micro expression images and a micro expression image sequence to be marked into a pre-trained feature extractor model; labeling the category of each micro expression to be labeled; acquiring a standard confidence score corresponding to each micro expression category; obtaining the confidence score of the recognized micro expression, comparing the confidence score with the standard confidence score of the micro expression of the corresponding category, and outputting the micro expression of which the confidence score is greater than or equal to the standard confidence score; and updating the micro expression image sequence to be labeled until all the micro expressions are labeled and outputting the labeled micro expression image set. The method and the device have the advantages that the micro-expressions of the students can be efficiently and accurately automatically marked in a teaching scene, the subjectivity of manual marking and the ambiguity of the collected micro-expressions are avoided, and a large amount of manpower and material resources are saved.

Description

Cross-granularity interactive learning micro-expression image labeling method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a micro-expression image labeling method and device for cross-granularity interactive learning.

Background

In order to study the classroom performance states of students in classroom concentration, entrance degree, liveness and the like in a teaching scene, the students in classroom performance states are generally analyzed and studied by collecting video data of the students in the teaching scene, so that the expression states of the students in classroom are analyzed according to the expressions of the students. The micro-expression of the students in the classroom is usually weak facial activity with extremely short duration, when the students try to hide the true emotion of the inner heart, the expression is unconsciously generated, and compared with other data information, the micro-expression of the students can reflect the true emotional state of the students in the classroom.

At present, the micro-expression data of students in a teaching scene are labeled and analyzed in a manual mode, the manual labeling can cause certain subjectivity in the judgment of the type of expressions, and the results of identifying the same micro-expression data by different people are possibly different, so that ambiguity is easily generated in data processing; the subjectivity of the label and the ambiguity of the expression will seriously affect the accuracy of the analysis and research on the learning state of the student; in addition, with the rapid development of information technology, the educational big data is massive in daily teaching activities, if a traditional manual labeling method is adopted, a large amount of manual labor is consumed, and the cost of paid time is extremely expensive.

Disclosure of Invention

The invention provides a micro expression image labeling method for cross-granularity interactive learning, which is used for solving the defect that manual labeling in the prior art cannot accurately identify the category of each student micro expression in a large amount of expression data, realizing the automatic labeling of the student micro expressions in a teaching scene efficiently and accurately, effectively avoiding the subjectivity of the manual labeling and the ambiguity of the acquired micro expressions on one hand, and saving a large amount of manpower and material resources on the other hand.

The invention provides a micro-expression image labeling method for cross-granularity interactive learning, which comprises the following steps of:

s1, acquiring a micro expression image sequence to be annotated;

s2, acquiring a preset number of marked micro expression images, and inputting the marked micro expression images and the micro expression image sequence to be marked into a pre-trained feature extractor model;

marking the category corresponding to each micro expression in the micro expression image sequence to be marked through the characteristic extractor model; acquiring a standard confidence score corresponding to each micro-expression category based on the marked micro-expression image;

s3, obtaining the confidence score of the micro expression recognized in the micro expression image sequence to be labeled and comparing the confidence score with the standard confidence score of the micro expression of the corresponding category, if the confidence score is more than or equal to the standard confidence score, adding the micro expression more than or equal to the standard confidence score into the labeled micro expression image set;

and S4, updating the micro expression image sequence to be labeled until all micro expressions are labeled and then outputting a labeled micro expression image set.

According to the micro-expression image annotation method provided by the invention, before the step S1, the method comprises the following steps:

collecting video data of students in a classroom;

preprocessing the video data, and converting the video data into image data of each frame;

carrying out face detection on the image data to generate a plurality of face images;

and storing a plurality of face images as the micro-expression image sequence to be labeled.

According to the micro-expression image labeling method provided by the invention, training the feature extractor model comprises the following steps:

inputting the marked micro expression image and the micro expression image sequence to be marked into a feature extractor model after performing a data enhancement strategy:

and acquiring the category corresponding to the marked micro expression image, acquiring the standard confidence score corresponding to each micro expression category, and outputting the trained feature extractor model.

selecting any image from the micro-expression image sequence to be marked, carrying out a strong enhancement strategy and a weak enhancement strategy, and obtaining a corresponding strong enhancement image u _s And weakly enhanced image u _w (ii) a Selecting any image from the marked micro-expression image, performing weak enhancement strategy, and obtaining a weak enhancement image s _w ；

Will strongly enhance the image u _s And weakly enhanced image u _w 、s _w Respectively inputting the images into the feature extractor, respectively acquiring corresponding image features, and judging the category of the micro expression;

based on a plurality of weakly enhanced images u _w Outputting the category probability distribution corresponding to each micro-expression category through the feature extractor, comparing the maximum value of the category probability distribution with a preset threshold value, and taking the micro-expression categories which are more than or equal to the preset threshold value as pseudo categories of the weakly enhanced image; obtaining a plurality of strongly enhanced images u _s The cross entropy of the micro-expression class and the pseudo class;

based on a plurality of weakly enhanced images s _w The micro-expression categories are output through the feature extractor, category probability distribution corresponding to each micro-expression category is output, and cross entropy between the output category probability distribution and the original category probability distribution of the marked micro-expression images is obtained.

According to the micro expression image labeling method provided by the invention, the confidence score of the micro expression recognized in the micro expression image sequence to be labeled is compared with the standard confidence score of the micro expression of the corresponding category, and the method comprises the following steps:

the probability distribution of the first class for obtaining the image data in the micro-expression image sequence to be labeled is

Obtaining a second class probability distribution of the labeled micro-expression dataset as

For the same micro-expression category, obtaining the average probability distribution of the first category probability distribution and the second category probability distribution

Will be provided with

Adaptive fraction T with preset _c Make a comparison if

Judging that the confidence score of the corresponding micro expression is more than or equal to the standard confidence score, updating the micro expression image sequence to be labeled, and adding the micro expressions which are more than or equal to the standard confidence score in the micro expression image sequence to be labeled into the labeled micro expression image set;

if it is

And judging that the confidence score of the corresponding micro expression is smaller than the standard confidence score, adding the corresponding micro expression into a retraining image set, and using the retraining image set for retraining the feature extractor model.

According to the micro-expression image labeling method provided by the invention, the feature extractor model comprises a fine-grained feature extractor model and a coarse-grained feature extractor model, and the training of the feature extractor model comprises the following steps:

acquiring a neutral expression image and other expression images of the same target to be identified in the micro-expression image data; acquiring the identity characteristics of the neutral expression image, and acquiring the identity characteristics and expression category characteristics of other expression images through an encoder;

performing expression reconstruction by combining the identity characteristics of the neutral expression image and the expression category characteristics of the other expression images through a decoder;

the encoder and the decoder perform counterstudy, and the encoder classifies the micro-expression images of different targets to be recognized according to the identity characteristics, so that the difference between the micro-expression category distributions of the different targets to be recognized is minimum;

and analyzing the reconstructed expression image through an expression classifier, acquiring corresponding expression category characteristics, and outputting category probability distribution.

According to the micro-expression image labeling method provided by the invention, the training of the feature extractor model comprises the following steps:

anchor the micro-expression x _anchor And the macro expression positive example x with the same expression category as the micro expression anchor _posi And micro-expression counter-example x of different expression categories from micro-expression _nega Form a triplet (x) _anchor ，x _nega ，x _posi ) Input to the feature extractor model;

wherein the micro-expression anchor and the micro-expression counter-example are input into the fine-grained feature extractor to respectively obtain expression embedding

And

inputting the macro expression positive example into the coarse-grained feature extractor model, and acquiring expression embedded into

Obtaining triple loss between multiple expression embeddings:

performing a counterstudy between the fine-grained feature extractor and the coarse-grained feature extractor model; the coarse-grained feature learning module provides embedded expression of macro expression and records the classification of the corresponding macro expression image as a correct category; the fine-grained feature extractor provides embedded representation of the micro expression, and the classification of the corresponding micro expression image is recorded as an error category;

distinguishing the two embedded representations through a discriminator, and adjusting common expression features between a macro expression and a micro expression through a fine-grained feature extractor model so that the gradient between the macro expression and the micro expression is larger than or equal to a lowest threshold value;

wherein m is a hyperparameter.

On the other hand, the invention also provides a micro-expression image labeling device for cross-granularity interactive learning, which comprises:

the image processing module is used for acquiring micro-expression image sequences to be labeled, acquiring a preset number of labeled micro-expression images, and inputting the labeled micro-expression images and the micro-expression image sequences to be labeled into the feature extractor module;

the characteristic extractor module is used for marking the category corresponding to each micro expression in the micro expression image sequence to be marked through the characteristic extractor model; acquiring a standard confidence score corresponding to each micro-expression category based on the marked micro-expression image;

the image labeling module is used for acquiring the confidence score of the identified micro expression in the micro expression image sequence to be labeled, comparing the confidence score with the standard confidence score of the micro expression of the corresponding category, and adding the micro expression of which the confidence score is more than or equal to the standard confidence score into the labeled micro expression image set if the confidence score is more than or equal to the standard confidence score;

the image labeling module is also used for updating the micro-expression image sequence to be labeled until all the micro-expressions are labeled and then outputting a labeled micro-expression image set.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method for annotating a microexpressive image as described in any of the above.

The invention provides a micro-expression image labeling method and device for cross-granularity interactive learning, wherein a labeled micro-expression image and a micro-expression image sequence to be labeled are input into a pre-trained feature extractor model, and a standard confidence score corresponding to each micro-expression category is obtained from the labeled micro-expression image, so that the category corresponding to each micro-expression in the micro-expression image sequence to be labeled is further identified, the confidence score of the identified micro-expression in the micro-expression image sequence to be labeled is obtained and is compared with the standard confidence score of the micro-expression of the corresponding category, and the identification accuracy of the micro-expression is improved; the micro-expression images which do not meet the confidence score requirement are input into the feature extractor model again for training, so that the precision of the feature extractor model can be effectively improved; the micro-expression of the students in the teaching scene is identified by the method provided by the invention, so that the feedback of the students on the teaching contents in the classroom and the classroom emotion of the students are obtained, and the teacher or the teaching staff can effectively master the classroom states of different students, thereby being convenient for the teacher or the teaching staff to pertinently provide guidance for different students and improve the teaching effect in the classroom.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a micro-expression image annotation method according to the present invention;

FIG. 2 is a schematic diagram of image acquisition of the micro-expression image annotation method provided by the present invention;

FIG. 3 is a second schematic flow chart of the micro-expression image annotation method provided by the present invention;

fig. 4 is a third schematic flow chart of the micro-expression image annotation method provided by the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the term "first \ second" referred to in the present invention is only used for distinguishing similar objects, and does not represent a specific ordering for the objects, and it should be understood that "first \ second" may be interchanged in a specific order or sequence, if allowed. It should be understood that "first \ second" distinct objects may be interchanged under appropriate circumstances such that embodiments of the invention described herein may be practiced in sequences other than those described or illustrated herein.

In one embodiment, as shown in fig. 1, the present invention provides a cross-granularity interactive learning micro-expression image labeling method, including the steps of:

s1, acquiring a micro expression image sequence to be annotated;

it should be noted that the labeled micro-expression image can select the micro-expression image and the labeled category from the currently disclosed micro-expression database;

further, if the micro expression image is smaller than the standard confidence score, the micro expression image which is smaller than the standard confidence score is input into the feature extractor model again for training;

it should be noted that learning the confidence score corresponding to each micro expression category provides a reference basis for judging and comparing whether the recognition result of the micro expression data to be annotated is qualified or not; the confidence score represents the correct recognition probability of each micro-expression category, and the higher the score is, the higher the probability of containing the category is;

Optionally, before step S1, the method includes the steps of:

collecting video data of students in a classroom;

storing a plurality of face images as the micro-expression image sequence to be labeled;

specifically, video data of students in a classroom are collected through a front RGB intelligent camera and an intelligent tracking camera;

as shown in fig. 2, the camera includes an intelligent tracking camera facing the teaching scene; still including two leading intelligent RGB cameras CL, CR that set up respectively to guarantee to the collection of every student under the teaching scene, guarantee the collection of all-round multi-angle student's facial little expression data in to the teaching scene, can enlarge, reduce the operation to the camera lens to the student of different distances.

Specifically, after video data of students in a classroom is collected by using a specific camera device, data preprocessing is performed on the video data of the students in the classroom, and the data preprocessing comprises the following steps: and converting the video data into image data of each frame, carrying out face detection on students, cutting face images of each frame of students according to the same size, and finally storing a data set subjected to data preprocessing as a micro-expression image sequence to be labeled.

In one embodiment, as shown in FIG. 3, training the feature extractor model comprises:

It should be noted that the feature extractor model is a feature extractor for cross-granularity interactive learning, and the trained feature extractor model is stored by performing model training on the labeled micro-expression data set and learning the confidence score corresponding to each micro-expression category; the macro expression and the micro expression have some shared characteristics, but the characteristics of the macro expression are coarse-grained, and the characteristics of the micro expression are finer-grained, so that the fine-grained micro expression characteristic training characteristic extractor is guided by the coarse-grained macro expression characteristics, and the accuracy of the micro expression identification to be marked is better facilitated;

it should be noted that the micro-expression has the characteristics of short duration (less than 0.5 second), small amplitude and difficult perception by naked eyes, is an unconscious spontaneous behavior, and the reaction quickly appears after an emotion evoking event and is difficult to inhibit, so that the real emotional state of a person can be revealed; the macro expression has the characteristics of long duration (1s-5s), large amplitude and easy observation;

wherein, the macro expression is coarse-grained, the micro expression is fine-grained, and the granularity refers to the fineness of the expression characteristics; micro-expressions are usually short in duration, evanescent at the moment, small in facial muscle movement amplitude, and difficult to capture, for example, a moment of raising of the corners of the mouth, eyebrows, a minute movement contraction or stretching of the eyes, etc. are more fine expressive features, and therefore fine-grained; macro expression has a long duration and large facial muscle movement amplitude compared with micro expression, and is easy to extract captured features, so that the macro expression is coarse-grained;

specifically, training the feature extractor model includes two stages:

the first stage comprises:

dividing the acquired video data into a micro expression data set and a macro expression data set; acquiring neutral expressions in the same video

(Neutral), other expressions

(Other Expression) image and corresponding Expression category y ⁱ ；

In-micro-expression data set

On-pretraining a fine-grained feature extractor model, M ₁ Representing the number of videos; simultaneously in the macro expression data set

Pre-training a coarse-grained feature extractor model;

the inputs of the fine-grained feature extractor model and the coarse-grained feature extractor model are paired images from the same pair of videos, including a neutral micro-expression image x _N Other expression image x _O (ii) a The training of both feature extractor models comprises four steps: expression feature extraction, expression reconstruction, identity classification and expression classification:

wherein, expression feature extraction includes: the fine-grained characteristic learning module carries out encoding on neutral expression images x from the micro expression video _N Other expression image x _O Encode and obtain their embedded representations, respectively:

obtaining the identity characteristics of the neutral expression image through an encoder

Obtaining identity characteristics of other expression images

And expression category characteristics

Expression category characteristics of neutral expression image

Can be considered as a fixed value; the neutral expression image refers to the situation of facial expression, and can be generally regarded as the neutral expression image only comprising identity features; the other expression images refer to expressions generated by facial muscle movement except neutral expressions, such as basic expressions of happiness, sadness, anger, surprise, aversion and the like;

note that the neutral expression image x _N Other expression image x _O For images originating from the same object, their identity-related features

And

similarly, the difference between the two can be expressed as a loss function as follows:

further, other expression-related features

Enough category characteristic information of the original expression is transmitted, so that the expression reconstruction comprises the following steps: through a decoder D _r Combining the identity characteristics of the neutral expression image and the expression category characteristics of the other expression images to reconstruct the expression, and outputting a reconstruction loss function

Further, the step of identity classification comprises: the encoder E and the identity classifier D _s Performing antagonistic learning; classifying the micro expression images of different targets to be recognized according to the identity characteristics through an encoder to obtain micro expression category distribution of a plurality of different targets to be recognized, so that the difference between the micro expression category distribution of the different targets to be recognized is minimum, namely the probability distribution difference of expression categories in the recognition results of the plurality of images of the different targets is minimum; thereby making D _s Related expression characteristics of target difficult to recognize

Classifying; the goal of the confrontational training is expression class distribution

Cross entropy loss with the real expression class s of the recognition target:

it should be noted that the similarity between the expression category distribution and the real category distribution obtained by recognition can be measured through cross entropy loss, which is beneficial to improving the learning rate of the recognition model;

further, the step of classifying the expressions comprises: by expression classifier D _e Generating characteristics related to expressions, namely generating various expression characteristics based on micro expressions and/or macro expressions, and introducing cross entropy loss L _c Is defined as:

wherein y is x _O The category of the expression of (a) is,

the obtained expression category distribution is identified;

finally, a total loss function is obtained, and for the fine-grained feature extractor model, the total loss function L _micro Is defined as:

wherein λ, β, γ are hyper-parameters of the control loss function;

it should be noted that the cross entropy loss is the difference between the real probability distribution and the prediction probability distribution, and the smaller the cross entropy is, the better the model prediction effect is; smaller values for the overall loss function indicate better model training.

Further, the second stage comprises two steps of feature space guidance and category space guidance:

wherein the feature space guide comprises: fixing the coarse-grained feature extractor model, and training the fine-grained feature extractor model to improve the identification precision of the fine-grained feature extractor model;

anchor the micro-expression x _anchor And the macro expression positive example x with the same expression category as the micro expression anchor _posi And micro-expression counter-example x of different expression categories from micro-expression _nega Form a triplet (x) _anchor ,x _nega ,x _posi ) Inputting the fine-grained feature extractor model;

for each input triple, inputting the micro-expression anchor and the micro-expression counter-example into the fine-grained feature extractor, and respectively acquiring expression embedded as

And

inputting the macro expression into the coarse-grained feature extractionGet the model of the device, get the expression to embed as

Obtaining triple loss between multiple expression embeddings:

specifically, the details of the expression can be better distinguished by acquiring triple loss, which is beneficial to improving the recognition precision of the expression.

Wherein m is a hyper-parameter;

it should be noted that the micro-expression anchor is a micro-expression reconstructed image from a fine-granularity feature extractor, the macro-expression positive example is a macro-expression reconstructed image from a coarse-granularity feature extractor, and the micro-expression negative example is an expression image with the same identity feature as the micro-expression anchor;

further, performing countermeasure training between the coarse-grained feature extractor model and the fine-grained feature extractor model; the coarse-grained feature learning module provides embedded expression of macro expression and records the classification of the corresponding macro expression image as a correct category; the fine-grained feature extractor provides embedded representation of the micro expression, and the classification of the corresponding micro expression image is recorded as an error category;

distinguishing the two embedded representations through a discriminator D, adjusting common expression features between the macro expression and the micro expression through a fine-grained feature extractor model, enabling the difference gradient between the macro expression and the micro expression to be larger than or equal to a minimum threshold value, and aiming at counterstudy:

the loss of the discriminator is defined as

To avoid disappearance of the gradient, minimize

Calculating the confrontation loss of the fine-grained feature learning module:

it should be noted that the common expressive features of the macro expression and the micro expression refer to: for example, the smile of the macro expression may be a laugh, the smile of the micro expression may be a light smile, and the shared features between the two may include muscle stretching of the cheek, mouth corner rising, eye corner change, and the like, which are common feature points of the macro expression and the micro expression, except for the degree of muscle stretching, mouth corner rising, eye corner change.

Further, the class space guidance includes:

the expression classification loss is used for controlling the recognition precision of the expression, the classification loss is introduced as follows, and the classification loss of the fine-grained encoder branch is as follows:

wherein y represents x _anchor The category of the expression of (a) is,

a distribution of identified expression categories;

during training, assuming that a fine-grained feature learning module and a coarse-grained feature learning module generate similar outputs, so as to jointly train the two networks, and considering the difference that a regularization term is added in a loss function to punish the two networks;

it should be noted that the fine-grained features are more important than the coarse-grained features, the implied feature information is richer, and regularization can be added for constraint, so that the network learns the finer features

Wherein the loss function is defined as:

L _LIR ＝max{L _ds -L _cls′ ，0}；

wherein L is _cls′ Representing positive case characteristics for classification loss of coarse-grained encoder branches

And cross entropy loss of classification results between the true expression categories y of the regular images;

the overall loss function is defined as:

L _MM ＝L _cls +λ ₂ L _tri +λ ₃ L _adv +λ ₄ L _LIR ；

wherein λ ₂ ，λ ₃ ，λ ₄ A hyper-parameter to control the loss-factor;

it should be noted that, in the training process, the total loss function needs to be minimized as much as possible, so as to improve the effect of the trained fine-grained recognition model, and update some model parameters in the training process, optimize the model, and improve the performance, which is not limited in this invention;

and acquiring the category identification results of all micro expression images by the feature extractor model obtained after the training, and comparing the identification results with the real categories of the micro expression images:

selecting samples that identify the correct

Wherein s is _i Represents the confidence score of the ith token data,

denotes the ith identification mark, N denotes the sample S ^C The number of (2);

constructing an adaptive confidence interval T { (T) ₁ ，...，T _c )|T _c E.g. R, C1. ·, C }, the specific formula is:

wherein the content of the first and second substances,

representative sample S ^C The number of the middle samples marked as the c-th expression;

after a complete training process is carried out, storing a primary feature extractor model with the best effect and the highest prediction accuracy and a confidence score corresponding to each expression category;

in one embodiment, training the feature extractor model as shown in FIG. 4 further comprises:

the data enhancement strategy is carried out on the micro-expression image, and comprises the following steps: selecting any image from the micro-expression image sequence to be marked, carrying out a strong enhancement strategy and a weak enhancement strategy, and obtaining a corresponding strong enhancement image u _s And weakly enhanced image u _w (ii) a Selecting any image from the marked micro-expression image, performing weak enhancement strategy, and obtaining a weak enhancement image s _w ；

It should be noted that the strong enhancement strategy is: and (2) enhancing by adopting random Augment and then applying CutOut, wherein the weak enhancement strategy comprises the following steps: the image u is obtained by Random Horizontal Flip (Random Horizontal Flip) with a probability of 50%, and Random Horizontal and vertical shift (Random Translation) with a probability of 13.5% _w (ii) a More image samples are run through a strong enhancement strategy and a weak enhancement strategy;

further, the image u will be strongly enhanced _s And weakly enhanced image u _w 、s _w Respectively inputting the images into the feature extractor, respectively acquiring corresponding image features, and judging the category of the micro expression;

based on a plurality of weakly enhanced images u _w Outputting the category probability distribution corresponding to each micro-expression category through the feature extractor, comparing the maximum value of the category probability distribution with a preset threshold tau, and taking the micro-expression categories which are greater than or equal to the preset threshold tau as the pseudo categories of the weakly enhanced image; acquiring a plurality of strongly enhanced images u _s The cross entropy of the micro-expression class and the pseudo-class:

where M denotes the number of unlabelled images, p _w Representing the probability distribution of the identified classes after a weak enhancement strategy,

a pseudo-class representing a weakly enhanced image,

representing the class probability distribution of the strongly enhanced image, H (-) is a cross entropy loss function;

the pseudo category is an image category which is greater than or equal to a preset threshold value in category probability distribution and is identified after a weak enhancement strategy;

optionally, the weakly enhanced image s is obtained after a weak enhancement strategy is carried out on a small number of identified micro-expression images _w Identifying to obtain corresponding class probability distribution, and obtaining cross entropy loss L between the class probability distribution and the real class distribution of the image s _s The definition is as follows:

wherein N represents the number of data, C represents the number of categories,

is a true class distribution, p _c (s _i Is a class probability distribution, s _i Is the ith data, theta represents a network parameter, p _c () represents a recognition function;

further, comparing the confidence scores of the micro expressions identified in the micro expression image sequence to be labeled with the standard confidence scores of the micro expressions of the corresponding categories, comprising the steps of:

Wherein c is the corresponding micro-expression category;

will be provided with

Adaptive fraction T with preset _c Make a comparison if

if it is

In another embodiment, the present invention further provides a micro-expression image labeling apparatus for cross-granularity interactive learning, including:

the image labeling module is also used for updating the micro expression image sequence to be labeled until all micro expressions are labeled and outputting a labeled micro expression image set;

specifically, the micro-expression image labeling device further comprises image acquisition devices arranged at various positions in a teaching scene, and video data of students in a classroom are acquired through devices including an RGB intelligent camera and an intelligent tracking camera; the invention is not limited in this respect, and how to acquire images and which devices to acquire image information in a classroom do not influence the implementation of the invention;

specifically, the student expression categories are divided into three categories: positive, neutral, and negative; wherein the neutral expression is that the face has no obvious expression characteristics and is a facial blankness; positive expressions include smiling, etc., and negative expressions include angry, too much, etc.;

specifically, the type of the expression can be judged through facial features including eye corners, mouth corners, muscle stretching of cheeks and the like, and the expression degree of the expression such as the change degree of the eye corners, the rising degree of the mouth corners, the stretching degree of facial muscles and the like is further judged to further judge the expression of the person;

by the device provided by the invention, the micro expression marking method provided by the invention is mutually and correspondingly referenced, the micro expressions of students can be identified and marked through the pre-trained feature extractor based on student images acquired by a teaching scene, and the micro expressions and the corresponding micro expression categories of the students can be accurately identified from student image data shot by the teaching scene;

through simple arrangement, the emotion state and the classroom state of each student in a classroom are recognized, the subjectivity of micro expression recognition is avoided, the micro expressions of the students in a teaching scene can be recognized based on the device and the method provided by the invention only by inputting the collected classroom images, large-scale expression data can be respectively labeled in batches, the teaching personnel can be concentrated on improving the teaching effect, and the teaching and research efficiency of the teaching personnel is improved.

In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, which when executed by a computer, enable the computer to perform the micro-expression image annotation method provided by the above methods.

In still another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to execute the micro-expression image annotation method provided by the above methods.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A micro-expression image labeling method for cross-granularity interactive learning is characterized by comprising the following steps:

s1, acquiring a micro expression image sequence to be annotated;

2. The micro-expression image annotation method for cross-granularity interactive learning according to claim 1, wherein before step S1, the method comprises the steps of:

collecting video data of students in a classroom;

3. The method of claim 2, wherein training the feature extractor model comprises:

4. The method of claim 3, wherein training the feature extractor model comprises:

5. The method for labeling the micro-expression images for cross-granularity interactive learning according to claim 4, wherein the step of comparing the confidence scores of the micro-expressions identified in the micro-expression image sequence to be labeled with the standard confidence scores of the micro-expressions of the corresponding categories comprises the following steps:

Will be provided with

Adaptive fraction T with preset _c Make a comparison if

if it is

6. The method for labeling the micro-expression images for cross-granularity interactive learning according to claim 3 or 5, wherein the feature extractor model comprises a fine-granularity feature extractor model and a coarse-granularity feature extractor model, and training the feature extractor model comprises the following steps:

7. The method for labeling the micro-expression images for cross-granularity interactive learning according to claim 6, wherein the training of the feature extractor model comprises the following steps:

anchor the micro-expression x _anchor And the macro expression positive example x with the same expression category as the micro expression anchor _posi And micro-expression counter-example x of different expression categories from micro-expression _nega Form a triplet (x) _anchor ,x _nega ,x _posi ) Input to the feature extractor model;

And

Obtaining triple loss between multiple expression embeddings:

performing a counterstudy between the fine-grained feature extractor and the coarse-grained feature extractor model; the coarse-grained feature learning module provides embedded representation of macro expression and marks classification of corresponding macro expression images as correct categories; the fine-grained feature extractor provides embedded representation of the micro expression, and the classification of the corresponding micro expression image is recorded as an error category;

wherein m is a hyperparameter.

8. A micro-expression image labeling device for cross-granularity interactive learning is characterized by comprising:

9. A non-transitory computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the microexpressing image annotation method according to any one of claims 1 to 7.