CN113743275A

CN113743275A - Micro-expression type determination method and device, electronic equipment and storage medium

Info

Publication number: CN113743275A
Application number: CN202111002587.6A
Authority: CN
Inventors: 陈通; 周巨
Original assignee: Southwest University
Current assignee: Southwest University
Priority date: 2021-08-30
Filing date: 2021-08-30
Publication date: 2021-12-03

Abstract

The application provides a method and a device for determining micro expression types, electronic equipment and a storage medium, and belongs to the field of expression recognition. The determination method comprises the following steps: acquiring a first frame image and a second frame image of a target organism, wherein the first frame image is prior to the second frame image in time sequence; determining first characteristic point coordinate information and second characteristic point coordinate information which respectively correspond to the first frame image and the second frame image based on the characteristic point detection model; determining a target characteristic vector according to the coordinate information of the first characteristic point and the coordinate information of the second characteristic point; and determining the micro-expression type corresponding to the target organism according to the target feature vector. According to the method and the device, the micro expression type is determined through the coordinate information, the situation that pictures with highly similar motion states cannot be processed in the traditional micro expression identification method is avoided, and the accuracy rate of micro expression identification of the corresponding user is improved.

Description

Micro-expression type determination method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of expression recognition technologies, and in particular, to a method and an apparatus for determining a micro-expression type, an electronic device, and a storage medium.

Background

The facial expression recognition technology is the basis for understanding human emotion by a computer and is an effective way for exploring intelligent human-computer interaction. Due to the influence of a self defense mechanism, a human often tries to suppress or hide the real emotional state of the human, namely in life, besides a common facial expression (namely a macro expression), another micro expression which is difficult to be perceived exists.

The micro-expression recognition technology in the prior art is mainly divided into three steps: face detection, feature extraction and micro-expression recognition. In the prior art, the identification of the micro expression is usually realized by establishing a deep learning model through a deep learning feature technology, but as convolution kernels in the deep learning model share weights and pooling of pooling layers is required after the convolution kernels complete convolution, a picture classification task with highly similar motion states cannot be well processed, so that the method has low accuracy in identifying the micro expression of a user in the picture with the highly similar motion states.

Disclosure of Invention

In view of this, an object of the present application is to provide a method, an apparatus, an electronic device, and a storage medium for determining a micro-expression type, in which first feature point coordinate information and second feature point coordinate information corresponding to a first frame image and a second frame image are respectively determined, and a target feature vector is further determined; the micro expression type corresponding to the target organism is determined according to the target characteristic vector, the micro expression type is determined through coordinate information, the situation that pictures with highly similar motion states cannot be processed in a traditional micro expression identification method is avoided, and the accuracy of micro expression identification of corresponding users is improved.

The application mainly comprises the following aspects:

in a first aspect, an embodiment of the present application provides a method for determining a micro-expression type, where the method includes:

acquiring a first frame image and a second frame image of a target organism, wherein the first frame image is prior to the second frame image in time sequence;

determining first characteristic point coordinate information and second characteristic point coordinate information which respectively correspond to the first frame image and the second frame image based on a characteristic point detection model;

determining a target characteristic vector according to the first characteristic point coordinate information and the second characteristic point coordinate information;

and determining the micro-expression type corresponding to the target creature according to the target feature vector.

In a possible implementation manner, the determining a target feature vector according to the first feature point coordinate information and the second feature point coordinate information includes:

and obtaining the target characteristic vector by calculating the difference value between the second characteristic point coordinate information and the first characteristic point coordinate information.

In a possible embodiment, the first frame image is a start frame image in the video data of the target living being.

In a possible implementation, the second frame image is a top frame image in which the change in the expression intensity is the largest in the video data of the target living being.

In one possible embodiment, determining the top frame image by:

calculating expression change intensity of each frame of image in the video data of the target organism, wherein the expression change intensity is a difference value of the feature point coordinate information of the current frame of image and the previous frame of image, and taking an absolute value obtained after the difference value is summed;

and determining the frame image with the maximum expression change intensity as the top frame image.

In one possible embodiment, the feature point detection model is obtained by:

acquiring a micro-expression sample image and corresponding characteristic point coordinate information thereof;

and training an initial training model according to the micro-expression sample image and the corresponding feature point coordinate information thereof to obtain the feature point detection model.

In a possible implementation manner, the determining, according to the target feature vector, a micro-expression type corresponding to the target creature includes:

inputting the target characteristic vector into a preset classifier, and determining a micro expression type corresponding to the target organism, wherein the preset classifier is obtained by training according to a characteristic vector sample in a sample library and a micro expression label corresponding to the characteristic vector sample; or

And determining the micro expression type corresponding to the target organism by calculating the similarity between the target characteristic vector and the characteristic vector in a preset data table, wherein the preset data table comprises the corresponding relation between a plurality of characteristic vectors and the micro expression type.

In one possible embodiment, the feature vector samples are determined by:

determining the coordinate information of the characteristic points corresponding to the target frame image and the initial frame image respectively;

and determining the characteristic vector sample by calculating the difference value of the characteristic point coordinate information of the target frame image and the starting frame image.

In a second aspect, an embodiment of the present application further provides a device for determining a micro-expression type, where the device for determining a micro-expression type includes:

the acquisition module is used for acquiring a first frame image and a second frame image of a target organism, wherein the first frame image is prior to the second frame image in time sequence;

the first determining module is used for determining first characteristic point coordinate information and second characteristic point coordinate information which respectively correspond to the first frame image and the second frame image based on a characteristic point detection model;

the second determining module is used for determining a target characteristic vector according to the first characteristic point coordinate information and the second characteristic point coordinate information;

and the third determining module is used for determining the micro-expression type corresponding to the target creature according to the target feature vector.

In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, the processor and the memory communicate via the bus when the electronic device is running, and the machine-readable instructions are executed by the processor to perform the steps of the determination method in any one of the possible embodiments of the first aspect.

In a fourth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the determination method described in any one of the possible implementation manners of the first aspect.

In the embodiment of the application, a target feature vector is further determined by respectively determining first feature point coordinate information and second feature point coordinate information corresponding to a first frame image and a second frame image; the micro expression type corresponding to the target organism is determined according to the target characteristic vector, the micro expression type is determined through coordinate information, the situation that pictures with highly similar motion states cannot be processed in a traditional micro expression identification method is avoided, and the accuracy of micro expression identification of corresponding users is improved.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a flow chart illustrating a method for determining micro-expression types according to an embodiment of the present application;

fig. 2 is a flowchart illustrating another method for determining a microexpression type according to an embodiment of the present application;

fig. 3 is a schematic structural diagram illustrating a determination apparatus of micro-expression type according to an embodiment of the present application;

fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application;

fig. 5 is a schematic diagram illustrating detection of feature points of a face in a method for determining a micro-expression type according to an embodiment of the present application;

fig. 6 is a statistical chart showing fifteen points with the minimum euclidean distance in the determination method of the micro-expression type according to the embodiment of the present application;

fig. 7 is a schematic diagram illustrating a motion situation of a plurality of face motion units AU represented by a plurality of sub-images in a face motion unit AU set in a determination method of a micro-expression type according to an embodiment of the present application;

fig. 8 is a schematic diagram illustrating a comparison between two subgraphs of sparse features and dense features on different kernel functions in a determination method of micro-expression types according to an embodiment of the present application;

fig. 9 shows a comparison result graph corresponding to two sub-graphs of a data set expansion equalization comparison test in the determination method of micro-expression type according to the embodiment of the present application.

Description of the main element symbols:

in the figure: 300-a determination means; 310-an acquisition module; 320-a first determination module; 330-a second determination module; 340-a third determination module; 400-an electronic device; 410-a processor; 420-a memory; 430-bus.

Detailed Description

To make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and that steps without logical context may be performed in reverse order or concurrently. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.

In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

To enable one of ordinary skill in the art to utilize the present disclosure, the following embodiments are presented in conjunction with a specific application scenario "determination of micro-expression types," and it will be apparent to one of ordinary skill in the art that the general principles defined herein may be applied to other embodiments and application scenarios without departing from the spirit and scope of the present disclosure.

The method, the system, the electronic device and the storage medium in the embodiments of the present application may be applied to any scenario that requires entity association, and the embodiments of the present application do not limit a specific application scenario, and any scheme that uses the method, the apparatus, the electronic device and the storage medium for determining the micro-expression type provided in the embodiments of the present application is within the scope of protection of the present application.

It is noted that, before the present application is proposed, the micro expression recognition technology in the prior art is mainly divided into three steps: face detection, feature extraction and micro-expression recognition. In the prior art, the identification of the micro expression is usually realized by establishing a deep learning model through a deep learning feature technology, but as convolution kernels in the deep learning model share weights and pooling of pooling layers is required after the convolution kernels complete convolution, a picture classification task with highly similar motion states cannot be well processed, so that the method has low accuracy in identifying the micro expression of a user in the picture with the highly similar motion states.

Based on this, the embodiment of the application provides a method and a device for determining a micro-expression type, an electronic device and a storage medium, wherein a target feature vector is further determined by respectively determining first feature point coordinate information and second feature point coordinate information corresponding to a first frame image and a second frame image; the micro expression type corresponding to the target organism is determined according to the target characteristic vector, the micro expression type is determined through coordinate information, the situation that pictures with highly similar motion states cannot be processed in a traditional micro expression identification method is avoided, and the accuracy of micro expression identification of corresponding users is improved.

For the convenience of understanding of the present application, the technical solutions provided in the present application will be described in detail below with reference to specific embodiments.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method for determining a micro-expression type according to an embodiment of the present disclosure. As shown in fig. 1, a determination method provided by an embodiment of the present application includes the following steps:

s101, acquiring a first frame image and a second frame image of a target organism, wherein the first frame image is prior to the second frame image in time sequence.

In the step, a first frame image and a second frame image on an image set or a video are obtained by acquiring the image set or the image video of a segment of target organism.

Here, if the first frame image is any one of the micro expression images to be recognized on the picture or the image video, the second frame image is any one of the micro expression images to be recognized on the first frame image after the time sequence.

The target creature may specifically include, but is not limited to, a target user in a human and a target animal in an animal, and all of the following embodiments of the present application are exemplified by the target user in a human.

S102, determining first characteristic point coordinate information and second characteristic point coordinate information which respectively correspond to the first frame image and the second frame image based on a characteristic point detection model.

In this step, micro expression information in the first frame image and the second frame image of the target organism is extracted through the feature point detection model respectively to obtain first micro expression feature information and second micro expression feature information, wherein the micro expression feature information comprises feature points and corresponding coordinate information, namely first feature point coordinate information and second feature point coordinate information corresponding to the first frame image and the second frame image respectively are obtained, and only when the sum of all feature point coordinate information is detected in the feature point detection model, the integrity of the target organism is indicated, and the subsequent identification process can be performed.

Here, the feature point coordinate information may be specifically set to (x, y).

In the case where the target creature is determined as a human, the feature points of the detected human face include the edge of the face, the edge of the eyebrows, the contour of the eyes, the nose, and the contour of the mouth, and there are 68 feature points in total, by the feature point detection model.

Thus, as shown in fig. 5, fig. 5 is a schematic diagram illustrating detection of feature points of a human face, where if N micro expression images to be recognized exist on the image video, the coordinate information of the feature points of each frame of image (including the first frame of image and the second frame of image) may specifically be:

further, the feature point detection model is obtained by the following method:

and acquiring the micro expression sample image and the corresponding characteristic point coordinate information thereof.

Here, the micro expression sample image is a large number of sample images acquired.

Here, the sample micro-expression feature information corresponding to the sample micro-expression image includes sample micro-expression feature information such as an edge of a face, an edge of an eyebrow, an eye contour, a nose, and a mouth contour, and the feature point detection model is obtained by training an initial training model according to the feature points and the feature point coordinate information.

The training of the initial training model includes, but is not limited to, training of an initial neural network model, and the feature point detection model can also be obtained by a method based on a regression tree or the like, or by directly using a micro-expression sample image and corresponding sample image feature point coordinate labeling information.

S103, determining a target characteristic vector according to the first characteristic point coordinate information and the second characteristic point coordinate information.

In the step, the first feature point coordinate information and the second feature point coordinate information are expressed in a vectorization mode through manual coding, and the difference value calculation is carried out by using the second feature point coordinate information with a later time sequence and the first feature point coordinate information with a earlier time sequence, so that a target feature vector is generated.

Here, if there are N frames of micro expression images to be identified on the image video, the vector of each frame of micro expression feature information is represented as:

and calculating the difference value by using the second characteristic point coordinate information with the later time sequence and the first characteristic point coordinate information with the earlier time sequence, wherein the generated at least one target characteristic vector is represented as:

in summary, if there are N frames of micro expression images to be identified on the image video, N-1 target feature vectors can be obtained for identifying micro expressions.

In this way, the target feature vector is used as a sparse feature vector.

The method comprises the steps of manually coding coordinate information corresponding to feature points in micro-expression feature information of each frame of micro-expression image to be identified by using a manual coding mode, generating feature vectors corresponding to each frame of micro-expression image to be identified, wherein the target feature vectors are used for representing the movement trend of facial muscles with micro-expression.

The difference calculation of the target feature vectors, i.e., the coordinate information of the feature points, can be used to characterize the movement trend of the facial muscles of the micro-expression as follows:

four public data sets including CASMEI, CASMEII, SAMM and SMIC are selected for comparison test, and three-classification and four-classification task experiments of the mixed database are performed according to different tasks and databases. The number of different categories for each data set is shown in table 1 below:

TABLE 1 data set Categories and sample numbers thereof


		CASMEI	Positive(9)，Negative(53)，Surprise(21)，Other(112)
CASMEII	Positive(30)，Negative(69)，Surprise(26)，Other(121)
		SAMM	Positive(26)，Negative(80)，Surprise(15)，Other(38)
SMIC	Negative(70)，Positive(51)，Surprise(43)

Where the CASME1 dataset was taken with a 60 frame rate camera, the present invention selected 195 microexpression segments from 19 subjects. The CASME2 data set is shot by a 200-frame-rate camera, 246 micro-expression segments from 26 tested persons are selected by the invention, an SMIC data set is shot by a 100-frame-rate camera, 164 micro-expression segments from 16 tested persons are selected by the invention, an SAMM data set is shot by a 200-frame-rate camera, 159 micro-expression segments from 29 tested persons are selected by the invention, in the first three data sets, Positive micro-expressions such as happy are classified as 'Positive', dispust, sadness, fear, contentt and anger are classified as 'Negative', Surprise micro-expressions are independently classified as 'surpride', and places, reppression is classified as 'Other'. And the SMIC number set is provided with three classification labels, and repeated classification is not needed.

In order to show that the manual coding features of the invention have good interpretability, the invention carries out statistical calculation on a CASMIEII data set, and takes the initial state of face motion and the amplitude of muscle motion of different people as different values, and takes each tested video as a unit, and the video takes a first frame as a reference, calculates the accumulation of Euclidean distances between coordinates of human face feature points of all frames and coordinates of human face feature points corresponding to the first frame, and calculates the Euclidean distance between any one frame in all tested videos and the human face feature points of the first frame, taking a first point { QUOTE (x, y) } as an example, M is the number of videos each tested owns, N is the number of the current video frames, and the sum of the Euclidean distances moved by the human face feature points in all tested videos is as follows:

fifteen points with the minimum Euclidean distance of each tested object (sub) are taken and counted. The counting result is shown in fig. 6, and fig. 6 is a statistical graph of fifteen points with the minimum euclidean distance in the embodiment, in which the frame in each video sequence is affine transformed with the first frame, the affine transformation is used for calibration, and after calibration, the euclidean distance accumulated value is calculated. Here, no matter whether calibration exists or not, the histogram in the graph has three troughs corresponding to eyebrows, eyes and mouths, respectively, and it can be seen that in the micro expression behavior, the coordinate of the human face characteristic point detected by the above regions changes drastically, which indicates that the facial muscle of the region moves more, and these regions are the main occurrence regions of the micro expression movement in AU coding theory.

Here, the target feature vector has better separability and interpretability, which is demonstrated by the following embodiments.

As shown in fig. 7, fig. 7 is a schematic diagram of motion situations of a plurality of face motion units AU represented by a plurality of sub-graphs in a face motion unit AU set in a determination method of a micro-expression type, and further visualization research shows that the feature of the present invention can effectively characterize the motion situations of AU units, where, for example, casse 2 is used as an example, in the figure, in order to remove feature vectors of 17 points around a face edge and take an absolute value for a visualization situation, an abscissa corresponds to 136-34 dimensions 102, and an ordinate is an absolute value of the feature. The facial muscle movement unit corresponding to the characteristics of sub-images (a), (b), (c) and (d) in fig. 7 according to the labeling information carried by the casse 2 data set is AU4, and according to the facial movement coding system theory, AU4 means that the eyebrow-lowering muscle moves downwards, and the main movement area is eyebrow. After 17 points around the human face are removed, 0-19 dimensions represent the movement of the eyebrow area in 102-dimensional features. It can be seen that there are peaks in the 0-19 dimensions in sub-figures (a), (b), (c), (d) of fig. 7, which illustrate a greater intensity of movement in the region of the eyebrows, the subgraphs (e), (f), (g), (h) in figure 7 are from different subjects, the corresponding facial muscle movement unit according to the marking information carried by the CASME2 data set is AU4+ AU7, where AU7 is described in the facial motion coding system as the tension contraction of the muscle within the orbicularis oculi muscle, in the present invention, the 38-62 dimension characterizes the motion of the eye region, as can be seen in the figure, in the complex micro-expression motion of the combination of AU4 and AU7, there are at least two peaks mainly around the 0-20 dimension and the 40 dimension, which means that, in these two dimensional intervals, there is a large muscle movement, just in line with the facial motion coding system theory. The characteristics of the invention can effectively represent the micro expression of a single AU motion area, and can also effectively represent the motion area of the micro expression motion with a plurality of AUs, thereby having better interpretability.

If the displacement of the face is considered, and the coordinate of the same point of the face characteristic point may have the offset, the invention takes the canthus and the face characteristic point with the most stable coordinate and the reference number of 30 determined by the figure 5 as the reference points, carries out affine transformation on the coordinate points, thereby generating N-1 dense characteristic vectors, which correspond to the sparse characteristic vectors without affine transformation, and the invention carries out experiments on two characteristic points on the support vector machines of different kernel functions for finding the kernel function which is most suitable for the characteristic.

Here, default parameters are used for all parameters, the data set is CASME2, and the training set and the test set are divided by 7: 3. The result is shown in fig. 8, and fig. 8 is a schematic diagram of a comparison of two subgraphs of sparse features and dense features on different kernel functions in a determination method of micro expression types, where a subgraph is a sparse feature without affine transformation, b subgraph is a dense feature after affine transformation, horizontal coordinates are four different SVM kernel functions of linear, rbf, poly, and sigmoid, a vertical coordinate is accuracy, an upper line is training set accuracy, and a bottom line is test set accuracy. The experimental result shows that the accuracy of the sparse features is better than that of the dense features in four different kernel function models, which indicates that the sparse features can represent micro-expression motions better, since 0 in the sparse features can reflect whether muscle motions exist in an AU (AU) region better, and the smaller the number of 0, the more severe the micro-expression muscle motions are, and further indicates that the features of the invention have better interpretability. Meanwhile, the experiment also shows that the rbf kernel function is more suitable for the target feature vector of the application.

Here, the SVM kernel is a mapping function that maps a low-dimensional spatial non-linearity problem to a high-dimensional spatial programming linearity problem for processing. Many of the non-linear classification problems that are difficult to handle in low-dimensional space are transformed to high-dimensional space and the optimal classification hyperplane is easily obtained, which is the most core idea.

And S104, determining the micro expression type corresponding to the target creature according to the target feature vector.

In this step, the determining the micro-expression type corresponding to the target creature according to the target feature vector includes:

Wherein the preset classifier includes, but is not limited to, a neural network classifier. Each preset data table comprises a plurality of characteristic vectors which are in one-to-one correspondence with a micro expression type.

Here, the preset classifier is determined by:

and acquiring video sample data of the target organism and the corresponding micro-expression label.

And calculating a feature vector sample according to a target frame image and a starting frame image in the video sample data, wherein the target frame image is any frame image except the starting frame image in the video sample data.

And correspondingly storing the characteristic vector sample and the micro expression label into the sample library, and determining a preset classifier.

Determining the feature vector samples by:

and determining the coordinate information of the characteristic points corresponding to the target frame image and the initial frame image respectively.

The feature point coordinate information may be obtained by training based on a feature point detection model, or may be obtained by manual labeling, and the obtained manner is not unique.

Here, the expression types in the preset classifier include "Positive", "Negative", "survarie", and "Other".

And determining the expression type corresponding to the target creature according to the target feature vector and the preset classifier.

Here, the target feature vector is input into a preset classifier, and the expression type corresponding to the target creature is determined to be any of the four types.

Compared with the prior art, the determining method provided by the embodiment of the application determines the coordinate information of the first characteristic point and the coordinate information of the second characteristic point corresponding to the first frame image and the second frame image respectively, and further determines the target characteristic vector; the micro expression type corresponding to the target organism is determined according to the target characteristic vector, the micro expression type is determined through coordinate information, the situation that pictures with highly similar motion states cannot be processed in a traditional micro expression identification method is avoided, and the accuracy of micro expression identification of corresponding users is improved.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for determining a micro-expression type according to another embodiment of the present application. As shown in fig. 2, the determining method provided by the embodiment of the present application includes the following steps:

s201, acquiring a first frame image and a second frame image of the target organism, wherein the first frame image is prior to the second frame image in time sequence.

Here, the first frame image is a start frame image in the video data of the target living being, and the second frame image is a top frame image in the video data of the target living being in which a change in expression intensity is largest.

Further, determining the top frame image by:

and calculating the expression change intensity of each frame of image in the video data of the target organism, wherein the expression change intensity is the difference value of the feature point coordinate information of the current frame of image and the previous frame of image, and taking the absolute value after the sum of the difference values.

Wherein the previous frame image refers to an image frame temporally located before the current frame image.

Here, the specific determination method of the top frame image may be that, after obtaining the difference value of the feature point coordinate information of the current frame image and the previous frame image, an absolute value is taken from the difference value, and then the sum is performed, and the frame image corresponding to the feature vector with the largest sum of absolute values is the top frame image.

S202, determining first characteristic point coordinate information and second characteristic point coordinate information which respectively correspond to the first frame image and the second frame image based on a characteristic point detection model.

S203, calculating the difference value between the second characteristic point coordinate information and the first characteristic point coordinate information to obtain the target characteristic vector.

Here, the most desirable state in this document is that the first frame image is a micro expression image to be identified as a start frame, and the second frame image is a top frame micro expression image. As a unique manual coding feature coding mode is adopted, as long as the corresponding target feature vectors have similar positive and negative characters in the same dimension, the feature points have the same motion trend, and in the task of determining expression classification, the secondary top frames with the frame number distance from the top frames smaller than the threshold frame number have the same motion trend features, so that the features extracted from the secondary top frames still have better separability and can effectively represent the motion trend of the expression. Therefore, the data set can be expanded, and the feature vector is generated by using any frame and the initial frame in the image video, so that the method can be used for identifying a task and solving the overfitting problem caused by unbalanced samples in the process of training data.

The following examples can demonstrate the good expandability of the present application:

a comparison test is carried out on a CASME1 data set with extremely uneven data, in the group of tests, the feature points which are not subjected to the expansion equalization processing and the feature points after the expansion equalization adopt the same parameters, and a left tested method is adopted for carrying out the test. It should be noted that, in order to better compare with other technologies, in the cross validation, the features without the extended equalization process are consistent with other technology experimental methods, the data after the extended equalization process is only trained, and the original features constructed by the start frame and the top frame are validated. The confusion matrix graph of the test result is shown in fig. 9, fig. 9 is a comparison result graph corresponding to two sub-graphs of the data set expansion balanced comparison test in the determination method of the micro-expression type provided by the application, wherein (a) the subgraph is the experimental result of the unexpanded equalized original data (SWU-oral), and (b) the subgraph is the experimental result of the expanded equalized data (SWU-authenticated), wherein the accuracy of a is 0.528 and is better than 0.441 of b, it can be seen that the F1 score for a is 0.22, the expression classifier overfitts the other class, has essentially no practical significance, whereas an F1 score of 0.40 for b is significantly better than a, since the data set is highly unbalanced, the positive class has only 9 samples, the other class has 112 samples, and the experimental result shows that the target characteristics have better expandability, and the data after the expansion and the balance can effectively solve the problem of sample imbalance.

Here, the motion preset table frame capability of the secondary top frame can be proved by the following experiment:

the features constructed from all frames in the image video and the starting frame are used as independent samples to form a test set, and an SWU-oral training set is used as a data set for performing experiments on CASME1, CASME2 and SAMM by adopting the LOSO cross validation method, and the results are shown in Table 2. Although the experimental accuracy results on all data sets are reduced compared with the results obtained by only using the top frame as the test set, the reduction is small, which indicates that the secondary top frame has certain representation capability besides the top frame in the image video, so that in the previous micro expression detection link, if the top frame is extracted and has an error, the error tolerance is also provided, and the advantage of extracting the target feature vector as the feature of the expression recognition task is further indicated.

TABLE 2

Dataset	CASMEI	CASMEII	SAMM
				Acc	0.511	0.522	0.480

Compared with other manual coding features and some depth features, from the published experimental results, the comparison results of the present application are shown in table 3, and it is noted that, because the SMIC dataset does not give the start frame and end frame positions, the manner of enhancing data of the present application cannot be compared well with other methods, so that the comparison results are missing, and from table 3, the present invention, whether the original feature or the enhanced feature, is superior to other manual coding features and some classical depth features, especially SWU-authored features, on the casmieii dataset, the accuracy and F1 index are superior to other methods, and on the SAMM dataset, all manual coding features are superior. The SWU-oral characteristics, the accuracy and the F1 evaluation index are higher than those of most methods, but the F1 is lower than the SWU-authenticated characteristics on CASMEII and SAMM, and the expansion equalization processing method provided by the invention has the advantages that the overfitting problem is effectively solved, and the classification result has more practical significance. In the SMIC data set, although the SWU-oral feature evaluation result is lower than that of the FDM method, the target feature vector of the present application is more stable and the robustness performance is better from the perspective of the lateral results of the three data sets, thereby proving the importance of the target feature vector provided by the present application.

TABLE 3

Further, the determining a target feature vector according to the first feature point coordinate information and the second feature point coordinate information, where the first frame of micro expression image to be identified is a start frame image, and the second frame of micro expression image to be identified is a top frame image, includes:

and obtaining the target characteristic vector by calculating the difference value between the second characteristic point coordinate information corresponding to the top frame image and the first characteristic point coordinate information corresponding to the initial frame image.

And S204, determining the micro expression type corresponding to the target creature according to the target feature vector.

The descriptions of S201 to S202 and S204 may refer to the descriptions of S101 to S102 and S104, and the same technical effect can be achieved, which is not described in detail herein.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a micro-expression type determination apparatus according to an embodiment of the present application, and as shown in fig. 3, the determination apparatus 300 includes:

an acquiring module 310 is configured to acquire a first frame image and a second frame image of the target living being, where the first frame image precedes the second frame image in time sequence.

A first determining module 320, configured to determine, based on a feature point detection model, first feature point coordinate information and second feature point coordinate information respectively corresponding to the first frame image and the second frame image.

Further, the feature point detection model is obtained by the following method:

A second determining module 330, configured to determine a target feature vector according to the first feature point coordinate information and the second feature point coordinate information.

And a third determining module 340, configured to determine, according to the target feature vector, a micro-expression type corresponding to the target living being.

Further, the first frame image is a start frame image in the video data of the target living being, and the second frame image is a top frame image in the video data of the target living being with the largest change in expression change intensity.

Further, determining the top frame image by:

Further, the third determining module 340 is specifically configured to:

Further, the preset classifier is determined by:

Further, the feature vector samples are determined by:

Compared with the prior art, the determining device provided by the embodiment of the application determines the coordinate information of the first characteristic point and the coordinate information of the second characteristic point corresponding to the first frame image and the second frame image respectively, and further determines the target characteristic vector; the micro expression type corresponding to the target organism is determined according to the target characteristic vector, the micro expression type is determined through coordinate information, the situation that pictures with highly similar motion states cannot be processed in a traditional micro expression identification method is avoided, and the accuracy of micro expression identification of corresponding users is improved.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device 400 according to an embodiment of the present disclosure, including: a processor 410, a memory 420 and a bus 430, wherein the memory 420 stores machine-readable instructions executable by the processor 410, the processor 410 and the memory 420 communicate via the bus 430 when the electronic device 400 is running, and the machine-readable instructions are executed by the processor 410 to perform the steps of the determination method according to any of the above embodiments.

In particular, the machine readable instructions, when executed by the processor 410, may perform the following:

acquiring a first frame image and a second frame image of a target living being, wherein the first frame image is prior to the second frame image in time sequence.

And determining first characteristic point coordinate information and second characteristic point coordinate information which respectively correspond to the first frame image and the second frame image based on a characteristic point detection model.

And determining a target characteristic vector according to the first characteristic point coordinate information and the second characteristic point coordinate information.

Based on the same application concept, the embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the determination method provided by the above embodiment.

Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, when a computer program on the storage medium is run, the determination method can be executed, the situation that pictures with highly similar motion states cannot be processed in a traditional micro-expression recognition method can be avoided, and even if a target feature vector determined by any two frames of images is small, a difference value of the target feature vector can be easily recognized, so that the accuracy of micro-expression recognition for different types of users is improved.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for determining a microexpression type, the method comprising:

2. The method according to claim 1, wherein determining a target feature vector based on the first feature point coordinate information and the second feature point coordinate information includes:

3. The determination method according to claim 1 or 2, wherein the first frame image is a start frame image in the video data of the target living being.

4. The method according to claim 1 or 2, wherein the second frame image is a top frame image in which a change in intensity of representation in the video data of the target living being is largest.

5. The determination method of claim 4, wherein determining the top frame image comprises:

6. The determination method according to claim 1, characterized in that the feature point detection model is obtained by:

7. The method for determining according to claim 1, wherein the determining a micro-expression type corresponding to the target creature according to the target feature vector comprises:

8. The determination method according to claim 7, characterized in that the preset classifier is determined by:

acquiring video sample data of the target organism and a micro-expression label corresponding to the video sample data;

calculating a feature vector sample according to a target frame image and a starting frame image in the video sample data, wherein the target frame image is any frame image except the starting frame image in the video sample data;

9. The method of claim 8, wherein the feature vector samples are determined by:

10. A determination apparatus of micro-expression type, characterized in that the determination apparatus comprises:

11. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions being executable by the processor to perform the steps of the determination method as set forth in any one of claims 1 to 9.

12. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the determination method as set forth in any one of the preceding claims 1 to 9.