CN109800771B

CN109800771B - Spontaneous micro-expression positioning method of local binary pattern of mixed space-time plane

Info

Publication number: CN109800771B
Application number: CN201910089341.3A
Authority: CN
Inventors: 付晓峰; 吴俊�; 付晓鹃; 崔扬; 徐岗; 计忠平; 姚金良; 柯进华; 吴卿
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2019-01-30
Filing date: 2019-01-30
Publication date: 2021-03-05
Anticipated expiration: 2039-01-30
Also published as: CN109800771A

Abstract

The invention discloses a spontaneous micro-expression positioning method of a local binary pattern of a mixed space-time plane. According to the correlation of continuous frames in the spontaneous micro-expression video, the alignment of the pixel-level face region is realized through fine matching, so that the anti-interference capability on interference such as head deviation is strong. Meanwhile, the fan-shaped region characteristics are extracted on the spatial axis plane, and the redundancy-removed linear characteristics are extracted on the time axis, so that the redundant calculation of characteristic points is reduced, and the more complete characteristic representation is formed by combining the space-time characteristics in a nonlinear characteristic fusion mode, therefore, the spontaneous micro expression can be represented more robustly, and the accuracy rate of the spontaneous micro expression positioning in the spontaneous micro expression video is improved.

Description

Spontaneous micro-expression positioning method of local binary pattern of mixed space-time plane

Technical Field

The invention belongs to the technical field of computer video and image processing, and relates to a spontaneous micro-expression positioning method of a local binary pattern of a mixed space-time plane.

Background

National security, psychology and the like are closely related to the daily life of people. With the development of science and technology, people find that the special properties of spontaneous micro-expression are helpful for confirming identity, detecting lie, cognitive state and the like. The most difference of the spontaneous micro expression compared with the common expression is that the spontaneous micro expression is not controlled by people when occurring, so that the spontaneous micro expression can represent the real feelings of people. And spontaneous micro-expression is characterized by mechanism inhibition, and the facial muscle movement module only contains part of all muscle modules with common expression when the spontaneous micro-expression occurs, so the amplitude is weak, different categories are easy to be confused, and the spontaneous micro-expression cannot be observed by naked eyes in many times. Just because spontaneous micro expression takes place that the range is little, duration is short, the research method of spontaneous micro expression location in the video that has now has very big promotion space.

In the current machine learning method, a spontaneous micro-expression positioning method in a video is based on spontaneous micro-expressions of continuous frames, corresponding features are extracted according to the variation amplitude of the spontaneous micro-expressions, and the spontaneous micro-expressions are judged, so that a spontaneous micro-expression frame sequence is found. To achieve this, many efficient localization algorithms are proposed, such as optical flow method, Local Binary Pattern (LBP), etc., which all achieve a certain effect when localizing spontaneous micro-expressions in video. Meanwhile, the spontaneous micro-expression positioning method based on the neural network can also extract the characteristics and position the characteristics.

Among the above mentioned algorithms, many algorithms are not fully applied to data redundancy processing and spatio-temporal feature representation of consecutive frames in a video, so that the extracted features are not complete, resulting in a low final positioning accuracy. The invention aims to overcome the problems of calculation redundancy of feature points, insufficient completeness of space-time features and the like, and provides a method for positioning spontaneous micro-expressions in a video with higher accuracy.

Disclosure of Invention

The invention aims to provide a method for positioning spontaneous micro expressions in a video with high accuracy aiming at the defects of redundancy of feature calculation, incomplete utilization of space-time features, low positioning accuracy and the like of the spontaneous micro expressions in the current video.

According to the correlation of continuous frames in the spontaneous micro-expression video, the alignment of the pixel-level face region is realized through fine matching, so that the anti-interference capability on interference such as head deviation is strong. Meanwhile, the fan-shaped region characteristics are extracted on the spatial axis plane, and the redundancy-removed linear characteristics are extracted on the time axis, so that the redundant calculation of characteristic points is reduced, and the more complete characteristic representation is formed by combining the space-time characteristics in a nonlinear characteristic fusion mode, therefore, the spontaneous micro expression can be represented more robustly, and the accuracy rate of the spontaneous micro expression positioning in the spontaneous micro expression video is improved.

The method for positioning the spontaneous micro-expression in the video comprises the steps of aligning the face in the video, extracting the space-time characteristics of the eye region and the mouth region of the face, fusing the characteristics and judging the spontaneous micro-expression frame in the video.

1) And face alignment, including coarse matching alignment and fine matching alignment.

The steps of the face alignment of the present invention are as follows:

step S1: a plurality of face mark points are extracted from each frame of the spontaneous micro-expression video by using an ASM algorithm, wherein the inner eye corner points and the nose tip points are three relatively stable points in the front view angle and are used for affine transformation of face regions, so that rough matching alignment of the face regions is realized.

Step S2: on the basis of the picture which is subjected to rough matching alignment, the similarity between adjacent frames is maximized by using a fine matching algorithm, and the face area is divided into an eye area and a mouth area to be respectively subjected to fine matching alignment.

Specifically, the decomposition step of step S2 is as follows:

inputting: a standard template picture N, a picture to be matched U _ P and an accurate matching picture N _ P;

step 1: and acquiring the size of the standard template picture N, wherein the length and the width of the picture N are respectively represented by N _ x and N _ y.

Step 2: if the x and y values are not null, executing the step 3; otherwise, initializing values, making x equal to 1 and y equal to 1, and executing step 3. x represents the abscissa of the starting point of the upper left corner of the picture to be matched, y represents the ordinate of the starting point of the upper left corner of the picture to be matched, and x and y form a pair of coordinates representing the position of the point.

And step 3: n _ Value is NCC (N, U _ P (x: x + N _ x, y: y + N _ y)), N _ Value represents the similarity Value between the standard template picture and the picture to be matched, U _ P (x: x + N _ x, y: y + N _ y)) represents that x and y are selected as the reference point of the starting coordinate of the picture to be matched, and N _ x and N _ y are the length and width of the selected picture respectively.

And 4, step 4: and selecting a neighborhood point (x {1..8} and y {1.. 8}) with (x, y) as a circle center and 1 as a radius. x {1..8} represents the horizontal coordinates of 8 neighborhood points around the center of a circle, and y {1..8} represents the vertical coordinates of 8 neighborhood points around the center of a circle, so that a group of coordinates of a certain neighborhood point is formed in sequence.

And 5: value {1..8}, NCC (N, U _ P (x {1.. 8}: x {1..8} + N _ x, y {1.. 8}: y {1..8} + N _ y)), Value {1..8} represents that the standard template picture and the picture to be matched selected with the new coordinates as the reference are subjected to similarity calculation, and 8 similarity values are sequentially obtained.

Step 6: value _ Max, which is the maximum Value of all the similarity values calculated as described above, is Max (N _ Value, Value {1..8 }).

And 7: if N _ Value _ Max, then N _ P _ U _ P (x: x + N _ x, y: y + N _ y); otherwise, let x be x _ Value _ Max and y be y _ Value _ Max, and execute step 4. And x _ Value _ Max represents the abscissa of the starting point at the upper left corner of the picture to be matched corresponding to the maximum similarity Value, and y _ Value _ Max represents the ordinate of the starting point at the upper left corner of the picture to be matched corresponding to the maximum similarity Value.

And (3) outputting: exactly match picture N _ P.

Specifically, the formula for calculating NCC is shown in formula (1).

Wherein λ ∈ [ λ [ ]₁,λ₂]The interval value applied to the matching template may be selected for the standard template. The position of the sliding matching window is located in the interval [ sigma, sigma + W-1]And σ is the matching starting position, and W is the size of the window. f (n),

Respectively representing the entropy values of the standard template and the region to be matched. The larger the value obtained by NCC, the more f (n) and

the more matched.

Specifically, max is taken to be the maximum value.

Specifically, the operator is assigned.

Specifically, the operator is equal.

Specifically, the following components: is represented by a section.

Specifically, {1..8} represents the surrounding 8 neighborhood points with a radius of 1.

Specifically, the to-be-matched picture selected based on the new coordinates refers to selecting the to-be-matched picture based on the coordinates of the neighborhood point as the coordinate reference point.

2) Extracting the space-time characteristics of the eye region and the mouth region of the human face, namely extracting the space axis fan-shaped plane characteristics and the redundancy-removing time axis linear characteristics from the eye region and the mouth region respectively.

The steps of the spatio-temporal feature extraction of the invention are as follows:

s3: and (3) respectively extracting spatial axis fan-shaped plane features of the eye region and the mouth region by using an LBP method.

S4: the LBP method is used for extracting more compact and more effective redundancy-removing time axis linear characteristics of the eye region and the mouth region respectively.

Specifically, the extraction method of the spatial axis fan-shaped plane sampling point is shown in formula (2).

Wherein, on the circle with C as the center point and r as the radius, when the sampling point is g_n，n∈[0,p_s]In g_nThe arc length which is used as the center and expands to r/2 of the two circles forms a section of circular arc. p is a radical of_SThe number of spatial axis plane sampling points is,

values representing sample points in a sector area surrounded by circular arcs,

representing the center C and the mean of all the samples contained in the sector. g_cRepresenting the pixel value of the center point C.

Specifically, the extraction manner of the linear sampling points of the redundancy-free time axis is shown in formula (3).

Wherein p is_TThe number of time axis linear sampling points. L is_nRepresenting feature points extracted in the range of radius r on each line. m (L)_nAnd r) represents the mean of all samples contained in a straight line. g_cRepresenting the pixel value of the center point C.

Specifically, the way of converting the sampling points into the feature vectors is as follows: separately counting LBP_SAnd LBP_TThe number of the same value is arranged to form a vector, namely the LBP characteristic vector H of the sub-sampling_S、H_T。

3) The characteristic fusion is to adopt a nonlinear characteristic fusion mode to fuse characteristic vectors obtained by linear extraction of a spatial axis plane and a time axis into more complete characteristic vector representation.

Specifically, the calculation formula fused into the column vector is shown in formula (4).

Specifically, [ ] in the formula (4) represents the column arrangement of the feature vector.

H_S、H_TRespectively, a space axis fan-shaped plane and a characteristic vector p on a time axis line_S、p_TRespectively the number of its sampling points.

4) The spontaneous microexpression frames in the video are judged to be the sequence positions of the spontaneous microexpression frames in the spontaneous microexpression video through correlation calculation and threshold setting, and the frames with the frame characteristics larger than the threshold are selected as the spontaneous microexpression frames.

Specifically, the calculation of the correlation is shown in the formulas (5) and (6).

χ²Indicating the degree of correlation between different feature vectors, for measuring the "distance" between feature vectors,

representing the characteristic distance of the current frame. H_CF、H_NFFeature vectors of a Current Frame (Current Frame) and a Next Frame (Next Frame) of the Current Frame are respectively represented.

Wherein, C_jRepresenting the frame characteristics of the jth frame,

representing the characteristic distance of the jth frame,

respectively, representing the characteristic distance of the video frame differing from the j-th frame by k frames.

Specifically, the threshold value is set as shown in equation (7).

T＝C_mean+p×(C_max-C_mean) (7)

Where T represents a threshold, p ∈ [0,1 ]]And determining the value of the spontaneous microexpression frame according to the positioning accuracy of the spontaneous microexpression frame. C_mean、C_maxRespectively representing the mean and maximum values of the frame features in the current video.

The invention has the beneficial effects that:

in order to reduce the interference of the positioning point offset on the face matching effect, the invention provides a fine matching algorithm to maximize the similarity between adjacent frames, thereby realizing the alignment of the pixel level precision of the face region. The face area is divided into an eye area and a mouth area at the same time because the expression of the area is most prominent.

The invention provides a mixed spatio-temporal plane LBP algorithm for respectively extracting features of an eye region and a mouth region in a video. During spatial axis plane feature extraction, sector area sampling is adopted to keep pixel information around sampling points as much as possible; in the time axis feature extraction, in order to remove information redundancy caused by the existing method, a more compact and more effective time axis linear feature extraction method is used.

The invention adopts a nonlinear feature fusion mode to fuse the space axis and time axis feature vectors with different sampling point numbers, and compared with the common serial combination feature vector method, the method has more robustness.

The invention converts the feature vector into different frame features in order to eliminate the influence of background noise and increase the contrast intensity between adjacent frames. Comparing the frame characteristics can distinguish spontaneous micro-expression frames from non-expression common frames more clearly.

Drawings

FIG. 1 shows a flow chart of the present invention.

Fig. 2 shows a comparison of the face alignment result and the unaligned result after affine transformation.

Fig. 3 shows an exemplary diagram of spatial axis sampling.

Fig. 4 shows an example diagram of time axis sampling.

Fig. 5 is a diagram showing an example of threshold determination for fine matching in the present invention.

FIG. 6 shows an exemplary graph of threshold decision for fine matching in the ULBP algorithm.

Fig. 7 is a diagram showing an exemplary threshold determination of rough matching in the present invention.

Fig. 8 shows an exemplary diagram of threshold decision for coarse matching in the ULBP algorithm.

Detailed Description

The present invention will be described in detail below with reference to the attached drawings, and it should be noted that the described embodiments are only intended to facilitate understanding of the present invention, and do not limit the present invention in any way.

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a method for positioning spontaneous micro-expressions in a video according to the present invention, which shows the whole process from the input of spontaneous micro-expression video to the output of final positioning result.

The embodiment selects CAS (ME)²And taking the complete spontaneous micro-expression video in the spontaneous micro-expression database as a test database.

1) The face positioning in fig. 1 is to extract 68 face mark points for each frame of the spontaneous micro-expression video by using an ASM algorithm, wherein an inner eye corner point and a nose tip point are three relatively stable points in a front view angle, and the face positioning has good positioning capability.

2) The affine transformation in fig. 1 is a transformation method for realizing the alignment of the face region, and the purpose is to realize the coarse matching alignment of the face region. As shown in fig. 2, it represents the comparison of the human face area in the spontaneous microexpressing video frame without transformation (as in fig. 2(a)) and with affine transformation (as in fig. 2 (b)).

3) The fine matching in fig. 1 is a method for eliminating the pixel level offset of the face picture after the coarse matching alignment, and meanwhile, in order to reduce the amount of calculation, the face area is divided into an eye area and a mouth area for performing the fine matching alignment, respectively, and specific embodiments are as follows.

Inputting: the method comprises the steps of obtaining a standard template picture N, a picture to be matched U _ P and an accurate matching picture N _ P.

And (3) outputting: exactly match picture N _ P.

It should be noted that the standard template picture is a first frame picture in each spontaneous microexpression video, and is used as a standard reference object. The pictures to be matched are all the other pictures except the first frame in the video.

Specifically, the formula for calculating NCC is shown in formula (1).

the more matched.

It should be noted that the face region is divided into an eye region and a mouth region, because the expression of the region is most prominent, and the computation time complexity of the correlation matching is also reduced.

It should be noted that the exact-match picture is the final output result, i.e. the finely-matched picture is output for the subsequent frame feature calculation.

4) The eye region and the mouth region in fig. 1 are pictures of different face regions in the same spontaneous micro-expression video obtained in the previous step.

5) The spatial axis plane features in fig. 1 are features extracted from the eye region and the mouth region with reference to the spatial axis. The extraction of the features comprises two aspects of determination of the sampling points and conversion of the sampling points and the feature vectors.

The sampling points for the spatial axis planes are determined as follows:

values representing sample points in a sector area surrounded by circular arcs,

It should be noted that, the determination of the sampling point of the spatial axis plane can refer to fig. 3, where C is the center of the sampling point, and r is the sampling radius.

Accordingly, the utility model can be used forObtaining a sampling point LBP of a spatial axis plane_S。

The conversion of the samples and features is as follows:

the method for converting the sampling points into the characteristic vectors is to respectively count the LBP_SThe same number of values are arranged to form a column vector, which is the LBP feature vector H of the sub-sample_S。

6) The time-axis linear features in fig. 1 are features extracted from the eye region and the mouth region with reference to the time axis. The extraction of the features comprises two aspects of determination of the sampling points and conversion of the sampling points and the features.

The sampling points of the time axis linearity are determined as follows:

It should be noted that, the determination of the sampling point in the time axis is described with reference to fig. 4, where C represents the central point of the sampling, and the picture in which the sampling point is located is the current picture. The white points crossed by the gray vertical lines represent the unselected feature points on the time axis, which are the sampling points that need to be included in the sub-sample.

Accordingly, a sampling point LBP linear with the time axis is obtained_T。

The conversion of the samples and features is as follows:

the method for converting the sampling points into the characteristic vectors is to respectively count the LBP_TThe same number of values are arranged to form a column vector, which is the LBP feature vector H of the sub-sample_T。

7) The features in FIG. 1 are fused into a spatial-axis planar feature vector H for the same region_SAnd time axis linear feature vector H_TFused into a feature vector in a non-linear manner。

The feature fusion approach is as follows:

in the formula (4), [ ] represents the column arrangement of the feature vector.

H_S、H_TRespectively, a space axis plane and a linear feature vector, p_S、p_TRespectively the number of its sampling points.

It should be noted that the sampling point p_SThe number of (2) is the number of black points excluding the center point in FIG. 3, sample point p_TThe number of (2) is the sum of the numbers of black dots and white dots excluding the center dot in fig. 4.

It should be noted that the final feature vector is formed by merging feature vectors obtained by fusing the eye region and the mouth region, respectively.

8) The frame feature in fig. 1 is to position the sequence position of the spontaneous microexpression frame in the self-luminous microexpression video by calculating the correlation of the final feature vector and setting the threshold, and the frame with the frame feature greater than the threshold is set as the spontaneous microexpression frame.

The correlation is calculated as shown in the formulas (5) and (6).

Wherein, C_jRepresenting the frame characteristics of the jth frame,

representing the characteristic distance of the jth frame,

The threshold value is set as shown in equation (7).

T＝C_mean+p×(C_max-C_mean) (7)

It should be noted that, in order to select the vertex frame where the spontaneous microexpression is located, all the obtained frame features C are subjected to threshold T_jAnd dividing, wherein the part larger than the threshold is the spontaneous micro-expression bit sequence obtained by detection.

Example (b):

the invention is in CAS (ME)²Experiments are performed on a database, and the spontaneous microexpression library is widely used for spontaneous microexpression frame positioning and classification.

1)CAS(ME)²Database with a plurality of databases

CAS (ME)2(Chinese Academy of Sciences Macro-expression and Micro-expression) database, which is a database published by the Chinese Academy of Sciences about spontaneous Micro-expressions. CAS (ME)2 database has a video resolution of 640 × 480 pixels, 30 frames per second. The database a section contains 94 spontaneous microexpression videos. The database contains 126 negative (negative) expression segments, 124 positive (positive) expression segments, 25 surprised (surrise) expression segments, and 82 other kinds (others) expression segments. In order to verify that the method has robustness to the head deviation, 80% of videos are randomly selected for experiment.

2) Experiment 1: coarse matching and fine matching analysis contrast experiment

The analysis is performed by taking fig. 5 as an example, which is an effect diagram of the present invention after the fine matching process. The spontaneous microexpression is detected in all frames except the 1002 th frame. As shown in fig. 5-8, frame 1002 is not detected, mainly because the frame is too weak in spontaneous microexpression amplitude.

As can be seen from fig. 5, although frames 174, 632, and 1038 are detected, all of the frames are blinking of the captured object in the video (whether blinking is an expression is still controversial), so that the frame characteristics of the video frame in which the blinking is performed exceed the threshold, thereby causing a deviation of the detection result. Meanwhile, it was found that although the effect of blinking eyes is produced, the intensity is sometimes lower than that of spontaneous microexpression, so that the amplitude is smaller than that of spontaneous microexpression. When the threshold value is set to be relatively large, the number of determination frames is relatively small, but the influence of blinking can be removed to some extent. Unfortunately, for spontaneous microexpressions, the effects of blinking are not removed by thresholding, so the choice of threshold size is also of great significance for the detection result.

It is observed from the comparison between fig. 5 and fig. 6 that although most of spontaneous microexpression frames can be detected by the present invention and the Uniform local binary pattern (Uniform LBP, ULBP) algorithm, the saliency of the peak value in fig. 5 is significantly higher than that in fig. 6 (frame 825 is an interference frame), which indicates that the present invention is more sensitive to spontaneous microexpression and is robust in anti-interference capability than the ULBP algorithm. Meanwhile, as can be seen from the comparison between fig. 7 and fig. 8, under the processing condition of only rough matching, the number of the spontaneous microexpression fragments hit by the method is more, that is, the hit rate is higher. In addition, as is apparent from fig. 7 and 8, the amplitude of the interference frame is smaller than that of the spontaneous microexpression frame, as compared with fig. 5 and 6.

3) Experiment 2: detection and analysis experiment for spontaneous micro-expression fragments of several types

As shown in table 1, the invention has certain improvements for the three types of expressions, namely negative, surprise and others, and proves that the invention is more effective and comprehensive for capturing spontaneous micro-expression changes than the ULBP algorithm. However, for positive expression, the hit rate is not obviously improved compared with the ULBP algorithm, because the positive expression is more negative expression than other expressions, and the eye area and the mouth area are obviously changed and are easier to detect.

TABLE 1 comparison table of four types of spontaneous microexpression positioning results

4) Experiment 3: analysis experiment of different algorithms

The evaluation formula is shown in formula (8) and formula (9):

as shown in table 2, the accuracy of the present invention is optimal. In the experiment, in order to reduce the influence caused by factors such as blinking eyes and improve the precision rate, the threshold value is set to be higher, so that the number of relevant frames extracted when spontaneous micro-expressions are detected is small, and the recall rate is low. The MDMD algorithm is an abbreviation of Main Directional maximum Difference (Main Directional maximum Difference).

TABLE 2 comparison of accuracy and recall

Claims

1. The spontaneous micro-expression positioning method of the local binary pattern of the mixed space-time plane is characterized by comprising the following steps:

step 1), face alignment, including coarse matching alignment and fine matching alignment;

step S1: extracting face mark points for each frame of the spontaneous micro-expression video by using an ASM algorithm, wherein the inner eye corner points and the nose tip points are three relatively stable points in a front view angle and are used for affine transformation of face regions to realize rough matching alignment of the face regions;

step S2: on the basis of the picture which is subjected to rough matching alignment, the similarity between adjacent frames is maximized by using a fine matching algorithm, and the face area is divided into an eye area and a mouth area to be respectively subjected to fine matching alignment;

step 2) extracting space-time characteristics of the eye region and the mouth region of the human face, namely extracting space axis fan-shaped plane characteristics and redundancy-removing time axis linear characteristics from the eye region and the mouth region respectively;

step S3: respectively extracting spatial axis fan-shaped plane features of the eye region and the mouth region by using an LBP (local binary pattern) method;

step S4: respectively extracting redundancy-removing time axis linear features of the eye region and the mouth region by using an LBP (local binary pattern) method;

step 3) fusing the feature vectors obtained by linear extraction of the spatial axis plane and the time axis into more complete feature vector representation by adopting a nonlinear feature fusion mode;

and 4) judging the spontaneous microexpression frames in the video through correlation calculation and threshold setting, positioning the sequence positions of the spontaneous microexpression frames in the self-expression-caused video, and selecting the frames with the frame characteristics larger than the threshold as the spontaneous microexpression frames.

2. The method for locating spontaneous microexpression of local binary pattern in hybrid spatiotemporal plane according to claim 1, wherein the step S2 is as follows:

step 1: acquiring the size of a standard template picture N, wherein the length and the width of the picture N are respectively represented by N _ x and N _ y;

step 2: if the x and y values are not null, executing the step 3; otherwise, initializing values, setting x to 1 and y to 1, and executing step 3;

wherein x represents the abscissa of the starting point of the upper left corner of the picture to be matched, y represents the ordinate of the starting point of the upper left corner of the picture to be matched, and x and y form a pair of coordinates representing the position of a point;

and step 3: n _ Value is NCC (N, U _ P (x: x + N _ x, y: y + N _ y)), N _ Value represents the similarity Value between the standard template picture and the picture to be matched, U _ P (x: x + N _ x, y: y + N _ y)) represents that x and y are selected as the reference point of the initial coordinate of the picture to be matched, and N _ x and N _ y are the length and width of the selected picture respectively;

and 4, step 4: selecting neighborhood points (x {1..8} and y {1.. 8}) which take (x, y) as a circle center and 1 as a radius; wherein x {1..8} represents the horizontal coordinates of 8 neighborhood points around the center of a circle, and y {1..8} represents the vertical coordinates of 8 neighborhood points around the center of a circle, so as to form a group of coordinates of a certain neighborhood point;

and 5: value {1..8}, namely NCC (N, U _ P (x {1.. 8}: x {1..8} + N _ x, y {1.. 8}: y {1..8} + N _ y)), Value {1..8} represents that the standard template picture and the picture to be matched selected by taking the new coordinates as the reference are subjected to similarity calculation, and 8 similarity values are sequentially obtained; the picture to be matched selected by taking the new coordinates as the reference points is selected by taking the coordinates of the neighborhood points as the reference points of the coordinates;

step 6: value _ Max, which is the maximum Value of all the similarity values obtained by the above calculation, is Max (N _ Value, Value {1..8 });

and 7: if N _ Value _ Max, then N _ P _ U _ P (x: x + N _ x, y: y + N _ y); otherwise, let x be x _ Value _ Max, y be y _ Value _ Max, execute step 4; x _ Value _ Max represents the abscissa of the starting point of the upper left corner of the picture to be matched corresponding to the maximum similarity Value, and y _ Value _ Max represents the ordinate of the starting point of the upper left corner of the picture to be matched corresponding to the maximum similarity Value;

and (3) outputting: accurately matching the picture N _ P;

wherein "max" is the operation of taking the maximum value, "═ is the operator of assignment," ═ is the operator of equality, ": "is interval representation, {1..8} represents the surrounding 8 neighborhood points with 1 as radius.

3. The spontaneous microexpression positioning method of the local binary pattern of the mixed spatiotemporal plane as recited in claim 2, wherein the formula for calculating the NCC function is shown in formula (1);

wherein λ ∈ [ λ [ ]₁,λ₂]Interval values applied to the matching template may be selected for the standard template; the position of the sliding matching window is located in the interval [ sigma, sigma + W-1]In, σ is the matching starting position, and W is the size of the window; f (n),

Respectively representing the entropy values of the standard template and the region to be matched.

4. The method for locating spontaneous microexpression of local binary pattern in mixed spatiotemporal plane according to claim 1, wherein the step S3 is specifically: the extraction mode of the spatial axis fan-shaped plane sampling point is shown as the formula (2):

wherein, on the circle with C as the center point and r as the radius, when the sampling point is g_n，n∈[0,p_s]In g_nAn arc length which is used as the center and is expanded to r/2 of the two circumferences of the circle forms a section of circular arc; p is a radical of_SThe number of spatial axis plane sampling points is,

values representing sample points in a sector area surrounded by circular arcs,

representing the mean value of the circle center C and all sampling points contained in the sector area; g_cA pixel value representing the center point C;

the way of converting the sampling points into the feature vectors is as follows: statistical LBP_SThe number of the same value is arranged to form a vector, namely the LBP characteristic vector H of the sampling_S。

5. The method for locating spontaneous microexpression of local binary pattern in mixed spatiotemporal plane according to claim 1, wherein the step S4 is specifically: the extraction mode of the linear sampling points of the redundancy removing time axis is shown as the formula (3):

wherein p is_TThe number of linear sampling points of a time axis; l is_nRepresenting feature points extracted in the range of the radius r' on each straight line; m (L)_nR') represents the mean value of the straight line containing all the sampling points; g_cA pixel value representing the center point C;

the way of converting the sampling points into the feature vectors is as follows: statistical LBP_TThe number of the same value is arranged to form a vector, namely the LBP characteristic vector H of the sampling_T。

6. The spontaneous microexpression positioning method of the local binary pattern of the mixed space-time plane according to claim 1, wherein the calculation formula of the column vectors fused in the feature fusion mode of step 3) is as shown in formula (4):

in the formula (4)]Column arrangement representing a feature vector, H_S、H_TRespectively, a space axis fan-shaped plane and a characteristic vector p on a time axis line_S、p_TRespectively the number of its sampling points.

7. The spontaneous microexpression positioning method of the local binary pattern of the mixed spatiotemporal plane as claimed in claim 1, wherein the calculation of the correlation in step 4) is as shown in formula (5) and formula (6);

representing the characteristic distance of the current frame; h_CF、H_NFRespectively representing feature vectors of a current frame and a next frame of the current frame;

wherein, C_jRepresenting the frame characteristics of the jth frame,

representing the characteristic distance of the jth frame,

8. The spontaneous microexpression positioning method of the local binary pattern of the mixed space-time plane as claimed in claim 1, wherein the threshold value in step 4) is set as shown in formula (7);

T＝C_mean+p×(C_max-C_mean) (7)

where T represents a threshold, p ∈ [0,1 ]]Determining the value of the spontaneous microexpression frame according to the positioning accuracy of the spontaneous microexpression frame; c_mean、C_maxRespectively representing the mean and maximum values of the frame features in the current video.