WO2021259005A1 - Video-based micro-expression recognition method and apparatus, computer device, and storage medium - Google Patents

Video-based micro-expression recognition method and apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2021259005A1
WO2021259005A1 PCT/CN2021/097208 CN2021097208W WO2021259005A1 WO 2021259005 A1 WO2021259005 A1 WO 2021259005A1 CN 2021097208 W CN2021097208 W CN 2021097208W WO 2021259005 A1 WO2021259005 A1 WO 2021259005A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
micro
frame
expression
feature vector
Prior art date
Application number
PCT/CN2021/097208
Other languages
French (fr)
Chinese (zh)
Inventor
熊玮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021259005A1 publication Critical patent/WO2021259005A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Definitions

  • This application relates to the field of artificial intelligence biometric recognition technology, and in particular to a video micro-expression recognition method, device, computer equipment, and storage medium.
  • feature extraction refers to the detection and extraction of micro-expressions in a video image sequence that has undergone a suitable preprocessing method through various feature extraction methods, for example, feature extraction based on optical flow or based on LBP-TOP algorithm. Feature extraction of sub (instant-space local texture operator).
  • Expression Recognition is actually a classification task. That is, the extracted micro expressions are classified into preset categories, so as to finally determine the specific meaning of each micro expression. For example, happy, sad, surprised, angry, disgusted, afraid, etc.
  • the existing expression recognition method is realized by CNN (Convolutional Neural Network). It first uses the training data set to train the built CNN model. Then classify and recognize through the trained CNN model.
  • CNN cannot use the relevant information of the video image sequence in the time domain (in the feature input layer of CNN, the relationship between each feature is not reflected, and the input layer The neurons are equivalent). That is, CNN can only recognize a single image frame in the video image information, but cannot learn changes or associations between adjacent image frames.
  • the micro-expression is the movement of the client's face in a short period of time. Relevant information in the time domain is also a very important part of identifying and distinguishing micro-expressions. Therefore, ignoring the relevant information in the time domain will lead to a decrease in the performance of CNN's recognition of micro-expressions.
  • the embodiments of the present application provide a video micro-expression recognition method, device, computer equipment, and storage medium, aiming to solve the problem that the micro-expression recognition in the prior art is only based on the motion presented in the local area of the face, that is, the convolutional nerve is used.
  • the network can only recognize a single image frame in the video image information, but cannot learn the changes or associations between adjacent image frames, which leads to the problem of a decrease in the accuracy of the convolutional neural network for micro-expression recognition.
  • an embodiment of the present application provides a video micro-expression recognition method, which includes:
  • the preset empirical frame value acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence
  • an embodiment of the present application provides a video micro-expression recognition device, which includes:
  • the micro-expression image frame acquisition unit is configured to, if user video data corresponding to the user terminal is received, acquire an image frame containing a micro-expression in a video image sequence of the user video data;
  • the micro-expression sequence acquisition unit is configured to acquire, in the image frame containing the micro-expression, a number of consecutive images equal to the number of the empirical frame according to the preset empirical frame value to form a micro-expression sequence;
  • the weight value feature vector obtaining unit is configured to call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
  • the comprehensive image feature vector acquiring unit is configured to sum the image feature vectors of the combined weight values of each frame of image to obtain the comprehensive image feature vector corresponding to the user video data;
  • the micro-expression recognition unit is used to input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result;
  • the item process information acquisition unit is used to call a pre-stored item processing micro-expression strategy, obtain item processing flow information related to the micro-expression recognition result, and send the item processing flow information to the user terminal; wherein, the item processing The micro-expression strategy stores several item processing flow information, and each item processing flow information corresponds to a micro-expression recognition result.
  • an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer The following steps are implemented during the program:
  • the preset empirical frame value acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence
  • the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, which when executed by a processor causes the processor to perform the following operations :
  • the preset empirical frame value acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence
  • the embodiments of the present application provide a video micro-expression recognition method, device, computer equipment, and storage medium.
  • the method realizes that when a neural network is used to classify micro-expressions, it fully considers the micro-expression between multiple consecutive image frames. Time sequence relationship, learn the time domain information of micro-expression in the image and video sequence, so as to provide more accurate micro-expression recognition results.
  • FIG. 1 is a schematic diagram of an application scenario of a video micro-expression recognition method provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a video micro-expression recognition method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of a sub-flow of a video micro-expression recognition method provided by an embodiment of the application.
  • FIG. 4 is a schematic block diagram of a video micro-expression recognition device provided by an embodiment of the application.
  • FIG. 5 is a schematic block diagram of subunits of a video micro-expression recognition device provided by an embodiment of the application.
  • Fig. 6 is a schematic block diagram of a computer device provided by an embodiment of the application.
  • Figure 1 is a schematic diagram of an application scenario of a video micro-expression recognition method provided by an embodiment of this application
  • Figure 2 is a schematic flow chart of a video micro-expression recognition method provided in an embodiment of this application. The method is applied to a server, and the method is executed by application software installed in the server.
  • the method includes steps S110 to S160.
  • the user terminal After the user terminal establishes a connection with the server, when the user views the user interaction interface provided by the server on the user terminal, the user handles the items according to the item process corresponding to the item selected on the user interaction interface.
  • the user side needs to start the camera to collect the user's video data and upload it to the server.
  • the server After receiving the user video data sent by the client, the server obtains the image frame containing the micro-expression to perform subsequent micro-expression recognition.
  • step S110 includes:
  • any suitable feature extraction method can be specifically selected to extract the image frames of the micro-expression contained in the video image sequence.
  • the optical flow algorithm is to estimate the optical flow in the video image sequence under certain constraints to identify the subtle movements of the customer's face, and realize the feature extraction of the micro-expression.
  • the LBP-TOP operator real-time spatial local texture
  • LBP operator local binary mode
  • the video image sequence from which the user video data is obtained by the optical flow method includes the image frames of the micro-expression, including:
  • an image frame containing micro-expression is composed of corresponding pictures.
  • the scene of the object forms a series of continuously changing images on the retina of the human eye, and this series of continuously changing information continuously "flows through" the retina (that is, the image plane) , Seems to be a kind of "flow” of light, so it is called optical flow.
  • the optical flow expresses the change of the image, contains the information of the target's movement, and can be used to determine the target's movement.
  • the three elements of optical flow one is the motion velocity field, which is a necessary condition for the formation of optical flow; the second is the part with optical characteristics such as gray-scale pixels, which can carry motion information; the third is the imaging projection from the scene to the The image plane can thus be observed.
  • optical flow is based on points. Specifically, let (u, v) be the optical flow of the image point (x, y), then (x, y, u, v) is called the optical flow point.
  • the collection of all optical flow points is called the optical flow field.
  • a corresponding image motion field, or image velocity field is formed on the image plane. In an ideal situation, the optical flow field corresponds to the sports field.
  • the image can be dynamically analyzed. If there is no moving target in the image, the optical flow vector changes continuously throughout the image area. When there is a moving object in the image (when the user has a micro-expression, the face will move, which is equivalent to a moving object), there is relative movement between the target and the background. The velocity vector formed by the moving object must be different from the background velocity vector, so that the position of the moving object can be calculated. Preprocessing by the optical flow method can obtain image frames containing micro-expression in the video image sequence of the user video data.
  • S120 According to the preset empirical frame value, obtain a number of consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence.
  • the empirical frame value is denoted as N, and N is an empirical value, which can be set by a technician according to actual needs. That is, it is ensured that a complete process of micro-expression from the beginning, the peak to the end is recorded in the N frames of images.
  • connection between the image frames in the micro-expression sequence (that is, the time-domain information of the micro-expression image sequence) can be represented by the difference in the weight value.
  • the time domain information of the sequence can be obtained by increasing the weight of these jointly appearing image frames.
  • a pre-built weight calculation layer needs to be called to calculate the weight feature vector of each frame of the image in the micro-expression sequence.
  • step S130 includes:
  • S133 Perform normalization processing on a set of similarity values corresponding to each frame of the image in the micro-expression sequence, to obtain a set of normalized similarity values corresponding to each frame of the image;
  • each frame of image in the micro-expression sequence is initially without a weight value, at this time, in order to obtain the weight value of each frame of image, the following process can be performed:
  • each frame of image can be input into the trained convolutional neural network to obtain the picture feature vector corresponding to each frame of image; Then obtain the picture feature vector set corresponding to each frame of image, wherein the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is corresponding to other frame images except the i-th frame image in the micro-expression sequence
  • N i Mark the i-th image in the N frames of the micro-expression sequence as N i , and first input the image feature vector corresponding to one of the images into the weight calculation layer to calculate the frame image and the rest of the micro-expression sequence
  • the similarity values between the image feature vectors constitute the similarity value set of the i-th frame image.
  • the similarity can be evaluated in any suitable way, such as the vector dot product between the image feature vectors of the two frames, the cosine similarity or the introduction of a new neural network to calculate;
  • each normalized similarity value in the normalized similarity value set is multiplied by the picture feature vector of the corresponding frame and then summed, The image feature vector corresponding to each frame of image combined with the weight value is obtained.
  • the internal connections between different image frames in the sequence of micro-expression images can be mined. That is, some closely related image frames have a weight value that is significantly higher than other image frames, so that more attention can be paid to the recognition process of the micro-expression.
  • step S134 includes:
  • the image feature vector corresponding to the weight value of the i-th frame image obtained in this way fully considers the internal relationship between different image frames.
  • S140 Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data.
  • the combination of each frame of image is summed to obtain the integrated image feature vector corresponding to the user video data, and then the integrated image feature vector is used as the recognition vector to perform micro-expression recognition.
  • this integrated image feature vector represents the integrated image feature vector corresponding to N frames of images in the video image sequence, which is then input to the weight calculation layer for use In the convolutional neural network, the micro-expression recognition results can be obtained.
  • step S150 includes:
  • the integrated image feature vector is input to the softmax layer of the pre-trained convolutional neural network to obtain the micro-expression recognition result.
  • the convolutional layer, the pooling layer, and the fully connected layer have been used in the convolutional neural network used in the weight calculation layer, the corresponding image feature vector is obtained.
  • the comprehensive image feature vector can be input to the softmax layer of the convolutional neural network to obtain the final micro-expression recognition result. Specifically, the probability that the micro expression belongs to each category is obtained, and the category with the highest probability is selected as the micro expression recognition result of the micro expression sequence.
  • S160 Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several Each item processing flow information corresponds to a micro-expression recognition result.
  • This method realizes that when a neural network is used to classify micro-expressions, the timing relationship between micro-expressions in multiple consecutive image frames is fully considered, and the time-domain information of micro-expressions in the image and video sequence is learned, which can provide more accurate The result of micro-expression recognition.
  • An embodiment of the present application also provides a video micro-expression recognition device, which is used to execute any embodiment of the aforementioned video micro-expression recognition method.
  • a video micro-expression recognition device which is used to execute any embodiment of the aforementioned video micro-expression recognition method.
  • Fig. 4 is a schematic block diagram of a video micro-expression recognition device provided by an embodiment of the present application.
  • the video micro-expression recognition device 100 can be configured in a server.
  • the video micro-expression recognition device 100 includes: a micro-expression image frame acquisition unit 110, a micro-expression sequence acquisition unit 120, a weight value feature vector acquisition unit 130, a comprehensive image feature vector acquisition unit 140, and a micro-expression recognition unit 150 ,
  • the item flow information acquiring unit 160 includes: a micro-expression image frame acquisition unit 110, a micro-expression sequence acquisition unit 120, a weight value feature vector acquisition unit 130, a comprehensive image feature vector acquisition unit 140, and a micro-expression recognition unit 150 , The item flow information acquiring unit 160.
  • the micro-expression image frame obtaining unit 110 is configured to, if user video data corresponding to the user terminal is received, obtain an image frame containing a micro-expression in a video image sequence of the user video data.
  • the user terminal After the user terminal establishes a connection with the server, when the user views the user interaction interface provided by the server on the user terminal, the user handles the items according to the item process corresponding to the item selected on the user interaction interface.
  • the user side needs to start the camera to collect the user's video data and upload it to the server.
  • the server After receiving the user video data sent by the client, the server obtains the image frame containing the micro-expression to perform subsequent micro-expression recognition.
  • micro-expression image frame obtaining unit 110 is further configured to:
  • any suitable feature extraction method can be specifically selected to extract the image frames of the micro-expression contained in the video image sequence.
  • the optical flow algorithm is to estimate the optical flow in the video image sequence under certain constraints to identify the subtle movements of the customer's face, and realize the feature extraction of the micro-expression.
  • the LBP-TOP operator real-time spatial local texture
  • LBP operator local binary mode
  • the micro-expression image frame obtaining unit 110 includes:
  • a speed vector feature acquiring unit configured to acquire a speed vector feature corresponding to each pixel of the video image sequence of the user video data
  • the target image frame acquisition unit is configured to, if the velocity vector feature of at least one frame of image in the video image sequence does not keep changing continuously, form an image frame containing micro-expression from the corresponding pictures.
  • the scene of the object forms a series of continuously changing images on the retina of the human eye, and this series of continuously changing information continuously "flows through" the retina (that is, the image plane) , Seems to be a kind of "flow” of light, so it is called optical flow.
  • the optical flow expresses the change of the image, contains the information of the target's movement, and can be used to determine the target's movement.
  • the three elements of optical flow one is the motion velocity field, which is a necessary condition for the formation of optical flow; the second is the part with optical characteristics such as gray-scale pixels, which can carry motion information; the third is the imaging projection from the scene to the The image plane can thus be observed.
  • optical flow is based on points. Specifically, let (u, v) be the optical flow of the image point (x, y), then (x, y, u, v) is called the optical flow point.
  • the collection of all optical flow points is called the optical flow field.
  • a corresponding image motion field, or image velocity field is formed on the image plane. In an ideal situation, the optical flow field corresponds to the sports field.
  • the image can be dynamically analyzed. If there is no moving target in the image, the optical flow vector changes continuously throughout the image area. When there are moving objects in the image (when the user has micro expressions, the face will move, which is equivalent to moving objects), there is relative movement between the target and the background. The velocity vector formed by the moving object must be different from the background velocity vector, so that the position of the moving object can be calculated. Preprocessing by the optical flow method can obtain image frames containing micro-expression in the video image sequence of the user video data.
  • the micro-expression sequence acquiring unit 120 is configured to acquire, in the image frame containing the micro-expression, a number of consecutive images equal to the number of the empirical frame according to the preset empirical frame value to form a micro-expression sequence.
  • the empirical frame value is denoted as N, and N is an empirical value, which can be set by a technician according to actual needs. That is, it is ensured that a complete process of micro-expression from the beginning, the peak to the end is recorded in the N frames of images.
  • the weight value feature vector obtaining unit 130 is configured to call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence to obtain the image feature vector combined with the weight value of each frame of image.
  • connection between the image frames in the micro-expression sequence (that is, the time-domain information of the micro-expression image sequence) can be represented by the difference in the weight value.
  • the time domain information of the sequence can be obtained by increasing the weight of these jointly appearing image frames.
  • a pre-built weight calculation layer needs to be called to calculate the weight feature vector of each frame of the image in the micro-expression sequence.
  • the weight value feature vector obtaining unit 130 includes:
  • the picture feature vector acquiring unit 131 is configured to acquire the picture feature vector corresponding to each frame of the image in the micro-expression sequence and the set of picture feature vector corresponding to each frame of the image; wherein, the i-th frame image in the micro-expression sequence
  • the similarity value obtaining unit 132 is configured to obtain the similarity value between the picture feature vector of each frame of the image in the micro-expression sequence and the picture feature vectors of other frames of the image to obtain the corresponding similarity of each frame of image A value set; wherein the similarity values between the picture feature vector of the i-th frame image and the picture feature vectors of other frames in the micro-expression sequence constitute a similarity value set of the i-th frame image;
  • the normalization unit 133 is configured to perform normalization processing on the set of similarity values corresponding to each frame of the image in the micro-expression sequence to obtain a set of normalized similarity values corresponding to each frame of the image;
  • the weight feature vector obtaining unit 134 is used to obtain the weight feature vector corresponding to each frame image according to the normalized similarity value set and the picture feature vector set corresponding to each frame image, so as to obtain the combination of each frame image The image feature vector of the weight value.
  • each frame of image in the micro-expression sequence is initially without a weight value, at this time, in order to obtain the weight value of each frame of image, the following process can be performed:
  • each frame of image can be input into the trained convolutional neural network to obtain the picture feature vector corresponding to each frame of image; Then obtain the picture feature vector set corresponding to each frame of image, wherein the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is corresponding to other frame images except the i-th frame image in the micro-expression sequence
  • N i Mark the i-th image in the N frames of the micro-expression sequence as N i , and first input the image feature vector corresponding to one of the images into the weight calculation layer to calculate the frame image and the rest of the micro-expression sequence
  • the similarity values between the image feature vectors constitute the similarity value set of the i-th frame image.
  • the similarity can be evaluated in any suitable way, such as the vector dot product between the image feature vectors of the two frames, the cosine similarity or the introduction of a new neural network to calculate;
  • each normalized similarity value in the normalized similarity value set is multiplied by the picture feature vector of the corresponding frame and then summed, The image feature vector corresponding to each frame of image combined with the weight value is obtained.
  • the internal connections between different image frames in the sequence of micro-expression images can be mined. That is, some closely related image frames have a weight value that is significantly higher than other image frames, so that more attention can be paid to the recognition process of the micro-expression.
  • the weight feature vector obtaining unit 134 is further configured to include:
  • the image feature vector corresponding to the weight value of the i-th frame image obtained in this way fully considers the internal relationship between different image frames.
  • the integrated image feature vector obtaining unit 140 is configured to sum the image feature vectors of the combined weight values of each frame of image to obtain the integrated image feature vector corresponding to the user video data.
  • the combination of each frame of image is summed to obtain the integrated image feature vector corresponding to the user video data, and then the integrated image feature vector is used as the recognition vector to perform micro-expression recognition.
  • the micro-expression recognition unit 150 is configured to input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result.
  • this integrated image feature vector represents the integrated image feature vector corresponding to N frames of images in the video image sequence, which is then input to the weight calculation layer for use In the convolutional neural network, the micro-expression recognition results can be obtained.
  • micro-expression recognition unit 150 is also used to:
  • the integrated image feature vector is input to the softmax layer of the pre-trained convolutional neural network to obtain the micro-expression recognition result.
  • the convolutional layer, the pooling layer, and the fully connected layer have been used in the convolutional neural network used in the weight calculation layer, the corresponding image feature vector is obtained.
  • the comprehensive image feature vector can be input to the softmax layer of the convolutional neural network to obtain the final micro-expression recognition result. Specifically, the probability that the micro expression belongs to each category is obtained, and the category with the highest probability is selected as the micro expression recognition result of the micro expression sequence.
  • the item process information acquiring unit 160 is configured to call a pre-stored item processing micro-expression strategy, obtain item processing flow information related to the micro-expression recognition result, and send the item processing flow information to the user terminal; wherein, the item The micro-expression processing strategy stores several item processing flow information, and each item processing flow information corresponds to a micro-expression recognition result.
  • the device realizes that when the neural network is used to classify micro-expressions, it fully considers the timing relationship between the micro-expressions in multiple consecutive image frames, and learns the time-domain information of the micro-expressions in the image and video sequence, so as to provide more accurate The result of micro-expression recognition.
  • the above-mentioned video micro-expression recognition device can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 6.
  • FIG. 6 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 500 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the processor 502 can execute the video micro-expression recognition method.
  • the processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503.
  • the processor 502 can execute the video micro-expression recognition method.
  • the network interface 505 is used for network communication, such as providing data information transmission.
  • the structure shown in FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
  • the specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in a memory to implement the video micro-expression recognition method disclosed in the embodiment of the present application.
  • the embodiment of the computer device shown in FIG. 6 does not constitute a limitation on the specific configuration of the computer device.
  • the computer device may include more or less components than those shown in the figure. Or combine certain components, or different component arrangements.
  • the computer device may only include a memory and a processor. In such embodiments, the structures and functions of the memory and the processor are the same as those of the embodiment shown in FIG. 6, which will not be repeated here.
  • the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • a computer-readable storage medium may be a non-volatile computer-readable storage medium, or may be a volatile computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, where the computer program is executed by a processor to implement the video micro-expression recognition method disclosed in the embodiments of the present application.
  • the disclosed equipment, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods, or the units with the same function may be combined into one. Units, for example, multiple units or components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A video-based micro-expression recognition method and apparatus, a computer device, and a storage medium, relating to biological recognition of artificial intelligence. The method comprises: obtaining, in user video data, image frames containing micro-expressions; obtaining, from the image frames containing the micro-expressions, continuous multiple frame images having the number equal to the experience frame numerical value so as to form a micro-expression sequence; calling a weight calculation layer to calculate image feature vectors of combined weight values of frame images in the micro-expression sequence; summing the image feature vectors of combined weight values of the frame images to obtain a corresponding comprehensive image feature vector; inputting the comprehensive image feature vector into a convolutional neural network to obtain a micro-expression recognition result; and calling an item processing micro-expression strategy to obtain corresponding item processing flow information. According to the method, a time sequence relationship of the micro-expressions among a plurality of continuous image frames is fully considered when a neural network is used for classifying the micro-expressions, and time domain information of the micro-expressions in an image video sequence is learned, so that the micro-expression recognition result is obtained more accurately.

Description

视频微表情识别方法、装置、计算机设备及存储介质Video micro-expression recognition method, device, computer equipment and storage medium
本申请要求于2020年6月23日提交中国专利局、申请号为202010583481.9,申请名称为“视频微表情识别方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 23, 2020, the application number is 202010583481.9, and the application name is "Video Micro-Expression Recognition Method, Device, Computer Equipment, and Storage Medium". The entire content is approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及人工智能的生物识别技术领域,尤其涉及一种视频微表情识别方法、装置、计算机设备及存储介质。This application relates to the field of artificial intelligence biometric recognition technology, and in particular to a video micro-expression recognition method, device, computer equipment, and storage medium.
背景技术Background technique
随着互联网技术的不断发展,金融产品也开始提供线上销售的渠道,以便于人们进行购买。在销售过程中,可以对录制的销售过程视频使用微表情识别技术,分析客户在视频中企图抑制的真实情绪,实现对客户的情绪识别以规避销售过程中的风险。With the continuous development of Internet technology, financial products have also begun to provide online sales channels for people to make purchases. In the sales process, you can use micro-expression recognition technology on the recorded sales process video to analyze the true emotions that customers try to suppress in the video, and realize the emotion recognition of customers to avoid risks in the sales process.
现有的微表情识别算法需要完成特征提取和表情识别两个任务。其中,“特征提取”是指在一段经过合适的预处理方法的视频图像序列中,通过各种特征提取方式检测并提取其中的微表情,例如,基于光流的特征提取或者基于LBP-TOP算子(即时空局部纹理算子)的特征提取。Existing micro-expression recognition algorithms need to complete two tasks: feature extraction and expression recognition. Among them, "feature extraction" refers to the detection and extraction of micro-expressions in a video image sequence that has undergone a suitable preprocessing method through various feature extraction methods, for example, feature extraction based on optical flow or based on LBP-TOP algorithm. Feature extraction of sub (instant-space local texture operator).
而“表情识别”实际上是一个分类任务。亦即,将提取获得的微表情分到预先设定的类别中,从而最终确定每个微表情具体对应的含义。例如,高兴,悲伤、惊讶、生气、厌恶、害怕等等。And "Expression Recognition" is actually a classification task. That is, the extracted micro expressions are classified into preset categories, so as to finally determine the specific meaning of each micro expression. For example, happy, sad, surprised, angry, disgusted, afraid, etc.
现有的表情识别方法是通过CNN(卷积神经网络)来实现的。其首先利用训练数据集对构建好的CNN模型进行训练。然后通过训练好的CNN模型进行分类和识别。The existing expression recognition method is realized by CNN (Convolutional Neural Network). It first uses the training data set to train the built CNN model. Then classify and recognize through the trained CNN model.
但是,发明人意识到现有使用CNN进行识别和分类时,CNN无法利用视频图像序列在时域上的相关信息(CNN的特征输入层中,每个特征之间的相互关系没有体现,输入层的神经元是等效的)。亦即,CNN只能对视频图像信息中的单个图像帧进行识别而无法学习到相邻图像帧之间的变化或者关联。However, the inventor realizes that when CNN is currently used for identification and classification, CNN cannot use the relevant information of the video image sequence in the time domain (in the feature input layer of CNN, the relationship between each feature is not reflected, and the input layer The neurons are equivalent). That is, CNN can only recognize a single image frame in the video image information, but cannot learn changes or associations between adjacent image frames.
而微表情是客户在一段较短时间内,面部局部区域呈现的运动。时域上的相关信息也是识别和区分微表情非常重要的部分。因此,忽略时域上的相关信息会导致CNN对微表情识别性能的下降。The micro-expression is the movement of the client's face in a short period of time. Relevant information in the time domain is also a very important part of identifying and distinguishing micro-expressions. Therefore, ignoring the relevant information in the time domain will lead to a decrease in the performance of CNN's recognition of micro-expressions.
发明内容Summary of the invention
本申请实施例提供了一种视频微表情识别方法、装置、计算机设备及存储介质,旨在解决现有技术中微表情识别是仅仅是基于面部局部区域呈现的运动,也即采用的卷积神经网络只能对视频图像信息中的单个图像帧进行识别而无法学习到相邻图像帧之间的变化或者关联, 导致卷积神经网络对微表情识别准确率下降的问题。The embodiments of the present application provide a video micro-expression recognition method, device, computer equipment, and storage medium, aiming to solve the problem that the micro-expression recognition in the prior art is only based on the motion presented in the local area of the face, that is, the convolutional nerve is used. The network can only recognize a single image frame in the video image information, but cannot learn the changes or associations between adjacent image frames, which leads to the problem of a decrease in the accuracy of the convolutional neural network for micro-expression recognition.
第一方面,本申请实施例提供了一种视频微表情识别方法,其包括:In the first aspect, an embodiment of the present application provides a video micro-expression recognition method, which includes:
若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;If user video data corresponding to the user terminal is received, acquiring image frames containing micro-expressions in the video image sequence of the user video data;
根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;According to the preset empirical frame value, acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence;
调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;Call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data;
将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several items Processing flow information, each item processing flow information corresponds to a micro-expression recognition result.
第二方面,本申请实施例提供了一种视频微表情识别装置,其包括:In the second aspect, an embodiment of the present application provides a video micro-expression recognition device, which includes:
微表情图像帧获取单元,用于若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;The micro-expression image frame acquisition unit is configured to, if user video data corresponding to the user terminal is received, acquire an image frame containing a micro-expression in a video image sequence of the user video data;
微表情序列获取单元,用于若根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;The micro-expression sequence acquisition unit is configured to acquire, in the image frame containing the micro-expression, a number of consecutive images equal to the number of the empirical frame according to the preset empirical frame value to form a micro-expression sequence;
权重值特征向量获取单元,用于调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;The weight value feature vector obtaining unit is configured to call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
综合图像特征向量获取单元,用于将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;The comprehensive image feature vector acquiring unit is configured to sum the image feature vectors of the combined weight values of each frame of image to obtain the comprehensive image feature vector corresponding to the user video data;
微表情识别单元,用于将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及The micro-expression recognition unit is used to input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
事项流程信息获取单元,用于调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。The item process information acquisition unit is used to call a pre-stored item processing micro-expression strategy, obtain item processing flow information related to the micro-expression recognition result, and send the item processing flow information to the user terminal; wherein, the item processing The micro-expression strategy stores several item processing flow information, and each item processing flow information corresponds to a micro-expression recognition result.
第三方面,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现以下步骤:In the third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer The following steps are implemented during the program:
若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;If user video data corresponding to the user terminal is received, acquiring image frames containing micro-expressions in the video image sequence of the user video data;
根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;According to the preset empirical frame value, acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence;
调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;Call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data;
将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several items Processing flow information, each item processing flow information corresponds to a micro-expression recognition result.
第四方面,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行以下操作:In a fourth aspect, the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, which when executed by a processor causes the processor to perform the following operations :
若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;If user video data corresponding to the user terminal is received, acquiring image frames containing micro-expressions in the video image sequence of the user video data;
根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;According to the preset empirical frame value, acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence;
调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;Call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data;
将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several items Processing flow information, each item processing flow information corresponds to a micro-expression recognition result.
本申请实施例提供了一种视频微表情识别方法、装置、计算机设备及存储介质,该方法实现了采用神经网络在进行微表情分类时,充分考虑到微表情在多个连续图像帧之间的时序关系,学习到微表情在图像视频序列中的时域信息,从而能够提供更准确的微表情识别结果。The embodiments of the present application provide a video micro-expression recognition method, device, computer equipment, and storage medium. The method realizes that when a neural network is used to classify micro-expressions, it fully considers the micro-expression between multiple consecutive image frames. Time sequence relationship, learn the time domain information of micro-expression in the image and video sequence, so as to provide more accurate micro-expression recognition results.
附图说明Description of the drawings
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1为本申请实施例提供的视频微表情识别方法的应用场景示意图;FIG. 1 is a schematic diagram of an application scenario of a video micro-expression recognition method provided by an embodiment of the application;
图2为本申请实施例提供的视频微表情识别方法的流程示意图;2 is a schematic flowchart of a video micro-expression recognition method provided by an embodiment of the application;
图3为本申请实施例提供的视频微表情识别方法的子流程示意图;3 is a schematic diagram of a sub-flow of a video micro-expression recognition method provided by an embodiment of the application;
图4为本申请实施例提供的视频微表情识别装置的示意性框图;4 is a schematic block diagram of a video micro-expression recognition device provided by an embodiment of the application;
图5为本申请实施例提供的视频微表情识别装置的子单元示意性框图;5 is a schematic block diagram of subunits of a video micro-expression recognition device provided by an embodiment of the application;
图6为本申请实施例提供的计算机设备的示意性框图。Fig. 6 is a schematic block diagram of a computer device provided by an embodiment of the application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and appended claims, the terms "including" and "including" indicate the existence of the described features, wholes, steps, operations, elements and/or components, but do not exclude one or The existence or addition of multiple other features, wholes, steps, operations, elements, components, and/or collections thereof.
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terms used in the specification of this application are only for the purpose of describing specific embodiments and are not intended to limit the application. As used in the specification of this application and the appended claims, unless the context clearly indicates other circumstances, the singular forms "a", "an" and "the" are intended to include plural forms.
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and/or" used in the specification and appended claims of this application refers to any combination of one or more of the items listed in the associated and all possible combinations, and includes these combinations .
请参阅图1和图2,图1为本申请实施例提供的视频微表情识别方法的应用场景示意图;图2为本申请实施例提供的视频微表情识别方法的流程示意图,该视频微表情识别方法应用于服务器中,该方法通过安装于服务器中的应用软件进行执行。Please refer to Figures 1 and 2. Figure 1 is a schematic diagram of an application scenario of a video micro-expression recognition method provided by an embodiment of this application; Figure 2 is a schematic flow chart of a video micro-expression recognition method provided in an embodiment of this application. The method is applied to a server, and the method is executed by application software installed in the server.
如图2所示,该方法包括步骤S110~S160。As shown in Figure 2, the method includes steps S110 to S160.
S110、若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧。S110: If user video data corresponding to the user terminal is received, obtain an image frame containing a micro-expression in a video image sequence of the user video data.
在本实施例中,当用户端与服务器建立连接后,用户在用户端上查看服务器对应提供的用户交互界面时,用户根据在用户交互界面上所选定的事项对应的事项流程进行事项办理。在事项办理的过程中,用户端需启动摄像头以对用户视频数据进行采集并上传至服务器。服务器在接收到用户端所发送的用户视频数据后,获取其中包含微表情的图像帧,以进行后续的微表情识别。In this embodiment, after the user terminal establishes a connection with the server, when the user views the user interaction interface provided by the server on the user terminal, the user handles the items according to the item process corresponding to the item selected on the user interaction interface. In the process of handling matters, the user side needs to start the camera to collect the user's video data and upload it to the server. After receiving the user video data sent by the client, the server obtains the image frame containing the micro-expression to perform subsequent micro-expression recognition.
在一实施例中,步骤S110包括:In an embodiment, step S110 includes:
通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧;或者通过时空局部纹理算子获取所述用户视频数据的视频图像序列中包含微表情的图像帧。Obtain image frames containing micro-expression in the video image sequence of the user video data by an optical flow method; or obtain image frames containing micro-expression in the video image sequence of the user video data by a spatio-temporal local texture operator.
在本实施例中,具体可以选择使用的任何合适的特征提取方式,从视频图像序列中提取其中包含的微表情的图像帧。例如,可以使用基于光流的特征提取或者基于LBP-TOP算子的特征提取:In this embodiment, any suitable feature extraction method can be specifically selected to extract the image frames of the micro-expression contained in the video image sequence. For example, you can use feature extraction based on optical flow or feature extraction based on LBP-TOP operator:
其中,光流算法是在一定约束条件下估算视频图像序列中的光流从而识别出客户面部的细微运动,实现对微表情的特征提取。而LBP-TOP算子(即时空局部纹理)则是在局部二值模式(LBP算子)的基础上发展而来的,用于反映像素在视频图像序列中的空间分布的特征。简单而言,其是在LBP算子的基础上,新增加了一个时间上的维度,从而可以提取视频图像序列中各个像素点跟随时间的变化特征,从而识别出客户面部的细微表情变化。Among them, the optical flow algorithm is to estimate the optical flow in the video image sequence under certain constraints to identify the subtle movements of the customer's face, and realize the feature extraction of the micro-expression. The LBP-TOP operator (real-time spatial local texture) is developed on the basis of the local binary mode (LBP operator) and is used to reflect the characteristics of the spatial distribution of pixels in the video image sequence. Simply put, it is based on the LBP operator, adding a new time dimension, so that the change characteristics of each pixel in the video image sequence with time can be extracted, so as to identify the subtle facial expression changes of the customer.
在一实施例中,所述通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:In an embodiment, the video image sequence from which the user video data is obtained by the optical flow method includes the image frames of the micro-expression, including:
获取所述用户视频数据的视频图像序列的各像素点对应的速度矢量特征;Acquiring a velocity vector feature corresponding to each pixel of the video image sequence of the user video data;
若视频图像序列中存在至少一帧图像的所述速度矢量特征未保持连续变化,由对应图片组成包含微表情的图像帧。If the velocity vector feature of at least one frame of images in the video image sequence does not keep changing continuously, an image frame containing micro-expression is composed of corresponding pictures.
在本实施例中,当人的眼睛观察运动物体时,物体的景象在人眼的视网膜上形成一系列连续变化的图像,这一系列连续变化的信息不断“流过”视网膜(即图像平面),好像是一种光的“流”,故称之为光流。光流表达图像的变化,包含目标运动的信息,可用来确定目标的运动。光流三个要素:一是运动速度场,这是形成光流的必要条件;二是带光学特征的部分例如有灰度的象素点,它可以携带运动信息;三是成像投影从场景到图像平面,因而能被观察到。In this embodiment, when the human eye observes a moving object, the scene of the object forms a series of continuously changing images on the retina of the human eye, and this series of continuously changing information continuously "flows through" the retina (that is, the image plane) , Seems to be a kind of "flow" of light, so it is called optical flow. The optical flow expresses the change of the image, contains the information of the target's movement, and can be used to determine the target's movement. The three elements of optical flow: one is the motion velocity field, which is a necessary condition for the formation of optical flow; the second is the part with optical characteristics such as gray-scale pixels, which can carry motion information; the third is the imaging projection from the scene to the The image plane can thus be observed.
定义光流以点为基础,具体来说,设(u,v)为图像点(x,y)的光流,则把(x,y,u,v)称为光流点。所有光流点的集合称为光流场。当带光学特性的物体在三维空间运动时,在图像平面上就形成了相应的图像运动场,或称为图像速度场。在理想情况下,光流场对应于运动场。The definition of optical flow is based on points. Specifically, let (u, v) be the optical flow of the image point (x, y), then (x, y, u, v) is called the optical flow point. The collection of all optical flow points is called the optical flow field. When an object with optical properties moves in a three-dimensional space, a corresponding image motion field, or image velocity field, is formed on the image plane. In an ideal situation, the optical flow field corresponds to the sports field.
给图像中的每个像素点赋予一个速度矢量,这样就形成了一个运动矢量场。根据各个像素点的速度矢量特征,可以对图像进行动态分析。如果图像中没有运动目标,则光流矢量在整个图像区域是连续变化的。当图像中有运动物体时(当用户有微表情时,脸部会有运动,相当于运动物体),目标和背景存在着相对运动。运动物体所形成的速度矢量必然和背景的速度矢量有所不同,如此便可以计算出运动物体的位置。通过光流法进行预处理,即可得到所述用户视频数据的视频图像序列中包含微表情的图像帧。Assign a velocity vector to each pixel in the image, thus forming a motion vector field. According to the velocity vector characteristics of each pixel, the image can be dynamically analyzed. If there is no moving target in the image, the optical flow vector changes continuously throughout the image area. When there is a moving object in the image (when the user has a micro-expression, the face will move, which is equivalent to a moving object), there is relative movement between the target and the background. The velocity vector formed by the moving object must be different from the background velocity vector, so that the position of the moving object can be calculated. Preprocessing by the optical flow method can obtain image frames containing micro-expression in the video image sequence of the user video data.
S120、根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列。S120: According to the preset empirical frame value, obtain a number of consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence.
在本实施例中,经验帧数值记为N,N是一个经验性数值,可以由技术人员根据实际情况的需要而设置。亦即,保证在N帧图像中记录有一个微表情从起始、峰值到终结的完整过程。In this embodiment, the empirical frame value is denoted as N, and N is an empirical value, which can be set by a technician according to actual needs. That is, it is ensured that a complete process of micro-expression from the beginning, the peak to the end is recorded in the N frames of images.
S130、调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量。S130. Invoke a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector combined with the weight value of each frame of image.
在本实施例中,通过权重值的不同的可以表示微表情序列中图像帧之间的联系(即微表情图像序列的时域信息)。例如,在一个微笑的微表情序列中,某几张图像帧总是联合出现,通过提高这些联合出现的图像帧的权重可以获得序列的时域信息。In this embodiment, the connection between the image frames in the micro-expression sequence (that is, the time-domain information of the micro-expression image sequence) can be represented by the difference in the weight value. For example, in a smiling micro-expression sequence, certain image frames always appear jointly, and the time domain information of the sequence can be obtained by increasing the weight of these jointly appearing image frames.
为了对所述微表情序列中每一帧图像赋予权重值,需要调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量。In order to assign a weight value to each frame of the image in the micro-expression sequence, a pre-built weight calculation layer needs to be called to calculate the weight feature vector of each frame of the image in the micro-expression sequence.
在一实施例中,如图3所示,步骤S130包括:In an embodiment, as shown in FIG. 3, step S130 includes:
S131、获取所述微表情序列中每一帧图像对应的图片特征向量,及每一帧图像对应的图片特征向量集合;其中,所述微表情序列中第i帧图像对应的图片特征向量集合由所述微表情序列中除第i帧图像之外的其他帧图像对应的图片特征向量组成,i的取值范围是[1,N]且 N=经验帧数值;S131. Obtain a picture feature vector corresponding to each frame of the image in the micro-expression sequence, and a picture feature vector set corresponding to each frame of the image; wherein, the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is determined by The composition of picture feature vectors corresponding to other frame images in the micro-expression sequence except for the i-th frame image, the value range of i is [1, N] and N = empirical frame value;
S132、获取所述微表情序列中每一帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,以得到每一帧图像对应的相似度值集合;其中,所述微表情序列中第i帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,组成第i帧图像的相似度值集合;S132. Obtain the similarity value between the picture feature vector of each frame of the image in the micro-expression sequence and the picture feature vectors of the other frames of the image to obtain a set of similarity values corresponding to each frame of the image; wherein, the The similarity values between the picture feature vector of the i-th frame image and the picture feature vectors of other frames in the micro-expression sequence constitute the similarity value set of the i-th frame image;
S133、将所述微表情序列中每一帧图像分别对应的相似度值集合均进行归一化处理,得到与每一帧图像分别对应的归一化相似度值集合;S133: Perform normalization processing on a set of similarity values corresponding to each frame of the image in the micro-expression sequence, to obtain a set of normalized similarity values corresponding to each frame of the image;
S134、根据每一帧图像分别对应的归一化相似度值集合及图片特征向量集合,获取每一帧图像分别对应的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量。S134: According to the normalized similarity value set and the picture feature vector set corresponding to each frame of image, obtain the weight feature vector corresponding to each frame of image to obtain the image feature vector of the combined weight value of each frame of image.
在本实施例中,由于所述微表情序列中的各帧图像初始是不带权重值的,此时为了获取每一帧图像的权重值,可以通过下述过程进行:In this embodiment, since each frame of image in the micro-expression sequence is initially without a weight value, at this time, in order to obtain the weight value of each frame of image, the following process can be performed:
1)获取所述微表情序列中每一帧图像对应的图片特征向量,具体可将每一帧图像输入至已完成训练的卷积神经网络中,得到与各帧图像对应图片特征向量;此时再获取每一帧图像对应的图片特征向量集合,其中所述微表情序列中第i帧图像对应的图片特征向量集合由所述微表情序列中除第i帧图像之外的其他帧图像对应的图片特征向量组成,i的取值范围是[1,N]且N=经验帧数值;1) Obtain the picture feature vector corresponding to each frame of the image in the micro-expression sequence. Specifically, each frame of image can be input into the trained convolutional neural network to obtain the picture feature vector corresponding to each frame of image; Then obtain the picture feature vector set corresponding to each frame of image, wherein the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is corresponding to other frame images except the i-th frame image in the micro-expression sequence Picture feature vector composition, the value range of i is [1,N] and N=experience frame value;
2)将微表情序列的N帧图像中第i帧图像记为N i,先将将其中一帧图像对应的图片特征向量输入至权重计算层,以计算该帧图像与微表情序列中其余的N-1帧图像的图片特征向量之间的相似性,从而得到每一帧图像对应的相似度值集合;其中,所述微表情序列中第i帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,组成第i帧图像的相似度值集合。该相似性具体可以采用任何合适的方式进行评价,如通过两帧图像的图像特征向量之间的向量点积,余弦相似度或者是引入新的神经网络来计算; 2) Mark the i-th image in the N frames of the micro-expression sequence as N i , and first input the image feature vector corresponding to one of the images into the weight calculation layer to calculate the frame image and the rest of the micro-expression sequence The similarity between the picture feature vectors of the N-1 frames of images, so as to obtain the set of similarity values corresponding to each frame of image; wherein, the picture feature vector of the i-th frame image in the micro-expression sequence is compared with that of other frames. The similarity values between the image feature vectors constitute the similarity value set of the i-th frame image. The similarity can be evaluated in any suitable way, such as the vector dot product between the image feature vectors of the two frames, the cosine similarity or the introduction of a new neural network to calculate;
3)将计算获得的与所述微表情序列中每一帧图像分别对应的相似度值集合均进行归一化处理,得到与每一帧图像分别对应的归一化相似度值集合;3) Perform normalization processing on the calculated similarity value sets corresponding to each frame of the image in the micro-expression sequence to obtain a normalized similarity value set corresponding to each frame of the image;
4)由于每一帧图像均对应一个归一化相似度值集合,此时将归一化相似度值集合中每一个归一化相似度值与对应帧的图片特征向量相乘后求和,得到每一帧图像均对应的结合权重值的图像特征向量。4) Since each frame of image corresponds to a set of normalized similarity values, at this time, each normalized similarity value in the normalized similarity value set is multiplied by the picture feature vector of the corresponding frame and then summed, The image feature vector corresponding to each frame of image combined with the weight value is obtained.
通过上述权重计算层,可以挖掘获得微表情图像序列中,不同图像帧之间的内在联系。亦即,一些密切相关的图像帧会有显著高于其他图像帧的权重值,从而在微表情的识别过程能够得到更多的关注。Through the above-mentioned weight calculation layer, the internal connections between different image frames in the sequence of micro-expression images can be mined. That is, some closely related image frames have a weight value that is significantly higher than other image frames, so that more attention can be paid to the recognition process of the micro-expression.
在一实施例中,步骤S134包括:In an embodiment, step S134 includes:
将第i帧图像的归一化相似度值集合中每一归一化相似度值,与第i帧图像的图片特征向量集合中对应的图片特征向量进行相乘后求和,得到第i帧图像对应的权重特征向量,以得到第i帧图像相应的结合权重值的图像特征向量。Multiply each normalized similarity value in the normalized similarity value set of the i-th frame image with the corresponding picture feature vector in the picture feature vector set of the i-th frame image and then sum to obtain the i-th frame The weight feature vector corresponding to the image to obtain the image feature vector combined with the weight value corresponding to the i-th frame image.
在本实施例中,通过这一方式获取的第i帧图像相应的结合权重值的图像特征向量,充分考虑了不同图像帧之间的内在联系。In this embodiment, the image feature vector corresponding to the weight value of the i-th frame image obtained in this way fully considers the internal relationship between different image frames.
S140、将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量。S140: Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data.
在本实施例中,当获取了每一帧图像的结合权重值的图像特征向量后,此时为了综合考虑这些帧数的图像对应的微表情识别结果,此时可以将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量,之后以综合图像特征向量作为识别向量来进行微表情识别。In this embodiment, after the image feature vector of the combined weight value of each frame of image is obtained, at this time, in order to comprehensively consider the micro-expression recognition results corresponding to the images of these frames, the combination of each frame of image The image feature vectors of the weight values are summed to obtain the integrated image feature vector corresponding to the user video data, and then the integrated image feature vector is used as the recognition vector to perform micro-expression recognition.
S150、将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果。S150. Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result.
在本实施例中,当获取了所述综合图像特征向量后,这一综合图像特征向量代表了视频图像序列中N帧图像对应的综合图片特征向量,此时将其输入至权重计算层所使用的卷积神经网络中,即可得到微表情识别结果。In this embodiment, when the integrated image feature vector is obtained, this integrated image feature vector represents the integrated image feature vector corresponding to N frames of images in the video image sequence, which is then input to the weight calculation layer for use In the convolutional neural network, the micro-expression recognition results can be obtained.
在一实施例中,步骤S150包括:In an embodiment, step S150 includes:
将所述综合图像特征向量输入至预先训练的卷积神经网络的softmax层,得到微表情识别结果。The integrated image feature vector is input to the softmax layer of the pre-trained convolutional neural network to obtain the micro-expression recognition result.
在本实施例中,由于在权重计算层所使用的卷积神经网络中已使用了卷积层、池化层及全连接层,得到了对应的图片特征向量,此时在获取了所述综合图像特征向量后,可将综合图像特征向量输入至卷积神经网络的softmax层,获取最终的微表情识别结果。具体是获得该微表情属于各个类别的概率,选择概率最高的类别作为该微表情序列的微表情识别结果。In this embodiment, since the convolutional layer, the pooling layer, and the fully connected layer have been used in the convolutional neural network used in the weight calculation layer, the corresponding image feature vector is obtained. At this time, the comprehensive After the image feature vector, the comprehensive image feature vector can be input to the softmax layer of the convolutional neural network to obtain the final micro-expression recognition result. Specifically, the probability that the micro expression belongs to each category is obtained, and the category with the highest probability is selected as the micro expression recognition result of the micro expression sequence.
S160、调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。S160. Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several Each item processing flow information corresponds to a micro-expression recognition result.
在本实施例中,由于识别了用户视频数据对应的微表情识别结果后,此时为了有针对性的出发后续事项办理流程,此时需要调用预先存储的事项处理微表情策略。In this embodiment, after the micro-expression recognition result corresponding to the user's video data is recognized, in order to start the follow-up item handling process in a targeted manner at this time, it is necessary to call the pre-stored item handling micro-expression strategy at this time.
例如,在事项处理微表情策略设置有以下两个策略:For example, there are the following two strategies in the transaction processing micro-expression strategy:
A)当微表情识别结果为开心时,获取事项处理微表情策略中与开心标签对应的第一事项处理流程(如该第一事项处理流程是保持当前自助办理流程的方式,无需人工***干预流程)A) When the micro-expression recognition result is happy, obtain the first item processing process corresponding to the happy tag in the item processing micro-expression strategy (for example, the first item processing process is a way to maintain the current self-service process without manual intervention in the process )
B)当微表情识别结果为厌恶时,获取事项处理微表情策略中与开心标签对应的第二事项处理流程(如该第二事项处理流程是中断当前自助办理流程的方式,需人工***干预流程,以核实用户是否对事项办理有不满之处);B) When the micro-expression recognition result is disgust, obtain the second item processing process corresponding to the happy tag in the item handling micro-expression strategy (if the second item handling process is a way to interrupt the current self-service process, manual intervention is required to be inserted into the process , To verify whether the user is dissatisfied with the handling of the matter);
C)当微表情识别结果为除了开心和厌恶以外的其他识别结果时,获取事项处理微表情策略中与其他标签对应的第三事项处理流程(如该第三事项处理流程是保持当前自助办理流程的方式,前面多步无需人工***干预流程,最后3步需人工***干预流程即可)。C) When the micro-expression recognition result is other than happy and disgusting, obtain the third item processing process corresponding to other tags in the item processing micro-expression strategy (if the third item processing process is to maintain the current self-service process In this way, the first steps do not need to be manually inserted into the intervention process, and the last three steps need to be manually inserted into the intervention process).
该方法实现了采用神经网络在进行微表情分类时,充分考虑到微表情在多个连续图像帧之间的时序关系,学习到微表情在图像视频序列中的时域信息,从而能够提供更准确的微表情识别结果。This method realizes that when a neural network is used to classify micro-expressions, the timing relationship between micro-expressions in multiple consecutive image frames is fully considered, and the time-domain information of micro-expressions in the image and video sequence is learned, which can provide more accurate The result of micro-expression recognition.
本申请实施例还提供一种视频微表情识别装置,该视频微表情识别装置用于执行前述视频微表情识别方法的任一实施例。具体地,请参阅图4,图4是本申请实施例提供的视频微 表情识别装置的示意性框图。该视频微表情识别装置100可以配置于服务器中。An embodiment of the present application also provides a video micro-expression recognition device, which is used to execute any embodiment of the aforementioned video micro-expression recognition method. Specifically, please refer to Fig. 4, which is a schematic block diagram of a video micro-expression recognition device provided by an embodiment of the present application. The video micro-expression recognition device 100 can be configured in a server.
如图4所示,视频微表情识别装置100包括:微表情图像帧获取单元110、微表情序列获取单元120、权重值特征向量获取单元130、综合图像特征向量获取单元140、微表情识别单元150、事项流程信息获取单元160。As shown in FIG. 4, the video micro-expression recognition device 100 includes: a micro-expression image frame acquisition unit 110, a micro-expression sequence acquisition unit 120, a weight value feature vector acquisition unit 130, a comprehensive image feature vector acquisition unit 140, and a micro-expression recognition unit 150 , The item flow information acquiring unit 160.
微表情图像帧获取单元110,用于若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧。The micro-expression image frame obtaining unit 110 is configured to, if user video data corresponding to the user terminal is received, obtain an image frame containing a micro-expression in a video image sequence of the user video data.
在本实施例中,当用户端与服务器建立连接后,用户在用户端上查看服务器对应提供的用户交互界面时,用户根据在用户交互界面上所选定的事项对应的事项流程进行事项办理。在事项办理的过程中,用户端需启动摄像头以对用户视频数据进行采集并上传至服务器。服务器在接收到用户端所发送的用户视频数据后,获取其中包含微表情的图像帧,以进行后续的微表情识别。In this embodiment, after the user terminal establishes a connection with the server, when the user views the user interaction interface provided by the server on the user terminal, the user handles the items according to the item process corresponding to the item selected on the user interaction interface. In the process of handling matters, the user side needs to start the camera to collect the user's video data and upload it to the server. After receiving the user video data sent by the client, the server obtains the image frame containing the micro-expression to perform subsequent micro-expression recognition.
在一实施例中,微表情图像帧获取单元110还用于:In an embodiment, the micro-expression image frame obtaining unit 110 is further configured to:
通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧;或者通过时空局部纹理算子获取所述用户视频数据的视频图像序列中包含微表情的图像帧。Obtain image frames containing micro-expression in the video image sequence of the user video data by an optical flow method; or obtain image frames containing micro-expression in the video image sequence of the user video data by a spatio-temporal local texture operator.
在本实施例中,具体可以选择使用的任何合适的特征提取方式,从视频图像序列中提取其中包含的微表情的图像帧。例如,可以使用基于光流的特征提取或者基于LBP-TOP算子的特征提取:In this embodiment, any suitable feature extraction method can be specifically selected to extract the image frames of the micro-expression contained in the video image sequence. For example, you can use feature extraction based on optical flow or feature extraction based on LBP-TOP operator:
其中,光流算法是在一定约束条件下估算视频图像序列中的光流从而识别出客户面部的细微运动,实现对微表情的特征提取。而LBP-TOP算子(即时空局部纹理)则是在局部二值模式(LBP算子)的基础上发展而来的,用于反映像素在视频图像序列中的空间分布的特征。简单而言,其是在LBP算子的基础上,新增加了一个时间上的维度,从而可以提取视频图像序列中各个像素点跟随时间的变化特征,从而识别出客户面部的细微表情变化。Among them, the optical flow algorithm is to estimate the optical flow in the video image sequence under certain constraints to identify the subtle movements of the customer's face, and realize the feature extraction of the micro-expression. The LBP-TOP operator (real-time spatial local texture) is developed on the basis of the local binary mode (LBP operator) and is used to reflect the characteristics of the spatial distribution of pixels in the video image sequence. Simply put, it is based on the LBP operator, adding a new time dimension, so that the change characteristics of each pixel in the video image sequence with time can be extracted, so as to identify the subtle facial expression changes of the customer.
在一实施例中,所述微表情图像帧获取单元110,包括:In an embodiment, the micro-expression image frame obtaining unit 110 includes:
速度矢量特征获取单元,用于获取所述用户视频数据的视频图像序列的各像素点对应的速度矢量特征;A speed vector feature acquiring unit, configured to acquire a speed vector feature corresponding to each pixel of the video image sequence of the user video data;
目标图像帧获取单元,用于若视频图像序列中存在至少一帧图像的所述速度矢量特征未保持连续变化,由对应图片组成包含微表情的图像帧。The target image frame acquisition unit is configured to, if the velocity vector feature of at least one frame of image in the video image sequence does not keep changing continuously, form an image frame containing micro-expression from the corresponding pictures.
在本实施例中,当人的眼睛观察运动物体时,物体的景象在人眼的视网膜上形成一系列连续变化的图像,这一系列连续变化的信息不断“流过”视网膜(即图像平面),好像是一种光的“流”,故称之为光流。光流表达图像的变化,包含目标运动的信息,可用来确定目标的运动。光流三个要素:一是运动速度场,这是形成光流的必要条件;二是带光学特征的部分例如有灰度的象素点,它可以携带运动信息;三是成像投影从场景到图像平面,因而能被观察到。In this embodiment, when the human eye observes a moving object, the scene of the object forms a series of continuously changing images on the retina of the human eye, and this series of continuously changing information continuously "flows through" the retina (that is, the image plane) , Seems to be a kind of "flow" of light, so it is called optical flow. The optical flow expresses the change of the image, contains the information of the target's movement, and can be used to determine the target's movement. The three elements of optical flow: one is the motion velocity field, which is a necessary condition for the formation of optical flow; the second is the part with optical characteristics such as gray-scale pixels, which can carry motion information; the third is the imaging projection from the scene to the The image plane can thus be observed.
定义光流以点为基础,具体来说,设(u,v)为图像点(x,y)的光流,则把(x,y,u,v)称为光流点。所有光流点的集合称为光流场。当带光学特性的物体在三维空间运动时,在图像平面上就形成了相应的图像运动场,或称为图像速度场。在理想情况下,光流场对应 于运动场。The definition of optical flow is based on points. Specifically, let (u, v) be the optical flow of the image point (x, y), then (x, y, u, v) is called the optical flow point. The collection of all optical flow points is called the optical flow field. When an object with optical properties moves in a three-dimensional space, a corresponding image motion field, or image velocity field, is formed on the image plane. In an ideal situation, the optical flow field corresponds to the sports field.
给图像中的每个像素点赋予一个速度矢量,这样就形成了一个运动矢量场。根据各个像素点的速度矢量特征,可以对图像进行动态分析。如果图像中没有运动目标,则光流矢量在整个图像区域是连续变化的。当图像中有运动物体时(当用户有微表情时,脸部会有运动,相当于运动物体),目标和背景存在着相对运动。运动物体所形成的速度矢量必然和背景的速度矢量有所不同,如此便可以计算出运动物体的位置。通过光流法进行预处理,即可得到所述用户视频数据的视频图像序列中包含微表情的图像帧。Assign a velocity vector to each pixel in the image, thus forming a motion vector field. According to the velocity vector characteristics of each pixel, the image can be dynamically analyzed. If there is no moving target in the image, the optical flow vector changes continuously throughout the image area. When there are moving objects in the image (when the user has micro expressions, the face will move, which is equivalent to moving objects), there is relative movement between the target and the background. The velocity vector formed by the moving object must be different from the background velocity vector, so that the position of the moving object can be calculated. Preprocessing by the optical flow method can obtain image frames containing micro-expression in the video image sequence of the user video data.
微表情序列获取单元120,用于根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列。The micro-expression sequence acquiring unit 120 is configured to acquire, in the image frame containing the micro-expression, a number of consecutive images equal to the number of the empirical frame according to the preset empirical frame value to form a micro-expression sequence.
在本实施例中,经验帧数值记为N,N是一个经验性数值,可以由技术人员根据实际情况的需要而设置。亦即,保证在N帧图像中记录有一个微表情从起始、峰值到终结的完整过程。In this embodiment, the empirical frame value is denoted as N, and N is an empirical value, which can be set by a technician according to actual needs. That is, it is ensured that a complete process of micro-expression from the beginning, the peak to the end is recorded in the N frames of images.
权重值特征向量获取单元130,用于调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量。The weight value feature vector obtaining unit 130 is configured to call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence to obtain the image feature vector combined with the weight value of each frame of image.
在本实施例中,通过权重值的不同的可以表示微表情序列中图像帧之间的联系(即微表情图像序列的时域信息)。例如,在一个微笑的微表情序列中,某几张图像帧总是联合出现,通过提高这些联合出现的图像帧的权重可以获得序列的时域信息。In this embodiment, the connection between the image frames in the micro-expression sequence (that is, the time-domain information of the micro-expression image sequence) can be represented by the difference in the weight value. For example, in a smiling micro-expression sequence, certain image frames always appear jointly, and the time domain information of the sequence can be obtained by increasing the weight of these jointly appearing image frames.
为了对所述微表情序列中每一帧图像赋予权重值,需要调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量。In order to assign a weight value to each frame of the image in the micro-expression sequence, a pre-built weight calculation layer needs to be called to calculate the weight feature vector of each frame of the image in the micro-expression sequence.
在一实施例中,如图5所示,权重值特征向量获取单元130包括:In an embodiment, as shown in FIG. 5, the weight value feature vector obtaining unit 130 includes:
图片特征向量获取单元131,用于获取所述微表情序列中每一帧图像对应的图片特征向量,及每一帧图像对应的图片特征向量集合;其中,所述微表情序列中第i帧图像对应的图片特征向量集合由所述微表情序列中除第i帧图像之外的其他帧图像对应的图片特征向量组成,i的取值范围是[1,N]且N=经验帧数值;The picture feature vector acquiring unit 131 is configured to acquire the picture feature vector corresponding to each frame of the image in the micro-expression sequence and the set of picture feature vector corresponding to each frame of the image; wherein, the i-th frame image in the micro-expression sequence The corresponding picture feature vector set is composed of picture feature vectors corresponding to other frame images except the i-th frame image in the micro-expression sequence, and the value range of i is [1,N] and N=experience frame value;
相似度值获取单元132,用于获取所述微表情序列中每一帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,以得到每一帧图像对应的相似度值集合;其中,所述微表情序列中第i帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,组成第i帧图像的相似度值集合;The similarity value obtaining unit 132 is configured to obtain the similarity value between the picture feature vector of each frame of the image in the micro-expression sequence and the picture feature vectors of other frames of the image to obtain the corresponding similarity of each frame of image A value set; wherein the similarity values between the picture feature vector of the i-th frame image and the picture feature vectors of other frames in the micro-expression sequence constitute a similarity value set of the i-th frame image;
归一化单元133,用于将所述微表情序列中每一帧图像分别对应的相似度值集合均进行归一化处理,得到与每一帧图像分别对应的归一化相似度值集合;The normalization unit 133 is configured to perform normalization processing on the set of similarity values corresponding to each frame of the image in the micro-expression sequence to obtain a set of normalized similarity values corresponding to each frame of the image;
权重特征向量获取单元134,用于根据每一帧图像分别对应的归一化相似度值集合及图片特征向量集合,获取每一帧图像分别对应的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量。The weight feature vector obtaining unit 134 is used to obtain the weight feature vector corresponding to each frame image according to the normalized similarity value set and the picture feature vector set corresponding to each frame image, so as to obtain the combination of each frame image The image feature vector of the weight value.
在本实施例中,由于所述微表情序列中的各帧图像初始是不带权重值的,此时为了获取每一帧图像的权重值,可以通过下述过程进行:In this embodiment, since each frame of image in the micro-expression sequence is initially without a weight value, at this time, in order to obtain the weight value of each frame of image, the following process can be performed:
1)获取所述微表情序列中每一帧图像对应的图片特征向量,具体可将每一帧图像输入至已完成训练的卷积神经网络中,得到与各帧图像对应图片特征向量;此时再获取每一帧图像 对应的图片特征向量集合,其中所述微表情序列中第i帧图像对应的图片特征向量集合由所述微表情序列中除第i帧图像之外的其他帧图像对应的图片特征向量组成,i的取值范围是[1,N]且N=经验帧数值;1) Obtain the picture feature vector corresponding to each frame of the image in the micro-expression sequence. Specifically, each frame of image can be input into the trained convolutional neural network to obtain the picture feature vector corresponding to each frame of image; Then obtain the picture feature vector set corresponding to each frame of image, wherein the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is corresponding to other frame images except the i-th frame image in the micro-expression sequence Picture feature vector composition, the value range of i is [1,N] and N=experience frame value;
2)将微表情序列的N帧图像中第i帧图像记为N i,先将将其中一帧图像对应的图片特征向量输入至权重计算层,以计算该帧图像与微表情序列中其余的N-1帧图像的图片特征向量之间的相似性,从而得到每一帧图像对应的相似度值集合;其中,所述微表情序列中第i帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,组成第i帧图像的相似度值集合。该相似性具体可以采用任何合适的方式进行评价,如通过两帧图像的图像特征向量之间的向量点积,余弦相似度或者是引入新的神经网络来计算; 2) Mark the i-th image in the N frames of the micro-expression sequence as N i , and first input the image feature vector corresponding to one of the images into the weight calculation layer to calculate the frame image and the rest of the micro-expression sequence The similarity between the picture feature vectors of the N-1 frames of images, so as to obtain the set of similarity values corresponding to each frame of image; wherein, the picture feature vector of the i-th frame image in the micro-expression sequence is compared with that of other frames. The similarity values between the image feature vectors constitute the similarity value set of the i-th frame image. The similarity can be evaluated in any suitable way, such as the vector dot product between the image feature vectors of the two frames, the cosine similarity or the introduction of a new neural network to calculate;
3)将计算获得的与所述微表情序列中每一帧图像分别对应的相似度值集合均进行归一化处理,得到与每一帧图像分别对应的归一化相似度值集合;3) Perform normalization processing on the calculated similarity value sets corresponding to each frame of the image in the micro-expression sequence to obtain a normalized similarity value set corresponding to each frame of the image;
4)由于每一帧图像均对应一个归一化相似度值集合,此时将归一化相似度值集合中每一个归一化相似度值与对应帧的图片特征向量相乘后求和,得到每一帧图像均对应的结合权重值的图像特征向量。4) Since each frame of image corresponds to a set of normalized similarity values, at this time, each normalized similarity value in the normalized similarity value set is multiplied by the picture feature vector of the corresponding frame and then summed, The image feature vector corresponding to each frame of image combined with the weight value is obtained.
通过上述权重计算层,可以挖掘获得微表情图像序列中,不同图像帧之间的内在联系。亦即,一些密切相关的图像帧会有显著高于其他图像帧的权重值,从而在微表情的识别过程能够得到更多的关注。Through the above-mentioned weight calculation layer, the internal connections between different image frames in the sequence of micro-expression images can be mined. That is, some closely related image frames have a weight value that is significantly higher than other image frames, so that more attention can be paid to the recognition process of the micro-expression.
在一实施例中,权重特征向量获取单元134还用于包括:In an embodiment, the weight feature vector obtaining unit 134 is further configured to include:
将第i帧图像的归一化相似度值集合中每一归一化相似度值,与第i帧图像的图片特征向量集合中对应的图片特征向量进行相乘后求和,得到第i帧图像对应的权重特征向量,以得到第i帧图像相应的结合权重值的图像特征向量。Multiply each normalized similarity value in the normalized similarity value set of the i-th frame image with the corresponding picture feature vector in the picture feature vector set of the i-th frame image and then sum to obtain the i-th frame The weight feature vector corresponding to the image to obtain the image feature vector combined with the weight value corresponding to the i-th frame image.
在本实施例中,通过这一方式获取的第i帧图像相应的结合权重值的图像特征向量,充分考虑了不同图像帧之间的内在联系。In this embodiment, the image feature vector corresponding to the weight value of the i-th frame image obtained in this way fully considers the internal relationship between different image frames.
综合图像特征向量获取单元140,用于将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量。The integrated image feature vector obtaining unit 140 is configured to sum the image feature vectors of the combined weight values of each frame of image to obtain the integrated image feature vector corresponding to the user video data.
在本实施例中,当获取了每一帧图像的结合权重值的图像特征向量后,此时为了综合考虑这些帧数的图像对应的微表情识别结果,此时可以将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量,之后以综合图像特征向量作为识别向量来进行微表情识别。In this embodiment, after the image feature vector of the combined weight value of each frame of image is obtained, at this time, in order to comprehensively consider the micro-expression recognition results corresponding to the images of these frames, the combination of each frame of image The image feature vectors of the weight values are summed to obtain the integrated image feature vector corresponding to the user video data, and then the integrated image feature vector is used as the recognition vector to perform micro-expression recognition.
微表情识别单元150,用于将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果。The micro-expression recognition unit 150 is configured to input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result.
在本实施例中,当获取了所述综合图像特征向量后,这一综合图像特征向量代表了视频图像序列中N帧图像对应的综合图片特征向量,此时将其输入至权重计算层所使用的卷积神经网络中,即可得到微表情识别结果。In this embodiment, when the integrated image feature vector is obtained, this integrated image feature vector represents the integrated image feature vector corresponding to N frames of images in the video image sequence, which is then input to the weight calculation layer for use In the convolutional neural network, the micro-expression recognition results can be obtained.
在一实施例中,微表情识别单元150还用于:In an embodiment, the micro-expression recognition unit 150 is also used to:
将所述综合图像特征向量输入至预先训练的卷积神经网络的softmax层,得到微表情识 别结果。The integrated image feature vector is input to the softmax layer of the pre-trained convolutional neural network to obtain the micro-expression recognition result.
在本实施例中,由于在权重计算层所使用的卷积神经网络中已使用了卷积层、池化层及全连接层,得到了对应的图片特征向量,此时在获取了所述综合图像特征向量后,可将综合图像特征向量输入至卷积神经网络的softmax层,获取最终的微表情识别结果。具体是获得该微表情属于各个类别的概率,选择概率最高的类别作为该微表情序列的微表情识别结果。In this embodiment, since the convolutional layer, the pooling layer, and the fully connected layer have been used in the convolutional neural network used in the weight calculation layer, the corresponding image feature vector is obtained. At this time, the comprehensive After the image feature vector, the comprehensive image feature vector can be input to the softmax layer of the convolutional neural network to obtain the final micro-expression recognition result. Specifically, the probability that the micro expression belongs to each category is obtained, and the category with the highest probability is selected as the micro expression recognition result of the micro expression sequence.
事项流程信息获取单元160,用于调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。The item process information acquiring unit 160 is configured to call a pre-stored item processing micro-expression strategy, obtain item processing flow information related to the micro-expression recognition result, and send the item processing flow information to the user terminal; wherein, the item The micro-expression processing strategy stores several item processing flow information, and each item processing flow information corresponds to a micro-expression recognition result.
在本实施例中,由于识别了用户视频数据对应的微表情识别结果后,此时为了有针对性的出发后续事项办理流程,此时需要调用预先存储的事项处理微表情策略。In this embodiment, after the micro-expression recognition result corresponding to the user's video data is recognized, in order to start the follow-up item handling process in a targeted manner at this time, it is necessary to call the pre-stored item handling micro-expression strategy at this time.
例如,在事项处理微表情策略设置有以下两个策略:For example, there are the following two strategies in the transaction processing micro-expression strategy:
A)当微表情识别结果为开心时,获取事项处理微表情策略中与开心标签对应的第一事项处理流程(如该第一事项处理流程是保持当前自助办理流程的方式,无需人工***干预流程)A) When the micro-expression recognition result is happy, obtain the first item processing process corresponding to the happy tag in the item processing micro-expression strategy (for example, the first item processing process is a way to maintain the current self-service process without manual intervention in the process )
B)当微表情识别结果为厌恶时,获取事项处理微表情策略中与开心标签对应的第二事项处理流程(如该第二事项处理流程是中断当前自助办理流程的方式,需人工***干预流程,以核实用户是否对事项办理有不满之处);B) When the micro-expression recognition result is disgust, obtain the second item processing process corresponding to the happy tag in the item handling micro-expression strategy (if the second item handling process is a way to interrupt the current self-service process, manual intervention is required to be inserted into the process , To verify whether the user is dissatisfied with the handling of the matter);
C)当微表情识别结果为除了开心和厌恶以外的其他识别结果时,获取事项处理微表情策略中与其他标签对应的第三事项处理流程(如该第三事项处理流程是保持当前自助办理流程的方式,前面多步无需人工***干预流程,最后3步需人工***干预流程即可)。C) When the micro-expression recognition result is other than happy and disgusting, obtain the third item processing process corresponding to other tags in the item processing micro-expression strategy (if the third item processing process is to maintain the current self-service process In this way, the first steps do not need to be manually inserted into the intervention process, and the last three steps need to be manually inserted into the intervention process).
该装置实现了采用神经网络在进行微表情分类时,充分考虑到微表情在多个连续图像帧之间的时序关系,学习到微表情在图像视频序列中的时域信息,从而能够提供更准确的微表情识别结果。The device realizes that when the neural network is used to classify micro-expressions, it fully considers the timing relationship between the micro-expressions in multiple consecutive image frames, and learns the time-domain information of the micro-expressions in the image and video sequence, so as to provide more accurate The result of micro-expression recognition.
上述视频微表情识别装置可以实现为计算机程序的形式,该计算机程序可以在如图6所示的计算机设备上运行。The above-mentioned video micro-expression recognition device can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 6.
请参阅图6,图6是本申请实施例提供的计算机设备的示意性框图。该计算机设备500是服务器,服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。Please refer to FIG. 6, which is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
参阅图6,该计算机设备500包括通过***总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。Referring to FIG. 6, the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
该非易失性存储介质503可存储操作***5031和计算机程序5032。该计算机程序5032被执行时,可使得处理器502执行视频微表情识别方法。The non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032. When the computer program 5032 is executed, the processor 502 can execute the video micro-expression recognition method.
该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。The processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行视频微表情识别方法。The internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute the video micro-expression recognition method.
该网络接口505用于进行网络通信,如提供数据信息的传输等。本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方 案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。The network interface 505 is used for network communication, such as providing data information transmission. Those skilled in the art can understand that the structure shown in FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied. The specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现本申请实施例公开的视频微表情识别方法。Wherein, the processor 502 is configured to run a computer program 5032 stored in a memory to implement the video micro-expression recognition method disclosed in the embodiment of the present application.
本领域技术人员可以理解,图6中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图6所示实施例一致,在此不再赘述。Those skilled in the art can understand that the embodiment of the computer device shown in FIG. 6 does not constitute a limitation on the specific configuration of the computer device. In other embodiments, the computer device may include more or less components than those shown in the figure. Or combine certain components, or different component arrangements. For example, in some embodiments, the computer device may only include a memory and a processor. In such embodiments, the structures and functions of the memory and the processor are the same as those of the embodiment shown in FIG. 6, which will not be repeated here.
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in the embodiment of the present application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. Among them, the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质,也可以为易失性的计算机可读存储介质。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现本申请实施例公开的视频微表情识别方法。In another embodiment of the present application, a computer-readable storage medium is provided. The computer-readable storage medium may be a non-volatile computer-readable storage medium, or may be a volatile computer-readable storage medium. The computer-readable storage medium stores a computer program, where the computer program is executed by a processor to implement the video micro-expression recognition method disclosed in the embodiments of the present application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above described equipment, device, and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here. A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described in accordance with the function. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of this application.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为逻辑功能划分,实际实现时可以有另外的划分方式,也可以将具有相同功能的单元集合成一个单元,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。In the several embodiments provided in this application, it should be understood that the disclosed equipment, device, and method may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods, or the units with the same function may be combined into one. Units, for example, multiple units or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个 单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium. Based on this understanding, the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), magnetic disk or optical disk and other media that can store program codes.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the scope of protection of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the scope of protection of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (20)

  1. 一种视频微表情识别方法,其中,包括:A video micro-expression recognition method, which includes:
    若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;If user video data corresponding to the user terminal is received, acquiring image frames containing micro-expressions in the video image sequence of the user video data;
    根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;According to the preset empirical frame value, acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence;
    调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;Call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
    将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data;
    将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
    调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several items Processing flow information, each item processing flow information corresponds to a micro-expression recognition result.
  2. 根据权利要求1所述的视频微表情识别方法,其中,所述获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:The method for recognizing video micro-expressions according to claim 1, wherein the video image sequence from which the user video data is obtained includes image frames of micro-expressions, comprising:
    通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧;或者通过时空局部纹理算子获取所述用户视频数据的视频图像序列中包含微表情的图像帧。Obtain image frames containing micro-expression in the video image sequence of the user video data by an optical flow method; or obtain image frames containing micro-expression in the video image sequence of the user video data by a spatio-temporal local texture operator.
  3. 根据权利要求2所述的视频微表情识别方法,其中,所述通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:The video micro-expression recognition method according to claim 2, wherein the video image sequence obtained by the optical flow method of the user video data contains the image frames of the micro-expression, comprising:
    获取所述用户视频数据的视频图像序列的各像素点对应的速度矢量特征;Acquiring a velocity vector feature corresponding to each pixel of the video image sequence of the user video data;
    若视频图像序列中存在至少一帧图像的所述速度矢量特征未保持连续变化,由对应图片组成包含微表情的图像帧。If the velocity vector feature of at least one frame of images in the video image sequence does not keep changing continuously, an image frame containing micro-expression is composed of corresponding pictures.
  4. 根据权利要求1所述的视频微表情识别方法,其中,所述调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量,包括:The video micro-expression recognition method according to claim 1, wherein the weight calculation layer constructed in advance is invoked to calculate the weight feature vector of each frame of the image in the micro-expression sequence to obtain the combined weight value of each frame of image The image feature vector of includes:
    获取所述微表情序列中每一帧图像对应的图片特征向量,及每一帧图像对应的图片特征向量集合;其中,所述微表情序列中第i帧图像对应的图片特征向量集合由所述微表情序列中除第i帧图像之外的其他帧图像对应的图片特征向量组成,i的取值范围是[1,N]且N=经验帧数值;Obtain the picture feature vector corresponding to each frame of the image in the micro-expression sequence, and the picture feature vector set corresponding to each frame of the image; wherein, the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is determined by the In the micro-expression sequence, it is composed of picture feature vectors corresponding to other frame images except the i-th frame image, the value range of i is [1,N] and N=experience frame value;
    获取所述微表情序列中每一帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,以得到每一帧图像对应的相似度值集合;其中,所述微表情序列中第i帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,组成第i帧图像的相似度值集合;Obtain the similarity value between the picture feature vector of each frame of the image in the micro-expression sequence and the picture feature vectors of other frames of the image to obtain a set of similarity values corresponding to each frame of the image; wherein, the micro-expression The similarity values between the picture feature vector of the i-th frame image and the picture feature vectors of other frames in the sequence constitute the similarity value set of the i-th frame image;
    将所述微表情序列中每一帧图像分别对应的相似度值集合均进行归一化处理,得到与每一帧图像分别对应的归一化相似度值集合;Normalizing the set of similarity values corresponding to each frame of image in the micro-expression sequence to obtain a set of normalized similarity values corresponding to each frame of image;
    根据每一帧图像分别对应的归一化相似度值集合及图片特征向量集合,获取每一帧图像分别对应的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量。According to the normalized similarity value set and the picture feature vector set corresponding to each frame of image, the weight feature vector corresponding to each frame of image is obtained to obtain the image feature vector of the combined weight value of each frame of image.
  5. 根据权利要求4所述的视频微表情识别方法,其中,所述根据每一帧图像分别对应的归一化相似度值集合及图片特征向量集合,获取每一帧图像分别对应的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量,包括:4. The video micro-expression recognition method according to claim 4, wherein the weight feature vector corresponding to each frame image is obtained according to the normalized similarity value set and the picture feature vector set corresponding to each frame of image respectively, To obtain the image feature vector of the combined weight value of each frame of image, including:
    将第i帧图像的归一化相似度值集合中每一归一化相似度值,与第i帧图像的图片特征向量集合中对应的图片特征向量进行相乘后求和,得到第i帧图像对应的权重特征向量,以得到第i帧图像相应的结合权重值的图像特征向量。Multiply each normalized similarity value in the normalized similarity value set of the i-th frame image with the corresponding picture feature vector in the picture feature vector set of the i-th frame image and then sum to obtain the i-th frame The weight feature vector corresponding to the image to obtain the image feature vector combined with the weight value corresponding to the i-th frame image.
  6. 根据权利要求5所述的视频微表情识别方法,其中,所述将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果,包括:The video micro-expression recognition method according to claim 5, wherein said inputting said integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result comprises:
    将所述综合图像特征向量输入至预先训练的卷积神经网络的softmax层,得到微表情识别结果。The integrated image feature vector is input to the softmax layer of the pre-trained convolutional neural network to obtain the micro-expression recognition result.
  7. 根据权利要求4所述的视频微表情识别方法,其中,所述获取所述微表情序列中每一帧图像对应的图片特征向量,包括:The video micro-expression recognition method according to claim 4, wherein said obtaining the picture feature vector corresponding to each frame of image in the micro-expression sequence comprises:
    将所述微表情序列中每一帧图像输入至已完成训练的卷积神经网络中,得到与各帧图像对应图片特征向量。Each frame of image in the micro-expression sequence is input into the trained convolutional neural network to obtain a picture feature vector corresponding to each frame of image.
  8. 根据权利要求1所述的视频微表情识别方法,其中,所述调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端,包括:The video micro-expression recognition method according to claim 1, wherein the pre-stored item processing micro-expression strategy is invoked to obtain item processing flow information related to the micro-expression recognition result, and the item processing flow information is sent to User side, including:
    当微表情识别结果为开心时,获取事项处理微表情策略中与开心标签对应的第一事项处理流程;When the micro-expression recognition result is happy, obtain the first item processing flow corresponding to the happy tag in the item-processing micro-expression strategy;
    当微表情识别结果为厌恶时,获取事项处理微表情策略中与开心标签对应的第二事项处理流程。When the micro-expression recognition result is disgust, the second item processing flow corresponding to the happy tag in the item-processing micro-expression strategy is acquired.
  9. 一种视频微表情识别装置,其中,包括:A video micro-expression recognition device, which includes:
    微表情图像帧获取单元,用于若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;The micro-expression image frame acquisition unit is configured to, if user video data corresponding to the user terminal is received, acquire an image frame containing a micro-expression in a video image sequence of the user video data;
    微表情序列获取单元,用于若根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;The micro-expression sequence acquisition unit is configured to acquire, in the image frame containing the micro-expression, a number of consecutive images equal to the number of the empirical frame according to the preset empirical frame value to form a micro-expression sequence;
    权重值特征向量获取单元,用于调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;The weight value feature vector obtaining unit is configured to call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
    综合图像特征向量获取单元,用于将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;The comprehensive image feature vector acquiring unit is configured to sum the image feature vectors of the combined weight values of each frame of image to obtain the comprehensive image feature vector corresponding to the user video data;
    微表情识别单元,用于将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及The micro-expression recognition unit is used to input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
    事项流程信息获取单元,用于调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处 理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。The item process information acquisition unit is used to call a pre-stored item processing micro-expression strategy, obtain item processing flow information related to the micro-expression recognition result, and send the item processing flow information to the user terminal; wherein, the item processing The micro-expression strategy stores several item processing flow information, and each item processing flow information corresponds to a micro-expression recognition result.
  10. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现以下步骤:A computer device includes a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the following steps when the processor executes the computer program:
    若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;If user video data corresponding to the user terminal is received, acquiring image frames containing micro-expressions in the video image sequence of the user video data;
    根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;According to the preset empirical frame value, acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence;
    调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;Call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
    将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data;
    将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
    调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several items Processing flow information, each item processing flow information corresponds to a micro-expression recognition result.
  11. 根据权利要求10所述的计算机设备,其中,所述获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:The computer device according to claim 10, wherein the video image sequence from which the user video data is obtained includes image frames of micro-expression, comprising:
    通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧;或者通过时空局部纹理算子获取所述用户视频数据的视频图像序列中包含微表情的图像帧。Obtain image frames containing micro-expression in the video image sequence of the user video data by an optical flow method; or obtain image frames containing micro-expression in the video image sequence of the user video data by a spatio-temporal local texture operator.
  12. 根据权利要求2所述的计算机设备,其中,所述通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:The computer device according to claim 2, wherein the video image sequence obtained by the optical flow method of the user video data contains image frames of micro-expression, comprising:
    获取所述用户视频数据的视频图像序列的各像素点对应的速度矢量特征;Acquiring a velocity vector feature corresponding to each pixel of the video image sequence of the user video data;
    若视频图像序列中存在至少一帧图像的所述速度矢量特征未保持连续变化,由对应图片组成包含微表情的图像帧。If the velocity vector feature of at least one frame of images in the video image sequence does not keep changing continuously, an image frame containing micro-expression is composed of corresponding pictures.
  13. 根据权利要求10所述的计算机设备,其中,所述调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量,包括:10. The computer device according to claim 10, wherein said calling a pre-built weight calculation layer calculates the weight feature vector of each frame of image in the micro-expression sequence to obtain the image feature combined with the weight value of each frame of image Vectors, including:
    获取所述微表情序列中每一帧图像对应的图片特征向量,及每一帧图像对应的图片特征向量集合;其中,所述微表情序列中第i帧图像对应的图片特征向量集合由所述微表情序列中除第i帧图像之外的其他帧图像对应的图片特征向量组成,i的取值范围是[1,N]且N=经验帧数值;Obtain the picture feature vector corresponding to each frame of the image in the micro-expression sequence, and the picture feature vector set corresponding to each frame of the image; wherein, the picture feature vector set corresponding to the i-th frame image in the micro-expression sequence is determined by the In the micro-expression sequence, it is composed of picture feature vectors corresponding to other frame images except the i-th frame image, the value range of i is [1,N] and N=experience frame value;
    获取所述微表情序列中每一帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,以得到每一帧图像对应的相似度值集合;其中,所述微表情序列中第i帧图像的图片特征向量与其他各帧图像的图片特征向量之间的相似度值,组成第i帧图像的相似度值集合;Obtain the similarity value between the picture feature vector of each frame of the image in the micro-expression sequence and the picture feature vectors of other frames of the image to obtain a set of similarity values corresponding to each frame of the image; wherein, the micro-expression The similarity values between the picture feature vector of the i-th frame image and the picture feature vectors of other frames in the sequence constitute the similarity value set of the i-th frame image;
    将所述微表情序列中每一帧图像分别对应的相似度值集合均进行归一化处理,得到与每一帧图像分别对应的归一化相似度值集合;Normalizing the set of similarity values corresponding to each frame of image in the micro-expression sequence to obtain a set of normalized similarity values corresponding to each frame of image;
    根据每一帧图像分别对应的归一化相似度值集合及图片特征向量集合,获取每一帧图像分别对应的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量。According to the normalized similarity value set and the picture feature vector set corresponding to each frame of image, the weight feature vector corresponding to each frame of image is obtained to obtain the image feature vector of the combined weight value of each frame of image.
  14. 根据权利要求13所述的计算机设备,其中,所述根据每一帧图像分别对应的归一化相似度值集合及图片特征向量集合,获取每一帧图像分别对应的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量,包括:The computer device according to claim 13, wherein the weight feature vector corresponding to each frame image is obtained according to the normalized similarity value set and the picture feature vector set corresponding to each frame of image, so as to obtain each frame of image. The image feature vector of a frame of image combined with the weight value includes:
    将第i帧图像的归一化相似度值集合中每一归一化相似度值,与第i帧图像的图片特征向量集合中对应的图片特征向量进行相乘后求和,得到第i帧图像对应的权重特征向量,以得到第i帧图像相应的结合权重值的图像特征向量。Multiply each normalized similarity value in the normalized similarity value set of the i-th frame image with the corresponding picture feature vector in the picture feature vector set of the i-th frame image and then sum to obtain the i-th frame The weight feature vector corresponding to the image to obtain the image feature vector combined with the weight value corresponding to the i-th frame image.
  15. 根据权利要求14所述的计算机设备,其中,所述将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果,包括:The computer device according to claim 14, wherein said inputting said integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result comprises:
    将所述综合图像特征向量输入至预先训练的卷积神经网络的softmax层,得到微表情识别结果。The integrated image feature vector is input to the softmax layer of the pre-trained convolutional neural network to obtain the micro-expression recognition result.
  16. 根据权利要求13所述的计算机设备,其中,所述获取所述微表情序列中每一帧图像对应的图片特征向量,包括:The computer device according to claim 13, wherein said acquiring the picture feature vector corresponding to each frame of image in the micro-expression sequence comprises:
    将所述微表情序列中每一帧图像输入至已完成训练的卷积神经网络中,得到与各帧图像对应图片特征向量。Each frame of image in the micro-expression sequence is input into the trained convolutional neural network to obtain a picture feature vector corresponding to each frame of image.
  17. 根据权利要求10所述的计算机设备,其中,所述调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端,包括:10. The computer device according to claim 10, wherein said invoking a pre-stored item processing micro-expression strategy, acquiring item processing flow information related to the micro-expression recognition result, and sending the item processing flow information to the user terminal, include:
    当微表情识别结果为开心时,获取事项处理微表情策略中与开心标签对应的第一事项处理流程;When the micro-expression recognition result is happy, obtain the first item processing flow corresponding to the happy tag in the item-processing micro-expression strategy;
    当微表情识别结果为厌恶时,获取事项处理微表情策略中与开心标签对应的第二事项处理流程。When the micro-expression recognition result is disgust, the second item processing flow corresponding to the happy tag in the item-processing micro-expression strategy is acquired.
  18. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行以下操作:A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program that, when executed by a processor, causes the processor to perform the following operations:
    若接收到与用户端对应的用户视频数据,获取所述用户视频数据的视频图像序列中包含微表情的图像帧;If user video data corresponding to the user terminal is received, acquiring image frames containing micro-expressions in the video image sequence of the user video data;
    根据预设的经验帧数值,在包含微表情的图像帧中获取与所述经验帧数值相等张数的连续多帧图像,以组成微表情序列;According to the preset empirical frame value, acquiring consecutive multiple frames of images equal to the empirical frame value in the image frame containing the micro-expression to form a micro-expression sequence;
    调用预先构建的权重计算层计算所述微表情序列中每一帧图像的权重特征向量,以得到每一帧图像的结合权重值的图像特征向量;Call a pre-built weight calculation layer to calculate the weight feature vector of each frame of the image in the micro-expression sequence, so as to obtain the image feature vector of each frame of image combined with the weight value;
    将每一帧图像的结合权重值的图像特征向量进行求和,得到所述用户视频数据对应的综合图像特征向量;Sum the image feature vectors of the combined weight values of each frame of image to obtain a comprehensive image feature vector corresponding to the user video data;
    将所述综合图像特征向量输入至预先训练的卷积神经网络,得到微表情识别结果;以及Input the integrated image feature vector to a pre-trained convolutional neural network to obtain a micro-expression recognition result; and
    调用预先存储的事项处理微表情策略,获取与所述微表情识别结果的事项处理流程信息,将所述事项处理流程信息发送至用户端;其中,所述事项处理微表情策略存储有若干个事项处理流程信息,每一事项处理流程信息对应一个微表情识别结果。Invoke a pre-stored item handling micro-expression strategy, obtain item handling process information related to the micro-expression recognition result, and send the item handling process information to the user terminal; wherein the item handling micro-expression strategy stores several items Processing flow information, each item processing flow information corresponds to a micro-expression recognition result.
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:18. The computer-readable storage medium according to claim 18, wherein the video image sequence from which the user video data is obtained contains image frames of micro-expressions, comprising:
    通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧;或者通过时空局部纹理算子获取所述用户视频数据的视频图像序列中包含微表情的图像帧。Obtain image frames containing micro-expression in the video image sequence of the user video data by an optical flow method; or obtain image frames containing micro-expression in the video image sequence of the user video data by a spatio-temporal local texture operator.
  20. 根据权利要求19所述的计算机设备,其中,所述通过光流法获取所述用户视频数据的视频图像序列中包含微表情的图像帧,包括:The computer device according to claim 19, wherein the video image sequence obtained by the optical flow method of the user video data contains the image frame of the micro-expression, comprising:
    获取所述用户视频数据的视频图像序列的各像素点对应的速度矢量特征;Acquiring a velocity vector feature corresponding to each pixel of the video image sequence of the user video data;
    若视频图像序列中存在至少一帧图像的所述速度矢量特征未保持连续变化,由对应图片组成包含微表情的图像帧。If the velocity vector feature of at least one frame of images in the video image sequence does not keep changing continuously, an image frame containing micro-expression is composed of corresponding pictures.
PCT/CN2021/097208 2020-06-23 2021-05-31 Video-based micro-expression recognition method and apparatus, computer device, and storage medium WO2021259005A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010583481.9A CN111738160B (en) 2020-06-23 2020-06-23 Video micro-expression recognition method and device, computer equipment and storage medium
CN202010583481.9 2020-06-23

Publications (1)

Publication Number Publication Date
WO2021259005A1 true WO2021259005A1 (en) 2021-12-30

Family

ID=72650730

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097208 WO2021259005A1 (en) 2020-06-23 2021-05-31 Video-based micro-expression recognition method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN111738160B (en)
WO (1) WO2021259005A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114373214A (en) * 2022-01-14 2022-04-19 平安普惠企业管理有限公司 User psychological analysis method, device, equipment and storage medium based on micro expression
CN114639136A (en) * 2022-01-22 2022-06-17 西北工业大学 Long video micro-expression detection method based on shallow network
CN114708627A (en) * 2022-02-28 2022-07-05 厦门大学 Micro-expression recognition method applied to social robot
CN114743235A (en) * 2022-03-01 2022-07-12 东南大学 Micro-expression identification method and system based on sparsification self-attention mechanism
CN114863515A (en) * 2022-04-18 2022-08-05 厦门大学 Human face living body detection method and device based on micro-expression semantics
CN115221954A (en) * 2022-07-12 2022-10-21 中国电信股份有限公司 User portrait method, device, electronic equipment and storage medium
CN115396743A (en) * 2022-08-26 2022-11-25 深圳万兴软件有限公司 Video watermark removing method, device, equipment and storage medium
CN116071810A (en) * 2023-04-03 2023-05-05 中国科学技术大学 Micro expression detection method, system, equipment and storage medium
CN116824280A (en) * 2023-08-30 2023-09-29 安徽爱学堂教育科技有限公司 Psychological early warning method based on micro-expression change
CN117314890A (en) * 2023-11-07 2023-12-29 东莞市富明钮扣有限公司 Safety control method, device, equipment and storage medium for button making processing
CN114708627B (en) * 2022-02-28 2024-05-31 厦门大学 Micro-expression recognition method applied to social robot

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738160B (en) * 2020-06-23 2024-03-26 平安科技(深圳)有限公司 Video micro-expression recognition method and device, computer equipment and storage medium
CN112580555B (en) * 2020-12-25 2022-09-30 中国科学技术大学 Spontaneous micro-expression recognition method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8848068B2 (en) * 2012-05-08 2014-09-30 Oulun Yliopisto Automated recognition algorithm for detecting facial expressions
CN110175505A (en) * 2019-04-08 2019-08-27 北京网众共创科技有限公司 Determination method, apparatus, storage medium and the electronic device of micro- expression type
CN111738160A (en) * 2020-06-23 2020-10-02 平安科技(深圳)有限公司 Video micro-expression recognition method and device, computer equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI430185B (en) * 2010-06-17 2014-03-11 Inst Information Industry Facial expression recognition systems and methods and computer program products thereof
CN106980811A (en) * 2016-10-21 2017-07-25 商汤集团有限公司 Facial expression recognizing method and expression recognition device
CN109376598A (en) * 2018-09-17 2019-02-22 平安科技(深圳)有限公司 Facial expression image processing method, device, computer equipment and storage medium
CN109522818B (en) * 2018-10-29 2021-03-30 中国科学院深圳先进技术研究院 Expression recognition method and device, terminal equipment and storage medium
CN109684911B (en) * 2018-10-30 2021-05-11 百度在线网络技术(北京)有限公司 Expression recognition method and device, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8848068B2 (en) * 2012-05-08 2014-09-30 Oulun Yliopisto Automated recognition algorithm for detecting facial expressions
CN110175505A (en) * 2019-04-08 2019-08-27 北京网众共创科技有限公司 Determination method, apparatus, storage medium and the electronic device of micro- expression type
CN111738160A (en) * 2020-06-23 2020-10-02 平安科技(深圳)有限公司 Video micro-expression recognition method and device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FENG XU, FENZHANG JUN-PING: "Facial Microexpression Recognition: A Survey", ACTA AUTOMATICA SINICA, KEXUE CHUBANSHE, BEIJING, CN, vol. 43, no. 3, 1 January 2017 (2017-01-01), CN , pages 333 - 348, XP055884915, ISSN: 0254-4156, DOI: 10.16383/j.aas.2017.c160398 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114373214A (en) * 2022-01-14 2022-04-19 平安普惠企业管理有限公司 User psychological analysis method, device, equipment and storage medium based on micro expression
CN114639136A (en) * 2022-01-22 2022-06-17 西北工业大学 Long video micro-expression detection method based on shallow network
CN114639136B (en) * 2022-01-22 2024-03-08 西北工业大学 Long video micro expression detection method based on shallow network
CN114708627A (en) * 2022-02-28 2022-07-05 厦门大学 Micro-expression recognition method applied to social robot
CN114708627B (en) * 2022-02-28 2024-05-31 厦门大学 Micro-expression recognition method applied to social robot
CN114743235A (en) * 2022-03-01 2022-07-12 东南大学 Micro-expression identification method and system based on sparsification self-attention mechanism
CN114863515A (en) * 2022-04-18 2022-08-05 厦门大学 Human face living body detection method and device based on micro-expression semantics
CN115221954B (en) * 2022-07-12 2023-10-31 中国电信股份有限公司 User portrait method, device, electronic equipment and storage medium
CN115221954A (en) * 2022-07-12 2022-10-21 中国电信股份有限公司 User portrait method, device, electronic equipment and storage medium
CN115396743B (en) * 2022-08-26 2023-08-11 深圳万兴软件有限公司 Video watermark removing method, device, equipment and storage medium
CN115396743A (en) * 2022-08-26 2022-11-25 深圳万兴软件有限公司 Video watermark removing method, device, equipment and storage medium
CN116071810A (en) * 2023-04-03 2023-05-05 中国科学技术大学 Micro expression detection method, system, equipment and storage medium
CN116824280A (en) * 2023-08-30 2023-09-29 安徽爱学堂教育科技有限公司 Psychological early warning method based on micro-expression change
CN116824280B (en) * 2023-08-30 2023-11-24 安徽爱学堂教育科技有限公司 Psychological early warning method based on micro-expression change
CN117314890A (en) * 2023-11-07 2023-12-29 东莞市富明钮扣有限公司 Safety control method, device, equipment and storage medium for button making processing
CN117314890B (en) * 2023-11-07 2024-04-23 东莞市富明钮扣有限公司 Safety control method, device, equipment and storage medium for button making processing

Also Published As

Publication number Publication date
CN111738160B (en) 2024-03-26
CN111738160A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
WO2021259005A1 (en) Video-based micro-expression recognition method and apparatus, computer device, and storage medium
CN107203753B (en) Action recognition method based on fuzzy neural network and graph model reasoning
WO2019174439A1 (en) Image recognition method and apparatus, and terminal and storage medium
US11741736B2 (en) Determining associations between objects and persons using machine learning models
US10776470B2 (en) Verifying identity based on facial dynamics
WO2020078119A1 (en) Method, device and system for simulating user wearing clothing and accessories
US11093734B2 (en) Method and apparatus with emotion recognition
JP6411510B2 (en) System and method for identifying faces in unconstrained media
Youssif et al. Automatic facial expression recognition system based on geometric and appearance features
WO2020103700A1 (en) Image recognition method based on micro facial expressions, apparatus and related device
JP2020522285A (en) System and method for whole body measurement extraction
WO2018128996A1 (en) System and method for facilitating dynamic avatar based on real-time facial expression detection
CN112800903B (en) Dynamic expression recognition method and system based on space-time diagram convolutional neural network
US20190347522A1 (en) Training Set Sufficiency For Image Analysis
Hatem et al. A survey of feature base methods for human face detection
Durga et al. A ResNet deep learning based facial recognition design for future multimedia applications
US20230172457A1 (en) Systems and methods for temperature measurement
WO2021184754A1 (en) Video comparison method and apparatus, computer device and storage medium
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
Yi et al. Facial expression recognition of intercepted video sequences based on feature point movement trend and feature block texture variation
Purps et al. Reconstructing facial expressions of HMD users for avatars in VR
Dinakaran et al. Efficient regional multi feature similarity measure based emotion detection system in web portal using artificial neural network
CN108399358B (en) Expression display method and system for video chat
Tu et al. Face and gesture based human computer interaction
KR20230077560A (en) Appartus of providing service customized on exhibit hall and controlling method of the same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21829603

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21829603

Country of ref document: EP

Kind code of ref document: A1