WO2018154098A1 - Method and system for recognizing mood by means of image analysis - Google Patents

Method and system for recognizing mood by means of image analysis Download PDF

Info

Publication number
WO2018154098A1
WO2018154098A1 PCT/EP2018/054622 EP2018054622W WO2018154098A1 WO 2018154098 A1 WO2018154098 A1 WO 2018154098A1 EP 2018054622 W EP2018054622 W EP 2018054622W WO 2018154098 A1 WO2018154098 A1 WO 2018154098A1
Authority
WO
WIPO (PCT)
Prior art keywords
mood
subject
facial
images
distance
Prior art date
Application number
PCT/EP2018/054622
Other languages
French (fr)
Inventor
Javier Varona Gómez
Diana Arellano Távara
Miquel Mascaró Oliver
Cristina Manresa Yee
Simón Garcés Rayo
Juan Sebastián Filippini
Original Assignee
Universitat De Les Illes Balears
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universitat De Les Illes Balears filed Critical Universitat De Les Illes Balears
Publication of WO2018154098A1 publication Critical patent/WO2018154098A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Definitions

  • the present invention is comprised in the technical field corresponding to the sector of artificial intelligence and facial expression recognition. More specifically, the invention relates to a mood recognition method based on image sequence processing.
  • FACS Facial Action Coding System
  • - moods relate to emotions insofar as a person who is in a certain mood tends to experience certain emotions. In other words, by means of noticeable effects produced by emotions, facial expressions or gestures, it is possible to recognize a person's mood.
  • the applications for mood-based facial recognition may be very useful in various sectors, such as commercial or political marketing, human resources, video games, distance learning, digital signage and human-computer interactions in general.
  • US 8798374 B2 discloses an automatic method for image processing for the detection of AUs
  • US 8879854 B2 discloses a method and apparatus for recognizing emotions based on action units.
  • the descriptors constructed in a heuristic manner have very little discriminatory power, fundamentally in interpersonal detection. This is why various lines of work have tended to construct more complex descriptors by means of automatic methods for selecting features.
  • US 9405962 B2 discloses a method for determining emotions in a set of images in the presence of a facial artifact (beard, mustache, glasses, etc.), including the detection of action units.
  • PAD "Pleasure-Arousal-Dominance"
  • PAD a system that allows defining and measuring different moods, emotional traits and personality traits as a function of three orthogonal dimensions: pleasure (P), arousal (A), and dominance (D).
  • P pleasure
  • A arousal
  • D dominance
  • the PAD model is a framework that is generally used for defining moods and it allows the interrelation thereof with the facial coding in FACS. In other words, PAD can describe a mood in terms of action units.
  • octants representing the basic categories of moods can be derived (Table 1 ).
  • Table 1 Moods, PAD space octants.
  • a mood can give rise to various emotions.
  • the mood "anxious” can manifest itself in emotions such as “confused”, “fearful”, “worried”, “ashamed”, etc., which in turn can be related to action units (AUs).
  • AUs action units
  • Table 2 Example of emotions represented in PAD space. Particularly, it is possible to define the correspondence between AUs and PAD space octants by means of the PAD model. The main objective of this correspondence is the description of each of the eight moods in AU terms.
  • the Facial Expression Repertoire (or FER) is known for this description.
  • FER Facial Expression Repertoire
  • the manner of transforming captured images of people into facial expressions/movements is through the use of generic methods, based on processing instantaneous images of the subjects subjected to analysis. However, these methods entail errors since the particular form of the facial features of the subject analyzed cannot be "learned" and customized, such that the emotion recognition method is more precise.
  • said methods of the state of the art are restricted to the identification of emotions (happiness, sadness, etc.), but they do not allow detecting complex constructs such as moods, the activation of which may comprise, at the same time, different configurations of emotions, sometimes even opposing emotions. For example, an anxious mood can be reflected in both a sad subject and in a happy subject. Therefore, the known solutions of the state of the art are still unable to solve the technical problem which entails providing a precise mood recognition method.
  • the present invention proposes a solution to this technical problem by means of a novel facial recognition method for recognizing moods in a set of images, which provides for the customization of the subject to minimize AU detection errors.
  • the main object of the invention relates to a method for recognizing the mood of a subject based on their relationship with facial expressions/movements.
  • the method of the invention focuses on recognizing moods, a concept that is technically different from emotion.
  • the manner of transforming the captured images of the subjects into facial gestures/movements is customized, "learning" the particular form of the facial features of the person analyzed, such that the mood recognition method is more precise than if this customization were not performed.
  • the mentioned object of the invention is performed by means of a mood recognition method for recognizing the mood of a subject based on facial images of said subjected obtained by means of a system comprising a camera suitable for taking said images, and a processor for storing and/or processing said images.
  • said method comprises carrying out the following steps:
  • step b) defining one or more resting patterns corresponding to the distances between the characteristic facial points of the subject, defined in step b);
  • AUs action units
  • step e) obtaining, for each image of the sequence, the activation probability distribution of the action units associated with the mood to be recognized, according to the rules defined in step e);
  • step f) determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f).
  • a reliable and robust mood recognition method is thereby achieved, where image analysis is performed in sequences captured by the camera, such that said sequences allow dynamically evaluating the contribution of the action units to the mood of the subject.
  • the mood recognition method further comprises carrying out the following steps in step f):
  • - defining a standard probability distribution associated with the activation of one or more action units associated with a mood / ' , defining to that end a value p ⁇ , between 0 and 1 to designate the contribution of each action unit j, where the value 0 is assigned to the minimum contribution and the value 1 to the maximum;
  • the mood recognition method further comprises carrying out the following steps in step h):
  • the mood recognition method further comprises carrying out the following step in step i):
  • step h determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f) by calculating the Bhattacharyya coefficient, D,, for each mood / ' , according to the expression:
  • the W facial images of the subject are consecutive in the sequence captured by the camera.
  • the set of n action units involved in determining the mood or moods of the subject are selected from all the action units existing in FACS.
  • the action units involved in determining the mood or moods of the subject are one or more of the following: inner brow raiser; outer brow raiser; brow lowerer; upper lid raiser; cheek raiser; upper lip raiser; lip corner puller; lip corner depressor; lips part; jaw drop; eyes closed.
  • the moods considered are the eight moods of the Pleasure-Arousal-Dominance (PAD) space.
  • the relationship between the eight moods of the PAD space developed by Mehrabian and the action units that are activated in each of them are those defined in the Facial Expression Repertoire (FER).
  • one or more resting patterns corresponding to the distances between the characteristic facial points of the subject are defined, with said distances being one or more of the following: middle right eye-eyebrow distance; inner right eye-eyebrow distance; middle left eye-eyebrow distance; inner left eye-eyebrow distance; open right eye distance; open left eye distance; horizontal mouth distance; upper mouth-nose distance; jaw-nose distance; almost lower mouth-outer mouth distance; left eyebrow-upper lid distance; left eyebrow-lower lid distance; right eyebrow-upper lid distance; right eyebrow-lower lid distance.
  • the mood or moods of the subject are gauged in a session with known and controlled stimuli, such that one or more action units can be associated with one or more moods / of said subject.
  • Another object of the invention relates to a mood recognition system for recognizing the mood of a subject through the mood recognition method according to any of the embodiments described herein, comprising:
  • processing means (3) for storing and/or processing the facial images, wherein said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
  • said system additional comprises a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera, as a function of results obtained in previous analyses. More preferably, said learning subsystem is locally or remotely connected to the processing means.
  • Figure 2 shows the characteristic facial points used in detecting action units of the method of the invention according to a preferred embodiment thereof.
  • Figure 3 depicts the detection of the activation of an action unit (specifically, AU1 ) in a sequence of images upon comparing the minimum theoretical variation in pixels with the experimental variation of facial parameters with respect to the customized resting pattern parameters (in this case parameter P2).
  • Figure 4 shows a mood recognition system according to a preferred embodiment of the invention, showing in detail the elements thereof.
  • One object of the invention relates to a mood recognition method for recognizing the mood of a subject (1 ) based on their relationship with facial expressions/movements.
  • the method of the invention focuses on recognizing moods, a concept that is different from emotion.
  • the theory existing between facial gestures/movements and emotions FACS coding
  • the theory relating emotions and moods PAD model
  • the manner of transforming the captured images of the subjects (1 ) into facial gestures/movements is customized, "learning" the particular form of the facial features of the analyzed subject (1 ), such that the mood recognition method is more precise than if this customization were not performed.
  • the method of the invention furthermore takes into account the prior history of the sequence of images (i.e., the recognition of expressions in the images preceding the processed image).
  • the invention is therefore based on the analysis of a set of a given number of images, unlike methods based on instantaneous recognition for the identification of emotions.
  • the method comprises three fundamental steps: defining general previous criteria and data, defining customized resting patterns, and evaluating the mood. Each of these steps is described below in detail.
  • a subset n of action units which are considered sufficient for being able to describe and recognize any mood of the PAD space, must be selected from among all those existing in FACS.
  • Table 3 Subset of action units considered in mood recognition.
  • the starting data must also indicate the importance of each gesture or AUj in the corresponding mood.
  • a number between 0 and 1 is defined to determine the weight of each gesture or AUj. If an AUj is highly determinant, it is assigned the value 1 , whereas if it is not important for a certain mood, it is assigned the value 0.
  • Each pij is a scalar that determines the importance of an AUj in the mood / '
  • p,- is a pattern of the mood that relates it with gestures or AUs.
  • a standard probability distribution associated with the activation of one or more action units associated with a mood is thereby defined.
  • the method requires defining criteria for activating each AUj when can be used to determine if a gesture or AUj has been made by the subject (1 ) under study when interpreting the image data.
  • criteria for activating each AUj when can be used to determine if a gesture or AUj has been made by the subject (1 ) under study when interpreting the image data.
  • the definition of these resting patterns includes the definition of a mean value ⁇ and a maximum deviation ⁇ from the mean value.
  • These resting patterns must be found for each subject (1 ) subjected to the method of facial recognition analysis. It is a step included in each analysis, not a prior independent gauging.
  • Table 6 shows an example of a set of rules for detecting activations of AUs that describe a threshold value for each variation of parameters relating to the AUs and are defined as a function of the deviation ⁇ . For example, if in an image ⁇ 7 (+) > 2 ⁇ , AU12 will have been activated.
  • the method of the invention then comprises a final step of comparison to carry out the final step of evaluating the mood:
  • the final step of the method of the invention consists of comparing the pattern with the experiment. To that end, the Bhattacharyya coefficient, D, is used for each mood:
  • This coefficient gives a value indicating the proximity of the probability distribution of the experiment with respect to the standard probability distribution.
  • this invention considers the use of descriptors of the temporary dynamics of a person's facial expression to determine said person's mood. These descriptors encode the importance of the occurrence of each AU for each mood.
  • the invention uses a method of detection AUs capable of learning the particular parameters of the appearance of the facial movement in a customized manner in the same analysis session without a prior learning step.
  • the final system that is provided also allows the possibility of defining a temporary analysis parameter W relating to the set of images to be processed, which allows the correct robust interpretation over partial errors of the mood of the person participating in the analysis.
  • the analysis process is an iteration the duration of which depends on the number of image sequences.
  • Another object of the invention relates to a facial recognition system for recognizing the mood of a subject (1 ) through the mood recognition method such as the one described in the preceding embodiment, comprising:
  • processing means (3) for storing and/or processing the facial images, where said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
  • the system of the invention can additionally comprise a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera (2), as a function of results obtained in previous analyses. This allows progressively improving system precision and feeding the previously obtained information back into said system, associating certain action units with moods of the subject, in a customized manner.
  • the learning subsystem can be locally and remotely connected to the processing means (3).

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention relates to a mood recognition method for recognizing the mood of a subject (1) based on their relationship with facial expressions/movements. The method of the invention focuses on recognizing moods, a concept that is different from emotion. The manner of transforming the captured images of the subjects (1) into facial movements is customized, by learning the particular form of the facial features of the analyzed subject (1). The invention is based on the analysis of a set of a given number of images, but said number being greater than the number used in standard emotion recognition. A more robust mood recognition method is thereby defined. The method comprises three fundamental steps: defining general previous criteria and data, defining customized resting patterns, and evaluating the mood.

Description

DESCRIPTION
"Method and system for recognizing mood by means of image analysis"
FIELD OF THE INVENTION
The present invention is comprised in the technical field corresponding to the sector of artificial intelligence and facial expression recognition. More specifically, the invention relates to a mood recognition method based on image sequence processing.
BACKGROUND OF THE INVENTION
The recognition of emotions from facial expressions is a very dynamic field today given its various applications in the field of psychology, advertising or marketing, among others. Said recognition is typically performed according to the system known as the Facial Action Coding System (FACS). FACS allows analyzing human facial expressions through facial coding, and it can be used to classify virtually any anatomical facial expression by analyzing the possible movements of the muscles associated with said facial expression. These movements are divided into what is commonly referred to as Action Units (AU), which are the fundamental actions of muscles or individual muscle groups (for example, according to the mentioned classification, AU6 refers to raising cheeks). Terms such as action units, gestures, facial expressions and AU will be used interchangeably herein.
On the other hand, the terms mood and emotion are normally confused in colloquial language and in their formal definitions. There is a general consensus today that establishes at least three main differences between both terms:
- moods last longer than emotions do;
- moods are not outwardly expressed in a direct manner, unlike emotions;
- moods relate to emotions insofar as a person who is in a certain mood tends to experience certain emotions. In other words, by means of noticeable effects produced by emotions, facial expressions or gestures, it is possible to recognize a person's mood.
As mentioned, the applications for mood-based facial recognition may be very useful in various sectors, such as commercial or political marketing, human resources, video games, distance learning, digital signage and human-computer interactions in general.
In the field of facial recognition for recognizing emotions, different analysis technologies are known, such as those disclosed in patents US 8798374 B2, US 8879854 B2 or US 9405962 B2. These patent documents disclose systems focusing on recognizing emotions, not moods (which are different concepts), and their associated methods of analysis therefore focus on the recognition and processing of instantaneous images of the subjects under study. These patent documents primarily disclose the construction of a set of descriptors based on detectable geometric facial features, and a method of classifying AUs based on these descriptors. The heuristic definition of a set of rules is used to obtain suitable descriptors, and even automatic methods for selecting features in the context of automatic learning methods are used for the same purpose. Therefore, US 8798374 B2 discloses an automatic method for image processing for the detection of AUs, and US 8879854 B2 discloses a method and apparatus for recognizing emotions based on action units. The descriptors constructed in a heuristic manner have very little discriminatory power, fundamentally in interpersonal detection. This is why various lines of work have tended to construct more complex descriptors by means of automatic methods for selecting features. For example, US 9405962 B2 discloses a method for determining emotions in a set of images in the presence of a facial artifact (beard, mustache, glasses, etc.), including the detection of action units.
On the other hand, the "Pleasure-Arousal-Dominance" (or PAD) model is also known today as a theoretical framework for mood recognition. The PAD model is a system that allows defining and measuring different moods, emotional traits and personality traits as a function of three orthogonal dimensions: pleasure (P), arousal (A), and dominance (D). The PAD model is a framework that is generally used for defining moods and it allows the interrelation thereof with the facial coding in FACS. In other words, PAD can describe a mood in terms of action units. In the PAD model, based on the intersection of the pleasure, arousal and dominance axes, eight octants representing the basic categories of moods can be derived (Table 1 ).
Figure imgf000004_0001
Table 1 : Moods, PAD space octants.
It is possible to express emotions in terms of pleasure, arousal and dominance according to a certain correlation (Table 2). Therefore, a mood can give rise to various emotions. For example, the mood "anxious" can manifest itself in emotions such as "confused", "fearful", "worried", "ashamed", etc., which in turn can be related to action units (AUs).
Figure imgf000004_0002
Table 2: Example of emotions represented in PAD space. Particularly, it is possible to define the correspondence between AUs and PAD space octants by means of the PAD model. The main objective of this correspondence is the description of each of the eight moods in AU terms. The Facial Expression Repertoire (or FER) is known for this description. In the state of the art, the manner of transforming captured images of people into facial expressions/movements is through the use of generic methods, based on processing instantaneous images of the subjects subjected to analysis. However, these methods entail errors since the particular form of the facial features of the subject analyzed cannot be "learned" and customized, such that the emotion recognition method is more precise. Additionally, said methods of the state of the art are restricted to the identification of emotions (happiness, sadness, etc.), but they do not allow detecting complex constructs such as moods, the activation of which may comprise, at the same time, different configurations of emotions, sometimes even opposing emotions. For example, an anxious mood can be reflected in both a sad subject and in a happy subject. Therefore, the known solutions of the state of the art are still unable to solve the technical problem which entails providing a precise mood recognition method.
The present invention proposes a solution to this technical problem by means of a novel facial recognition method for recognizing moods in a set of images, which provides for the customization of the subject to minimize AU detection errors.
BRIEF DISCLOSURE OF THE INVENTION
The main object of the invention relates to a method for recognizing the mood of a subject based on their relationship with facial expressions/movements. The method of the invention focuses on recognizing moods, a concept that is technically different from emotion. In the method of the invention, the manner of transforming the captured images of the subjects into facial gestures/movements is customized, "learning" the particular form of the facial features of the person analyzed, such that the mood recognition method is more precise than if this customization were not performed. The mentioned object of the invention is performed by means of a mood recognition method for recognizing the mood of a subject based on facial images of said subjected obtained by means of a system comprising a camera suitable for taking said images, and a processor for storing and/or processing said images. Advantageously, said method comprises carrying out the following steps:
a) registering one or more facial images of the subject in a reference mood; b) defining a plurality of characteristic facial points of the subject in one or more of the images associated with the reference mood;
c) defining one or more resting patterns corresponding to the distances between the characteristic facial points of the subject, defined in step b);
d) defining one or more action units (AUs) corresponding to the movement of the facial points with respect to the resting patterns;
e) defining one or more activation rules of each action unit for the mood to be recognized based on threshold values associated with the amount of movement of the characteristic facial points with respect to the resting patterns;
f) defining a standard probability distribution associated with the activation of one or more action units associated with a mood;
g) registering a sequence of facial images of the subject that is associated with the mood to be recognized;
h) obtaining, for each image of the sequence, the activation probability distribution of the action units associated with the mood to be recognized, according to the rules defined in step e);
i) determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f). A reliable and robust mood recognition method is thereby achieved, where image analysis is performed in sequences captured by the camera, such that said sequences allow dynamically evaluating the contribution of the action units to the mood of the subject. In another preferred embodiment of the invention, the mood recognition method further comprises carrying out the following steps in step f):
- defining a standard probability distribution associated with the activation of one or more action units associated with a mood /', defining to that end a value p<, between 0 and 1 to designate the contribution of each action unit j, where the value 0 is assigned to the minimum contribution and the value 1 to the maximum;
- constructing with these values p,j a vector p, for each mood where n is the number of action units involved in determining the moods:
Figure imgf000006_0001
{Pij} = Cp {Pi1, Pi2,... Pin},
where Cp is a normalization constant for imposing the condition that .∑"/=·/ < =1 .
In another preferred embodiment of the invention, the mood recognition method further comprises carrying out the following steps in step h):
- registering a number W of facial images of the subject;
- obtaining, for the set of images W, the activation probability distribution of the action units j associated with the mood / to be recognized, defining to that end a value qj to designate the contribution of each action unit j, according to the expression:
qj= Cq (IM ^k-o Skj
where / =0,1 ,...,l l/; y=1 ,2,...,n; and s¾- is assigned the value s¾=1 if the action unit j is activated, and s¾=0 if the action unit j is not activated; and Cq is a normalization constant for imposing the condition that∑nj=i q, =1.
In another preferred embodiment of the invention, the mood recognition method further comprises carrying out the following step in step i):
- determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f) by calculating the Bhattacharyya coefficient, D,, for each mood /', according to the expression:
Figure imgf000007_0001
More preferably, the W facial images of the subject are consecutive in the sequence captured by the camera.
In another preferred embodiment of the method of the invention, the set of n action units involved in determining the mood or moods of the subject are selected from all the action units existing in FACS.
More preferably, the action units involved in determining the mood or moods of the subject are one or more of the following: inner brow raiser; outer brow raiser; brow lowerer; upper lid raiser; cheek raiser; upper lip raiser; lip corner puller; lip corner depressor; lips part; jaw drop; eyes closed.
In another preferred embodiment of the method of the invention, the moods considered are the eight moods of the Pleasure-Arousal-Dominance (PAD) space.
More preferably, the relationship between the eight moods of the PAD space developed by Mehrabian and the action units that are activated in each of them are those defined in the Facial Expression Repertoire (FER).
In another preferred embodiment of the method of the invention, one or more resting patterns corresponding to the distances between the characteristic facial points of the subject are defined, with said distances being one or more of the following: middle right eye-eyebrow distance; inner right eye-eyebrow distance; middle left eye-eyebrow distance; inner left eye-eyebrow distance; open right eye distance; open left eye distance; horizontal mouth distance; upper mouth-nose distance; jaw-nose distance; almost lower mouth-outer mouth distance; left eyebrow-upper lid distance; left eyebrow-lower lid distance; right eyebrow-upper lid distance; right eyebrow-lower lid distance.
In another preferred embodiment of the method of the invention, the mood or moods of the subject are gauged in a session with known and controlled stimuli, such that one or more action units can be associated with one or more moods / of said subject.
Another object of the invention relates to a mood recognition system for recognizing the mood of a subject through the mood recognition method according to any of the embodiments described herein, comprising:
- a camera suitable for taking facial images of said subject;
- one or more processing means (3) for storing and/or processing the facial images, wherein said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
In a preferred embodiment of the system of the invention, said system additional comprises a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera, as a function of results obtained in previous analyses. More preferably, said learning subsystem is locally or remotely connected to the processing means.
DESCRIPTION OF THE DRAWINGS Figure 1 shows a flowchart of the steps of the method of the invention according to a preferred embodiment thereof.
Figure 2 shows the characteristic facial points used in detecting action units of the method of the invention according to a preferred embodiment thereof. Figure 3 depicts the detection of the activation of an action unit (specifically, AU1 ) in a sequence of images upon comparing the minimum theoretical variation in pixels with the experimental variation of facial parameters with respect to the customized resting pattern parameters (in this case parameter P2).
Figure 4 shows a mood recognition system according to a preferred embodiment of the invention, showing in detail the elements thereof.
DETAILED DISCLOSURE OF THE INVENTION
A detailed description of the method of the invention is provided below in reference to a preferred embodiment thereof based on Figure 1 of the present patent document. Said embodiment is provided for the purpose of illustrating the claimed invention in a non-limiting manner.
One object of the invention relates to a mood recognition method for recognizing the mood of a subject (1 ) based on their relationship with facial expressions/movements. The method of the invention focuses on recognizing moods, a concept that is different from emotion. In defining said relationship, the theory existing between facial gestures/movements and emotions (FACS coding) and the theory relating emotions and moods (PAD model) are used. In the method of the invention, the manner of transforming the captured images of the subjects (1 ) into facial gestures/movements is customized, "learning" the particular form of the facial features of the analyzed subject (1 ), such that the mood recognition method is more precise than if this customization were not performed.
The method of the invention furthermore takes into account the prior history of the sequence of images (i.e., the recognition of expressions in the images preceding the processed image). The invention is therefore based on the analysis of a set of a given number of images, unlike methods based on instantaneous recognition for the identification of emotions.
According to Figure 1 , the method comprises three fundamental steps: defining general previous criteria and data, defining customized resting patterns, and evaluating the mood. Each of these steps is described below in detail.
1. Defining general previous criteria and data The method requires basic data, prior to the analysis of the mood of the subject (1 ):
- Firstly, a subset n of action units (AUs), which are considered sufficient for being able to describe and recognize any mood of the PAD space, must be selected from among all those existing in FACS. For example, Table 3 shows a possible subset n=11. Therefore, there is a set of gestures or AUj the combination of which gestures can describe moods, where y'=1 ,2,...n.
Figure imgf000010_0001
Table 3: Subset of action units considered in mood recognition.
- Secondly, a previous criterion relating the eight moods of the PAD space developed by Mehrabian with facial gestures or action units (AUs) that are activated in each of them is required. Table 4 shows all this starting data defined by Russel and Mehrabian and the Facial Expression Repertoire (FER) according to the subset of action units considered.
Mood Active AUs
Exuberant AU5, AU6, AU12, AU25, AU26
Anxious AU1 , AU2, AU4, AU5, AU15, AU25, AU26 Bored AU1 , AU2, AU4, AU15, AU43
Docile AU1 , AU2, AU12, AU43
Hostile AU4, AU 10, AU5, AU15, AU25, AU26
Relaxed AU6, AU12, AU43
Dependent AU1 , AU2, AU5, AU12, AU25, AU26
Disdainful AU4, All 15, AU43
Table 4. Active AUs per PAD quadrant
The starting data must also indicate the importance of each gesture or AUj in the corresponding mood. To that end, a number between 0 and 1 is defined to determine the weight of each gesture or AUj. If an AUj is highly determinant, it is assigned the value 1 , whereas if it is not important for a certain mood, it is assigned the value 0. A vector is constructed for each mood / with these values. This vector is called p,, for example: p,= CP {pij} = CP {pi, pi2l... Pin} = CP {1 ,1 ,1 ,0.7,0,0,0,1 ,1 ,1 ,0} (Eq. 1 )
Each pij is a scalar that determines the importance of an AUj in the mood /', and Cp is a normalization constant for imposing the condition that .∑"/= p<, =1 in Eq. 1. Then p,- is a pattern of the mood that relates it with gestures or AUs.
A standard probability distribution associated with the activation of one or more action units associated with a mood is thereby defined.
2. Defining customized resting patterns
In a second step, the method requires defining criteria for activating each AUj when can be used to determine if a gesture or AUj has been made by the subject (1 ) under study when interpreting the image data. To define said criteria, the following steps are carried out:
- Registering one or more facial images of the subject (1 ) in a reference mood.
- Defining a plurality of characteristic facial points of the subject (1 ) in one or more of the images in the reference mood. For example, as shown in Figure 2, 24 facial points or curves can be taken. These characteristic points are strategically associated with the facial points or curves that are most susceptible to undergoing changes in position upon activating one or more AUj. - A plurality of distances between those characteristic facial points selected in the preceding step are defined. These distances are called parameters P. As an example, 15 distance parameters that will be used in detecting AUs are defined in Table 5.
Figure imgf000012_0001
Table 5. Distance parameters for detecting AUs.
- Defining one or more resting patterns for each parameter P. The definition of these resting patterns includes the definition of a mean value μ and a maximum deviation σ from the mean value. These resting patterns must be found for each subject (1 ) subjected to the method of facial recognition analysis. It is a step included in each analysis, not a prior independent gauging. - Defining one or more rules relating the non-resting measurements with the resting patterns to indicate the activation of each AUj. If a comparison with respect to the resting pattern
Figure imgf000013_0001
is a positive number, it is an expansion ΔΡ(+), and if in contrast there is a negative difference, it is a contraction of this facial parameter ΔΡ(-). Table 6 shows an example of a set of rules for detecting activations of AUs that describe a threshold value for each variation of parameters relating to the AUs and are defined as a function of the deviation σ. For example, if in an image ΔΡ7(+) > 2σ, AU12 will have been activated.
Figure imgf000013_0002
Table 6. Rules used in detecting AUs.
3. Evaluating the mood With these steps described above, it is possible to determine changes in facial parameters in a consecutive image package, as shown in Figure 3 by way of example. According to the activation rules, what the theoretical change for activating an AU should be like can be compared with the actual changes experienced by the subject (1 ) throughout a sequence of images. Figure 3 shows as an example the rule for activating AU1 with respect to parameter P2 and the experimental value of parameter P2 in pixels in a sequence of images. Since the theoretical variation fits with the experimental variation, AU1 is considered to have been activated in that set of images.
The method of the invention then comprises a final step of comparison to carry out the final step of evaluating the mood:
Assume that there is a number Wof images, which are preferably consecutive images. If each of those images is compared with the criteria for activating the AUs, it can be determined if it has been activated for each gesture or AUj. By repeating that comparison with all the images, it is possible to determine if it has been activated in one or in several images. In other words, an occurrence or relevance value can be obtained for each gesture or AUj. Each of those occurrence values can be referred to as qj. Each q is calculated with the following expression: qj= Cq (1/W)∑Wk-o Skj (Eq. 2) where / =0,1 ,...,l l/ and each k designates an image and where Skj represents the activation or non-activation of the gesture AUj. If the gesture AUj has been activated, Skj is assigned the value Skj=1, whereas if it has not been activated, Skj=0. Finally, Cq is a normalization constant for imposing the condition that∑nj=i q, =1 in Eq. 2.
With the set of resulting scalars a vector <¾={¾} having the same dimensions as pattern p, can be constructed, but this time it denotes the experimental weight or activation probability distribution of each gesture in a set of images W associated with the mood to be recognized.
The final step of the method of the invention consists of comparing the pattern with the experiment. To that end, the Bhattacharyya coefficient, D,, is used for each mood:
Figure imgf000014_0001
This coefficient gives a value indicating the proximity of the probability distribution of the experiment with respect to the standard probability distribution.
Through these steps it is possible to determine the mood or moods that are closer to the experimental mood of the subject (1 ) under analysis. In conclusion, this invention considers the use of descriptors of the temporary dynamics of a person's facial expression to determine said person's mood. These descriptors encode the importance of the occurrence of each AU for each mood. The invention uses a method of detection AUs capable of learning the particular parameters of the appearance of the facial movement in a customized manner in the same analysis session without a prior learning step. The final system that is provided also allows the possibility of defining a temporary analysis parameter W relating to the set of images to be processed, which allows the correct robust interpretation over partial errors of the mood of the person participating in the analysis. The analysis process is an iteration the duration of which depends on the number of image sequences.
Alternatively, it is possible to perform a special gauging of the subject (1 ) in a session with known stimuli, which allow evaluating the degree of response of the subject (1 ) to standard stimuli to then recognize moods in non-standard stimuli with greater precision.
Another object of the invention relates to a facial recognition system for recognizing the mood of a subject (1 ) through the mood recognition method such as the one described in the preceding embodiment, comprising:
- a camera (2) suitable for taking facial images of said subject (1 );
- one or more processing means (3) for storing and/or processing the facial images, where said processing means (3) are configured by means of hardware and/or software for carrying out an emotional state recognition method according to any of the embodiments described herein.
The system of the invention can additionally comprise a learning subsystem configured by means of hardware and/or software, to establish classification criteria for the sequences taken by the camera (2), as a function of results obtained in previous analyses. This allows progressively improving system precision and feeding the previously obtained information back into said system, associating certain action units with moods of the subject, in a customized manner. The learning subsystem can be locally and remotely connected to the processing means (3).

Claims

1 . A mood recognition method for recognizing the mood of a subject (1 ) based on facial images of said subject (1 ) obtained by means of a system comprising a camera (2) suitable for taking said images, and a processor (3) for storing and/or processing said images; where said method is characterized in that it comprises carrying out the following steps:
a) registering one or more facial images of the subject (1 ) in a reference mood;
b) defining a plurality of characteristic facial points of the subject (1 ) in one or more of the images associated with the reference mood;
c) defining one or more resting patterns corresponding to the distances between the characteristic facial points of the subject (1 ), defined in step b);
d) defining one or more action units (AUs) corresponding to the movement of the facial points with respect to the resting patterns;
e) defining one or more activation rules of each action unit for the mood to be recognized based on threshold values associated with the amount of movement of the characteristic facial points with respect to the resting patterns;
f) defining a standard probability distribution associated with the activation of one or more action units associated with a mood;
g) registering a sequence of facial images of the subject (1 ) that is associated with the mood to be recognized;
h) obtaining, for each image of the sequence, the activation probability distribution of the action units associated with the mood to be recognized, according to the rules defined in step e);
i) determining the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f).
2. The mood recognition method according to the preceding claim, wherein:
- a standard probability distribution associated with the activation of one or more action units associated with a mood / is defined, defining to that end a value p<, between 0 and 1 to designate the contribution of each action unit j, where the value 0 is assigned to the minimum contribution and the value 1 is assigned to the maximum contribution;
- a vector p, is constructed with these values p<, for each mood where n is the number of action units involved in determining the moods:
Figure imgf000017_0001
Cp {Pij} = Cp {Pi1, Pi2,... Pin} ,
where Cp is a normalization constant for imposing the condition that∑"i
3. The mood recognition method according to any of the preceding claims, wherein:
- a number W of facial images of the subject (1 ) are registered;
- the activation probability distribution of the action units j associated with the mood / to be recognized is obtained for the set of images W, defining to that end a value ¾ to designate the contribution of each action unit j, according to the expression: where / =0,1 , ..., l l/; y=1 ,2,... ,n; and s¾- is assigned the value s¾=1 if the action unit j is activated, and
Figure imgf000017_0002
if the action unit j is not activated; and Cq is a normalization constant for imposing the condition that∑nj=i q, =1 .
4. The mood recognition method according to any of the preceding claims, wherein the similarity between the probability distribution obtained in step h) and the standard probability distribution defined in step f) is determined by calculating the Bhattacharyya coefficient, D,, for each mood /', according to the expression:
Figure imgf000017_0003
5. The mood recognition method according to the preceding claim, wherein the W facial images of the subject (1 ) are consecutive images in a sequence captured by the camera (2).
6. The mood recognition method according to any of the preceding claims, wherein the set of n action units involved in determining the mood or moods of the subject (1 ) are selected from all the action units existing in the Facial Action Coding System (FACS).
7. The mood recognition method according to the preceding claim, wherein the action units involved in determining the mood or moods of the subject (1 ) are one or more of the following: inner brow raiser; outer brow raiser; brow lowerer; upper lid raiser; cheek raiser; upper lip raiser; lip corner puller; lip corner depressor; lips part; jaw drop; eyes closed.
8. The mood recognition method according to any of the preceding claims, wherein the moods considered are the eight moods of the PAD space developed by Mehrabian.
9. The mood recognition method according to the preceding claim, wherein the relationship between the eight moods of the PAD space developed by Mehrabian and the action units that are activated in each of them are those defined by Russel and Mehrabian and the Facial Expression Repertoire (FER).
10. The mood recognition method according to any of the preceding claims, wherein one or more resting patterns corresponding to the distances between the characteristic facial points of the subject (1 ) are defined, with said distances being one or more of the following: middle right eye-eyebrow distance; inner right eye-eyebrow distance; middle left eye-eyebrow distance; inner left eye-eyebrow distance; open right eye distance; open left eye distance; horizontal mouth distance; upper mouth-nose distance; jaw-nose distance; almost lower mouth- outer mouth distance; left eyebrow-upper lid distance; left eyebrow-lower lid distance; right eyebrow-upper lid distance; right eyebrow-lower lid distance.
1 1 . The mood recognition method according to any of the preceding claims, wherein the mood or moods of the subject (1 ) are gauged in a session with known stimuli.
12. A mood recognition system for recognizing the mood of a subject (1 ) through a mood recognition method according to any of claims 1 to 1 1 , comprising:
- a camera (2) suitable for taking facial images of said subject (1 );
- a processor (3) for storing and/or processing the facial images;
one or more processing means (3) for storing and/or processing the facial images, wherein said processing means (3) are configured by means of hardware and/or software for carrying out a mood recognition method according to any of the preceding claims.
13. The mood recognition system for recognizing the mood of a subject (1 ) according to the preceding claim, wherein the images of the registered sequence are consecutive images obtained by the camera (2).
14. The mood recognition system for recognizing the mood of a subject (1 ) according to any of the preceding claims, additionally comprises a learning subsystem configured by means of hardware and/or software to establish classification criteria for the sequences taken by the camera (2) as a function of results obtained in previous analyses, wherein said learning subsystem is locally or remotely connected to the processing means (3).
PCT/EP2018/054622 2017-02-27 2018-02-26 Method and system for recognizing mood by means of image analysis WO2018154098A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ESP201730259 2017-02-27
ES201730259A ES2633152B1 (en) 2017-02-27 2017-02-27 METHOD AND SYSTEM FOR THE RECOGNITION OF THE STATE OF MOOD THROUGH IMAGE ANALYSIS

Publications (1)

Publication Number Publication Date
WO2018154098A1 true WO2018154098A1 (en) 2018-08-30

Family

ID=59846800

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2018/054622 WO2018154098A1 (en) 2017-02-27 2018-02-26 Method and system for recognizing mood by means of image analysis

Country Status (2)

Country Link
ES (1) ES2633152B1 (en)
WO (1) WO2018154098A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523290A (en) * 2018-09-14 2019-03-26 平安科技(深圳)有限公司 Evaluation method, device, equipment and medium are paid attention to the class based on the micro- expression of audience
CN109961054A (en) * 2019-03-29 2019-07-02 山东大学 It is a kind of based on area-of-interest characteristic point movement anxiety, depression, angry facial expression recognition methods
CN110889908A (en) * 2019-12-10 2020-03-17 吴仁超 Intelligent sign-in system integrating face recognition and data analysis
CN112115751A (en) * 2019-06-21 2020-12-22 北京百度网讯科技有限公司 Training method and device for animal mood recognition model
CN112507959A (en) * 2020-12-21 2021-03-16 中国科学院心理研究所 Method for establishing emotion perception model based on individual face analysis in video

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI756681B (en) * 2019-05-09 2022-03-01 李至偉 Artificial intelligence assisted evaluation method applied to aesthetic medicine and system using the same

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100266213A1 (en) * 2009-04-16 2010-10-21 Hill Daniel A Method of assessing people's self-presentation and actions to evaluate personality type, behavioral tendencies, credibility, motivations and other insights through facial muscle activity and expressions

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110263946A1 (en) * 2010-04-22 2011-10-27 Mit Media Lab Method and system for real-time and offline analysis, inference, tagging of and responding to person(s) experiences

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100266213A1 (en) * 2009-04-16 2010-10-21 Hill Daniel A Method of assessing people's self-presentation and actions to evaluate personality type, behavioral tendencies, credibility, motivations and other insights through facial muscle activity and expressions

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ADAMS ANDRA ET AL: "Automated recognition of complex categorical emotions from facial expressions and head motions", 2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), IEEE, 21 September 2015 (2015-09-21), pages 355 - 361, XP032825053, DOI: 10.1109/ACII.2015.7344595 *
BOUKRICHA H ET AL: "Pleasure-arousal-dominance driven facial expression simulation", AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION AND WORKSHOPS, 2009. ACII 2009. 3RD INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 10 September 2009 (2009-09-10), pages 1 - 7, XP031577868, ISBN: 978-1-4244-4800-5 *
DIANA DI LORENZA E ARELLANO TAVARA: "Visualization of Affect in Faces based on Context Appraisal", 1 January 2012 (2012-01-01), pages 881 - 905, XP055481766, Retrieved from the Internet <URL:https://www.tdx.cat/bitstream/handle/10803/84078/Tddlat1de1.pdf?sequence=1> [retrieved on 20180606], DOI: 10.1016/j.jm.2004.06.005 *
EL KALIOUBY R ET AL: "Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures", 20040627; 20040627 - 20040602, 27 June 2004 (2004-06-27), pages 154 - 154, XP010761935 *
GUNES HATICE ET AL: "Categorical and dimensional affect analysis in continuous input: Current trends and future directions", IMAGE AND VISION COMPUTING, ELSEVIER, GUILDFORD, GB, vol. 31, no. 2, 20 July 2012 (2012-07-20), pages 120 - 136, XP028973723, ISSN: 0262-8856, DOI: 10.1016/J.IMAVIS.2012.06.016 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523290A (en) * 2018-09-14 2019-03-26 平安科技(深圳)有限公司 Evaluation method, device, equipment and medium are paid attention to the class based on the micro- expression of audience
CN109961054A (en) * 2019-03-29 2019-07-02 山东大学 It is a kind of based on area-of-interest characteristic point movement anxiety, depression, angry facial expression recognition methods
CN112115751A (en) * 2019-06-21 2020-12-22 北京百度网讯科技有限公司 Training method and device for animal mood recognition model
CN110889908A (en) * 2019-12-10 2020-03-17 吴仁超 Intelligent sign-in system integrating face recognition and data analysis
CN110889908B (en) * 2019-12-10 2020-11-27 苏州鱼得水电气科技有限公司 Intelligent sign-in system integrating face recognition and data analysis
CN112507959A (en) * 2020-12-21 2021-03-16 中国科学院心理研究所 Method for establishing emotion perception model based on individual face analysis in video

Also Published As

Publication number Publication date
ES2633152A1 (en) 2017-09-19
ES2633152B1 (en) 2018-05-03

Similar Documents

Publication Publication Date Title
WO2018154098A1 (en) Method and system for recognizing mood by means of image analysis
US10573313B2 (en) Audio analysis learning with video data
Bandini et al. Analysis of facial expressions in parkinson's disease through video-based automatic methods
Girard et al. Spontaneous facial expression in unscripted social interactions can be measured automatically
Bishay et al. Schinet: Automatic estimation of symptoms of schizophrenia from facial behaviour analysis
Rudovic et al. Context-sensitive dynamic ordinal regression for intensity estimation of facial action units
US9547808B2 (en) Head-pose invariant recognition of facial attributes
JP6467965B2 (en) Emotion estimation device and emotion estimation method
EP3740898A1 (en) Systems and methods for evaluating individual, group, and crowd emotion engagement and attention
Griffin et al. Laughter type recognition from whole body motion
Al Osman et al. Multimodal affect recognition: Current approaches and challenges
Miyakoshi et al. Facial emotion detection considering partial occlusion of face using Bayesian network
US20220101146A1 (en) Neural network training with bias mitigation
Khatri et al. Facial expression recognition: A survey
Wilhelm Towards facial expression analysis in a driver assistance system
Alshamsi et al. Automated facial expression and speech emotion recognition app development on smart phones using cloud computing
JP2022553779A (en) Method and device for adjusting environment in cabin
Rudovic et al. 1 Machine Learning Methods for Social Signal Processing
Silva et al. Real-time emotions recognition system
Ponce-López et al. Non-verbal communication analysis in victim–offender mediations
Alugupally et al. Analysis of landmarks in recognition of face expressions
Bakchy et al. Facial expression recognition based on support vector machine using Gabor wavelet filter
Chiarugi et al. Facial Signs and Psycho-physical Status Estimation for Well-being Assessment.
Liliana et al. The fuzzy emotion recognition framework using semantic-linguistic facial features
EP3799407B1 (en) Initiating communication between first and second users

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18710784

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18710784

Country of ref document: EP

Kind code of ref document: A1