WO2020029406A1 - 人脸情绪识别方法、装置、计算机设备及存储介质 - Google Patents

人脸情绪识别方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020029406A1
WO2020029406A1 PCT/CN2018/108251 CN2018108251W WO2020029406A1 WO 2020029406 A1 WO2020029406 A1 WO 2020029406A1 CN 2018108251 W CN2018108251 W CN 2018108251W WO 2020029406 A1 WO2020029406 A1 WO 2020029406A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
facial
emotion
training sample
feature vector
Prior art date
Application number
PCT/CN2018/108251
Other languages
English (en)
French (fr)
Inventor
吴壮伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020029406A1 publication Critical patent/WO2020029406A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method, a device, a computer device, and a storage medium for facial emotion recognition.
  • facial expressions are an important carrier of human communication and an important way of non-verbal communication. It can express human emotions well.
  • human emotions affect human behavior to a certain extent. For example, when drivers are in negative emotions such as anger, sadness, and anxiety, it is easy to ignore the surrounding road conditions and reduce the speed of response to emergency things, resulting in a high incidence of traffic accidents. Based on this, the behavior of drivers and other personnel can be guided by identifying facial emotions. For example, when the driver's facial emotions are identified, if the driver is identified as having a negative emotion, the driver can be prompted to adjust his emotional state to avoid a traffic accident. Therefore, how to accurately recognize facial emotions has become an urgent technical problem.
  • the present application provides a facial emotion recognition method, device, computer equipment, and storage medium to accurately recognize facial emotions.
  • the present application provides a facial emotion recognition method, which includes: acquiring video images collected in real time; performing wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors; and obtaining standard energy Feature vector, and calculate a Euclidean distance value between each of the energy eigenvectors and the standard energy eigenvector according to an image difference calculation method; determine whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values If there is a European-style distance value exceeding the preset threshold in the plurality of European-style distance values, an image corresponding to an energy feature vector exceeding the European-style distance value of the preset threshold is taken as a key frame image, wherein the key The number of frame images is at least one; obtaining a pre-stored emotion recognition model, and recognizing a face emotion in each of the key frame images based on the emotion recognition model; and according to all the faces in the key frame images Emotionally obtain a facial emotion corresponding to the video
  • the present application provides a facial emotion recognition device, which includes: an acquiring unit for acquiring a video image collected in real time; and a transform unit for performing wavelet transformation on all frame images in the video image to Obtain a corresponding energy feature vector; a distance calculation unit configured to obtain a standard energy feature vector, and calculate an Euclidean distance value between each of the energy feature vector and the standard energy feature vector according to an image difference calculation method; a distance judgment unit For determining whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values; and a key frame acquisition unit, which is used for if there is an Euclidean distance value that exceeds the preset threshold among the plurality of Euclidean distance values Taking an image corresponding to an energy feature vector of a European distance value exceeding the preset threshold as a key frame image, wherein the number of the key frame images is at least one; an emotion recognition unit is configured to obtain a previously stored emotion recognition A model, and identifying facial emotions in
  • the present application further provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the processor is implemented when the computer program is executed.
  • the facial emotion recognition method provided by the first aspect.
  • the present application also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the first aspect.
  • the described facial emotion recognition method when executed by a processor, causes the processor to execute the first aspect.
  • the application provides a method, a device, a computer device, and a storage medium for facial emotion recognition. This method can accurately recognize facial emotions.
  • FIG. 1 is a schematic flowchart of a facial emotion recognition method according to an embodiment of the present application
  • FIGS. 2 to 6 are another schematic flowchart of a facial emotion recognition method provided by an embodiment of the present application.
  • FIG. 7 to 8 are specific schematic flowcharts of a facial emotion recognition method provided by an embodiment of the present application.
  • FIG. 9 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • 11 to 15 are another schematic block diagrams of a facial emotion recognition device according to an embodiment of the present application.
  • FIG. 16 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • the facial emotion recognition method can be applied to a facial emotion recognition system, and the facial emotion recognition system can be installed in a device with a camera function such as a mobile phone or a car.
  • the facial emotion recognition system can exist in the device as an independent system, or it can be embedded in other systems of the device.
  • a facial emotion recognition system can be embedded in a car driving system to identify the driver's emotions.
  • the facial emotion recognition system may be embedded in an application program of a mobile phone to assist the application program to realize a facial emotion recognition function.
  • the facial emotion recognition method includes steps S101 to S107.
  • the device where the facial emotion recognition system is located invokes a camera to perform real-time image acquisition of the user.
  • the device acquires video images collected within a certain period of time through a camera. For example, capture video images within 10 seconds of real-time capture. It can be understood that the video image will include multiple frames of images.
  • the facial emotion recognition method needs to use information such as a neutral expression image, a standard energy feature vector, and an emotion recognition model when performing facial emotion recognition
  • the user uses the facial emotion recognition system for facial emotion recognition.
  • the facial emotion recognition system also needs to perform the following operations:
  • FIG. 2 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101a, S101b, and S101c are also included.
  • S101b Perform wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector.
  • a neutral expression image and a standard energy feature vector need to be prepared in advance.
  • the neutral expression may be a facial expression of the user in a relatively stable mood.
  • the facial expressions commonly used by users when taking ID photos can be understood as neutral expressions.
  • the device may issue a voice prompt or a text prompt to prompt the user to make a neutral expression.
  • the image of the user's neutral expression is captured by the camera to obtain a neutral expression image.
  • neutral expression images can also be obtained in other ways.
  • a neutral expression image such as a photo of the user input is acquired.
  • the user transfers the image of the neutral expression taken in the past into the device where the facial emotion recognition system is located as a neutral expression image.
  • the identity information input by the user is obtained, and then the identity photo corresponding to the identity information is obtained from the background server as a neutral facial expression image.
  • the background server may be The background server of the vehicle system, the background server of mobile phone applications, the background server of the facial emotion recognition system, etc.
  • the background server can store the ID photos corresponding to the user's identity information, or call the third party after obtaining the identity information
  • the neutral expression image is subjected to wavelet transform using Gabor wavelet transform to obtain a corresponding standard energy feature vector, and the neutral expression image and the corresponding standard energy feature vector are stored for the convenience of the user.
  • the neutral expression image and the corresponding standard energy feature vector may be called for facial emotion recognition.
  • FIG. 3 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101d and S101e are also included.
  • S101d Obtain an emotional training sample image set, where the emotional training sample image set includes a plurality of emotional training sample images and an emotional label of a human face in the emotional training sample image.
  • S101e input the sentiment training sample image and the corresponding sentiment label into a convolutional neural network model for machine learning to obtain a sentiment recognition model, and store the sentiment recognition model.
  • an emotion recognition model needs to be prepared in advance.
  • the facial emotion recognition system needs to acquire a set of emotional training sample images.
  • the emotional training sample image set includes a large number of emotional training sample images and an emotional label of a human face corresponding to each emotional training sample image. It should be noted that the emotional labels of the faces in each of the emotional training sample images can be labeled manually, or can be labeled by other methods, which are not specifically limited here.
  • the emotional training sample image and the corresponding emotional labels of the human face are input to a convolutional neural network (English full name: Convolutional Neural Networks, CNN) model for machine learning to obtain emotional recognition Model, and then store the emotion recognition model in the device where the facial emotion recognition system is located, so as to facilitate the subsequent use of the facial emotion recognition system, the emotion recognition model can be called for emotion recognition.
  • a convolutional neural network English full name: Convolutional Neural Networks, CNN
  • the camera of the device where the facial emotion recognition system is located cannot perform a real-time image acquisition of the user, for example, the angle of the camera No, the user's face information is not collected in the video images collected in real time, or only half of the user's face information is collected.
  • Such video images will inevitably reduce the accuracy of facial emotion recognition during subsequent facial emotion recognition. . Therefore, in order to ensure that in the subsequent facial emotion recognition, a better video image of the face can be captured, and to improve the accuracy of subsequent facial emotion recognition, the camera needs to be calibrated before acquiring the real-time captured video image. work.
  • FIG. 4 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • steps S101f, S101g, S101h and S101j are also included.
  • S101g Extract a preset number of frames as a calibration image from a plurality of frames of the calibration video image according to a preset rule.
  • S101h Based on a pre-stored face detection and recognition model, determine whether face information exists in the calibration image in each frame.
  • step S101j If there is no face information in the calibration image of at least one frame, issue a prompt message so that the user adjusts the angle of the camera according to the prompt information, and after adjusting the angle of the camera, return to step S101f, Until the face information exists in the calibration image of each frame.
  • the facial emotion recognition system needs to acquire a segment of a calibration video image collected in real time.
  • the calibration video image includes multiple frames of images. Then, according to a preset extraction rule, an image with a preset number of frames is extracted from a plurality of frames of the calibration video image as a calibration image.
  • the preset extraction rule may extract one image as a calibration image every 1 second.
  • the preset frame number can be set to 100, and the preset frame number can be set according to actual requirements.
  • the preset extraction rule may not be limited to the above-mentioned rules, and may be set according to actual requirements, and is not limited here.
  • step S101 can be performed, that is, real-time acquisition is performed. Steps to capture video images.
  • a prompt message can be sent by voice or display mode, so that the user can readjust the camera angle according to the prompt
  • step S101f that is, return to the step of obtaining a calibration video image acquired in real time, until the face information is present in each frame of the calibration image, thereby completing the camera Angle calibration.
  • FIG. 5 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101k, S101m, S101n, and S101p are also included.
  • S101k Obtain a training sample image set, where the training sample image set includes multiple training sample images and a face label used to characterize whether or not face information exists in the training sample image.
  • S101m Obtain a face Hal feature vector of the training sample image.
  • S101n input the face hal feature vector and face label corresponding to the training sample image into an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model.
  • a face detection recognition model needs to be prepared in advance, so as to be used when performing camera angle calibration.
  • a training sample image set is first obtained, where the training sample image set includes multiple training sample images, and a face label corresponding to each training sample image.
  • the face label is used to characterize whether there is face information in a corresponding face sample image.
  • the Hal feature extraction of the face is performed on each training sample image to obtain the Hal feature vector of the face corresponding to each training sample image.
  • the face hal feature vector corresponding to each training sample image and the corresponding face label are input to an Adaboost lifting model based on a decision tree model for training, and a face detection and recognition model can be obtained.
  • the face detection and recognition model is stored in the device where the facial emotion recognition system is located.
  • wavelet transform is required for all frame images in the video image to obtain the energy feature vector corresponding to each frame image.
  • the wavelet transform may be, for example, a Gabor wavelet transform.
  • the wavelet transform may also adopt other methods, which is not limited herein.
  • the standard energy feature vector is an energy feature vector obtained by performing wavelet transform on a neutral expression image of a user collected in advance.
  • the standard energy feature vector is stored in advance in a device where the facial emotion recognition system is located. Because the standard energy feature vector is stored in the device in advance, obtaining the standard energy feature vector is specifically to obtain the previously stored standard energy feature vector.
  • the Euclidean distance value between each energy feature vector and the standard energy feature vector in step S102 will be calculated according to the image difference calculation method.
  • the standard energy feature vector since the standard energy feature vector is stored in advance in the device where the facial emotion recognition system is located, the standard energy feature vector can be directly called in step S103, thereby reducing the Occupation of CPU resources of the device on which the facial emotion recognition system is located, reducing calculation time, etc.
  • the device on which the facial emotion recognition system is located may only store the neutral expression image in advance. In this way, when obtaining the standard energy feature vector in step S103, the previously stored neutral expression image is obtained first, and then Wavelet transform is performed on the neutral expression image to obtain a standard energy feature vector, and there is no limitation on the time for calculating the standard energy feature vector.
  • step S103 After calculating the Euclidean distance value between each energy feature vector and the standard energy feature vector in step S103, a plurality of Euclidean distance values are obtained, and then it is determined whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values. If there is an Euclidean distance value exceeding a preset threshold, it indicates that the difference between the facial expression and the neutral expression in the video image is large, and step S105 is executed at this time.
  • FIG. 6 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • step S104 determines that there is no Euclidean distance value exceeding a preset threshold.
  • step S108 will be executed, that is, the standard energy feature vector corresponding to The neutral expression image is used as the key frame image.
  • the subsequent steps S106 and S107 are performed.
  • the facial emotion corresponding to the video image may also be set as a neutral emotion to complete the recognition of the facial emotion.
  • an image corresponding to an energy feature vector exceeding the Euclidean distance value exceeding the preset threshold is used as a key frame image, where the The number of key frame images is at least one.
  • the number of Euclidean distance values exceeding a preset threshold may be one, or may be two or more. At this time, the number of key frame images is at least one.
  • the emotion recognition model is a model for recognizing facial emotions obtained by performing machine learning training in advance
  • the emotion recognition module may be, for example, a convolutional neural network model.
  • the device where the facial emotion recognition system is located first obtains the emotion recognition model, and then inputs the key frame image as an input value into the emotion recognition model.
  • the emotion recognition model performs emotion recognition on the key frame images to output each key frame image. Face emotions.
  • FIG. 7 is a specific schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • This step S106 includes steps S1061 to S1063.
  • Each of the key frame images is sequentially input as an input value into the emotion recognition model.
  • each key frame image is sequentially input as an input value into an emotion recognition model, and then the emotion recognition model outputs probability values of each key frame image on various preset emotions.
  • the multiple preset emotions include 7 preset emotions such as fear, anger, romance, disgust, joy, surprise, and neutrality.
  • the emotion recognition model will recognize the probability of the face emotion in each key frame image on these 7 preset expressions.
  • the emotion recognition model recognizes the face emotion in a key frame image in the above 7 preset expressions.
  • the emotional probabilities are 10%, 70%, 15%, 5%, 0%, 0%, and 0%.
  • the emotion corresponding to the larger probability value among the multiple probability values corresponding to each key frame image is used as the facial emotion in the key frame image.
  • 70% of the anger emotions with the largest probability value are used as the facial emotions corresponding to a certain key frame image.
  • FIG. 8 is a specific schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • This step S107 includes steps S1071 to S1072.
  • the facial emotion with a higher probability of occurrence is used as the facial emotion corresponding to the video image to complete the recognition of the facial emotion.
  • the number of key frame images is 10, and the facial emotion of 8 key frame images is identified by emotion recognition model as anger, the facial emotion of 1 key frame image is aversion, and the facial emotion of one key frame image is For fear, by performing probability statistics on the facial emotions of the 10 key frame images, it can be concluded that the probability of the appearance of angry facial emotions is 80%, the probability of the appearance of disgusted facial emotions is 10%, and the fear of facial emotions The probability of occurrence is 10%. In this way, the anger emotion with a high probability of occurrence can be used as the facial emotion corresponding to the entire video image, thereby completing the recognition of the facial emotion within the time period corresponding to the video image.
  • FIG. 9 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. After step S107, steps S109 to S112 are also included.
  • S111 Determine whether a probability of a facial emotion belonging to a preset emotion category exceeds a preset probability value.
  • the preset emotion category is the negative emotion category.
  • the facial emotions included in the negative emotion category are four types of fear, anger, sadness, and disgust.
  • the number of video images in the emotion list within two minutes is 100, then there will be 100 facial emotions, and then the probability of facial emotions belonging to the negative emotion category among these 100 facial emotions will be counted, such as The probability is 99%.
  • the probability of facial emotions belonging to the category of negative emotions exceeds the preset probability value of 80%, it means that the user has been in negative emotions within these 2 minutes.
  • the preset reminder method and pre-emption will be obtained.
  • Set prompt information and prompt the user with the preset prompt information according to the preset prompt mode.
  • the preset prompt mode may be, for example, a voice prompt mode, a text display mode, a voice prompt and vibration combination mode, and the like.
  • the preset prompt information may be, for example, "your current mood is low, please pay attention to driving safely” and the like.
  • the facial emotion recognition method in this embodiment can accurately recognize facial emotions.
  • An embodiment of the present application further provides a facial emotion recognition device, which is configured to execute any one of the foregoing facial emotion recognition methods.
  • a facial emotion recognition device which is configured to execute any one of the foregoing facial emotion recognition methods.
  • FIG. 10 is a schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition device 300 may be installed in a device such as a car or a mobile phone.
  • the facial emotion recognition device 300 includes an acquisition unit 301, a transformation unit 302, a distance calculation unit 303, a distance judgment unit 304, a key frame acquisition unit 305, an emotion recognition unit 306 and an emotion acquisition unit 307.
  • the obtaining unit 301 is configured to obtain a video image collected in real time.
  • FIG. 11 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition apparatus 300 further includes a storage unit 308.
  • the obtaining unit 301 is further configured to obtain a neutral expression image.
  • the transform unit 302 is further configured to perform wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector.
  • the storage unit 308 is configured to store the neutral expression image and the standard energy feature vector.
  • FIG. 12 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition apparatus 300 further includes an emotion model training unit 309.
  • the obtaining unit 301 is further configured to obtain an emotional training sample image set, where the emotional training sample image set includes a plurality of emotional training sample images and an emotional label of a human face in the emotional training sample image.
  • An emotional model training unit 309 is configured to input the emotional training sample image and a corresponding emotional label into a convolutional neural network model and perform machine learning to obtain an emotional recognition model, and store the emotional recognition model.
  • FIG. 13 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition device 300 further includes an extraction unit 310, a face determination unit 311, and a prompting unit 312.
  • the obtaining unit 301 is further configured to obtain a calibration video image collected in real time.
  • the extraction unit 310 is configured to extract an image of a preset number of frames from a plurality of frames of the calibration video image according to a preset rule as a calibration image.
  • the face judging unit 311 is configured to determine whether face information exists in the calibration image in each frame based on a face detection and recognition model stored in advance.
  • the obtaining unit 301 is further configured to obtain real-time collected video images if face information exists in the calibration image in each frame.
  • a prompting unit 312 is configured to send a prompting message if the face information does not exist in at least one frame of the calibration image, so that the user can adjust the angle of the camera according to the prompting information, and obtain the unit after adjusting the angle of the camera.
  • Step 301 returns to performing the step of acquiring a calibration video image collected in real time, until face information exists in the calibration image in each frame.
  • FIG. 14 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition apparatus 300 further includes a vector acquisition unit 313 and a face model training unit 314.
  • the obtaining unit 301 is further configured to obtain a training sample image set, where the training sample image set includes a plurality of training sample images and a face label used to characterize whether there is face information in the training sample image.
  • a vector obtaining unit 313 is configured to obtain a face Hal feature vector of the training sample image.
  • a face model training unit 314, configured to input a face Hal feature vector and a face label corresponding to the training sample image into an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model, and The face detection recognition model is stored.
  • a transformation unit 302 is configured to perform wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors.
  • the distance calculation unit 303 is configured to obtain a standard energy feature vector, and calculate a Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method.
  • the distance judging unit 304 is configured to judge whether there is an Euclidean distance value among the plurality of Euclidean distance values that exceeds a preset threshold.
  • a key frame obtaining unit 305 is configured to use, as a key frame, an image corresponding to an energy feature vector of the Euclidean distance value exceeding the preset threshold value if there are Euclidean distance values exceeding the preset threshold value among the plurality of Euclidean distance values.
  • the key frame obtaining unit 305 is further configured to, if there is no Euclidean distance value exceeding the preset threshold value among the plurality of Euclidean distance values, convert the neutral expression image corresponding to the standard energy feature vector As the key frame image.
  • the emotion recognition unit 306 is configured to obtain a pre-stored emotion recognition model, and recognize a facial emotion in each of the key frame images based on the emotion recognition model.
  • the emotion recognition unit 306 is specifically configured to: sequentially input each of the key frame images as an input value into the emotion recognition model; and obtain each of the information output by the emotion recognition model.
  • the emotion acquiring unit 307 is configured to acquire the facial emotion corresponding to the video image according to the facial emotions in all the key frame images to complete the recognition of the facial emotion.
  • the emotion acquiring unit 307 is specifically configured to: perform probability statistics on the facial emotions in all the key frame images; and use the facial emotions with a higher probability of occurrence as the video image corresponding Facial emotions to complete recognition of facial emotions.
  • FIG. 15 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition device 300 further includes a recording unit 315, a statistics unit 316, a probability judgment unit 317, and an information prompting unit 318.
  • the recording unit 315 is configured to record a time period corresponding to the video image and a facial emotion corresponding to the video image into an emotion list.
  • the statistics unit 316 is configured to count, according to the emotion list, a probability of a facial emotion belonging to a preset emotional category among facial emotions corresponding to all the video images in a preset time period.
  • the probability judging unit 317 is configured to judge whether a probability of a facial emotion belonging to a preset emotion category exceeds a preset probability value.
  • An information prompting unit 318 is configured to obtain a preset prompting method and preset prompting information if the probability of a face emotion belonging to the preset emotion category exceeds the preset probability value, and provide the user with a preset prompting method according to the preset prompting method. Prompt the preset prompt information.
  • the facial emotion recognition device 300 in this embodiment can accurately recognize facial emotions.
  • the above-mentioned facial emotion recognition device can be implemented in the form of a computer program, which can be run on a computer device as shown in FIG. 16.
  • FIG. 16 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 500 may be a terminal such as a mobile phone, or may be a device used in a car.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501.
  • the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the computer program 5032 includes program instructions. When the program instructions are executed, the processor 502 can execute a method for facial emotion recognition.
  • the processor 502 is used to provide computing and control capabilities to support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for running a computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute a method for facial emotion recognition.
  • the network interface 505 is used for network communication, such as sending assigned tasks.
  • 16 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment 500 to which the solution of the present application is applied.
  • the specific computer equipment 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in a memory to achieve the following functions: acquiring video images collected in real time; performing wavelet transformation on all frame images in the video image to obtain corresponding energy characteristics Vector; obtaining a standard energy feature vector, and calculating an Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method; determining whether a plurality of the Euclidean distance values exceed a preset value Threshold Euclidean distance value; if there is an Euclidean distance value exceeding the preset threshold among the plurality of Euclidean distance values, an image corresponding to an energy feature vector exceeding the Euclidean distance value exceeding the preset threshold is taken as a key frame image, Wherein, the number of the key frame images is at least one; acquiring a pre-stored emotion recognition model, and recognizing a facial emotion in each of the key frame images based on the emotion recognition model; and according to all the key frames The facial emotion in the image acquires the facial emotion
  • the processor 502 before executing the real-time acquisition of the video image, the processor 502 also implements the following functions: acquiring a neutral expression image; performing wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector; and The neutral expression image and the standard energy feature vector are stored.
  • the processor 502 also implements the following function before acquiring video images acquired in real time: acquiring an emotional training sample image set, wherein the emotional training sample image set includes a plurality of emotional training sample images and the Emotion labels of human faces in the emotion training sample image; and inputting the emotion training sample images and corresponding emotion labels into a convolutional neural network model for machine learning to obtain an emotion recognition model, and storing the emotion recognition model.
  • the processor 502 also implements the following function before acquiring video images acquired in real time: acquiring a training sample image set, wherein the training sample image set includes multiple training sample images and is used to characterize the Whether there is a face label of the face information in the training sample image; obtaining the face Hal feature vector of the training sample image; and inputting the face Hal feature vector corresponding to the training sample image and the face label into a decision-based Training in the Adaboost lifting model of the tree model to obtain a face detection and recognition model; and storing the face detection and recognition model.
  • the processor 502 before executing the real-time video image acquisition, the processor 502 also implements the following functions: acquiring a real-time calibration video image; and extracting a pre-defined video from a plurality of frames of the calibration video image according to a preset rule.
  • the processor 502 when the processor 502 recognizes the facial emotion in each of the key frame images based on the emotion recognition model, the processor 502 specifically implements the following function: inputting each of the key frame images as an input value in sequence To the emotion recognition model; obtaining probability values of each of the key frame images output by the emotion recognition model on a plurality of preset emotions; and among the plurality of probability values corresponding to each of the key frame images The emotion corresponding to the larger probability value is used as the facial emotion in the key frame image.
  • the processor 502 when the processor 502 executes acquiring facial emotions corresponding to the video image according to the facial emotions in all the key frame images to complete the recognition of the facial emotions, it specifically implements the following functions: Probability statistics are performed on the facial emotions in the key frame image; and facial emotions with a higher probability of occurrence are used as facial emotions corresponding to the video image to complete the recognition of facial emotions.
  • the processor 502 may be a central processing unit, and the processor 502 may also be other general-purpose processors, digital signal processors, application specific integrated circuits, ready-made programmable gate arrays, or other programmable logic. Devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • a person of ordinary skill in the art can understand that all or part of the processes in the embodiment of the method for recognizing facial emotions described above can be performed by a computer program instructing related hardware.
  • the computer program may be stored in a computer-readable storage medium.
  • the computer program is executed by at least one processor in the computer system to implement the process steps of the embodiment including the facial emotion recognition method as described above.
  • the storage medium may be various media that can store program codes, such as a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
  • program codes such as a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
  • each unit is only a logical function division, and there may be another division manner in actual implementation.
  • the steps in the method of the embodiment of the present application can be adjusted, combined, and deleted according to actual needs.
  • the units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
  • Each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are instructions for causing a computer device (which may be a personal computer, a terminal, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

一种人脸情绪识别方法、装置、计算机设备及存储介质。该方法包括获取视频图像中每帧图像的能量特征向量;计算每个能量特征向量与标准能量特征向量之间的欧式距离值;根据欧式距离值筛选出关键帧图像;识别每个关键帧图像中的人脸情绪;根据所有关键帧图像中的人脸情绪获取视频图像对应的人脸情绪以完成人脸情绪的识别。

Description

人脸情绪识别方法、装置、计算机设备及存储介质
本申请要求于2018年8月7日提交中国专利局、申请号为201810892915.6、发明名称为“人脸情绪识别方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种人脸情绪识别方法、装置、计算机设备及存储介质。
背景技术
在人们的日常生活中,通过语言来传递的信息占7%,通过声音来传递的信息占38%,而通过面部表情来传递的信息则达到55%。由此可见人脸表情是人类交流的重要载体和非语言交流的一种重要方式,它可以很好地表达出人类的情感状态。
一般情况下,人类的情感状态会在一定程度上影响人类的行为活动。譬如,当司机处于愤怒、悲伤、焦虑等负面情绪时,就很容易忽略周围的路况、对应急事物的反应速度降低,导致交通事故发生率较高。基于这一点,可以通过对人脸情绪进行识别来指导司机等人员的行为。譬如,当通过对司机人脸情绪进行识别时,若识别出司机处于负面情绪,可以提示司机调整情绪状态以避免发生交通事故。因此,如何准确地识别出人脸情绪成为亟待解决的技术问题。
发明内容
本申请提供了一种人脸情绪识别方法、装置、计算机设备及存储介质,以准确地识别人脸情绪。
第一方面,本申请提供了一种人脸情绪识别方法,其包括:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能 量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
第二方面,本申请提供了一种人脸情绪识别装置,其包括:获取单元,用于获取实时采集的视频图像;变换单元,用于对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;距离计算单元,用于获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;距离判断单元,用于判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;关键帧获取单元,用于若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;情绪识别单元,用于获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及情绪获取单元,用于根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
第三方面,本申请又提供了一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现第一方面提供的所述的人脸情绪识别方法。
第四方面,本申请还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行第一方面提供的所述的人脸情绪识别方法。
本申请提供一种人脸情绪识别方法、装置、计算机设备及存储介质。该方法可以准确地识别出人脸情绪。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要 使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种人脸情绪识别方法的示意流程图;
图2至图6均为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图;
图7至图8均为本申请实施例提供的一种人脸情绪识别方法的具体示意流程图;
图9为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图;
图10为本申请实施例提供的一种人脸情绪识别装置的示意性框图;
图11至图15均为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图;
图16为本申请实施例提供的一种计算机设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
请参阅图1,图1是本申请实施例提供的一种人脸情绪识别方法的示意流程图。该人脸情绪识别方法可以应用于人脸情绪识别***,该人脸情绪识别***可以安装于手机、汽车等具备摄像功能的设备中。该人脸情绪识别***可以作为独立的***存在于设备中,也可以嵌入至设备的其他***中。譬如,人脸情绪识别***可内嵌至车驾驶***中,以识别司机的情绪。又譬如,该人脸情绪识别***可内嵌至手机的某个应用程序中,以辅助该应用程序实现人脸情绪识别功能等。如图1所示,该人脸情绪识别方法包括步骤S101~S107。
S101、获取实时采集的视频图像。
当用户开启人脸情绪识别***以进行人脸情绪识别时,该人脸情绪识别***所在的设备调用摄像头以对用户进行实时的图像采集。该设备通过摄像头获取实时采集的一定时间段内的视频图像。譬如,获取实时采集的10秒内的视频 图像。可以理解的是,该视频图像将包括多帧图像。
由于该人脸情绪识别方法在进行人脸情绪识别时,需要使用到中性表情图像、标准能量特征向量、情绪识别模型等信息,因此,在用户使用该人脸情绪识别***进行人脸情绪识别之前,即,在步骤S101之前,人脸情绪识别***还需执行以下操作:
在一实施例中,如图2所示,图2为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101a、S101b和S101c。
S101a、获取中性表情图像。
S101b、对所述中性表情图像进行小波变换以得到对应的标准能量特征向量。
S101c、存储所述中性表情图像和标准能量特征向量。
在图2所示的实施例中,在进行人脸情绪识别之前,需要预先准备好中性表情图像和标准能量特征向量。其中,该中性表情可以为用户在较为平稳的情绪下的面部表情。譬如,用户在拍摄证件照片时一般采用的表情可理解为中性表情。
当用户首次使用该人脸情绪识别***时,设备可以发出语音提示或文字提示等以提示用户做好中性表情。在用户做好中性表情后,通过摄像头拍摄用户的中性表情的图像以获取到中性表情图像。
当然,也可以通过其他方式获取到中性表情图像。譬如,当用户首次使用该人脸情绪识别***时,获取用户输入的证件照等中性表情图像。也就是说,用户将以往拍过的中性表情的图像传入人脸情绪识别***所在的设备中作为中性表情图像。又譬如,当用户首次使用该人脸情绪识别***时,获取用户输入的身份信息,然后根据身份信息从后台服务器中获取身份信息对应的证件照片作为中性表情图像,其中,该后台服务器可以为车载***的后台服务器、手机应用程序的后台服务器、人脸情绪识别***的后台服务器等等,该后台服务器可以存储用户的身份信息对应的证件照片,也可以在获取到身份信息后,调用第三方服务器或通过网络爬虫等技术从网络数据中获取用户的身份信息对应的证件照片等。在此不对中性表情图像的获取方式做限制。
当获取到中性表情图像后,对中性表情图像采用Gabor小波变换等方式进行小波变换以得到对应的标准能量特征向量,并存储该中性表情图像和对应的 标准能量特征向量,以方便用户在使用该人脸情绪识别***进行人脸情绪识别时,可以调用该中性表情图像和对应的标准能量特征向量进行人脸情绪识别。
在一实施例中,如图3所示,图3为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101d和S101e。
S101d、获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签。
S101e、将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。
在图3所示的实施例中,在进行人脸情绪识别之前,需要预先准备好情绪识别模型。具体地,人脸情绪识别***需要获取情绪训练样本图像集。该情绪训练样本图像集包括大量的情绪训练样本图像和每个情绪训练样本图像对应的人脸的情绪标签。需要说明的是,每张情绪训练样本图像中人脸的情绪标签可以通过人工方式进行标记,也可以通过其他方法进行标记,在此不做具体限制。
在获得情绪训练样本图像集后,将情绪训练样本图像和对应的人脸的情绪标签输入至卷积神经网络(英文全称:Convolutional Neural Networks,简称:CNN)模型中进行机器学习,从而获得情绪识别模型,再将情绪识别模型存储在人脸情绪识别***所在的设备中,以方便后续使用人脸情绪识别***时,可以调用该情绪识别模型进行情绪识别。
在一实施例中,当用户开启人脸情绪识别***以进行人脸情绪识别时,若人脸情绪识别***所在的设备的摄像头不能很好地对用户进行实时的图像采集,譬如,摄像头的角度不对,使得实时采集的视频图像中没有采集到用户脸部信息,或者用户脸部信息只采集到一半,这样的视频图像在后续进行人脸情绪识别时,势必会降低人脸情绪识别的准确率。因此,为了确保后续在进行人脸情绪识别时,可以拍摄到较好的人脸的视频图像,提高后续人脸情绪识别的准确性,在获取实时采集的视频图像之前,还需要对摄像头进行校准的工作。
具体地,如图4所示,图4为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101f、S101g、S101h和S101j。
S101f、获取实时采集的校准视频图像。
S101g、从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像。
S101h、基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息。
S101j、若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行S101f的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。
在图4所示的实施例中,人脸情绪识别***需要获取实时采集的一段校准视频图像。可以理解的是,该校准视频图像包括多帧图像。然后按照预设抽取规则从校准视频图像的多帧图像中抽取出预设帧数的图像作为校准图像。
在一实施例中,该预设抽取规则可以为每间隔1秒钟抽取1张图像作为校准图像。预设帧数可以设置为100,该预设帧数可以根据实际需求进行设置,另外,该预设抽取规则也可以不局限于上述的规则,可以根据实际需求进行设置,在此不做限制。在获得多帧校准图像后,获取预先存储的人脸检测识别模型,该人脸检测识别模型是用来识别校准图像中是否存在人脸信息的。
若通过人脸检测识别模型判断出每个校准图像中均存在人脸信息,说明当前摄像头的角度良好,可以拍摄较好的人脸的视频图像,此时,可以执行步骤S101,即执行获取实时采集的视频图像的步骤。
若至少有一帧校准图像中不存在人脸信息,说明当前摄像头的角度不好,需要进行调整,此时可以通过语音方或显示方式等发出提示信息,以使得用户根据提示信息重新调整摄像头的角度,并在调整好摄像头角度后,重新返回执行S101f的步骤,即返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止,从而完成对摄像头的角度校准。
由于在进行摄像头的角度校准时,需要使用到人脸检测识别模型,因此,该人脸检测识别模型需要预先生成并存储在人脸情绪识别***所在的设备中。在一实施例中,如图5所示,图5为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101k、S101m、S101n、和S101p。
S101k、获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签。
S101m、获取所述训练样本图像的人脸哈尔特征向量。
S101n、将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至 基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型。
S101p、存储所述人脸检测识别模型。
在图5所示的实施例中,在进行人脸情绪识别之前,需要预先准备好人脸检测识别模型,以便于在进行摄像头角度校准时使用。具体地,首先获取训练样本图像集,该训练样本图像集中包括多个训练样本图像,以及每个训练样本图像对应的人脸标签。该人脸标签用于表征对应的人脸样本图像中是否存在人脸信息的。然后,对每个训练样本图像进行人脸的哈尔特征提取,以获取到每个训练样本图像对应的人脸哈尔特征向量。再将每个训练样本图像对应的人脸哈尔特征向量以及对应的人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,这就可以得到人脸检测识别模型。最后将该人脸检测识别模型存储在人脸情绪识别***所在的设备中。
S102、对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量。
在步骤S101获得视频图像后,需要对视频图像中的所有帧图像进行小波变换,以获取到每帧图像对应的能量特征向量。在一实施例中,该小波变换可例如为Gabor小波变换,当然,该小波变换还可以采用其他方法,在此不做限制。
S103、获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值。
该标准能量特征向量为预先采集的用户的中性表情图像进行小波变换后的能量特征向量。在本实施例中,该标准能量特征向量预先存储于人脸情绪识别***所在的设备中。由于设备中预先存储了该标准能量特征向量,因此,获取标准能量特征向量,具体为获取预先存储的标准能量特征向量。
在获得标准能量特征向量后,将根据图像差分运算方法计算步骤S102中每个能量特征向量与标准能量特征向量之间的欧式距离值。
需要说明的是,在本实施例中,由于该标准能量特征向量是预先存储于人脸情绪识别***所在的设备中,这样在步骤S103中就可以直接调用该标准能量特征向量,从而减小对人脸情绪识别***所在的设备的CPU资源的占用,降低计算时间等。当然,在其他实施例中,人脸情绪识别***所在的设备也可以只预先存储中性表情图像,这样,在步骤S103获取标准能量特征向量时,先获取预先存储的中性表情图像,然后再对中性表情图像进行小波变换以得到标准能 量特征向量,在此不对计算标准能量特征向量的时间做限制。
S104、判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值。
在步骤S103计算出每个能量特征向量与标准能量特征向量之间的欧式距离值后,会得到多个欧式距离值,然后判断多个欧式距离值中是否存在超过预设阈值的欧式距离值。若存在超过预设阈值的欧式距离值,说明视频图像中的人脸表情与中性表情之间的差距较大,此时执行步骤S105。
在一实施例中,如图6所示,图6为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。当步骤S104判断出不存在超过预设阈值的欧式距离值时,说明当前的视频图像中人脸表情与中性表情差距较小,此时将执行步骤S108,即将所述标准能量特征向量对应的中性表情图像作为所述关键帧图像。然后再执行后续的步骤S106和S107等。当然,在其他实施例中,若步骤S104判断出不存在超过预设阈值的欧式距离值,也可以直接设置视频图像对应的人脸情绪为中性情绪,从而完成人脸情绪的识别。
S105、若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个。
在本实施例中,超过预设阈值的欧式距离值的个数可以为一个,也可以为两个或更多个,此时关键帧图像的个数就为至少一个。
S106、获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪。
在本实施例中,该情绪识别模型为预先进行机器学习训练得到的用于识别人脸情绪的模型,该情绪识别模块可例如为卷积神经网络模型。人脸情绪识别***所在的设备先获取该情绪识别模型,然后将关键帧图像作为输入值输入至情绪识别模型中,该情绪识别模型对关键帧图像进行情绪识别,以输出每个关键帧图像中的人脸情绪。
具体地,在一实施例中,如图7所示,图7为本申请实施例提供的一种人脸情绪识别方法的具体示意流程图。该步骤S106包括步骤S1061至S1063。
S1061、依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中。
S1062、获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪 上的概率值。
S1063、将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。
在图7所示的实施例中,依次将每个关键帧图像作为输入值输入至情绪识别模型中,然后情绪识别模型会输出每个关键帧图像在多种预设情绪上的概率值。譬如,多种预设情绪包括害怕、愤怒、悲哀、厌恶、高兴、惊奇和中性等7种预设的情绪。情绪识别模型会识别出每个关键帧图像中的人脸情绪在这7种预设表情上的概率,如,情绪识别模型识别出某个关键帧图像中的人脸情绪在上述7种预设情绪上的概率依次为10%、70%、15%、5%、0%、0%和0%。
然后,将每个关键帧图像对应的多个概率值中较大的概率值对应的情绪作为该关键帧图像中的人脸情绪。譬如,将概率值最大的70%的愤怒情绪作为某个关键帧图像对应的人脸情绪。
S107、根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
具体地,在一实施例中,如图8所示,图8为本申请实施例提供的一种人脸情绪识别方法的具体示意流程图。该步骤S107包括步骤S1071至S1072。
S1071、对所有所述关键帧图像中的人脸情绪进行概率统计。
S1072、将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
譬如,关键帧图像的个数为10个,通过情绪识别模型识别出8个关键帧图像的人脸情绪为愤怒,1个关键帧图像的人脸情绪为厌恶,一个关键帧图像的人脸情绪为害怕,通过对10个关键帧图像的人脸情绪进行概率统计,可以得出愤怒的人脸情绪出现的概率为80%,厌恶的人脸情绪出现的概率为10%,害怕的人脸情绪出现的概率为10%。这样就可以将出现概率较大的愤怒情绪作为整段视频图像对应的人脸情绪,从而完成在视频图像对应的时间段内人脸情绪的识别。
在一实施例中,如图9所示,图9为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S107之后,还包括步骤S109至S112。
S109、将所述视频图像对应的时间段以及所述视频图像对应的人脸情绪记录至情绪列表中。
S110、根据所述情绪列表,统计预设时间段内的所有所述视频图像对应的人脸情绪中属于预设情绪类的人脸情绪的概率。
S111、判断属于预设情绪类的人脸情绪的概率是否超过预设概率值。
S112、若属于所述预设情绪类的人脸情绪的概率超过所述预设概率值,获取预设提示方式和预设提示信息,并根据所述预设提示方式向用户提示所述预设提示信息。
譬如,假设预设时间段为2分钟,预设情绪类为负面情绪类,该负面情绪类所包括的人脸情绪为害怕、愤怒、悲哀和厌恶四种。同时,假设情绪列表中2分钟内的视频图像的个数为100个,那么就会有100个人脸情绪,然后统计这100个人脸情绪中属于负面情绪类的人脸情绪所占的概率,比如概率为99%,当属于负面情绪类的人脸情绪所占的概率超过预设概率值80%时,说明用户在这2分钟内一直处于负面情绪中,此时将获取预设提示方式和预设提示信息,并根据预设提示方式向用户提示预设提示信息。其中,该预设提示方式可例如为语音提示方式、文字显示方式、语音提示与震动组合方式等等。该预设提示信息可例如为“您目前的情绪较低落,请注意安全驾驶”等。
本实施例中的人脸情绪识别方法,可以准确地识别出人脸情绪。
本申请实施例还提供一种人脸情绪识别装置,该人脸情绪识别装置用于执行前述任一项人脸情绪识别方法。具体地,请参阅图10,图10是本申请实施例提供的一种人脸情绪识别装置的示意性框图。人脸情绪识别装置300可以安装于汽车、手机等设备中。
如图10所示,人脸情绪识别装置300包括获取单元301、变换单元302、距离计算单元303、距离判断单元304、关键帧获取单元305、情绪识别单元306和情绪获取单元307。
获取单元301,用于获取实时采集的视频图像。
在一实施例中,如图11所示,图11为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括存储单元308。
获取单元301,还用于获取中性表情图像。
变换单元302,还用于对所述中性表情图像进行小波变换以得到对应的标准能量特征向量。
存储单元308,用于存储所述中性表情图像和标准能量特征向量。
在一实施例中,如图12所示,图12为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括情绪模型训练单元309。
获取单元301,还用于获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签。
情绪模型训练单元309,用于将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。
在一实施例中,如图13所示,图13为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括抽取单元310、人脸判断单元311和提示单元312。
获取单元301,还用于获取实时采集的校准视频图像。
抽取单元310,用于从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像。
人脸判断单元311,用于基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息。
获取单元301,还用于若每帧所述校准图像中均存在人脸信息,获取实时采集的视频图像。
提示单元312,用于若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,获取单元301返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。
相应地,在一实施例中,如图14所示,图14为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括向量获取单元313和人脸模型训练单元314。
获取单元301,还用于获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签。
向量获取单元313,用于获取所述训练样本图像的人脸哈尔特征向量。
人脸模型训练单元314,用于将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型,以及存储所述人脸检测识别模型。
变换单元302,用于对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量。
距离计算单元303,用于获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值。
距离判断单元304,用于判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值。
关键帧获取单元305,用于若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个。
在一实施例中,关键帧获取单元305,还用于若多个所述欧式距离值中不存在超过所述预设阈值的欧式距离值,将所述标准能量特征向量对应的中性表情图像作为所述关键帧图像。
情绪识别单元306,用于获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪。
具体地,在一实施例中,该情绪识别单元306具体用于:依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中;获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪上的概率值;以及将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。
情绪获取单元307,用于根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
具体地,在一实施例中,该情绪获取单元307具体用于:对所有所述关键帧图像中的人脸情绪进行概率统计;以及将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
在一实施例中,如图15所示,图15为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括记录单元315、统 计单元316、概率判断单元317和信息提示单元318。
记录单元315,用于将所述视频图像对应的时间段以及所述视频图像对应的人脸情绪记录至情绪列表中。
统计单元316,用于根据所述情绪列表,统计预设时间段内的所有所述视频图像对应的人脸情绪中属于预设情绪类的人脸情绪的概率。
概率判断单元317,用于判断属于预设情绪类的人脸情绪的概率是否超过预设概率值。
信息提示单元318,用于若属于所述预设情绪类的人脸情绪的概率超过所述预设概率值,获取预设提示方式和预设提示信息,并根据所述预设提示方式向用户提示所述预设提示信息。
需要说明的是,所属领域的技术人员可以清楚地了解到,上述人脸情绪识别装置300和各单元的具体实现过程,可以参考前述人脸情绪识别方法实施例中的相应描述,为了描述的方便和简洁,在此不再赘述。
本实施例中的人脸情绪识别装置300可以准确地识别出人脸情绪。
上述人脸情绪识别装置可以实现为一种计算机程序的形式,该计算机程序可以在如图16所示的计算机设备上运行。
请参阅图16,图16是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备500可以是手机等终端,也可以是应用于汽车中的设备。
参阅图16,该计算机设备500包括通过***总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。
该非易失性存储介质503可存储操作***5031和计算机程序5032。该计算机程序5032包括程序指令,该程序指令被执行时,可使得处理器502执行一种人脸情绪识别方法。该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行一种人脸情绪识别方法。该网络接口505用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图16中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备 500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现如下功能:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取实时采集的校准视频图像;从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像;基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息;若每帧所述校准图 像中均存在人脸信息,执行获取实时采集的视频图像的步骤;若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。
在一实施例中,处理器502在执行基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪时,具体实现如下功能:依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中;获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪上的概率值;以及将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。
在一实施例中,处理器502在执行根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别时,具体实现如下功能:对所有所述关键帧图像中的人脸情绪进行概率统计;以及将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
应当理解,在本申请实施例中,处理器502可以是中央处理单元,该处理器502还可以是其他通用处理器、数字信号处理器、专用集成电路、现成可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
本领域普通技术人员可以理解的是实现上述人脸情绪识别方法实施例中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成。该计算机程序可存储于一计算机可读存储介质中。该计算机程序被该计算机***中的至少一个处理器执行,以实现包括如上述各人脸情绪识别方法的实施例的流程步骤。
该存储介质可以是U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,各个单元的划分仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。本申请实施例装置中的单元可以根据实际需要进行合并、划分和删减。在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元 集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,终端,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种人脸情绪识别方法,其包括:
    获取实时采集的视频图像;
    对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;
    获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;
    判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;
    若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;
    获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及
    根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
  2. 根据权利要求1所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。
  3. 根据权利要求1所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。
  4. 根据权利要求1所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模 型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。
  5. 根据权利要求4所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取实时采集的校准视频图像;从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像;基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息;若每帧所述校准图像中均存在人脸信息,执行获取实时采集的视频图像的步骤;若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。
  6. 根据权利要求1所述的人脸情绪识别方法,其中,所述基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪,包括:依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中;获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪上的概率值;以及将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。
  7. 根据权利要求1所述的人脸情绪识别方法,其中,所述根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别,包括:对所有所述关键帧图像中的人脸情绪进行概率统计;以及将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
  8. 根据权利要求2所述的人脸情绪识别方法,其中,在所述判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值之后,还包括:若多个所述欧式距离值中不存在超过所述预设阈值的欧式距离值,将所述标准能量特征向量对应的中性表情图像作为所述关键帧图像,并返回执行获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪的步骤。
  9. 根据权利要求2所述的人脸情绪识别方法,其中,在所述判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值之后,还包括:若多个所述欧式距离值中不存在超过所述预设阈值的欧式距离值,设置所述视频图像对应 的人脸情绪为中性情绪,以完成人脸情绪的识别。
  10. 根据权利要求1所述的人脸情绪识别方法,其中,在所述根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪之后,还包括:将所述视频图像对应的时间段以及所述视频图像对应的人脸情绪记录至情绪列表中;根据所述情绪列表,统计预设时间段内的所有所述视频图像对应的人脸情绪中属于预设情绪类的人脸情绪的概率;判断属于所述预设情绪类的人脸情绪的概率是否超过预设概率值;若属于所述预设情绪类的人脸情绪的概率超过所述预设概率值,获取预设提示方式和预设提示信息,并根据所述预设提示方式向用户提示所述预设提示信息。
  11. 一种人脸情绪识别装置,其包括:
    获取单元,用于获取实时采集的视频图像;
    变换单元,用于对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;
    距离计算单元,用于获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;
    距离判断单元,用于判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;
    关键帧获取单元,用于若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;
    情绪识别单元,用于获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及
    情绪获取单元,用于根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
  12. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个 所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
  13. 根据权利要求12所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。
  14. 根据权利要求12所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。
  15. 根据权利要求12所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。
  16. 根据权利要求15所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取实时采集的校准视频图像;从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像;基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息;若每帧所述校准图像中均存在人脸信息,执行获取实时采集的视频图像的步骤;若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。
  17. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行如下步骤:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序当被处理器执行获取实时采集的视频图像之前,还使所述处理器执行如下步骤:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。
  19. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序当被处理器执行获取实时采集的视频图像之前,还使所述处理器执行如下步骤:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。
  20. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序当被处理器执行获取实时采集的视频图像之前,还使所述处理器执行如下步骤:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。
PCT/CN2018/108251 2018-08-07 2018-09-28 人脸情绪识别方法、装置、计算机设备及存储介质 WO2020029406A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810892915.6A CN109190487A (zh) 2018-08-07 2018-08-07 人脸情绪识别方法、装置、计算机设备及存储介质
CN201810892915.6 2018-08-07

Publications (1)

Publication Number Publication Date
WO2020029406A1 true WO2020029406A1 (zh) 2020-02-13

Family

ID=64921037

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/108251 WO2020029406A1 (zh) 2018-08-07 2018-09-28 人脸情绪识别方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN109190487A (zh)
WO (1) WO2020029406A1 (zh)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111627086A (zh) * 2020-06-03 2020-09-04 上海商汤智能科技有限公司 一种头像的展示方法、装置、计算机设备及存储介质
CN111860407A (zh) * 2020-07-29 2020-10-30 华侨大学 一种视频中人物的表情识别方法、装置、设备及存储介质
CN111950447A (zh) * 2020-08-11 2020-11-17 合肥工业大学 基于走路姿态的情绪识别方法和***、存储介质
CN112016437A (zh) * 2020-08-26 2020-12-01 中国科学院重庆绿色智能技术研究院 一种基于人脸视频关键帧的活体检测方法
CN112364736A (zh) * 2020-10-30 2021-02-12 深圳点猫科技有限公司 一种动态人脸表情识别方法、装置及设备
CN112418146A (zh) * 2020-12-02 2021-02-26 深圳市优必选科技股份有限公司 表情识别方法、装置、服务机器人和可读存储介质
CN112507959A (zh) * 2020-12-21 2021-03-16 中国科学院心理研究所 一种基于视频中个体面部分析的情绪感知模型的建立方法
CN112507824A (zh) * 2020-11-27 2021-03-16 长威信息科技发展股份有限公司 一种视频图像特征识别的方法及***
CN112686195A (zh) * 2021-01-07 2021-04-20 风变科技(深圳)有限公司 情绪识别方法、装置、计算机设备和存储介质
CN112699774A (zh) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 视频中人物的情绪识别方法及装置、计算机设备及介质
CN112712022A (zh) * 2020-12-29 2021-04-27 华南理工大学 基于图像识别的压力检测方法、***、装置及存储介质
CN113434647A (zh) * 2021-06-18 2021-09-24 竹间智能科技(上海)有限公司 一种人机交互方法、***及存储介质
CN114943924A (zh) * 2022-06-21 2022-08-26 深圳大学 基于人脸表***的疼痛评估方法、***、设备及介质
CN114998440A (zh) * 2022-08-08 2022-09-02 广东数业智能科技有限公司 基于多模态的测评方法、装置、介质及设备
CN115019374A (zh) * 2022-07-18 2022-09-06 北京师范大学 基于人工智能的智慧课堂学生专注度低耗检测方法及***
CN115177252A (zh) * 2022-07-13 2022-10-14 承德石油高等专科学校 一种可便捷操作的智能养老服务心理疏导装置
CN116665281A (zh) * 2023-06-28 2023-08-29 湖南创星科技股份有限公司 一种基于医患交互的关键情绪提取方法
CN117312992A (zh) * 2023-11-30 2023-12-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) 多视角人脸特征与音频特征融合的情绪识别方法及***

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784277B (zh) * 2019-01-17 2023-04-28 南京大学 一种基于智能眼镜的情绪识别方法
CN109934097A (zh) * 2019-01-23 2019-06-25 深圳市中银科技有限公司 一种基于人工智能的表情和心理健康管理***
CN109816893B (zh) * 2019-01-23 2022-11-04 深圳壹账通智能科技有限公司 信息发送方法、装置、服务器及存储介质
CN109871807B (zh) * 2019-02-21 2023-02-10 百度在线网络技术(北京)有限公司 人脸图像处理方法和装置
CN110047588A (zh) * 2019-03-18 2019-07-23 平安科技(深圳)有限公司 基于微表情的呼叫方法、装置、计算机设备及存储介质
CN110175526B (zh) * 2019-04-28 2024-06-21 平安科技(深圳)有限公司 狗情绪识别模型训练方法、装置、计算机设备及存储介质
CN110097004B (zh) * 2019-04-30 2022-03-29 北京字节跳动网络技术有限公司 面部表情识别方法和装置
CN110414323A (zh) * 2019-06-14 2019-11-05 平安科技(深圳)有限公司 情绪检测方法、装置、电子设备及存储介质
CN110399837B (zh) * 2019-07-25 2024-01-05 深圳智慧林网络科技有限公司 用户情绪识别方法、装置以及计算机可读存储介质
CN110751381A (zh) * 2019-09-30 2020-02-04 东南大学 一种路怒车辆风险评估和防控方法
CN110991427B (zh) * 2019-12-25 2023-07-14 北京百度网讯科技有限公司 用于视频的情绪识别方法、装置和计算机设备
CN111401198B (zh) * 2020-03-10 2024-04-23 广东九联科技股份有限公司 观众情绪识别方法、装置及***
CN111767779A (zh) * 2020-03-18 2020-10-13 北京沃东天骏信息技术有限公司 图像处理方法、装置、设备及计算机可读存储介质
CN111783587A (zh) * 2020-06-22 2020-10-16 腾讯数码(天津)有限公司 一种互动方法、装置和存储介质
CN111859025A (zh) * 2020-07-03 2020-10-30 广州华多网络科技有限公司 表情指令生成方法、装置、设备及存储介质
CN112541425A (zh) * 2020-12-10 2021-03-23 深圳地平线机器人科技有限公司 情绪检测方法、装置、介质及电子设备
TWI811605B (zh) 2020-12-31 2023-08-11 宏碁股份有限公司 情緒指標預測方法與系統
CN114005153A (zh) * 2021-02-01 2022-02-01 南京云思创智信息科技有限公司 面貌多样性的个性化微表情实时识别方法
CN113076813B (zh) * 2021-03-12 2024-04-12 首都医科大学宣武医院 面具脸特征识别模型训练方法和装置
CN113128399B (zh) * 2021-04-19 2022-05-17 重庆大学 用于情感识别的语音图像关键帧提取方法
CN113505665B (zh) * 2021-06-28 2023-06-20 哈尔滨工业大学(深圳) 基于视频的学生在校情绪判读方法以及装置
CN114694234B (zh) * 2022-06-02 2023-02-03 杭州智诺科技股份有限公司 情绪识别方法、***、电子设备和存储介质
CN116563915A (zh) * 2023-04-28 2023-08-08 深圳大器时代科技有限公司 一种基于深度学习算法的人脸状态识别方法及装置
CN116343314B (zh) * 2023-05-30 2023-08-25 之江实验室 一种表情识别方法、装置、存储介质及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008391A (zh) * 2014-04-30 2014-08-27 首都医科大学 一种基于非线性降维的人脸微表情捕捉及识别方法
CN105354527A (zh) * 2014-08-20 2016-02-24 南京普爱射线影像设备有限公司 一种消极表情识别鼓励***
CN107403142A (zh) * 2017-07-05 2017-11-28 山东中磁视讯股份有限公司 一种微表情的检测方法
CN107665074A (zh) * 2017-10-18 2018-02-06 维沃移动通信有限公司 一种色温调节方法及移动终端

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298682B (zh) * 2013-07-18 2018-02-23 广州华久信息科技有限公司 一种基于人脸表情图像的信息推荐效果的评价方法及手机
CN103870820A (zh) * 2014-04-04 2014-06-18 南京工程学院 极端光照人脸识别的光照归一化方法
CN106295566B (zh) * 2016-08-10 2019-07-09 北京小米移动软件有限公司 人脸表情识别方法及装置
CN106980811A (zh) * 2016-10-21 2017-07-25 商汤集团有限公司 人脸表情识别方法和人脸表情识别装置
CN106951856A (zh) * 2017-03-16 2017-07-14 腾讯科技(深圳)有限公司 表情包提取方法及装置
CN107358169A (zh) * 2017-06-21 2017-11-17 厦门中控智慧信息技术有限公司 一种人脸表情识别方法及人脸表情识别装置
CN107633203A (zh) * 2017-08-17 2018-01-26 平安科技(深圳)有限公司 面部情绪识别方法、装置及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008391A (zh) * 2014-04-30 2014-08-27 首都医科大学 一种基于非线性降维的人脸微表情捕捉及识别方法
CN105354527A (zh) * 2014-08-20 2016-02-24 南京普爱射线影像设备有限公司 一种消极表情识别鼓励***
CN107403142A (zh) * 2017-07-05 2017-11-28 山东中磁视讯股份有限公司 一种微表情的检测方法
CN107665074A (zh) * 2017-10-18 2018-02-06 维沃移动通信有限公司 一种色温调节方法及移动终端

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111627086A (zh) * 2020-06-03 2020-09-04 上海商汤智能科技有限公司 一种头像的展示方法、装置、计算机设备及存储介质
CN111860407A (zh) * 2020-07-29 2020-10-30 华侨大学 一种视频中人物的表情识别方法、装置、设备及存储介质
CN111860407B (zh) * 2020-07-29 2023-04-25 华侨大学 一种视频中人物的表情识别方法、装置、设备及存储介质
CN111950447A (zh) * 2020-08-11 2020-11-17 合肥工业大学 基于走路姿态的情绪识别方法和***、存储介质
CN111950447B (zh) * 2020-08-11 2023-08-22 合肥工业大学 基于走路姿态的情绪识别方法和***、存储介质
CN112016437B (zh) * 2020-08-26 2023-02-10 中国科学院重庆绿色智能技术研究院 一种基于人脸视频关键帧的活体检测方法
CN112016437A (zh) * 2020-08-26 2020-12-01 中国科学院重庆绿色智能技术研究院 一种基于人脸视频关键帧的活体检测方法
CN112364736A (zh) * 2020-10-30 2021-02-12 深圳点猫科技有限公司 一种动态人脸表情识别方法、装置及设备
CN112507824A (zh) * 2020-11-27 2021-03-16 长威信息科技发展股份有限公司 一种视频图像特征识别的方法及***
CN112418146A (zh) * 2020-12-02 2021-02-26 深圳市优必选科技股份有限公司 表情识别方法、装置、服务机器人和可读存储介质
CN112418146B (zh) * 2020-12-02 2024-04-30 深圳市优必选科技股份有限公司 表情识别方法、装置、服务机器人和可读存储介质
CN112507959A (zh) * 2020-12-21 2021-03-16 中国科学院心理研究所 一种基于视频中个体面部分析的情绪感知模型的建立方法
CN112699774B (zh) * 2020-12-28 2024-05-24 深延科技(北京)有限公司 视频中人物的情绪识别方法及装置、计算机设备及介质
CN112699774A (zh) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 视频中人物的情绪识别方法及装置、计算机设备及介质
CN112712022A (zh) * 2020-12-29 2021-04-27 华南理工大学 基于图像识别的压力检测方法、***、装置及存储介质
CN112712022B (zh) * 2020-12-29 2023-05-23 华南理工大学 基于图像识别的压力检测方法、***、装置及存储介质
CN112686195A (zh) * 2021-01-07 2021-04-20 风变科技(深圳)有限公司 情绪识别方法、装置、计算机设备和存储介质
CN113434647A (zh) * 2021-06-18 2021-09-24 竹间智能科技(上海)有限公司 一种人机交互方法、***及存储介质
CN113434647B (zh) * 2021-06-18 2024-01-12 竹间智能科技(上海)有限公司 一种人机交互方法、***及存储介质
CN114943924A (zh) * 2022-06-21 2022-08-26 深圳大学 基于人脸表***的疼痛评估方法、***、设备及介质
CN114943924B (zh) * 2022-06-21 2024-05-14 深圳大学 基于人脸表***的疼痛评估方法、***、设备及介质
CN115177252B (zh) * 2022-07-13 2024-01-12 承德石油高等专科学校 一种可便捷操作的智能养老服务心理疏导装置
CN115177252A (zh) * 2022-07-13 2022-10-14 承德石油高等专科学校 一种可便捷操作的智能养老服务心理疏导装置
CN115019374B (zh) * 2022-07-18 2022-10-11 北京师范大学 基于人工智能的智慧课堂学生专注度低耗检测方法及***
CN115019374A (zh) * 2022-07-18 2022-09-06 北京师范大学 基于人工智能的智慧课堂学生专注度低耗检测方法及***
CN114998440B (zh) * 2022-08-08 2022-11-11 广东数业智能科技有限公司 基于多模态的测评方法、装置、介质及设备
CN114998440A (zh) * 2022-08-08 2022-09-02 广东数业智能科技有限公司 基于多模态的测评方法、装置、介质及设备
CN116665281A (zh) * 2023-06-28 2023-08-29 湖南创星科技股份有限公司 一种基于医患交互的关键情绪提取方法
CN116665281B (zh) * 2023-06-28 2024-05-10 湖南创星科技股份有限公司 一种基于医患交互的关键情绪提取方法
CN117312992A (zh) * 2023-11-30 2023-12-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) 多视角人脸特征与音频特征融合的情绪识别方法及***
CN117312992B (zh) * 2023-11-30 2024-03-12 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) 多视角人脸特征与音频特征融合的情绪识别方法及***

Also Published As

Publication number Publication date
CN109190487A (zh) 2019-01-11

Similar Documents

Publication Publication Date Title
WO2020029406A1 (zh) 人脸情绪识别方法、装置、计算机设备及存储介质
CN111488433B (zh) 一种适用于银行的提升现场体验感的人工智能交互***
WO2020098249A1 (zh) 电子装置、应对话术推荐方法和计算机可读存储介质
CN110147726B (zh) 业务质检方法和装置、存储介质及电子装置
WO2020248376A1 (zh) 情绪检测方法、装置、电子设备及存储介质
US11138903B2 (en) Method, apparatus, device and system for sign language translation
CN110085225B (zh) 语音交互方法、装置、智能机器人及计算机可读存储介质
WO2020007129A1 (zh) 基于语音交互的上下文获取方法及设备
CA3030134A1 (en) Technologies for monitoring interactions between customers and agents using sentiment detection
CN106294774A (zh) 基于对话服务的用户个性化数据处理方法及装置
WO2020253128A1 (zh) 基于语音识别的通信服务方法、装置、计算机设备及存储介质
WO2019134580A1 (zh) 一种用于管理游戏用户的方法与设备
JP2018169494A (ja) 発話意図推定装置および発話意図推定方法
WO2020252903A1 (zh) Au检测方法、装置、电子设备及存储介质
CN112908325B (zh) 语音交互方法、装置、电子设备及存储介质
CN113223560A (zh) 情绪识别方法、装置、设备及存储介质
WO2022257452A1 (zh) 表情回复方法、装置、设备及存储介质
WO2021228148A1 (zh) 保护个人数据隐私的特征提取方法、模型训练方法及硬件
CN110879986A (zh) 人脸识别的方法、设备和计算机可读存储介质
WO2021051602A1 (zh) 基于唇语密码的人脸识别方法、***、装置及存储介质
CN113128284A (zh) 一种多模态情感识别方法和装置
CN110503943B (zh) 一种语音交互方法以及语音交互***
US20210166685A1 (en) Speech processing apparatus and speech processing method
CN116543798A (zh) 基于多分类器的情感识别方法和装置、电子设备、介质
CN111832651B (zh) 视频多模态情感推理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18929826

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18929826

Country of ref document: EP

Kind code of ref document: A1