CN108921037B - Emotion recognition method based on BN-acceptance double-flow network - Google Patents

Emotion recognition method based on BN-acceptance double-flow network Download PDF

Info

Publication number
CN108921037B
CN108921037B CN201810579049.5A CN201810579049A CN108921037B CN 108921037 B CN108921037 B CN 108921037B CN 201810579049 A CN201810579049 A CN 201810579049A CN 108921037 B CN108921037 B CN 108921037B
Authority
CN
China
Prior art keywords
network
spp
double
individual
initiation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810579049.5A
Other languages
Chinese (zh)
Other versions
CN108921037A (en
Inventor
卿粼波
王露
滕奇志
何小海
熊文诗
吴晓红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201810579049.5A priority Critical patent/CN108921037B/en
Publication of CN108921037A publication Critical patent/CN108921037A/en
Application granted granted Critical
Publication of CN108921037B publication Critical patent/CN108921037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an individual emotion recognition method based on posture information, and mainly relates to the fact that an individual posture is researched by a deep learning method to judge the emotion of an individual. The method comprises the following steps: firstly, introducing a BN-acceptance-based double-flow network model, and extracting static and dynamic characteristics of an input sequence through learning of an original image and an optical flow image; and then adding Space Pyramid Pooling (SPP) on the basis of the double-current network to input the image into the network in the original size, thereby reducing the influence on the model performance caused by deformation. According to the invention, firstly, a double-flow network is utilized to learn the space-time characteristics of the input sequence, and pyramid pooling is introduced to retain the original information of the video frame, so that the network can effectively learn the characteristics of individual attitude and emotion, and a higher recognition rate is obtained.

Description

Emotion recognition method based on BN-acceptance double-flow network
Technical Field
The invention relates to an emotion recognition problem in the field of deep learning, in particular to an emotion analysis method based on a BN-acceptance + SPP double-flow network.
Background
The emotion is a state integrating human feelings, ideas and behaviors, and plays an important role in human-to-human communication. The emotional state of a person can be usually judged according to the facial expression of the person, but in some specific environments, such as monitoring visual angles, the situation that the face is shielded, and the like, the person can not necessarily obtain clear facial expression of the person. In fact, the real emotion of a person is expressed not only by means of facial expressions, but also by means of body movements of the person, certain emotion information can be expressed. Therefore, the research of the present invention is mainly focused on emotion recognition of individual gestures based on video.
Emotion recognition is an important research content and direction in the field of computer vision, and currently, many authoritative international periodicals and top-level meetings have related subjects and contents, and many foreign famous schools also have related courses. The traditional emotion recognition method based on videos mainly depends on manually selected features, the method is time-consuming and labor-consuming, the generalization performance of obtained model parameters is poor, and the degree of emotion recognition service is limited. Deep learning is an important component of the development of the field of artificial intelligence, and has become a very popular research direction in the field of artificial intelligence in recent years. The method has great breakthrough in many fields (such as image recognition, voice recognition and the like), and particularly has high recognition rate and generalization capability in video analysis. Therefore, the method utilizes the advantages of deep learning in video analysis to research individual emotion recognition in the video.
Emotion recognition based on posture information has been developed in recent years, and related research is relatively small, and mainly focuses on research of traditional algorithms. Li and the like[1]The method comprises the steps of carrying out behavior identification and classification by using original skeleton coordinates and skeleton motion; piana et al[2]An automatic emotion recognition model and system based on whole body motion is provided, which is used for helping the autistic children learn recognition and express emotion through the whole body motion. Also, some combine the motion features of human body posture with advanced kinematic geometry features for clustering and classification. Crenn et al [3]The method comprises the steps of obtaining low-level features such as running data and the like by utilizing a human 3D framework sequence, decomposing the features into three types including geometric features, motion features and Fourier features, calculating meta-features (such as mean values, standard deviations and the like) of the low-level features, and finally classifying the meta-features by adopting a classifier. Deep learning, whether in terms of recognition time or accuracy, is greatly improved compared with the traditional method, but due to the lack of emotion data sets related to gestures, the research related to individual emotion recognition based on gesture information by adopting deep learning is still rare.
Disclosure of Invention
The invention aims to provide an individual emotion recognition method based on gestures, which combines deep learning with human gestures in a video, fully utilizes the superiority of a BN-initiation + SPP network structure, introduces a double-current network structure to perform individual emotion recognition based on the video, effectively learns the emotion characteristics of the individual gestures, and obtains higher recognition rate.
For convenience of explanation, the following concepts are first introduced:
an optical flow method: is a simple and practical way of expressing image motion, which is generally defined as the apparent motion of the image brightness pattern in an image sequence, i.e. the expression of the motion speed of a point on the surface of a spatial object on the imaging plane of a visual sensor.
A convolutional neural network: a multilayer feedforward neural network, each layer is composed of a plurality of two-dimensional planes, neurons of each plane work independently, and a convolutional neural network comprises a convolutional layer and a pooling layer.
Double-current convolutional neural network: the network is designed aiming at the extraction of the video behavior characteristics, and takes a single-frame RGB original image and an optical flow image obtained based on video data as two inputs respectively so as to realize the representation of behavior object space appearance information and the extraction of behavior process time sequence characteristics.
Spatial Pyramid Pooling (SPP): the SPP layer is formed by combining a plurality of down-sampling layers, can divide an input feature map from coarse to fine, and converts the feature map into a feature vector with fixed length, so that the SPP layer can extract various local information.
The invention specifically adopts the following technical scheme:
the emotion recognition method based on the BN-acceptance double-flow network is mainly characterized by comprising the following steps:
1. the individual pose data set is divided into four mood categories: boredom (bored), agitation (excited), pneumatosis (free), relaxation (relax);
2. adding Space Pyramid Pooling (SPP) in front of a full connection layer of the BN-acceptance dual-flow network, and respectively training a Space-time network on a data set;
The method mainly comprises the following steps:
(1) the individual posture sequence data set is divided into four emotion categories: boring, exciting, generating qi and relaxing;
(2) generating an optical flow image sequence corresponding to the data set by adopting an optical flow algorithm of a document [4] to represent the motion characteristics of the individual posture;
(3) dividing the original data set and the optical flow data set into a training set, a verification set and a test set according to a proportion;
(4) introducing a double-current convolutional neural network model based on BN-initiation, adding an SPP layer to optimize the BN-initiation network before a full connection layer, training a space-time network by using a training set and a verification set, and verifying by using a test set;
(5) and carrying out Average fusion on the spatial stream and time stream two-channel network based on BN-acceptance + SPP to obtain the accuracy ACC (accuracy) and the macro Average precision MAP (macro Average precision) on the test set.
Drawings
FIG. 1 is a schematic diagram of an overall framework of emotion recognition based on a BN-acceptance + SPP dual-flow network.
FIGS. 2-a-2-b are accuracy confusion matrices obtained on a test set without adding an SPP layer according to the present invention, wherein 2-a is a test matrix of a spatial stream BN-initiation network, and 2-b is a test matrix of a temporal stream BN-initiation network.
FIGS. 3-a-3-b are accuracy confusion matrices obtained on a test set when an SPP layer is added, wherein 3-a is a test matrix of a spatial stream BN-initiation + SPP network, and 3-b is a test matrix of a temporal stream BN-initiation + SPP network.
FIG. 4 shows that the present invention obtains ACC and MAP on the test set after averagely merging the spatial stream and the time stream based on BN-initiation + SPP.
Detailed Description
The present invention is further described in detail with reference to the drawings and examples, it should be noted that the following examples are only for illustrating the present invention and should not be construed as limiting the scope of the present invention, and those skilled in the art should be able to make certain insubstantial modifications and adaptations to the present invention based on the above disclosure and should still fall within the scope of the present invention.
In fig. 1, an emotion recognition method based on a BN-acceptance + SPP dual-stream network includes the following steps:
(1) firstly, after an individual data set in a public space is obtained, an optical flow image sequence of an original data set is generated by adopting an optical flow algorithm of a document [4] to represent the motion characteristics of an individual posture;
(2) dividing an original data set and an obtained optical flow data set into a test set, a verification set and a training set according to a proportion, and giving corresponding emotion types;
(3) Removing the SPP layer shown in FIG. 1, inputting the data of the training set and the verification set into a spatio-temporal network respectively for learning to obtain a training model, and testing and verifying the effect by using the data of the test set;
(4) adding an SPP layer, inputting the training set into a space-time network for learning according to the original size to obtain a training model, and testing and verifying the effect by using the data of the test set;
(5) after the spatial stream and the time stream two-channel network based on BN-initiation + SPP are averagely fused, ACC and MAP on a test set are obtained;
the convolutional neural networks of two channels of spatial flow and time flow are separately trained by Caffe, and parameters of the time flow and spatial flow networks are set through experiments, as shown in Table 1. Because the number of the established samples of the individual posture emotion data set is small, in order to prevent the overfitting phenomenon, a method of data expansion and adding a Dropout layer in a network is adopted.
TABLE 1 training parameter settings
Figure GDA0001789660910000041
Reference documents:
[1]Li C,Zhong Q,Xie D,et al.Skeleton-based Action Recognition with Convolutional Neural Networks[J].2017:597-600.
[2]Piana S,
Figure GDA0001789660910000042
A,Odone F,et al.Adaptive Body Gesture Representation for Automatic Emotion Recognition[J].ACM Transactions on Interactive Intelligent Systems(TiiS),2016,6(1):6.
[3]Crenn A,Khan R A,Meyer A,et al.Body Expression Recognition from Animated 3D Skeleton[C]//International Conference on 3D Imaging.IEEE,2017:1-7.
[4]Brox T,Bruhn A,Papenberg N,et al.High Accuracy Optical Flow Estimation Based on A Theory for Warping[C]//European Conference on Computer Vision(ECCV),2004:25-36.

Claims (3)

1. an individual emotion recognition method based on a BN-initiation + SPP double-flow network is characterized by comprising the following steps:
a. the individual pose data set is divided into four mood categories: boredom, excited, angry from, relaxed, given the emotional category of each sequence;
b. Adding Space Pyramid Pooling Space Pyramid Pooling before a full connection layer of a BN-acceptance double-flow network, and respectively training a Space-time network on a data set;
c. the double-flow network training parameter based on BN-initiation + SPP is the basic learning rate base _ lr: 0.00000001; learning rate change index gamma: 0.01; weight attenuation weight _ decay: 0.005; maximum number of iterations max _ iter: 150000;
the method mainly comprises the following steps:
(1) processing the data set by adopting an optical flow algorithm to generate a corresponding optical flow image sequence, and representing the motion characteristics of the individual posture;
(2) dividing a data set into a training set, a verification set and a test set, and giving the emotion category of each sequence;
(3) introducing a double-current convolutional neural network model based on BN-initiation, adding an SPP layer to optimize the BN-initiation network before a full connection layer, training a space-time network by using a training set and a verification set, and verifying by using a test set;
(4) and carrying out average fusion on the spatial stream and time stream two-channel network based on BN-acceptance + SPP to obtain the accuracy ACC and the macro average accuracy MAP on the test set.
2. The method for individual emotion recognition based on BN-initiation + SPP dual-stream network as claimed in claim 1, wherein the spatiotemporal features of the data set are learned separately by using the dual-stream network in step (3).
3. The method for recognizing individual emotion based on BN-initiation + SPP dual-stream network as claimed in claim 1, wherein in step (3), the SPP layer is added before the full connection layer of BN-initiation dual-stream network, so that the training set is input into the network in original size to avoid the loss of motion information caused by fixed input size, and then the training of spatio-temporal network is performed on the data set respectively.
CN201810579049.5A 2018-06-07 2018-06-07 Emotion recognition method based on BN-acceptance double-flow network Active CN108921037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810579049.5A CN108921037B (en) 2018-06-07 2018-06-07 Emotion recognition method based on BN-acceptance double-flow network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810579049.5A CN108921037B (en) 2018-06-07 2018-06-07 Emotion recognition method based on BN-acceptance double-flow network

Publications (2)

Publication Number Publication Date
CN108921037A CN108921037A (en) 2018-11-30
CN108921037B true CN108921037B (en) 2022-06-03

Family

ID=64418934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810579049.5A Active CN108921037B (en) 2018-06-07 2018-06-07 Emotion recognition method based on BN-acceptance double-flow network

Country Status (1)

Country Link
CN (1) CN108921037B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766856B (en) * 2019-01-16 2022-11-15 华南农业大学 Method for recognizing postures of lactating sows through double-current RGB-D Faster R-CNN
CN109886160B (en) * 2019-01-30 2021-03-09 浙江工商大学 Face recognition method under non-limited condition
CN109814565A (en) * 2019-01-30 2019-05-28 上海海事大学 The unmanned boat intelligence navigation control method of space-time double fluid data-driven depth Q study
CN110147729A (en) * 2019-04-16 2019-08-20 深圳壹账通智能科技有限公司 User emotion recognition methods, device, computer equipment and storage medium
CN110175596B (en) * 2019-06-04 2022-04-22 重庆邮电大学 Virtual learning environment micro-expression recognition and interaction method based on double-current convolutional neural network
CN112131908B (en) * 2019-06-24 2024-06-11 北京眼神智能科技有限公司 Action recognition method, device, storage medium and equipment based on double-flow network
CN110414561A (en) * 2019-06-26 2019-11-05 武汉大学 A kind of construction method of the natural scene data set suitable for machine vision
CN111968091B (en) * 2020-08-19 2022-04-01 南京图格医疗科技有限公司 Method for detecting and classifying lesion areas in clinical image

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663429A (en) * 2012-04-11 2012-09-12 上海交通大学 Method for motion pattern classification and action recognition of moving target
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning
CN107491731A (en) * 2017-07-17 2017-12-19 南京航空航天大学 A kind of Ground moving target detection and recognition methods towards precision strike
CN107944442A (en) * 2017-11-09 2018-04-20 北京智芯原动科技有限公司 Based on the object test equipment and method for improving convolutional neural networks

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050265580A1 (en) * 2004-05-27 2005-12-01 Paul Antonucci System and method for a motion visualizer
CN103544963B (en) * 2013-11-07 2016-09-07 东南大学 A kind of speech-emotion recognition method based on core semi-supervised discrimination and analysis
CN104732203B (en) * 2015-03-05 2019-03-26 中国科学院软件研究所 A kind of Emotion identification and tracking based on video information
CN106295568B (en) * 2016-08-11 2019-10-18 上海电力学院 The mankind's nature emotion identification method combined based on expression and behavior bimodal
CN106897671B (en) * 2017-01-19 2020-02-25 济南中磁电子科技有限公司 Micro-expression recognition method based on optical flow and Fisher Vector coding
CN107784114A (en) * 2017-11-09 2018-03-09 广东欧珀移动通信有限公司 Recommendation method, apparatus, terminal and the storage medium of facial expression image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663429A (en) * 2012-04-11 2012-09-12 上海交通大学 Method for motion pattern classification and action recognition of moving target
CN107368798A (en) * 2017-07-07 2017-11-21 四川大学 A kind of crowd's Emotion identification method based on deep learning
CN107491731A (en) * 2017-07-17 2017-12-19 南京航空航天大学 A kind of Ground moving target detection and recognition methods towards precision strike
CN107944442A (en) * 2017-11-09 2018-04-20 北京智芯原动科技有限公司 Based on the object test equipment and method for improving convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition;Kaiming He et al;《IEEE Transactions on Pattern Analysis and Machine Intelligence》;20150109;3 *
基于改进的深度卷积神经网络的人体动作识别方法;陈胜娣 等;《计算机应用研究》;20180207;2-3、5 *

Also Published As

Publication number Publication date
CN108921037A (en) 2018-11-30

Similar Documents

Publication Publication Date Title
CN108921037B (en) Emotion recognition method based on BN-acceptance double-flow network
CN108520535B (en) Object classification method based on depth recovery information
Liu et al. Two-stream 3d convolutional neural network for skeleton-based action recognition
Wang et al. Large-scale isolated gesture recognition using convolutional neural networks
Hu et al. 3D separable convolutional neural network for dynamic hand gesture recognition
CN108596039B (en) Bimodal emotion recognition method and system based on 3D convolutional neural network
CN105224942B (en) RGB-D image classification method and system
CN109919031A (en) A kind of Human bodys' response method based on deep neural network
CN111274921B (en) Method for recognizing human body behaviors by using gesture mask
Rioux-Maldague et al. Sign language fingerspelling classification from depth and color images using a deep belief network
CN109190479A (en) A kind of video sequence expression recognition method based on interacting depth study
CN110580500A (en) Character interaction-oriented network weight generation few-sample image classification method
Li et al. Sign language recognition based on computer vision
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
CN113221663A (en) Real-time sign language intelligent identification method, device and system
CN107066979A (en) A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks
CN109086664A (en) A kind of polymorphic gesture identification method of sound state fusion
CN112906520A (en) Gesture coding-based action recognition method and device
CN110490915A (en) A kind of point cloud registration method being limited Boltzmann machine based on convolution
CN110889335B (en) Human skeleton double interaction behavior identification method based on multichannel space-time fusion network
CN111401116B (en) Bimodal emotion recognition method based on enhanced convolution and space-time LSTM network
Agrawal et al. Redundancy removal for isolated gesture in Indian sign language and recognition using multi-class support vector machine
Özbay et al. 3D Human Activity Classification with 3D Zernike Moment Based Convolutional, LSTM-Deep Neural Networks.
Tur et al. Isolated sign recognition with a siamese neural network of RGB and depth streams
Ito et al. Efficient and accurate skeleton-based two-person interaction recognition using inter-and intra-body graphs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant