CN104464756A - Small speaker emotion recognition system - Google Patents

Small speaker emotion recognition system Download PDF

Info

Publication number
CN104464756A
CN104464756A CN201410750977.5A CN201410750977A CN104464756A CN 104464756 A CN104464756 A CN 104464756A CN 201410750977 A CN201410750977 A CN 201410750977A CN 104464756 A CN104464756 A CN 104464756A
Authority
CN
China
Prior art keywords
emotion
voice
small
parameter
voices
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410750977.5A
Other languages
Chinese (zh)
Inventor
冯秀霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heilongjiang Zhenmei Broadcasting Communications Equipment Co Ltd
Original Assignee
Heilongjiang Zhenmei Broadcasting Communications Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heilongjiang Zhenmei Broadcasting Communications Equipment Co Ltd filed Critical Heilongjiang Zhenmei Broadcasting Communications Equipment Co Ltd
Priority to CN201410750977.5A priority Critical patent/CN104464756A/en
Publication of CN104464756A publication Critical patent/CN104464756A/en
Pending legal-status Critical Current

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a small speaker emotion recognition system. Firstly, a small emotion voice library is built, one parts of voices in the small emotion voice library serve as training samples and are used for building a reference template, the other parts of the voices serve as testing samples and are used for follow-up emotion recognition testing, then the obtained voices in the voice library are preprocessed, and emotion parameter extracting is carried out on preprocessed voice signals, wherein emotion parameters include the fundamental voice frequency, the formants, the Mel frequency cepstrum coefficient and the related statistic parameters; then voice emotion recognition testing is carried out, the emotion parameters of the training voices are classified through an emotion classifier based on support vector machines, then the testing voices are predicated through the emotion classifier, and emotions of the testing voices are judged.

Description

A kind of small-sized speaker's emotion recognition system
Technical field
what the present invention relates to is a kind of speech emotion recognition system, particularly one small-sized speaker's emotion recognition system.
background technology
voice are interpersonal important means exchanged, and sound is the carrier of information, and people can obtain information wherein by sound, wherein naturally comprise emotion information.Voice are a kind of important tool that the mankind exchange mutually, are also the important media of one transmitting emotion.The not just Word message that voice signal comprises, is also mingled with the emotion of people simultaneously.In short equally, wherein can comprise different emotions, and emotion is different, so the meaning of the words just likely changes, if computing machine cannot obtain its emotion from the voice of operator, so just can not reach best communicative effect, even likely can misunderstand to the meaning of operator, thus generation misoperation, make troubles to operator.
speech processing is an important field of research, and research history existing very long so far, the emotion research of voice signal is then an emerging field, but it is a research topic combining multiclass subject.Wherein mainly contain the important subjects such as physiology, psychology and signal transacting.Achievement in research-speech emotion recognition the system of this problem has quite broad application prospect simultaneously, specifically can be applied in:
whether 1, distance network teaching, can add emotion recognition system in distance education system, proper by judging the emotional expression of learner, and learner can be allowed better to improve Reading ability with enriching emotion.
2, for criminal investigation field, emotion recognition system can be made into an a lie detector, utilizes it to infer the language really degree of tester.Along with improving constantly of technology, constantly can improve the function of a lie detector and use it in reality, therefore emotion recognition system also has considerable practical significance for criminal investigation field.
3, amusement game, at present great majority game is all conveyed a message by word, if add the emotion recognition of voice in gaming and express, can the transfer mode of abundant information, also more can attract player simultaneously.Can alleviate the fatigue strength of player in game process to a certain extent by the mode of this novelty, player also can obtain the sense of hearing and visual enjoyment simultaneously, adds the played degree of game.
summary of the invention
the object of this invention is to provide and a kind ofly utilize a small-sized emotional speech Cooley to do training sample with it as voice, for building reference template, to people's emotion recognition system that the discrimination of often kind of emotion is added up.
the object of the present invention is achieved like this: first step work of the present invention is on the basis of reading domestic and international great mass of data, establish a small-sized emotional speech storehouse, wherein will do training sample, for building reference template by a part of voice; Another part does test sample book, tests for follow-up emotion recognition.Second step carries out pre-service to the voice obtained in sound bank, and its step mainly comprises pre-emphasis, windowing framing and speech terminals detection.3rd step be to pre-service after voice signal carry out the extraction work of emotion parameter, emotion parameter comprises fundamental frequency, resonance peak, mel-frequency cepstrum coefficient and pertinent statistical parameters thereof.With software, emulation experiment is carried out to the extraction of parameter, obtain the distribution range of the parameters of different emotions type, and concise and to the point analysis is carried out to result.4th step carries out speech emotion recognition experiment, classified by the emotion classifiers of the emotion parameter of training utterance based on support vector machine, predict afterwards with it to tested speech again, judges which kind of emotion it belongs to.After experiment, the discrimination of often kind of emotion is added up, final statistics is analyzed.Finally, for whole system devises a simple man-machine interface, this interface can complete input test voice, display system to the recognition result of these voice and the function that empties result.
oneself records a small-scale Chinese emotional speech storehouse, and in storehouse, the emotion of voice is divided into four classes: happy, angry, sad, surprised.Producer is 6 people is all boy student, and everyone reads aloud by 4 kinds of emotions respectively to 4 speech texts, and often kind of emotion reads aloud 4 times, altogether obtains 384 samples and uses emotional speech storehouse as experiment.Adopt the method for SVM to classify to emotion, wherein SVM adopts " one to one " method to solve polytypic problem.Finally respectively with the prosodic features of voice comprise the correlation parameter of fundamental tone and resonance peak, phonetic feature MFCC correlation parameter and both be combined as affective characteristics and identify, and carried out analyzing contrast to recognition result.In experiment, when identifying by whole 11 parameters, the average recognition rate of final 4 kinds of obtained emotions is 79.15%, and sad discrimination is up to 83.3%.Find simultaneously, the most easily occur to identify phenomenon between these two kinds of emotions happy and angry by mistake.
Accompanying drawing explanation
fig. 1 is speech emotion recognition process flow diagram.
Embodiment
below in conjunction with accompanying drawing citing, the present invention is described in more detail:
embodiment 1
composition graphs 1, Fig. 1 is speech emotion recognition process flow diagram.1, the acquisition in emotional speech storehouse.Because current speech emotion recognition is all for other country's language, it is relatively less that Chinese research in this respect is then carried out, and can not find the Chinese emotional speech storehouse that is specifically designed to emotion recognition.Therefore the beam worker carried out before Study of recognition is exactly the emotional speech storehouse that oneself records a small-scale Chinese, then carries out follow-up study based on this sound bank.2, the pre-service of voice signal.Due to voice signal, can not extracting directly affective characteristics parameter wherein for the voice signal in sound bank, a step front-end processing be must first carry out, pre-emphasis, windowing framing and end-point detection comprised.3, the extraction of affective characteristics parameter.Be then extract the affective characteristics parameter in signal after pre-service, wherein mainly comprise two kinds, a class is acoustical characteristic parameters, comprises 12 rank MFCC parameter and formant parameters.Another kind of is prosodic features parameter, comprises the fundamental frequency of voice, short-time energy, the parameters such as average zero-crossing rate.And carried out refinement on this basis, finally have chosen fundamental frequency mean value, maximal value, minimum value, the first resonance peak mean value, maximal value, and the 10th of MFCC the, 11,12 parameters are as affective characteristics parameter.4, the design of emotion classifiers.Present invention employs the design of the speech emotional sorter based on support vector machine (Support Vector Machine), because current svm is only applicable to two classification, and if many classification will be realized, then need to design a svm between every two samples, when needs are classified to unknown sample, then to finally determine its classification by voting.Method that Here it is so-called " one to one ".

Claims (2)

1. small-sized speaker's emotion recognition system, it is characterized in that: first step work of the present invention is on the basis of reading domestic and international great mass of data, establish a small-sized emotional speech storehouse, wherein will do training sample, for building reference template by a part of voice; Another part does test sample book, tests for follow-up emotion recognition; Second step carries out pre-service to the voice obtained in sound bank, and its step mainly comprises: pre-emphasis, windowing framing and speech terminals detection; 3rd step be to pre-service after voice signal carry out the extraction work of emotion parameter, emotion parameter comprises fundamental frequency, resonance peak, mel-frequency cepstrum coefficient and pertinent statistical parameters thereof; With software, emulation experiment is carried out to the extraction of parameter, obtain the distribution range of the parameters of different emotions type, and concise and to the point analysis is carried out to result; 4th step carries out speech emotion recognition experiment, classified by the emotion classifiers of the emotion parameter of training utterance based on support vector machine, predict afterwards with it to tested speech again, judges which kind of emotion it belongs to; After experiment, the discrimination of often kind of emotion is added up, final statistics is analyzed; Finally, for whole system devises a simple man-machine interface, this interface can complete input test voice, display system to the recognition result of these voice and the function that empties result.
2. one according to claim 1 small-sized speaker's emotion recognition system, is characterized in that: record a small-scale Chinese emotional speech storehouse, in storehouse, the emotion of voice is divided into four classes: happy, angry, sad, surprised; Adopt the method for SVM to classify to emotion, wherein SVM adopts " one to one " method to solve polytypic problem; Finally respectively with the prosodic features of voice comprise the correlation parameter of fundamental tone and resonance peak, phonetic feature MFCC correlation parameter and both be combined as affective characteristics and identify.
CN201410750977.5A 2014-12-10 2014-12-10 Small speaker emotion recognition system Pending CN104464756A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410750977.5A CN104464756A (en) 2014-12-10 2014-12-10 Small speaker emotion recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410750977.5A CN104464756A (en) 2014-12-10 2014-12-10 Small speaker emotion recognition system

Publications (1)

Publication Number Publication Date
CN104464756A true CN104464756A (en) 2015-03-25

Family

ID=52910700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410750977.5A Pending CN104464756A (en) 2014-12-10 2014-12-10 Small speaker emotion recognition system

Country Status (1)

Country Link
CN (1) CN104464756A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016183961A1 (en) * 2015-05-18 2016-11-24 百度在线网络技术(北京)有限公司 Method, system and device for switching interface of smart device, and nonvolatile computer storage medium
CN106531158A (en) * 2016-11-30 2017-03-22 北京理工大学 Method and device for recognizing answer voice
WO2018095167A1 (en) * 2016-11-22 2018-05-31 北京京东尚科信息技术有限公司 Voiceprint identification method and voiceprint identification system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080086791A (en) * 2007-03-23 2008-09-26 엘지전자 주식회사 Feeling recognition system based on voice
CN101599271A (en) * 2009-07-07 2009-12-09 华中科技大学 A kind of recognition methods of digital music emotion
CN101685634A (en) * 2008-09-27 2010-03-31 上海盛淘智能科技有限公司 Children speech emotion recognition method
US20110141258A1 (en) * 2007-02-16 2011-06-16 Industrial Technology Research Institute Emotion recognition method and system thereof
CN102723078A (en) * 2012-07-03 2012-10-10 武汉科技大学 Emotion speech recognition method based on natural language comprehension
CN103021406A (en) * 2012-12-18 2013-04-03 台州学院 Robust speech emotion recognition method based on compressive sensing
CN103440863A (en) * 2013-08-28 2013-12-11 华南理工大学 Speech emotion recognition method based on manifold
CN104008754A (en) * 2014-05-21 2014-08-27 华南理工大学 Speech emotion recognition method based on semi-supervised feature selection
CN104091602A (en) * 2014-07-11 2014-10-08 电子科技大学 Speech emotion recognition method based on fuzzy support vector machine
CN104681039A (en) * 2013-11-26 2015-06-03 哈尔滨智晟天诚科技开发有限公司 Voice information-based small speaker emotion recognition system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110141258A1 (en) * 2007-02-16 2011-06-16 Industrial Technology Research Institute Emotion recognition method and system thereof
KR20080086791A (en) * 2007-03-23 2008-09-26 엘지전자 주식회사 Feeling recognition system based on voice
CN101685634A (en) * 2008-09-27 2010-03-31 上海盛淘智能科技有限公司 Children speech emotion recognition method
CN101599271A (en) * 2009-07-07 2009-12-09 华中科技大学 A kind of recognition methods of digital music emotion
CN102723078A (en) * 2012-07-03 2012-10-10 武汉科技大学 Emotion speech recognition method based on natural language comprehension
CN103021406A (en) * 2012-12-18 2013-04-03 台州学院 Robust speech emotion recognition method based on compressive sensing
CN103440863A (en) * 2013-08-28 2013-12-11 华南理工大学 Speech emotion recognition method based on manifold
CN104681039A (en) * 2013-11-26 2015-06-03 哈尔滨智晟天诚科技开发有限公司 Voice information-based small speaker emotion recognition system
CN104008754A (en) * 2014-05-21 2014-08-27 华南理工大学 Speech emotion recognition method based on semi-supervised feature selection
CN104091602A (en) * 2014-07-11 2014-10-08 电子科技大学 Speech emotion recognition method based on fuzzy support vector machine

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016183961A1 (en) * 2015-05-18 2016-11-24 百度在线网络技术(北京)有限公司 Method, system and device for switching interface of smart device, and nonvolatile computer storage medium
WO2018095167A1 (en) * 2016-11-22 2018-05-31 北京京东尚科信息技术有限公司 Voiceprint identification method and voiceprint identification system
CN106531158A (en) * 2016-11-30 2017-03-22 北京理工大学 Method and device for recognizing answer voice

Similar Documents

Publication Publication Date Title
CN102723078B (en) Emotion speech recognition method based on natural language comprehension
Koolagudi et al. IITKGP-SESC: speech database for emotion analysis
CN102881284B (en) Unspecific human voice and emotion recognition method and system
CN101346758B (en) Emotion recognizer
CN110782872A (en) Language identification method and device based on deep convolutional recurrent neural network
CN102332263B (en) Close neighbor principle based speaker recognition method for synthesizing emotional model
Graciarena et al. Combining prosodic lexical and cepstral systems for deceptive speech detection
CN106297826A (en) Speech emotional identification system and method
CN101261832A (en) Extraction and modeling method for Chinese speech sensibility information
CN108899033B (en) Method and device for determining speaker characteristics
Villarreal et al. From categories to gradience: Auto-coding sociophonetic variation with random forests
CN110111778A (en) A kind of method of speech processing, device, storage medium and electronic equipment
CN107221344A (en) A kind of speech emotional moving method
Zhang et al. Multimodal Deception Detection Using Automatically Extracted Acoustic, Visual, and Lexical Features.
CN109920435A (en) A kind of method for recognizing sound-groove and voice print identification device
Brown et al. Automatic sociophonetics: Exploring corpora with a forensic accent recognition system
CN106710588B (en) Speech data sentence recognition method, device and system
CN104464756A (en) Small speaker emotion recognition system
Lawson et al. Improving language identification robustness to highly channel-degraded speech through multiple system fusion.
Trabelsi et al. Improved frame level features and SVM supervectors approach for the recogniton of emotional states from speech: Application to categorical and dimensional states
CN113990288B (en) Method for automatically generating and deploying voice synthesis model by voice customer service
CN104681039A (en) Voice information-based small speaker emotion recognition system
Koolagudi et al. Robust speaker recognition in noisy environments: Using dynamics of speaker-specific prosody
Liu et al. Supra-Segmental Feature Based Speaker Trait Detection.
Mansour et al. A comparative study in emotional speaker recognition in noisy environment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150325

WD01 Invention patent application deemed withdrawn after publication