CN109785863A - A kind of speech-emotion recognition method and system of deepness belief network - Google Patents

A kind of speech-emotion recognition method and system of deepness belief network Download PDF

Info

Publication number
CN109785863A
CN109785863A CN201910173690.3A CN201910173690A CN109785863A CN 109785863 A CN109785863 A CN 109785863A CN 201910173690 A CN201910173690 A CN 201910173690A CN 109785863 A CN109785863 A CN 109785863A
Authority
CN
China
Prior art keywords
speech
voice signal
obtains
belief network
emotion recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910173690.3A
Other languages
Chinese (zh)
Inventor
巩微
黄玮
伏文龙
黄晨晨
范文庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Communication University of China
Original Assignee
Communication University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Communication University of China filed Critical Communication University of China
Priority to CN201910173690.3A priority Critical patent/CN109785863A/en
Publication of CN109785863A publication Critical patent/CN109785863A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses the speech-emotion recognition method and system of a kind of deepness belief network.The recognition methods includes: acquisition voice signal;The voice signal is pre-processed, pretreatment voice signal is obtained;Unsupervised speech recognition is carried out using deepness belief network to the pretreatment voice signal, obtains phonic signal character;The identification classification that the phonic signal character is carried out to speech emotional using support vector machines, obtains speech emotion recognition result.Using the multi-categorizer model based on deepness belief network and limitation Boltzmann machine, the multi-classifier system of a speech emotion recognition is established, the discrimination of speech emotional is improved.

Description

A kind of speech-emotion recognition method and system of deepness belief network
Technical field
The present invention relates to field of speech recognition, a kind of speech-emotion recognition method more particularly to deepness belief network and System.
Background technique
With cloud computing, the development of mobile Internet, big data, machine is that the mankind service intelligence further, people and machine For the dream to be engaged in the dialogue with natural language gradually close to realizing, requirement of the people to machine interaction capabilities is also higher and higher.Simply The identification of voice content be no longer satisfied the requirement of people, handle, identify and understand the emotion in voice in practical application In become particularly important.Language emotion recognition has boundless application prospect, it can be applied not only to man-machine friendship Mutual system, can be also used for speech recognition, enhance the robustness of speech recognition;Or it is used for speaker identification, improve speaker Discrimination rate.Speech emotion recognition technology intelligent human-machine interaction, human-computer interaction teaching, be widely used.Automatic language feelings The other research of perception, be not only able to push computer technology further development, it also by substantially increase people work and Learning efficiency improves people's lives quality.
Extraneous various emotion signals are sampled to identify various emotions, in terms of deep neural network research, for The accuracy of emotional semantic classification is low, in terms of pattern-recognition, using the feelings in the prior art based in neural network extraction voice Sense, it is lower for the discrimination of sad, excited, happy and angry emotion, using adaptive neural network to speech emotional state Discrimination it is relatively low.
Using traditional neural network in training, each layer of network is trained together as a whole, when facing big data When situation, the training time of network just will increase, the convergence rate of network is made to become slower.Back-propagation algorithm is neural network Most commonly used method in training trains entire neural network by the method for iteration, and network parameter is using the side being randomized Formula is initialized, and adjusts net using the difference of the actual value of the output valve and data that currently calculate network top obtained The parameter of each layer of network, using traditional gradient descent method, the target of undated parameter be so that neural network forecast value and true value more It is close, still, come initialization network parameter by the way of random initializtion, it will lead to error correction more down when network updates Signal is weaker, and gradient also becomes more sparse, so that network is easily trapped into local optimum.So leading to the knowledge of speech emotional state Not rate is low.
Summary of the invention
The object of the present invention is to provide a kind of speech emotionals of deepness belief network that can be improved speech emotion recognition rate Recognition methods and system.
To achieve the above object, the present invention provides following schemes:
A kind of speech-emotion recognition method of deepness belief network, which is characterized in that the recognition methods includes:
Obtain voice signal;
The voice signal is pre-processed, pretreatment voice signal is obtained;
Unsupervised speech recognition is carried out using deepness belief network to the pretreatment voice signal, is obtained Phonic signal character;
The identification classification that the phonic signal character is carried out to speech emotional using support vector machines, obtains speech emotional and knows Other result.
Optionally, described that unsupervised voice signal spy is carried out using deepness belief network to the pretreatment voice signal Sign is extracted, and is obtained phonic signal character and is specifically included:
Low layer to high-rise N layer limitation Boltzmann machine is stacked, deepness belief network is obtained;
Limitation Boltzmann machine according to the pretreatment voice signal to i-th layer carries out unsupervised training, obtains i-th most Excellent parameter, the optimized parameter for the limitation Boltzmann machine that i-th optimized parameter is described i-th layer;Wherein, the value of i is successively It is 1,2 ..., N;
The limitation Boltzmann machine of i+1 layer is carried out according to i-th optimized parameter and the pretreatment voice signal Unsupervised training obtains i+1 optimized parameter;
The multiple optimized parameter is finely tuned using the method for overall situation training to the deepness belief network and converges to the overall situation It is optimal, obtain multiple fine tuning optimized parameters;
The phonic signal character of the pretreatment voice signal is extracted according to the fine tuning optimized parameter.
Optionally, described that the phonic signal character is classified using the identification that support vector machines carries out speech emotional, it obtains Speech emotion recognition result is obtained to specifically include:
The sample point of the phonic signal character is mapped to by high-dimensional feature space using kernel function, obtaining spatial linear can The sample divided;
The sample that the support vector machines can divide according to the spatial linear carries out logic to the phonic signal character and sentences It is disconnected, obtain speech emotion recognition result.
A kind of speech emotion recognition system of deepness belief network, the identifying system include:
Voice signal obtains module, for obtaining voice signal;
Speech signal pre-processing module obtains pretreatment voice signal for pre-processing the voice signal;
Characteristic extracting module, for carrying out unsupervised voice using deepness belief network to the pretreatment voice signal Signal characteristic abstraction obtains phonic signal character;
Emotion recognition module, for the phonic signal character to be carried out to the identification point of speech emotional using support vector machines Class obtains speech emotion recognition result.
Optionally, the characteristic extracting module specifically includes:
Deepness belief network establishes unit, for stacking low layer to high-rise N layer limitation Boltzmann machine, obtains depth Belief network;
Supervised training unit carries out nothing for the limitation Boltzmann machine according to the pretreatment voice signal to i-th layer Supervised training obtains the i-th optimized parameter, the optimal ginseng for the limitation Boltzmann machine that i-th optimized parameter is described i-th layer Number;Wherein, the value of i is followed successively by 1,2 ..., N;According to i-th optimized parameter and the pretreatment voice signal to I+1 layers of limitation Boltzmann machine carries out unsupervised training, obtains i+1 optimized parameter;
Small parameter perturbations unit is believed for being finely tuned the multiple optimized parameter using the method for overall situation training to the depth Network convergence is read to global optimum, obtains multiple fine tuning optimized parameters;
Speech recognition unit, for extracting the pretreatment voice signal according to the fine tuning optimized parameter Phonic signal character.
Optionally, the emotion recognition module specifically includes:
Kernel function unit, for the sample point of the phonic signal character to be mapped to high dimensional feature sky using kernel function Between, obtain the sample that spatial linear can divide;
Logic judgment unit believes the voice according to the sample that the spatial linear can divide for the support vector machines Number feature carries out logic judgment, obtains speech emotion recognition result.
The specific embodiment provided according to the present invention, the invention discloses following technical effects: the invention discloses one kind The speech-emotion recognition method and system of deepness belief network.The recognition methods includes: acquisition voice signal;Described in pretreatment Voice signal obtains pretreatment voice signal;Deepness belief network is used to carry out the pretreatment voice signal unsupervised Speech recognition obtains phonic signal character;The phonic signal character is subjected to voice feelings using support vector machines The identification of sense is classified, and speech emotion recognition result is obtained.Each limitation Bohr is successively trained hereby using the deepness belief network The mode of graceful machine entirely trains the entire deepness belief network to reach training, using based on the deepness belief network and institute The multi-categorizer model for stating limitation Boltzmann machine, establishes the multi-classifier system of a speech emotion recognition, improves language The discrimination of sound emotion.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is the flow chart of the speech-emotion recognition method of deepness belief network provided by the invention;
Fig. 2 is the structure composition figure of the speech emotion recognition system of deepness belief network provided by the invention;
Fig. 3 is the feelings identifying system block diagram provided by the invention based on support vector machines.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The object of the present invention is to provide a kind of speech emotionals of deepness belief network that can be improved speech emotion recognition rate Recognition methods and system.
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
As shown in Figure 1, a kind of speech-emotion recognition method of deepness belief network, which is characterized in that the recognition methods Include:
Step 100: obtaining voice signal;
Step 200: pre-processing the voice signal, obtain pretreatment voice signal;
Step 300: unsupervised phonic signal character is carried out using deepness belief network to the pretreatment voice signal It extracts, obtains phonic signal character;
Step 400: the phonic signal character being classified using the identification that support vector machines carries out speech emotional, obtains language Sound emotion recognition result.
The step 300: unsupervised voice signal is carried out using deepness belief network to the pretreatment voice signal Feature extraction obtains phonic signal character and specifically includes:
Low layer to high-rise N layer limitation Boltzmann machine is stacked, deepness belief network is obtained;
Limitation Boltzmann machine according to the pretreatment voice signal to i-th layer carries out unsupervised training, obtains i-th most Excellent parameter, the optimized parameter for the limitation Boltzmann machine that i-th optimized parameter is described i-th layer;Wherein, the value of i is successively It is 1,2 ..., N;
The limitation Boltzmann machine of i+1 layer is carried out according to i-th optimized parameter and the pretreatment voice signal Unsupervised training obtains i+1 optimized parameter;
The multiple optimized parameter is finely tuned using the method for overall situation training to the deepness belief network and converges to the overall situation It is optimal, obtain multiple fine tuning optimized parameters;
The phonic signal character of the pretreatment voice signal is extracted according to the fine tuning optimized parameter.
The step 400: the phonic signal character is classified using the identification that support vector machines carries out speech emotional, is obtained Speech emotion recognition result is obtained to specifically include:
The sample point of the phonic signal character is mapped to by high-dimensional feature space using kernel function, obtaining spatial linear can The sample divided;
The sample that the support vector machines can divide according to the spatial linear carries out logic to the phonic signal character and sentences It is disconnected, obtain speech emotion recognition result.
As shown in Fig. 2, a kind of speech emotion recognition system of deepness belief network, the identifying system include:
Voice signal obtains module 1, for obtaining voice signal;
Speech signal pre-processing module 2 obtains pretreatment voice signal for pre-processing the voice signal;
Characteristic extracting module 3, for carrying out unsupervised language using deepness belief network to the pretreatment voice signal Sound signal feature extraction obtains phonic signal character;
Emotion recognition module 4, for the phonic signal character to be carried out to the identification of speech emotional using support vector machines Classification obtains speech emotion recognition result.
The characteristic extracting module 3 specifically includes:
Deepness belief network establishes unit, for stacking low layer to high-rise N layer limitation Boltzmann machine, obtains depth Belief network;
Supervised training unit carries out nothing for the limitation Boltzmann machine according to the pretreatment voice signal to i-th layer Supervised training obtains the i-th optimized parameter, the optimal ginseng for the limitation Boltzmann machine that i-th optimized parameter is described i-th layer Number;Wherein, the value of i is followed successively by 1,2 ..., N;According to i-th optimized parameter and the pretreatment voice signal to I+1 layers of limitation Boltzmann machine carries out unsupervised training, obtains i+1 optimized parameter;
Small parameter perturbations unit is believed for being finely tuned the multiple optimized parameter using the method for overall situation training to the depth Network convergence is read to global optimum, obtains multiple fine tuning optimized parameters;
Speech recognition unit, for extracting the pretreatment voice signal according to the fine tuning optimized parameter Phonic signal character.
The emotion recognition module 4 specifically includes:
Kernel function unit, for the sample point of the phonic signal character to be mapped to high dimensional feature sky using kernel function Between, obtain the sample that spatial linear can divide;
Logic judgment unit believes the voice according to the sample that the spatial linear can divide for the support vector machines Number feature carries out logic judgment, obtains speech emotion recognition result.
After the multidimensional characteristic vectors for extracting the affective characteristics in voice signal by deepness belief network, one is needed to be suitble to Emotion classifiers.This method is using support vector machines using one-to-one mode to four kinds of emotions (surprised, glad, angry, sad) Classify.Deepness belief network is extracted to the multidimensional characteristic vectors of the affective characteristics in voice signal as support vector machines The sample point of input feature vector is mapped to the Nonlinear separability problem of speech emotional using kernel function by the input of classifier High-dimensional feature space, so that corresponding sample space linear separability.Feelings identifying system block diagram such as Fig. 3 institute based on support vector machines Show.
" one-to-one " mode is to construct hyperplane to any two kinds of emotions, needs to train k* (k-1)/2 sub-classifier.It is whole A training process needs altogetherA support vector machines sub-classifier, i.e., 6.Each sub-classifier is by surprised, glad, anger Any two kinds of training in anger, sad four kinds of affective characteristics form.That is: glad-indignation, glad-sad, glad-surprised, anger Anger-sadness, indignation-is surprised, sad-surprised.One classifier of training between every two class is carried out when to a unknown speech emotional When classification, each classifier carries out its classification to judge and for corresponding classification " throwing a upper ticket ", last who gets the most votes's class It Ji not be as the classification of the unknown emotion.Decision phase uses ballot method, it is understood that there may be the identical situation of the poll of multiple classes, from And make unknown sample while belonging to multiple classifications, influence nicety of grading.
It is both needed to before support vector machine classifier training and identification as one label of every emotional speech Design of Signal, to Indicate emotional category belonging to this emotional speech signal.The type of label must be set as dimorphism.During emotion recognition, together When feature vector is input in all support vector machines, the output of each support vector machines was selected most later by logical decision Possible emotional category, finally using the emotion of weight highest (poll is most) as the affective state of voice signal to be identified, energy Access recognition result.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
Used herein a specific example illustrates the principle and implementation of the invention, and above embodiments are said It is bright to be merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, foundation Thought of the invention, there will be changes in the specific implementation manner and application range.In conclusion the content of the present specification is not It is interpreted as limitation of the present invention.

Claims (6)

1. a kind of speech-emotion recognition method of deepness belief network, which is characterized in that the recognition methods includes:
Obtain voice signal;
The voice signal is pre-processed, pretreatment voice signal is obtained;
Unsupervised speech recognition is carried out using deepness belief network to the pretreatment voice signal, obtains voice Signal characteristic;
The identification classification that the phonic signal character is carried out to speech emotional using support vector machines, obtains speech emotion recognition knot Fruit.
2. a kind of speech-emotion recognition method of deepness belief network according to claim 1, which is characterized in that described right The pretreatment voice signal carries out unsupervised speech recognition using deepness belief network, and it is special to obtain voice signal Sign specifically includes:
Low layer to high-rise N layer limitation Boltzmann machine is stacked, deepness belief network is obtained;
Limitation Boltzmann machine according to the pretreatment voice signal to i-th layer carries out unsupervised training, obtains the i-th optimal ginseng Number, the optimized parameter for the limitation Boltzmann machine that i-th optimized parameter is described i-th layer;Wherein, the value of i is followed successively by 1, 2 ..., N;
The limitation Boltzmann machine of i+1 layer is carried out without prison according to i-th optimized parameter and the pretreatment voice signal Supervise and instruct white silk, obtains i+1 optimized parameter;
The multiple optimized parameter is finely tuned using the method for overall situation training to the deepness belief network and converges to global optimum, Obtain multiple fine tuning optimized parameters;
The phonic signal character of the pretreatment voice signal is extracted according to the fine tuning optimized parameter.
3. a kind of speech-emotion recognition method of deepness belief network according to claim 1, which is characterized in that described to incite somebody to action The phonic signal character carries out the identification classification of speech emotional using support vector machines, and it is specific to obtain speech emotion recognition result Include:
The sample point of the phonic signal character is mapped to by high-dimensional feature space using kernel function, obtains what spatial linear can divide Sample;
The support vector machines carries out logic judgment to the phonic signal character according to the sample that the spatial linear can divide, and obtains Obtain speech emotion recognition result.
4. a kind of speech emotion recognition system of deepness belief network, which is characterized in that the identifying system includes:
Voice signal obtains module, for obtaining voice signal;
Speech signal pre-processing module obtains pretreatment voice signal for pre-processing the voice signal;
Characteristic extracting module, for carrying out unsupervised voice signal using deepness belief network to the pretreatment voice signal Feature extraction obtains phonic signal character;
Emotion recognition module, for the phonic signal character to be classified using the identification that support vector machines carries out speech emotional, Obtain speech emotion recognition result.
5. a kind of speech emotion recognition system of deepness belief network according to claim 4, which is characterized in that the spy Sign extraction module specifically includes:
Deepness belief network establishes unit, for stacking low layer to high-rise N layer limitation Boltzmann machine, obtains depth conviction Network;
Supervised training unit, for unsupervised to i-th layer of limitation Boltzmann machine progress according to the pretreatment voice signal Training obtains the i-th optimized parameter, the optimized parameter for the limitation Boltzmann machine that i-th optimized parameter is described i-th layer;Its In, the value of i is followed successively by 1,2 ..., N;According to i-th optimized parameter and the pretreatment voice signal to i+1 layer Limitation Boltzmann machine carry out unsupervised training, obtain i+1 optimized parameter;
Small parameter perturbations unit, for being finely tuned the multiple optimized parameter using the method for overall situation training to the depth conviction net Network converges to global optimum, obtains multiple fine tuning optimized parameters;
Speech recognition unit, for extracting the voice of the pretreatment voice signal according to the fine tuning optimized parameter Signal characteristic.
6. a kind of speech emotion recognition system of deepness belief network according to claim 4, which is characterized in that the feelings Sense identification module specifically includes:
Kernel function unit is obtained for the sample point of the phonic signal character to be mapped to high-dimensional feature space using kernel function Obtain the sample of space linear separability;
Logic judgment unit, the sample that can be divided for the support vector machines according to the spatial linear are special to the voice signal Sign carries out logic judgment, obtains speech emotion recognition result.
CN201910173690.3A 2019-02-28 2019-02-28 A kind of speech-emotion recognition method and system of deepness belief network Pending CN109785863A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910173690.3A CN109785863A (en) 2019-02-28 2019-02-28 A kind of speech-emotion recognition method and system of deepness belief network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910173690.3A CN109785863A (en) 2019-02-28 2019-02-28 A kind of speech-emotion recognition method and system of deepness belief network

Publications (1)

Publication Number Publication Date
CN109785863A true CN109785863A (en) 2019-05-21

Family

ID=66486177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910173690.3A Pending CN109785863A (en) 2019-02-28 2019-02-28 A kind of speech-emotion recognition method and system of deepness belief network

Country Status (1)

Country Link
CN (1) CN109785863A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619893A (en) * 2019-09-02 2019-12-27 合肥工业大学 Time-frequency feature extraction and artificial intelligence emotion monitoring method of voice signal
CN112687294A (en) * 2020-12-21 2021-04-20 重庆科技学院 Vehicle-mounted noise identification method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101561651B1 (en) * 2014-05-23 2015-11-02 서강대학교산학협력단 Interest detecting method and apparatus based feature data of voice signal using Deep Belief Network, recording medium recording program of the method
CN106297825A (en) * 2016-07-25 2017-01-04 华南理工大学 A kind of speech-emotion recognition method based on integrated degree of depth belief network
CN107092895A (en) * 2017-05-09 2017-08-25 重庆邮电大学 A kind of multi-modal emotion identification method based on depth belief network
CN108717856A (en) * 2018-06-16 2018-10-30 台州学院 A kind of speech-emotion recognition method based on multiple dimensioned depth convolution loop neural network
CN109036468A (en) * 2018-11-06 2018-12-18 渤海大学 Speech-emotion recognition method based on deepness belief network and the non-linear PSVM of core

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101561651B1 (en) * 2014-05-23 2015-11-02 서강대학교산학협력단 Interest detecting method and apparatus based feature data of voice signal using Deep Belief Network, recording medium recording program of the method
CN106297825A (en) * 2016-07-25 2017-01-04 华南理工大学 A kind of speech-emotion recognition method based on integrated degree of depth belief network
CN107092895A (en) * 2017-05-09 2017-08-25 重庆邮电大学 A kind of multi-modal emotion identification method based on depth belief network
CN108717856A (en) * 2018-06-16 2018-10-30 台州学院 A kind of speech-emotion recognition method based on multiple dimensioned depth convolution loop neural network
CN109036468A (en) * 2018-11-06 2018-12-18 渤海大学 Speech-emotion recognition method based on deepness belief network and the non-linear PSVM of core

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
黄晨晨等: "基于深度信念网络的语音情感识别的研究", 《计算机研究与发展》 *
黄驹斌: "基于深度信念网络的语音情感识别", 《中国优秀硕士学位论文全文数据库》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619893A (en) * 2019-09-02 2019-12-27 合肥工业大学 Time-frequency feature extraction and artificial intelligence emotion monitoring method of voice signal
CN112687294A (en) * 2020-12-21 2021-04-20 重庆科技学院 Vehicle-mounted noise identification method

Similar Documents

Publication Publication Date Title
Kim et al. Towards speech emotion recognition" in the wild" using aggregated corpora and deep multi-task learning
CN108597539A (en) Speech-emotion recognition method based on parameter migration and sound spectrograph
CN110136690A (en) Phoneme synthesizing method, device and computer readable storage medium
CN107577662A (en) Towards the semantic understanding system and method for Chinese text
CN109523994A (en) A kind of multitask method of speech classification based on capsule neural network
CN109409296A (en) The video feeling recognition methods that facial expression recognition and speech emotion recognition are merged
CN107291822A (en) The problem of based on deep learning disaggregated model training method, sorting technique and device
CN108711421A (en) A kind of voice recognition acoustic model method for building up and device and electronic equipment
CN109036465A (en) Speech-emotion recognition method
CN110853656B (en) Audio tampering identification method based on improved neural network
CN111581385A (en) Chinese text type identification system and method for unbalanced data sampling
CN104091602A (en) Speech emotion recognition method based on fuzzy support vector machine
CN104834941A (en) Offline handwriting recognition method of sparse autoencoder based on computer input
CN110223714A (en) A kind of voice-based Emotion identification method
Kurpukdee et al. Speech emotion recognition using convolutional long short-term memory neural network and support vector machines
Maheswari et al. A hybrid model of neural network approach for speaker independent word recognition
CN111899766B (en) Speech emotion recognition method based on optimization fusion of depth features and acoustic features
CN110211595A (en) A kind of speaker clustering system based on deep learning
CN110289002A (en) A kind of speaker clustering method and system end to end
CN105845141A (en) Speaker confirmation model, speaker confirmation method and speaker confirmation device based on channel robustness
CN109785863A (en) A kind of speech-emotion recognition method and system of deepness belief network
Sun et al. A novel convolutional neural network voiceprint recognition method based on improved pooling method and dropout idea
Atkar et al. Speech emotion recognition using dialogue emotion decoder and CNN Classifier
Sivaram et al. Data-driven and feedback based spectro-temporal features for speech recognition
CN110348482A (en) A kind of speech emotion recognition system based on depth model integrated architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190521