CN108922512A

CN108922512A - A kind of personalization machine people phone customer service system

Info

Publication number: CN108922512A
Application number: CN201810726770.2A
Authority: CN
Inventors: 陈志林
Original assignee: Guangdong Pig Strong Internet Technology Co Ltd
Current assignee: Guangdong Pig Strong Internet Technology Co Ltd
Priority date: 2018-07-04
Filing date: 2018-07-04
Publication date: 2018-11-30

Abstract

The present invention provides a kind of personalization machine people phone customer service system, including：Communication device：For receiving the first voice signal of user terminal transmission, and user terminal is sent by the second voice signal generated by intelligent robot；Intelligent robot：It is connect with the communication device, for receiving first voice signal, searches corresponding return information after identifying first voice signal, and generate corresponding second voice signal and replied；Database：It is communicated to connect with intelligent robot, for storing the return information and related data that are used for reply.Robot phone customer service system of the present invention can accurately identify user voice signal and its subsidiary emotion information, and personalized reply is provided for user, adaptable, and the intelligent interaction for improving system is horizontal.

Description

A kind of personalization machine people phone customer service system

Technical field

The present invention relates to customer service robotic technology field, especially a kind of personalization machine people phone customer service system.

Background technique

Communication between customer service and user is the follow-up service of business to customer, but between current enterprise and client Link up all is line to be carried out with artificial customer service after client's progress dialing, due to the limitation of telephone connection using manual service Property, cause many clients just to will appear and cannot link up in time when carrying out with enterprise communication, causes problem that cannot solve in time Certainly, problem maximum at present is exactly that a problem needs to wait and is lot more time to be handled, often client waiting when Time just will appear dislike mood, influences subsequent communication, influences corporate image.

In the prior art, there are also customer service robots to start to come into operation, by receiving the instruction of user to depanning The feedback of formula, but such customer service robot often can only unilaterally receive the instruction of user, it can not be as " natural person " one The emotion or behavior of sample " understanding " user, and it is humanized provide response, interaction performance need to be improved.

Summary of the invention

In view of the above-mentioned problems, the present invention is intended to provide a kind of personalization machine people phone customer service system.

The purpose of the present invention is realized using following technical scheme：

A kind of personalization machine people phone customer service system, including：

Communication device：For receiving the first voice signal of user terminal transmission, and the will generated by intelligent robot Two voice signals are sent to user terminal；

Intelligent robot：It is connect with the communication device, for receiving first voice signal, identifies first language Corresponding return information is searched after sound signal, and is generated corresponding second voice signal and replied；

Database：It is communicated to connect with intelligent robot, for storing the return information and related data that are used for reply；

Wherein the intelligent robot includes：

Control processing module：For controlling the operation of modules；

Speech recognition module：For received first voice signal to be changed into corresponding text, and obtain the first voice Speech emotional information in signal；

Semantics recognition module differentiates the morpheme of participle, and for segmenting to the text of acquisition to text Sentence type differentiated；

Reply module：For according to the participle morpheme of the acquisition, sentence type and speech emotional information from database The return information of middle match query；

Voice synthetic module：For the text organizational of the return information found at natural language, and to be changed into corresponding The second voice signal.

Preferably, the database includes event base, maneuver library, emotion library, knowledge base, think tank, phrase library and dictionary, Be respectively used to the storage simulation event information of human brain, action message, emotion information, knowledge information, concept information, phrase with Corresponding relationship and standard word between standard speech.

Preferably, the system also includes modified modules, for modifying or increasing newly the information stored in the database.

Preferably, the speech recognition module further comprises：

Pretreatment unit denoises first voice signal, framing, windowing process；

Feature extraction unit carries out feature extraction processing to pretreated first voice signal, obtains the first voice letter Number affective characteristics parameter；

Emotion recognition unit obtains the speech emotional information of the first voice signal according to the affective characteristics parameter.

Beneficial effects of the present invention are：Robot phone customer service system of the present invention, by carrying out communication link with user terminal It connects, receives the voice signal of user terminal, by handling the voice signal of acquisition, the semanteme and language of recognition of speech signals Sound emotion information, and this is analyzed, the return information for matching response replys user, can accurately identify user's language Sound signal and its subsidiary emotion information provide personalized reply for user, adaptable, improve the intelligent interaction of system It is horizontal.

Detailed description of the invention

The present invention will be further described with reference to the accompanying drawings, but the embodiment in attached drawing is not constituted to any limit of the invention System, for those of ordinary skill in the art, without creative efforts, can also obtain according to the following drawings Other attached drawings.

Fig. 1 is frame construction drawing of the invention；

Fig. 2 is the frame construction drawing of intelligent robot of the present invention；

Fig. 3 is the frame construction drawing of database of the present invention；

Fig. 4 is the frame construction drawing of speech recognition module of the present invention.

Appended drawing reference：

Communication device 1, intelligent robot 2, database 3, control processing module 21, speech recognition module 22, semantics recognition Module 23 replys module 24, voice synthetic module 25, event base 31, maneuver library 32, emotion library 33, knowledge base 34, think tank 35, phrase library 36, dictionary 37, pretreatment unit 221, feature extraction unit 222, emotion recognition unit 223

Specific embodiment

In conjunction with following application scenarios, the invention will be further described.

Referring to Fig. 1, Fig. 2, a kind of personalization machine people phone customer service system is shown, including：

Communication device 1：For receiving the first voice signal of user terminal transmission, and will be generated by intelligent robot 2 Second voice signal is sent to user terminal

Intelligent robot 2：It is connect with the communication device 1, for receiving first voice signal, identification described first Corresponding return information is searched after voice signal, and is generated corresponding second voice signal and replied；

Database 3：It is communicated to connect with intelligent robot 2, for storing the return information and related data that are used for reply；

Wherein the intelligent robot 2 includes：

Control processing module 21：For controlling the operation of modules；

Speech recognition module 22：For received first voice signal to be changed into corresponding text, and obtain the first language Speech emotional information in sound signal；

Semantics recognition module 23 differentiates the morpheme of participle, and for segmenting to the text of acquisition to text This sentence type is differentiated；

Reply module 24：For according to the participle morpheme of the acquisition, sentence type and speech emotional information from data The return information of match query in library 3；

Voice synthetic module 25：For the text organizational of the return information found at natural language, and to be changed into phase The second voice signal answered.

Above embodiment of the present invention receives the voice signal of user terminal by being communicatively coupled with user terminal, By handling the voice signal of acquisition, the semanteme and speech emotional information of recognition of speech signals, and this is analyzed, The return information of matching response replys user, can accurately identify user voice signal and its subsidiary emotion information, Personalized reply is provided for user, adaptable, the intelligent interaction for improving system is horizontal.

Preferably, wherein morpheme includes subject, from language, movement language and thought expression language.

Preferably, the communication device is connect with user terminal communication.

It preferably, include event base 31, maneuver library 32, emotion library 33, knowledge base 34, thought referring to database 3 described in Fig. 3 Library 35, phrase library 36 and dictionary 37 are respectively used to the event information of storage simulation human brain, action message, emotion information, know Know the corresponding relationship and standard word between information, concept information, phrase and standard speech.

Preferably, the system also includes modified modules, for modifying or increasing newly the information stored in the database 3.

Preferably, referring to fig. 4, the speech recognition module 22 further comprises：

Pretreatment unit 221 denoises first voice signal, framing, windowing process；

Feature extraction unit 222 carries out feature extraction processing to pretreated first voice signal, obtains the first voice The affective characteristics parameter of signal；

Emotion recognition unit 223 obtains the speech emotional information of the first voice signal according to the affective characteristics parameter.

Above embodiment of the present invention obtains voice signal by carrying out feature extraction processing to the voice signal of acquisition Affective characteristics parameter and according to the speech emotional information of the affective characteristics parameter accurately recognition of speech signals be system Corresponding reply is made according to the speech emotional information in user voice signal to lay a good foundation.

Preferably, the pretreatment unit 221 further comprises：

Sub-frame processing is carried out to received first voice signal, obtains voice sequence δ_c(τ), c=1,2 ..., C, wherein τ Indicate the index of frame, frame number C.

Above embodiment of the present invention carries out sub-frame processing to voice signal, and the feature extraction after being lays the foundation.

Preferably, the feature extraction unit 222 further comprises：

(1) STFT is converted：Short Time Fourier Transform (STFT) is carried out according to the voice sequence of acquisition, wherein the STFT used Function is：

In formula, δ_c(τ) indicates c frame voice sequence, λ (τ₀- τ) indicate Hamming window function；S_c(ι) indicates δ_cThe STFT of (τ) Coefficient, l=0,1 ..., L, L indicate Hamming window length, and j indicates orthogonality relation coefficient；

(2) logarithmic energy spectrum is obtained：The logarithmic energy spectrum for obtaining voice sequence STFT coefficient, wherein the logarithmic energy used Function is：

In formula, Mel_c(μ) indicates Mel filter function, whereinμ_l, μ_tTable respectively Showing the lower and upper limit of Mel frequency spectrum, 0≤g < G, G indicate the quantity of Mel filter,Indicate logarithmic energy, Mel_c(μ) Indicate the filter result of Mel filter group, wherein all framesConstitute logarithmic energy spectrumγ indicates Dynamic gene；

(3) Gabor spectrum is obtained：Obtain the Gabor spectrum of voice signalWherein use Gabor spectrum obtain function for：

In formula, conv () indicates convolution operator,Indicate logarithmic energy spectrum, ξ_xyIndicate Gabor wavelet kernel function,

Wherein：

In formula, x indicates the core direction of Gabor wavelet, x=1,2 ..., X, and X indicates Gabor wavelet core direction quantity, y table Show that the core scale of Gabor wavelet, y=1,2 ..., Y, Y indicate Gabor wavelet core scale quantity, k=(u, v) indicates pixel Spatial position, σ indicate the radius of Gaussian function, ι_yIndicate the corresponding scale parameter of core scale y,Indicate kernel function direction.

(4) local energy distribution profile is obtained：The local energy distribution profile of the Gabor spectrum is obtained, including：

Gabor is composedIt is divided into (C-d+1) (G/d) a fritter φ_uv；

Wherein, u=1 ..., C-d+1, v=1 ..., G/d, d indicate block size；

Each fritter φ is obtained respectively_uvGabor single order not bending moment parameter, wherein bending moment parameter does not obtain the single order used Function is：

β₁=γ₁y₂₀+γ₂y₀₂+γ₃y₁₁

Wherein,

In formula, β₁Indicate single order not bending moment parameter, d_L, d_HRespectively represent the width and height of block, y_mnIt indicates in the normalization of m+n rank Heart square, f_mnIndicate m+n rank central moment,Indicate m+n rank square,Indicate energy barycenter, wherein Wherein, y₂₀That is y_mnMiddle m=2, n=0, γ₁, γ₂, γ₃Indicate setting adjustment because Son；

Through above-mentioned processing, by the Gabor single order obtained from each fritter, bending moment parameter combination does not obtain the part Gabor energy Measure distribution profile；

(5) affective characteristics parameter is obtained：The phase of coefficient between Gabor local energy distribution profile is eliminated using discrete cosine transform Guan Xing to obtain the cepstrum coefficient of Gabor local energy distribution profile, and is denoted as ρ_x,y, by different scale, different directions Gabor local energy distribution profile cepstrum coefficient ρ_x,yIt combines, constitutes the affective characteristics parameter of voice signal.

Preferably, the feature extraction unit 222 further includes：From cepstrum coefficient ρ_x,yMiddle extraction statistical nature, is denoted as ω_x,y, meanwhile, also from ρ_x,ySingle order, extract statistical nature in second differnce, be denoted asWithCombine obtained language Sound signal affective characteristics parameter is：

Preferably, Gabor wavelet core the scale quantity Y=5, core direction quantity X=8 of use.

Above embodiment of the present invention adopts the affective characteristics parameter for extracting voice signal with the aforedescribed process, firstly, using Customized logarithmic energy function obtains the logarithmic energy spectrum of voice signal, can effectively represent the energy point of voice signal Cloth, accuracy are high；Then the Gabor spectrum of voice signal is obtained, wherein the Gabor wavelet kernel function used can amplify voice letter Number emotion local message, and the original information of stick signal improves the robustness of emotion recognition；Then by Gabor spectrum point Obtain single order not bending moment parameter at different fritter, according to the single order of acquisition not bending moment parameter combination at local energy distribution profile, The local energy change information of amplified energy spectrum, can protrude the characteristic of speech emotional in voice signal, improve voice The accuracy and adaptability of affective characteristics parameter extraction；Finally use the cepstrum coefficient and one of Gabor local energy distribution profile The feelings of the composition voice signal such as rank, the mean value of second differnce, standard value, maximum value, minimum value, peak value, the degree of bias, range, intermediate value Feel characteristic parameter, the redundancy of affective characteristics parameter acquisition can be effectively reduced, improves the recognition performance of affective characteristics parameter.

Preferably, emotion recognition unit 223 further comprises：The affective characteristics parameter and preset different phonetic that will acquire The affective characteristics parameter of emotional semantic classification is matched, and obtains corresponding speech emotional classification as speech emotional information；

Wherein, the speech emotional, which is classified, includes：It is neutral, happy, it is angry and sad.

In a kind of scene, when the speech emotional that emotion recognition unit 223 recognizes user is classified as anger, mould is replied Block 24 will obtain the relevant return information for being used to pacify user emotion from database 3, and generate corresponding second voice letter Number.

Finally it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than the present invention is protected The limitation of range is protected, although explaining in detail referring to preferred embodiment to the present invention, those skilled in the art are answered Work as analysis, it can be with modification or equivalent replacement of the technical solution of the present invention are made, without departing from the reality of technical solution of the present invention Matter and range.

Claims

1. a kind of personalization machine people phone customer service system, which is characterized in that including：

Communication device：For receiving the first voice signal of user terminal transmission, and the second language that will be generated by intelligent robot Sound signal is sent to user terminal；

Intelligent robot：It is connect with the communication device, for receiving first voice signal, identifies the first voice letter Corresponding return information is searched after number, and is generated corresponding second voice signal and replied；

Wherein the intelligent robot includes：

Control processing module：For controlling the operation of modules；

Speech recognition module：For received first voice signal to be changed into corresponding text, and obtain the first voice signal In speech emotional information；

Semantics recognition module differentiates the morpheme of participle, and for segmenting to the text of acquisition to the sentence of text Subtype is differentiated；

Reply module：For being looked into from database according to the participle morpheme of the acquisition, sentence type and speech emotional information Ask matched return information；

Voice synthetic module：For the text organizational of the return information found at natural language, and to be changed into corresponding Two voice signals.

2. a kind of personalization machine people phone customer service system according to claim 1, which is characterized in that the database packet Event base, maneuver library, emotion library, knowledge base, think tank, phrase library and dictionary are included, storage simulation human brain is respectively used to Corresponding relationship and standard between event information, action message, emotion information, knowledge information, concept information, phrase and standard speech Word.

3. a kind of personalization machine people phone customer service system according to claim 1, which is characterized in that the system is also wrapped Modified module is included, for modifying or increasing newly the information stored in the database.

4. a kind of personalization machine people phone customer service system according to claim 2, which is characterized in that the speech recognition Module further comprises：

Pretreatment unit denoises first voice signal, framing, windowing process；

Feature extraction unit carries out feature extraction processing to pretreated first voice signal, obtains the first voice signal Affective characteristics parameter；