CN106782550A - A kind of automatic speech recognition system based on dsp chip - Google Patents

A kind of automatic speech recognition system based on dsp chip Download PDF

Info

Publication number
CN106782550A
CN106782550A CN201611064684.7A CN201611064684A CN106782550A CN 106782550 A CN106782550 A CN 106782550A CN 201611064684 A CN201611064684 A CN 201611064684A CN 106782550 A CN106782550 A CN 106782550A
Authority
CN
China
Prior art keywords
module
speech recognition
unit
voice signal
pattern matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611064684.7A
Other languages
Chinese (zh)
Inventor
田丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heilongjiang Bayi Agricultural University
Original Assignee
Heilongjiang Bayi Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heilongjiang Bayi Agricultural University filed Critical Heilongjiang Bayi Agricultural University
Priority to CN201611064684.7A priority Critical patent/CN106782550A/en
Publication of CN106782550A publication Critical patent/CN106782550A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

The invention discloses a kind of automatic speech recognition system based on dsp chip, including voice signal acquisition device, wavelet filter, speech signal pre-processing module, speech recognition module, neural network module, Pattern Matching Module, speech recognition output module and dsp chip, the voice signal acquisition device, wavelet filter, speech signal pre-processing module, speech recognition module, Pattern Matching Module and speech recognition output module are sequentially connected, and speech recognition module and Pattern Matching Module are connected with neural network module.The present invention is based on Speech processing, small echo and neural network theory and method, have studied the Dynamic Recognition of voice signal, and small echo and neural network theory are applied into speech recognition with method, automatically voice can be identified, simple structure, easy to use, low cost.

Description

A kind of automatic speech recognition system based on dsp chip
Technical field
The present invention relates to technical field of voice recognition, more particularly to a kind of automatic speech recognition system based on dsp chip.
Background technology
Automatic speech recognition is always the ideal that the mankind pursue, and is also the direction that immediate and mid-term scientific research personnel seek assiduously; Its final goal is to allow machine to understand the language of the mankind, and performs corresponding function;Although over 50 years, people is in field of speech recognition Considerable progress is achieved, but we can be clearly seen that, also have very big gap apart from preferable target;With computer Fast development, speech recognition develops into a poor interdisciplinary study extensively by increasingly in-depth study;It with acoustics, The tight phase in the subjects such as linguistics, psychology, signal transacting, artificial intelligence, pattern-recognition, information theory and computer field Even;It shows huge application prospect in many fields, and many high performance speech recognition systems are also come out one after another; Meanwhile, man-machine interaction is made by way of natural language, have far-reaching significance, be widely applied prospect and application field;It is first First, the intelligent sound input based on mode identification technology, can bring revolutionary impact to office automation;Secondly, voice Identification technology will greatly reduce the cumbersome and work of dullness in the extensive use in service industry field, save substantial amounts of manpower, carry High workload efficiency;Again, speech recognition can also embody its powerful advantage on dangerous, severe working environment and battlefield;Cause This, the research work of speech recognition for improving people's living standard, strengthen the various aspects such as national defense construction suffer from it is far-reaching Meaning.
The content of the invention
Based on the technical problem that background technology is present, the present invention proposes a kind of automatic speech recognition based on dsp chip System.
A kind of automatic speech recognition system based on dsp chip proposed by the present invention, including it is voice signal acquisition device, small Wave filter, speech signal pre-processing module, speech recognition module, neural network module, Pattern Matching Module, language Sound recognizes output module and dsp chip, the voice signal acquisition device, wavelet filter, speech signal pre-processing module, language Message characteristic extracting module, Pattern Matching Module and speech recognition output module are sequentially connected, and speech recognition Module and Pattern Matching Module are connected with neural network module, the voice signal acquisition device, wavelet filter, voice letter Number pretreatment module, speech recognition module, neural network module, Pattern Matching Module and speech recognition output module It is connected with dsp chip.
Preferably, the speech signal pre-processing module includes pre-emphasis unit, windowing unit and end-point detection unit, institute State pre-emphasis unit, windowing unit and end-point detection unit to be sequentially connected, pre-emphasis unit is connected with wavelet filter, and end points Detection unit is connected with speech recognition module, and pre-emphasis unit is preaccentuator.
Preferably, the neural network module is including training unit, modeling unit and infers unit, the training unit, Modeling unit and deduction unit are sequentially connected, and training unit is connected with speech recognition module, and infers unit and mould Formula matching module is connected.
Preferably, the wavelet filter is used to choose the useful information of voice signal, and suppresses irrelevant information to knowing Not produced interference, speech signal pre-processing module is used to remove the voice signal of non-speech segment, speech recognition Module is used to for pretreated voice signal to extract effective argument sequence for neural network module and Pattern Matching Module Use.
In the present invention, the automatic speech recognition system that should be based on dsp chip can choose voice letter by wavelet filter Number useful information, and suppress irrelevant information to the interference produced by identification, can be gone by speech signal pre-processing module Except the voice signal of non-speech segment, can be to pretreated voice signal by time domain by speech recognition module And frequency-domain analysis, extract effective argument sequence and used for neural network module and Pattern Matching Module, by neutral net Module can summarize the rule of speech recognition, the voice signal of input can be carried out according to rule by Pattern Matching Module Match somebody with somebody, reach the purpose of identification, the present invention is based on Speech processing, small echo and neural network theory and method, have studied language The Dynamic Recognition of message number, speech recognition is applied to by small echo and neural network theory with method, and voice can be carried out automatically Identification, simple structure is easy to use, low cost.
Brief description of the drawings
Fig. 1 is a kind of structural representation of automatic speech recognition system based on dsp chip proposed by the present invention.
Specific embodiment
The present invention is made with reference to specific embodiment further explain.
Embodiment
With reference to Fig. 1, the present embodiment proposes a kind of automatic speech recognition system based on dsp chip, including voice signal Acquisition device, wavelet filter, speech signal pre-processing module, speech recognition module, neural network module, pattern Matching module, speech recognition output module and dsp chip, voice signal acquisition device, wavelet filter, speech signal pre-processing Module, speech recognition module, Pattern Matching Module and speech recognition output module are sequentially connected, and voice signal is special Levy extraction module and Pattern Matching Module to be connected with neural network module, voice signal acquisition device, wavelet filter, voice Signal pre-processing module, speech recognition module, neural network module, Pattern Matching Module and speech recognition output mould Block is connected with dsp chip, and the automatic speech recognition system that should be based on dsp chip can choose voice letter by wavelet filter Number useful information, and suppress irrelevant information to the interference produced by identification, can be gone by speech signal pre-processing module Except the voice signal of non-speech segment, can be to pretreated voice signal by time domain by speech recognition module And frequency-domain analysis, extract effective argument sequence and used for neural network module and Pattern Matching Module, by neutral net Module can summarize the rule of speech recognition, the voice signal of input can be carried out according to rule by Pattern Matching Module Match somebody with somebody, reach the purpose of identification, the present invention is based on Speech processing, small echo and neural network theory and method, have studied language The Dynamic Recognition of message number, speech recognition is applied to by small echo and neural network theory with method, and voice can be carried out automatically Identification, simple structure is easy to use, low cost.
In the present embodiment, speech signal pre-processing module includes pre-emphasis unit, windowing unit and end-point detection unit, in advance Weighting unit, windowing unit and end-point detection unit are sequentially connected, and pre-emphasis unit is connected with wavelet filter, and end-point detection Unit is connected with speech recognition module, and pre-emphasis unit is preaccentuator, neural network module include training unit, Modeling unit and infer unit, training unit, modeling unit and infer that unit is sequentially connected, training unit and phonic signal character Extraction module is connected, and infers that unit is connected with Pattern Matching Module, and wavelet filter is used to choose the useful letter of voice signal Breath, and suppress irrelevant information to the interference produced by identification, speech signal pre-processing module is used to remove the language of non-speech segment Message number, speech recognition module is used to for pretreated voice signal to extract effective argument sequence for nerve Mixed-media network modules mixed-media and Pattern Matching Module are used, and the automatic speech recognition system that should be based on dsp chip can by wavelet filter The useful information of voice signal is chosen, and suppresses irrelevant information to the interference produced by identification, by speech signal pre-processing Module can remove the voice signal of non-speech segment, and pretreated voice can be believed by speech recognition module Number by time and frequency domain analysis, extract effective argument sequence and used for neural network module and Pattern Matching Module, lead to Crossing neural network module can summarize the rule of speech recognition, by Pattern Matching Module be capable of will be input into voice signal according to Rule is matched, and reaches the purpose of identification, and the present invention is based on Speech processing, small echo and neural network theory and side Method, have studied the Dynamic Recognition of voice signal, and small echo and neural network theory are applied into speech recognition with method, can be automatic Voice is identified, simple structure is easy to use, low cost.
In the present embodiment, voice signal acquisition device obtains voice signal, is then transmit to wavelet filter, wavelet filtering Device chooses the useful information of voice signal, and suppresses irrelevant information to the interference produced by identification, then passes voice signal Speech signal pre-processing module is transported to, the effect of pre-emphasis unit is, by high boost, to be produced when lip is radiated to make up sound Raw high frequency loss;, by digitized voice signal s (n) by a low-order digit system, this digital display circuit can be for it Fixed, or slow self adaptation;Preaccentuator uses the first-order system of most widely used fixation, and its transmission function is such as Under:
Here output s ' (n) of preemphasis is related by the input of following difierence equation to system:
The conventional window function of windowing unit has rectangular window, Hamming window and Hanning window etc., due to Hamming in actual application The frequency characteristic of window is more suitable for the analysis of voice signal, so the system is weighted using Hamming window to signal, Hamming window Function formula it is as follows:
Its frequency characteristic is:
End-point detection unit:Several seconds voice to collecting and recording must make end-point detection to distinguish sound section and unvoiced segments, can The foundation for realizing end points judgement be voice of different nature various parameters in short-term have different probability density function and Adjacent some frame voices should have consistent characteristics of speech sounds;Then speech signal pre-processing module believes pretreated voice Number transmit to speech recognition module, speech recognition module is fallen by linear predictor coefficient and linear prediction Spectral coefficient carries out feature extraction, and linear predictor coefficient is the linear prediction of voice, and its basic thought is:Each of voice signal takes Sample value, can be represented with the weighted sum linear combination of its past several sampling value;The determination principle of each weight coefficient is Make the mean-square value of predicated error minimum.
If be predicted using p sampling value of past, referred to as the linear prediction of p ranks;If with p sampling value { x of past (n-1), x (n-2) ..., x (n-p) } weighting carry out prediction signal current sample value, then predicted value has x (n):
Wherein, weight coefficient use-aplRepresent, referred to as predictive coefficient;Predicated error is:
Make predictive coefficient optimal, even if
ε=E [e2(n)]=min
Predictive coefficient can be solved by Durbin recursive algorithms, be comprised the following steps that:Iterative calculation is to be opened from p=0 from zeroth order Begin;Zeroth order prediction does not give a forecast, and at this moment predicts that multinomial is
A0(z)=1
Predicated error is
e0(n)=x (n)
Predicated error power is
This is the primary condition of iterative calculation;Iterative step is as follows:
1. initialize
2. the parameter of known p rank fallout predictors, i.e., known A are assumedP(z) and εp
3. the reflectance factor of p+1 rank fallout predictors is calculated:
4. the predictive coefficient of p+1 rank fallout predictors is calculated:
The prediction multinomial of corresponding p+1 ranks fallout predictor is:
Ap+1(z)=Ap(z)-γp+1z-(p+1)Ap(z-1)
5. p+1 rank predicated error power is calculated:
6. the is returned 2. to walk.
After calculating terminates, following three classes result is obtained:The predictive coefficient of each rank fallout predictor;The reflection system of each rank fallout predictor Number;Each rank predicated error power.
Linear prediction residue error:Because voice signal has short-term stationarity, therefore characteristics of speech sounds also can use short-time spectrum Represent, cepstrum is conventional one kind;Cepstrum is that inverse Fourier of the signal after Fourier transform gained power spectrum is taken the logarithm becomes Change;Can be separated for recurrent pulse and sound channel by it, that is, obtain channel parameters;Cepstrum coefficient can directly be tried to achieve by the definition of cepstrum, Also can be obtained by LPC coefficient recursion;Compared with directly cepstrum coefficient is calculated, the amount of calculation drug effect of LPCCEP, therefore the system is used LPC cepstrum coefficients;There is a kind of very simple and effective recursive algorithm in the cepstrum based on lpc analysis:
In formula, CmIt is cepstrum coefficient, apIt is predictive coefficient, m is the exponent number (m=1-Q) of cepstrum coefficient, and p is predictive coefficient Exponent number;The phonic signal character of extraction is transmitted to neural network module and carries out speech recognition regularity summarization, while the voice for extracting Signal characteristic is transmitted to Pattern Matching Module, and the speech recognition rule that Pattern Matching Module is summarized according to neural network module is to defeated The voice signal for entering carries out match cognization.
The above, the only present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto, Any one skilled in the art the invention discloses technical scope in, technology according to the present invention scheme and its Inventive concept is subject to equivalent or change, should all be included within the scope of the present invention.

Claims (4)

1. a kind of automatic speech recognition system based on dsp chip, including voice signal acquisition device, wavelet filter, voice Signal pre-processing module, speech recognition module, neural network module, Pattern Matching Module, speech recognition output mould Block and dsp chip, it is characterised in that the voice signal acquisition device, wavelet filter, speech signal pre-processing module, language Message characteristic extracting module, Pattern Matching Module and speech recognition output module are sequentially connected, and speech recognition Module and Pattern Matching Module are connected with neural network module, the voice signal acquisition device, wavelet filter, voice letter Number pretreatment module, speech recognition module, neural network module, Pattern Matching Module and speech recognition output module It is connected with dsp chip.
2. a kind of automatic speech recognition system based on dsp chip according to claim 1, it is characterised in that institute's predicate Sound signal pre-processing module includes pre-emphasis unit, windowing unit and end-point detection unit, the pre-emphasis unit, windowing unit It is sequentially connected with end-point detection unit, pre-emphasis unit is connected with wavelet filter, and end-point detection unit is special with voice signal Extraction module connection is levied, pre-emphasis unit is preaccentuator.
3. a kind of automatic speech recognition system based on dsp chip according to claim 1, it is characterised in that the god Include training unit, modeling unit through mixed-media network modules mixed-media and infer unit, the training unit, modeling unit and deduction unit are successively Connection, training unit is connected with speech recognition module, and infers that unit is connected with Pattern Matching Module.
4. a kind of automatic speech recognition system based on dsp chip according to claim 1, it is characterised in that described small Wave filter is used to choose the useful information of voice signal, and suppresses irrelevant information to the interference produced by identification, voice letter Number pretreatment module is used to remove the voice signal of non-speech segment, and speech recognition module is used for pretreated language Message number extracts effective argument sequence and is used for neural network module and Pattern Matching Module.
CN201611064684.7A 2016-11-28 2016-11-28 A kind of automatic speech recognition system based on dsp chip Pending CN106782550A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611064684.7A CN106782550A (en) 2016-11-28 2016-11-28 A kind of automatic speech recognition system based on dsp chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611064684.7A CN106782550A (en) 2016-11-28 2016-11-28 A kind of automatic speech recognition system based on dsp chip

Publications (1)

Publication Number Publication Date
CN106782550A true CN106782550A (en) 2017-05-31

Family

ID=58902279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611064684.7A Pending CN106782550A (en) 2016-11-28 2016-11-28 A kind of automatic speech recognition system based on dsp chip

Country Status (1)

Country Link
CN (1) CN106782550A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108601180A (en) * 2018-06-04 2018-09-28 长江大学 Electric light audio-switch, control system and method based on sound groove recognition technology in e
CN110047480A (en) * 2019-04-22 2019-07-23 哈尔滨理工大学 Added Management robot head device and control for the inquiry of department, community hospital

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102789779A (en) * 2012-07-12 2012-11-21 广东外语外贸大学 Speech recognition system and recognition method thereof
CN103065629A (en) * 2012-11-20 2013-04-24 广东工业大学 Speech recognition system of humanoid robot
CN103236260A (en) * 2013-03-29 2013-08-07 京东方科技集团股份有限公司 Voice recognition system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102789779A (en) * 2012-07-12 2012-11-21 广东外语外贸大学 Speech recognition system and recognition method thereof
CN103065629A (en) * 2012-11-20 2013-04-24 广东工业大学 Speech recognition system of humanoid robot
CN103236260A (en) * 2013-03-29 2013-08-07 京东方科技集团股份有限公司 Voice recognition system

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
关胜平: ""基于TMS320VC5509A的语音识别与控制***"", 《电子技术应用》 *
曹斌芳: ""一种采用小波变换的实时语音识别***设计"", 《INTERNATIONAL CONFERENCE ON POWER ELECTRONICS & INTELLIGENT TRANSPORTATION SYSTEM》 *
李子琳: ""语音信号识别技术及应用研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
李记昌: ""基于DSP的语音处理及识别算法研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
田丽: ""小波预处理在语音识别***中的应用"", 《科技创新导报》 *
闫文娟: ""基于TMS320C5409的语音识别***"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108601180A (en) * 2018-06-04 2018-09-28 长江大学 Electric light audio-switch, control system and method based on sound groove recognition technology in e
CN110047480A (en) * 2019-04-22 2019-07-23 哈尔滨理工大学 Added Management robot head device and control for the inquiry of department, community hospital

Similar Documents

Publication Publication Date Title
EP2695160B1 (en) Speech syllable/vowel/phone boundary detection using auditory attention cues
CN104008751A (en) Speaker recognition method based on BP neural network
CN103117059A (en) Voice signal characteristics extracting method based on tensor decomposition
CN109192200B (en) Speech recognition method
CN113012720B (en) Depression detection method by multi-voice feature fusion under spectral subtraction noise reduction
Ganapathy et al. Feature extraction using 2-d autoregressive models for speaker recognition.
Chaudhary et al. Gender identification based on voice signal characteristics
CN113077806B (en) Audio processing method and device, model training method and device, medium and equipment
CN112786059A (en) Voiceprint feature extraction method and device based on artificial intelligence
Mistry et al. Overview: Speech recognition technology, mel-frequency cepstral coefficients (mfcc), artificial neural network (ann)
CN110136726A (en) A kind of estimation method, device, system and the storage medium of voice gender
CN115510909A (en) Unsupervised algorithm for DBSCAN to perform abnormal sound features
Labied et al. An overview of automatic speech recognition preprocessing techniques
CN112183582A (en) Multi-feature fusion underwater target identification method
Labied et al. Automatic speech recognition features extraction techniques: A multi-criteria comparison
Mu et al. Voice activity detection optimized by adaptive attention span transformer
CN106782550A (en) A kind of automatic speech recognition system based on dsp chip
CN112634880A (en) Speaker identification method, device, equipment, storage medium and program product
Sundar et al. A mixture model approach for formant tracking and the robustness of student's-t distribution
CN114913859B (en) Voiceprint recognition method, voiceprint recognition device, electronic equipment and storage medium
Liu et al. Pitch-synchronous linear prediction analysis of high-pitched speech using weighted short-time energy function
Chazan et al. Efficient periodicity extraction based on sine-wave representation and its application to pitch determination of speech signals.
KR20180101057A (en) Method and apparatus for voice activity detection robust to noise
CN114550696A (en) Method and system for realizing emotion judgment through voice recognition
CN113742515A (en) Audio classification method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170531

RJ01 Rejection of invention patent application after publication