CN107204189A - The speech recognition system and method for individualized feature model can be loaded - Google Patents
The speech recognition system and method for individualized feature model can be loaded Download PDFInfo
- Publication number
- CN107204189A CN107204189A CN201610150095.4A CN201610150095A CN107204189A CN 107204189 A CN107204189 A CN 107204189A CN 201610150095 A CN201610150095 A CN 201610150095A CN 107204189 A CN107204189 A CN 107204189A
- Authority
- CN
- China
- Prior art keywords
- speech recognition
- chip
- model
- loading
- individualized feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 10
- 238000004891 communication Methods 0.000 claims description 11
- 230000015572 biosynthetic process Effects 0.000 claims description 9
- 238000003786 synthesis reaction Methods 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 6
- 230000002093 peripheral effect Effects 0.000 claims description 5
- WHXSMMKQMYFTQS-UHFFFAOYSA-N Lithium Chemical compound [Li] WHXSMMKQMYFTQS-UHFFFAOYSA-N 0.000 claims description 4
- 229910052744 lithium Inorganic materials 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 206010034719 Personality change Diseases 0.000 claims description 2
- 238000007667 floating Methods 0.000 claims description 2
- 238000002360 preparation method Methods 0.000 claims description 2
- 230000005236 sound signal Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 7
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention provides a kind of speech recognition system and method for loading individualized feature model, the system includes:Phonetic codec chip, for the analog voice signal received progress A/D to be converted into digital audio and video signals, and is converted to analog voice signal by the digital audio and video signals progress D/A after digital signal processor;Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input;After FLASH chip, speech recognition program and universal phonetic model data for storing digital signal processor, electrifying startup, program and universal phonetic model data are from FLASH chip is loaded into DDR RAM chips;DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individualized feature model data.The present invention can be used for Voice command UI technology, and can load individualized feature model, greatly improve the reliability of discrimination and identification.
Description
Technical field
The present invention relates to Embedded Speech Recognition System technical field, in particular it relates to which one kind can load personalization
The speech recognition system and method for characteristic model.
Background technology
Human-machine interface technology based on button and touch-screen is highly developed, and substantially increases people
Operation equipment convenience, and voice utilizes speech recognition to control operation and set as the natural interface of the mankind
Standby technology just starts starting, and it, because speech recognition technology is extremely complex, is on the other hand embedding to be on the one hand
Enter formula computing capability not enough, the algorithm verified on PC is difficult to be transplanted in embedded system.
The content of the invention
For defect of the prior art, individualized feature mould can be loaded it is an object of the invention to provide one kind
The speech recognition system and method for type, it can be used for Voice command UI technology, and can load individual
Property characteristic model, greatly improves the reliability of discrimination and identification.
There is provided a kind of speech recognition system for loading individualized feature model according to an aspect of the present invention
System, the speech recognition system for loading individualized feature model includes:
Phonetic codec chip, for the analog voice signal received progress A/D to be converted into numeral
Audio signal, and the digital audio and video signals progress D/A after digital signal processor is converted into analog voice
Signal;
Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input,
The result phonetic synthesis of identification is sent to encoding and decoding speech to export digital audio and video signals after the completion of identification
Chip carries out voice output;
FLASH chip, speech recognition program and universal phonetic model for storing digital signal processor
After data, electrifying startup, program and universal phonetic model data are loaded into DDR RAM from FLASH chip
In chip;
DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individual character
Change characteristic model data;
Serial port chip, digital signal processor passes through serial port chip and PERCOM peripheral communication, digital signal processor
By serial port chip and PERCOM peripheral communication, the corresponding kanji code of the vocabulary identified is provided by serial ports;
Network chip is small using discrimination during universal phonetic Model Identification when someone accent especially severe
In 95%, for loading individualized feature model data, to improve its discrimination.
Preferably, the digital signal processor selects the floating type TMS320C6748 numbers of high-performance low-power-consumption
Word signal processor.
Preferably, the phonetic codec chip needs to support plurality of sampling rates.
Preferably, the network chip selection LAN8710A cake cores.
Preferably, the communication of the digital signal processor and the communication of phonetic codec chip are all taken
Dma mode communicates.
The present invention provides a kind of audio recognition method for loading individualized feature model, and it includes following step
Suddenly:
Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first
In chip, then universal phonetic model data is loaded into DDR RAM chips, bring into operation preparation language
Sound is recognized;
Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system
After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal,
Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to
Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis,
The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip;
Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button
One personalized speech model data is to DDR RAM chips, if without next personalized speech model
Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly
The model of loading carries out speech recognition;
Step 4, in sound identification module operation, if receiving the personalized speech model of network loading
Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips
In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.
Compared with prior art, the present invention has following beneficial effect:The present invention can be used for voice control
UI processed technology, and can load individualized feature model, greatly improves the reliable of discrimination and identification
Property.
Brief description of the drawings
By reading the detailed description made with reference to the following drawings to non-limiting example, of the invention its
Its feature, objects and advantages will become more apparent upon:
Fig. 1 can load the theory diagram of the speech recognition system of individualized feature model for the present invention.
Fig. 2 can load the flow chart of the audio recognition method of individualized feature model for the present invention.
Embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to ability
The technical staff in domain further understands the present invention, but the invention is not limited in any way.It should be understood that
It is, to those skilled in the art, without departing from the inventive concept of the premise, can be with
Make several modifications and improvements.These belong to protection scope of the present invention.
As shown in figure 1, the speech recognition system that the present invention can load individualized feature model includes:
Phonetic codec chip 104, for the analog voice signal received progress A/D to be converted to
Digital audio and video signals, and the digital audio and video signals progress D/A after digital signal processor is converted into simulation
Voice signal;
Digital signal processor (Digital Signal Processor, DSP) 101, for input
Digital audio and video signals carry out speech recognition algorithm processing, are by the result phonetic synthesis of identification after the completion of identification
Digital audio and video signals are exported, phonetic codec chip is sent to and carries out voice output;
FLASH chip 102, speech recognition program and universal phonetic for storing digital signal processor
After model data, electrifying startup, program and universal phonetic model data are loaded into DDR from FLASH chip
In RAM chip;
DDR RAM chips 103, for running speech recognition program, storage universal phonetic model data and
Individualized feature model data;
Serial port chip 105, DSP by serial port chip and PERCOM peripheral communication, digital signal processor passes through
Serial port chip and PERCOM peripheral communication, the corresponding kanji code of the vocabulary identified is provided by serial ports;
Network chip 106, when someone accent especially severe, is recognized during using universal phonetic Model Identification
Rate is less than 95%, for loading individualized feature model data, to improve its discrimination.
The speech recognition system that the present invention can load individualized feature model can also include lithium battery 107,
Lithium battery is used to power to the speech recognition system that the present invention can load individualized feature model.
As a kind of embodiment, digital signal processor 101 can select the floating-point of high-performance low-power-consumption
Type TMS320C6748DSP, meanwhile, in order to reduce power consumption, the use of each interface of processor is reduced as far as possible,
In the case where meeting algorithm process, processor working frequency is reduced as far as possible.FLASH chip 102 and DDR
The chip that RAM chip 103 selects in the market general and this amount of money word signal processor can be supported is i.e.
Can.Serial port chip 105 can select the chip of any one standard of RS232, RS422, RS485.Language
Sound codec chip 104 is needed to support plurality of sampling rates, and such as 8KHz, 16KHz, 44.1KHz are adopted
Sample precision supports 16bit, 24bit.Network chip 106 can select LAN8710A cake cores.
As a kind of embodiment, phonetic codec chip is configured as 16KHz sample rate, sampling essence
Spend for 24bit.It can be communicated between digital signal processor and phonetic codec chip using IIS modes,
Transmission byte numerical digit 48K bytes per second, in order to reduce the burden of digital signal processor, make data signal
Processor mainly runs recognizer, the communication of digital signal processor and the communication of phonetic codec chip
DMA (Direct Memory Access, direct memory access) mode is all taken to communicate.
As shown in Fig. 2 the present invention can load the audio recognition method of individualized feature model, including it is as follows
Step:
Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first
In chip, then universal phonetic model data is loaded into DDR RAM chips (if personalized language
Sound model data, personalized speech model data is loaded into DDR RAM chips), bring into operation standard
Standby speech recognition;
Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system
After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal,
Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to
Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis,
The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip;
Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button
One personalized speech model data is to DDR RAM chips, if without next personalized speech model
Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly
The model of loading carries out speech recognition;
Step 4, in sound identification module operation, if receiving the personalized speech model of network loading
Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips
In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.
The specific embodiment of the present invention is described above.It is to be appreciated that not office of the invention
It is limited to above-mentioned particular implementation, those skilled in the art can make various within the scope of the claims
Deformation is changed, and this has no effect on the substantive content of the present invention.
Claims (7)
1. a kind of speech recognition system for loading individualized feature model, it is characterised in that described to add
Carrying the speech recognition system of individualized feature model includes:
Phonetic codec chip, for the analog voice signal received progress A/D to be converted into numeral
Audio signal, and the digital audio and video signals progress D/A after digital signal processor is converted into analog voice
Signal;
Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input,
The result phonetic synthesis of identification is sent to encoding and decoding speech to export digital audio and video signals after the completion of identification
Chip carries out voice output;
FLASH chip, speech recognition program and universal phonetic model for storing digital signal processor
After data, electrifying startup, program and universal phonetic model data are loaded into DDR RAM from FLASH chip
In chip;
DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individual character
Change characteristic model data;
Serial port chip, digital signal processor provides knowledge by serial port chip and PERCOM peripheral communication by serial ports
The corresponding kanji code of vocabulary not gone out;
Network chip is small using discrimination during universal phonetic Model Identification when someone accent especially severe
In 95%, for loading individualized feature model data, to improve its discrimination.
2. the speech recognition system according to claim 1 for loading individualized feature model, its feature
It is, the speech recognition system for loading individualized feature model also includes lithium battery, lithium battery is used to give
The speech recognition system for loading individualized feature model is powered.
3. the speech recognition system according to claim 1 for loading individualized feature model, its feature
It is, the digital signal processor is selected at the floating type TMS320C6748 data signals of high-performance low-power-consumption
Manage device.
4. the speech recognition system according to claim 1 for loading individualized feature model, it is special
Levy and be, the phonetic codec chip needs to support plurality of sampling rates.
5. the speech recognition system according to claim 1 for loading individualized feature model, it is special
Levy and be, the network chip selects LAN8710A cake cores.
6. the speech recognition system according to claim 1 for loading individualized feature model, it is special
Levy and be, DMA side is all taken in the communication of the digital signal processor and the communication of phonetic codec chip
Formula communicates.
7. a kind of audio recognition method for loading individualized feature model, it is characterised in that it is included such as
Lower step:
Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first
In chip, then universal phonetic model data is loaded into DDR RAM chips, bring into operation preparation language
Sound is recognized;
Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system
After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal,
Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to
Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis,
The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip;
Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button
One personalized speech model data is to DDR RAM chips, if without next personalized speech model
Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly
The model of loading carries out speech recognition;
Step 4, in sound identification module operation, if receiving the personalized speech model of network loading
Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips
In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610150095.4A CN107204189A (en) | 2016-03-16 | 2016-03-16 | The speech recognition system and method for individualized feature model can be loaded |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610150095.4A CN107204189A (en) | 2016-03-16 | 2016-03-16 | The speech recognition system and method for individualized feature model can be loaded |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107204189A true CN107204189A (en) | 2017-09-26 |
Family
ID=59903809
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610150095.4A Pending CN107204189A (en) | 2016-03-16 | 2016-03-16 | The speech recognition system and method for individualized feature model can be loaded |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107204189A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110930994A (en) * | 2018-09-19 | 2020-03-27 | 三星电子株式会社 | System and method for providing voice assistant service |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101017428A (en) * | 2006-12-22 | 2007-08-15 | 广东电子工业研究院有限公司 | Embedded voice interaction device and interaction method thereof |
CN102915731A (en) * | 2012-10-10 | 2013-02-06 | 百度在线网络技术(北京)有限公司 | Method and device for recognizing personalized speeches |
CN103761967A (en) * | 2014-01-08 | 2014-04-30 | 上海应用技术学院 | Embedded speech recognition system |
CN105096940A (en) * | 2015-06-30 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Method and device for voice recognition |
CN205582481U (en) * | 2016-03-16 | 2016-09-14 | 中航华东光电(上海)有限公司 | But speech recognition system of individualized characteristic model of loading |
-
2016
- 2016-03-16 CN CN201610150095.4A patent/CN107204189A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101017428A (en) * | 2006-12-22 | 2007-08-15 | 广东电子工业研究院有限公司 | Embedded voice interaction device and interaction method thereof |
CN102915731A (en) * | 2012-10-10 | 2013-02-06 | 百度在线网络技术(北京)有限公司 | Method and device for recognizing personalized speeches |
CN103761967A (en) * | 2014-01-08 | 2014-04-30 | 上海应用技术学院 | Embedded speech recognition system |
CN105096940A (en) * | 2015-06-30 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Method and device for voice recognition |
CN205582481U (en) * | 2016-03-16 | 2016-09-14 | 中航华东光电(上海)有限公司 | But speech recognition system of individualized characteristic model of loading |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110930994A (en) * | 2018-09-19 | 2020-03-27 | 三星电子株式会社 | System and method for providing voice assistant service |
US11848012B2 (en) | 2018-09-19 | 2023-12-19 | Samsung Electronics Co., Ltd. | System and method for providing voice assistant service |
CN110930994B (en) * | 2018-09-19 | 2024-04-09 | 三星电子株式会社 | System and method for providing voice assistant service |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10332524B2 (en) | Speech recognition wake-up of a handheld portable electronic device | |
CN103310785B (en) | Use the electronic installation and method of speech recognition controlled power supply | |
CN104038864B (en) | Microphone circuit assembly and system with speech recognition | |
CN205582481U (en) | But speech recognition system of individualized characteristic model of loading | |
CN103280216B (en) | Improve the speech recognition device the relying on context robustness to environmental change | |
US8781831B2 (en) | System and method for standardized speech recognition infrastructure | |
WO2019096056A1 (en) | Speech recognition method, device and system | |
CN107134279A (en) | A kind of voice awakening method, device, terminal and storage medium | |
CN110570873B (en) | Voiceprint wake-up method and device, computer equipment and storage medium | |
CN101794576A (en) | Dirty word detection aid and using method thereof | |
CN107886944A (en) | A kind of audio recognition method, device, equipment and storage medium | |
CN109036395A (en) | Personalized speaker control method, system, intelligent sound box and storage medium | |
EP3276616A1 (en) | Speech recognition system, speech recognition device, speech recognition method, and control program | |
CN114360557B (en) | Voice tone conversion method, model training method, device, equipment and medium | |
CN101825953A (en) | Chinese character input product with combined voice input and Chinese phonetic alphabet input functions | |
CN112825248A (en) | Voice processing method, model training method, interface display method and equipment | |
CN105976808A (en) | Intelligent speech recognition system and method | |
US11250854B2 (en) | Method and apparatus for voice interaction, device and computer-readable storage medium | |
CN107204189A (en) | The speech recognition system and method for individualized feature model can be loaded | |
CN110503956A (en) | Audio recognition method, device, medium and electronic equipment | |
CN111508481B (en) | Training method and device of voice awakening model, electronic equipment and storage medium | |
CN103955149A (en) | DSP voice recognition used for laser large-screen splicing control system | |
CN115083397A (en) | Training method of lyric acoustic model, lyric recognition method, equipment and product | |
CN114566140A (en) | Speech synthesis model training method, speech synthesis method, equipment and product | |
CN105719650A (en) | Speech recognition method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170926 |
|
WD01 | Invention patent application deemed withdrawn after publication |