CN107204189A - The speech recognition system and method for individualized feature model can be loaded - Google Patents

The speech recognition system and method for individualized feature model can be loaded Download PDF

Info

Publication number
CN107204189A
CN107204189A CN201610150095.4A CN201610150095A CN107204189A CN 107204189 A CN107204189 A CN 107204189A CN 201610150095 A CN201610150095 A CN 201610150095A CN 107204189 A CN107204189 A CN 107204189A
Authority
CN
China
Prior art keywords
speech recognition
chip
model
loading
individualized feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610150095.4A
Other languages
Chinese (zh)
Inventor
郎立国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Air China (shanghai) Co Ltd
Original Assignee
Air China (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Air China (shanghai) Co Ltd filed Critical Air China (shanghai) Co Ltd
Priority to CN201610150095.4A priority Critical patent/CN107204189A/en
Publication of CN107204189A publication Critical patent/CN107204189A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention provides a kind of speech recognition system and method for loading individualized feature model, the system includes:Phonetic codec chip, for the analog voice signal received progress A/D to be converted into digital audio and video signals, and is converted to analog voice signal by the digital audio and video signals progress D/A after digital signal processor;Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input;After FLASH chip, speech recognition program and universal phonetic model data for storing digital signal processor, electrifying startup, program and universal phonetic model data are from FLASH chip is loaded into DDR RAM chips;DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individualized feature model data.The present invention can be used for Voice command UI technology, and can load individualized feature model, greatly improve the reliability of discrimination and identification.

Description

The speech recognition system and method for individualized feature model can be loaded
Technical field
The present invention relates to Embedded Speech Recognition System technical field, in particular it relates to which one kind can load personalization The speech recognition system and method for characteristic model.
Background technology
Human-machine interface technology based on button and touch-screen is highly developed, and substantially increases people Operation equipment convenience, and voice utilizes speech recognition to control operation and set as the natural interface of the mankind Standby technology just starts starting, and it, because speech recognition technology is extremely complex, is on the other hand embedding to be on the one hand Enter formula computing capability not enough, the algorithm verified on PC is difficult to be transplanted in embedded system.
The content of the invention
For defect of the prior art, individualized feature mould can be loaded it is an object of the invention to provide one kind The speech recognition system and method for type, it can be used for Voice command UI technology, and can load individual Property characteristic model, greatly improves the reliability of discrimination and identification.
There is provided a kind of speech recognition system for loading individualized feature model according to an aspect of the present invention System, the speech recognition system for loading individualized feature model includes:
Phonetic codec chip, for the analog voice signal received progress A/D to be converted into numeral Audio signal, and the digital audio and video signals progress D/A after digital signal processor is converted into analog voice Signal;
Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input, The result phonetic synthesis of identification is sent to encoding and decoding speech to export digital audio and video signals after the completion of identification Chip carries out voice output;
FLASH chip, speech recognition program and universal phonetic model for storing digital signal processor After data, electrifying startup, program and universal phonetic model data are loaded into DDR RAM from FLASH chip In chip;
DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individual character Change characteristic model data;
Serial port chip, digital signal processor passes through serial port chip and PERCOM peripheral communication, digital signal processor By serial port chip and PERCOM peripheral communication, the corresponding kanji code of the vocabulary identified is provided by serial ports;
Network chip is small using discrimination during universal phonetic Model Identification when someone accent especially severe In 95%, for loading individualized feature model data, to improve its discrimination.
Preferably, the digital signal processor selects the floating type TMS320C6748 numbers of high-performance low-power-consumption Word signal processor.
Preferably, the phonetic codec chip needs to support plurality of sampling rates.
Preferably, the network chip selection LAN8710A cake cores.
Preferably, the communication of the digital signal processor and the communication of phonetic codec chip are all taken Dma mode communicates.
The present invention provides a kind of audio recognition method for loading individualized feature model, and it includes following step Suddenly:
Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first In chip, then universal phonetic model data is loaded into DDR RAM chips, bring into operation preparation language Sound is recognized;
Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal, Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis, The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip;
Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button One personalized speech model data is to DDR RAM chips, if without next personalized speech model Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly The model of loading carries out speech recognition;
Step 4, in sound identification module operation, if receiving the personalized speech model of network loading Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.
Compared with prior art, the present invention has following beneficial effect:The present invention can be used for voice control UI processed technology, and can load individualized feature model, greatly improves the reliable of discrimination and identification Property.
Brief description of the drawings
By reading the detailed description made with reference to the following drawings to non-limiting example, of the invention its Its feature, objects and advantages will become more apparent upon:
Fig. 1 can load the theory diagram of the speech recognition system of individualized feature model for the present invention.
Fig. 2 can load the flow chart of the audio recognition method of individualized feature model for the present invention.
Embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to ability The technical staff in domain further understands the present invention, but the invention is not limited in any way.It should be understood that It is, to those skilled in the art, without departing from the inventive concept of the premise, can be with Make several modifications and improvements.These belong to protection scope of the present invention.
As shown in figure 1, the speech recognition system that the present invention can load individualized feature model includes:
Phonetic codec chip 104, for the analog voice signal received progress A/D to be converted to Digital audio and video signals, and the digital audio and video signals progress D/A after digital signal processor is converted into simulation Voice signal;
Digital signal processor (Digital Signal Processor, DSP) 101, for input Digital audio and video signals carry out speech recognition algorithm processing, are by the result phonetic synthesis of identification after the completion of identification Digital audio and video signals are exported, phonetic codec chip is sent to and carries out voice output;
FLASH chip 102, speech recognition program and universal phonetic for storing digital signal processor After model data, electrifying startup, program and universal phonetic model data are loaded into DDR from FLASH chip In RAM chip;
DDR RAM chips 103, for running speech recognition program, storage universal phonetic model data and Individualized feature model data;
Serial port chip 105, DSP by serial port chip and PERCOM peripheral communication, digital signal processor passes through Serial port chip and PERCOM peripheral communication, the corresponding kanji code of the vocabulary identified is provided by serial ports;
Network chip 106, when someone accent especially severe, is recognized during using universal phonetic Model Identification Rate is less than 95%, for loading individualized feature model data, to improve its discrimination.
The speech recognition system that the present invention can load individualized feature model can also include lithium battery 107, Lithium battery is used to power to the speech recognition system that the present invention can load individualized feature model.
As a kind of embodiment, digital signal processor 101 can select the floating-point of high-performance low-power-consumption Type TMS320C6748DSP, meanwhile, in order to reduce power consumption, the use of each interface of processor is reduced as far as possible, In the case where meeting algorithm process, processor working frequency is reduced as far as possible.FLASH chip 102 and DDR The chip that RAM chip 103 selects in the market general and this amount of money word signal processor can be supported is i.e. Can.Serial port chip 105 can select the chip of any one standard of RS232, RS422, RS485.Language Sound codec chip 104 is needed to support plurality of sampling rates, and such as 8KHz, 16KHz, 44.1KHz are adopted Sample precision supports 16bit, 24bit.Network chip 106 can select LAN8710A cake cores.
As a kind of embodiment, phonetic codec chip is configured as 16KHz sample rate, sampling essence Spend for 24bit.It can be communicated between digital signal processor and phonetic codec chip using IIS modes, Transmission byte numerical digit 48K bytes per second, in order to reduce the burden of digital signal processor, make data signal Processor mainly runs recognizer, the communication of digital signal processor and the communication of phonetic codec chip DMA (Direct Memory Access, direct memory access) mode is all taken to communicate.
As shown in Fig. 2 the present invention can load the audio recognition method of individualized feature model, including it is as follows Step:
Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first In chip, then universal phonetic model data is loaded into DDR RAM chips (if personalized language Sound model data, personalized speech model data is loaded into DDR RAM chips), bring into operation standard Standby speech recognition;
Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal, Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis, The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip;
Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button One personalized speech model data is to DDR RAM chips, if without next personalized speech model Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly The model of loading carries out speech recognition;
Step 4, in sound identification module operation, if receiving the personalized speech model of network loading Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.
The specific embodiment of the present invention is described above.It is to be appreciated that not office of the invention It is limited to above-mentioned particular implementation, those skilled in the art can make various within the scope of the claims Deformation is changed, and this has no effect on the substantive content of the present invention.

Claims (7)

1. a kind of speech recognition system for loading individualized feature model, it is characterised in that described to add Carrying the speech recognition system of individualized feature model includes:
Phonetic codec chip, for the analog voice signal received progress A/D to be converted into numeral Audio signal, and the digital audio and video signals progress D/A after digital signal processor is converted into analog voice Signal;
Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input, The result phonetic synthesis of identification is sent to encoding and decoding speech to export digital audio and video signals after the completion of identification Chip carries out voice output;
FLASH chip, speech recognition program and universal phonetic model for storing digital signal processor After data, electrifying startup, program and universal phonetic model data are loaded into DDR RAM from FLASH chip In chip;
DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individual character Change characteristic model data;
Serial port chip, digital signal processor provides knowledge by serial port chip and PERCOM peripheral communication by serial ports The corresponding kanji code of vocabulary not gone out;
Network chip is small using discrimination during universal phonetic Model Identification when someone accent especially severe In 95%, for loading individualized feature model data, to improve its discrimination.
2. the speech recognition system according to claim 1 for loading individualized feature model, its feature It is, the speech recognition system for loading individualized feature model also includes lithium battery, lithium battery is used to give The speech recognition system for loading individualized feature model is powered.
3. the speech recognition system according to claim 1 for loading individualized feature model, its feature It is, the digital signal processor is selected at the floating type TMS320C6748 data signals of high-performance low-power-consumption Manage device.
4. the speech recognition system according to claim 1 for loading individualized feature model, it is special Levy and be, the phonetic codec chip needs to support plurality of sampling rates.
5. the speech recognition system according to claim 1 for loading individualized feature model, it is special Levy and be, the network chip selects LAN8710A cake cores.
6. the speech recognition system according to claim 1 for loading individualized feature model, it is special Levy and be, DMA side is all taken in the communication of the digital signal processor and the communication of phonetic codec chip Formula communicates.
7. a kind of audio recognition method for loading individualized feature model, it is characterised in that it is included such as Lower step:
Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first In chip, then universal phonetic model data is loaded into DDR RAM chips, bring into operation preparation language Sound is recognized;
Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal, Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis, The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip;
Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button One personalized speech model data is to DDR RAM chips, if without next personalized speech model Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly The model of loading carries out speech recognition;
Step 4, in sound identification module operation, if receiving the personalized speech model of network loading Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.
CN201610150095.4A 2016-03-16 2016-03-16 The speech recognition system and method for individualized feature model can be loaded Pending CN107204189A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610150095.4A CN107204189A (en) 2016-03-16 2016-03-16 The speech recognition system and method for individualized feature model can be loaded

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610150095.4A CN107204189A (en) 2016-03-16 2016-03-16 The speech recognition system and method for individualized feature model can be loaded

Publications (1)

Publication Number Publication Date
CN107204189A true CN107204189A (en) 2017-09-26

Family

ID=59903809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610150095.4A Pending CN107204189A (en) 2016-03-16 2016-03-16 The speech recognition system and method for individualized feature model can be loaded

Country Status (1)

Country Link
CN (1) CN107204189A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930994A (en) * 2018-09-19 2020-03-27 三星电子株式会社 System and method for providing voice assistant service

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017428A (en) * 2006-12-22 2007-08-15 广东电子工业研究院有限公司 Embedded voice interaction device and interaction method thereof
CN102915731A (en) * 2012-10-10 2013-02-06 百度在线网络技术(北京)有限公司 Method and device for recognizing personalized speeches
CN103761967A (en) * 2014-01-08 2014-04-30 上海应用技术学院 Embedded speech recognition system
CN105096940A (en) * 2015-06-30 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for voice recognition
CN205582481U (en) * 2016-03-16 2016-09-14 中航华东光电(上海)有限公司 But speech recognition system of individualized characteristic model of loading

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017428A (en) * 2006-12-22 2007-08-15 广东电子工业研究院有限公司 Embedded voice interaction device and interaction method thereof
CN102915731A (en) * 2012-10-10 2013-02-06 百度在线网络技术(北京)有限公司 Method and device for recognizing personalized speeches
CN103761967A (en) * 2014-01-08 2014-04-30 上海应用技术学院 Embedded speech recognition system
CN105096940A (en) * 2015-06-30 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for voice recognition
CN205582481U (en) * 2016-03-16 2016-09-14 中航华东光电(上海)有限公司 But speech recognition system of individualized characteristic model of loading

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110930994A (en) * 2018-09-19 2020-03-27 三星电子株式会社 System and method for providing voice assistant service
US11848012B2 (en) 2018-09-19 2023-12-19 Samsung Electronics Co., Ltd. System and method for providing voice assistant service
CN110930994B (en) * 2018-09-19 2024-04-09 三星电子株式会社 System and method for providing voice assistant service

Similar Documents

Publication Publication Date Title
US10332524B2 (en) Speech recognition wake-up of a handheld portable electronic device
CN103310785B (en) Use the electronic installation and method of speech recognition controlled power supply
CN104038864B (en) Microphone circuit assembly and system with speech recognition
CN205582481U (en) But speech recognition system of individualized characteristic model of loading
CN103280216B (en) Improve the speech recognition device the relying on context robustness to environmental change
US8781831B2 (en) System and method for standardized speech recognition infrastructure
WO2019096056A1 (en) Speech recognition method, device and system
CN107134279A (en) A kind of voice awakening method, device, terminal and storage medium
CN110570873B (en) Voiceprint wake-up method and device, computer equipment and storage medium
CN101794576A (en) Dirty word detection aid and using method thereof
CN107886944A (en) A kind of audio recognition method, device, equipment and storage medium
CN109036395A (en) Personalized speaker control method, system, intelligent sound box and storage medium
EP3276616A1 (en) Speech recognition system, speech recognition device, speech recognition method, and control program
CN114360557B (en) Voice tone conversion method, model training method, device, equipment and medium
CN101825953A (en) Chinese character input product with combined voice input and Chinese phonetic alphabet input functions
CN112825248A (en) Voice processing method, model training method, interface display method and equipment
CN105976808A (en) Intelligent speech recognition system and method
US11250854B2 (en) Method and apparatus for voice interaction, device and computer-readable storage medium
CN107204189A (en) The speech recognition system and method for individualized feature model can be loaded
CN110503956A (en) Audio recognition method, device, medium and electronic equipment
CN111508481B (en) Training method and device of voice awakening model, electronic equipment and storage medium
CN103955149A (en) DSP voice recognition used for laser large-screen splicing control system
CN115083397A (en) Training method of lyric acoustic model, lyric recognition method, equipment and product
CN114566140A (en) Speech synthesis model training method, speech synthesis method, equipment and product
CN105719650A (en) Speech recognition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170926

WD01 Invention patent application deemed withdrawn after publication