CN107204189A

CN107204189A - The speech recognition system and method for individualized feature model can be loaded

Info

Publication number: CN107204189A
Application number: CN201610150095.4A
Authority: CN
Inventors: 郎立国
Original assignee: Air China (shanghai) Co Ltd
Current assignee: Air China (shanghai) Co Ltd
Priority date: 2016-03-16
Filing date: 2016-03-16
Publication date: 2017-09-26

Abstract

The invention provides a kind of speech recognition system and method for loading individualized feature model, the system includes：Phonetic codec chip, for the analog voice signal received progress A/D to be converted into digital audio and video signals, and is converted to analog voice signal by the digital audio and video signals progress D/A after digital signal processor；Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input；After FLASH chip, speech recognition program and universal phonetic model data for storing digital signal processor, electrifying startup, program and universal phonetic model data are from FLASH chip is loaded into DDR RAM chips；DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individualized feature model data.The present invention can be used for Voice command UI technology, and can load individualized feature model, greatly improve the reliability of discrimination and identification.

Description

The speech recognition system and method for individualized feature model can be loaded

Technical field

The present invention relates to Embedded Speech Recognition System technical field, in particular it relates to which one kind can load personalization The speech recognition system and method for characteristic model.

Background technology

Human-machine interface technology based on button and touch-screen is highly developed, and substantially increases people Operation equipment convenience, and voice utilizes speech recognition to control operation and set as the natural interface of the mankind Standby technology just starts starting, and it, because speech recognition technology is extremely complex, is on the other hand embedding to be on the one hand Enter formula computing capability not enough, the algorithm verified on PC is difficult to be transplanted in embedded system.

The content of the invention

For defect of the prior art, individualized feature mould can be loaded it is an object of the invention to provide one kind The speech recognition system and method for type, it can be used for Voice command UI technology, and can load individual Property characteristic model, greatly improves the reliability of discrimination and identification.

There is provided a kind of speech recognition system for loading individualized feature model according to an aspect of the present invention System, the speech recognition system for loading individualized feature model includes：

Phonetic codec chip, for the analog voice signal received progress A/D to be converted into numeral Audio signal, and the digital audio and video signals progress D/A after digital signal processor is converted into analog voice Signal；

Digital signal processor, speech recognition algorithm processing is carried out for the digital audio and video signals to input, The result phonetic synthesis of identification is sent to encoding and decoding speech to export digital audio and video signals after the completion of identification Chip carries out voice output；

FLASH chip, speech recognition program and universal phonetic model for storing digital signal processor After data, electrifying startup, program and universal phonetic model data are loaded into DDR RAM from FLASH chip In chip；

DDR RAM chips, for running speech recognition program, storage universal phonetic model data and individual character Change characteristic model data；

Serial port chip, digital signal processor passes through serial port chip and PERCOM peripheral communication, digital signal processor By serial port chip and PERCOM peripheral communication, the corresponding kanji code of the vocabulary identified is provided by serial ports；

Network chip is small using discrimination during universal phonetic Model Identification when someone accent especially severe In 95%, for loading individualized feature model data, to improve its discrimination.

Preferably, the digital signal processor selects the floating type TMS320C6748 numbers of high-performance low-power-consumption Word signal processor.

Preferably, the phonetic codec chip needs to support plurality of sampling rates.

Preferably, the network chip selection LAN8710A cake cores.

Preferably, the communication of the digital signal processor and the communication of phonetic codec chip are all taken Dma mode communicates.

The present invention provides a kind of audio recognition method for loading individualized feature model, and it includes following step Suddenly：

Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first In chip, then universal phonetic model data is loaded into DDR RAM chips, bring into operation preparation language Sound is recognized；

Step 2, after the operation of sound identification module power-up routine, button detection, detection is identified in system After being pressed to key range, start to control audio coding decoding chip, be AD converted reception voice signal, Then speech recognition is carried out by speech recognition algorithm, simultaneity factor is detected after key range lifts, and is led to Cross serial ports and provide the corresponding kanji code of the vocabulary identified, while the vocabulary of identification is subjected to phonetic synthesis, The result DA of synthesis is converted to analog voice signal and exported by control audio coding decoding chip；

Step 3, in sound identification module operation, is pressed, under loading if detecting models switching button One personalized speech model data is to DDR RAM chips, if without next personalized speech model Data, load universal phonetic model data into DDR RAM chips, subsequent speech recognition will be used newly The model of loading carries out speech recognition；

Step 4, in sound identification module operation, if receiving the personalized speech model of network loading Data, then by the personalized speech model data store received into FLASH chip and DDR RAM chips In, subsequent speech recognition will use the model newly loaded to carry out speech recognition.

Compared with prior art, the present invention has following beneficial effect：The present invention can be used for voice control UI processed technology, and can load individualized feature model, greatly improves the reliable of discrimination and identification Property.

Brief description of the drawings

By reading the detailed description made with reference to the following drawings to non-limiting example, of the invention its Its feature, objects and advantages will become more apparent upon：

Fig. 1 can load the theory diagram of the speech recognition system of individualized feature model for the present invention.

Fig. 2 can load the flow chart of the audio recognition method of individualized feature model for the present invention.

Embodiment

With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to ability The technical staff in domain further understands the present invention, but the invention is not limited in any way.It should be understood that It is, to those skilled in the art, without departing from the inventive concept of the premise, can be with Make several modifications and improvements.These belong to protection scope of the present invention.

As shown in figure 1, the speech recognition system that the present invention can load individualized feature model includes：

Phonetic codec chip 104, for the analog voice signal received progress A/D to be converted to Digital audio and video signals, and the digital audio and video signals progress D/A after digital signal processor is converted into simulation Voice signal；

Digital signal processor (Digital Signal Processor, DSP) 101, for input Digital audio and video signals carry out speech recognition algorithm processing, are by the result phonetic synthesis of identification after the completion of identification Digital audio and video signals are exported, phonetic codec chip is sent to and carries out voice output；

FLASH chip 102, speech recognition program and universal phonetic for storing digital signal processor After model data, electrifying startup, program and universal phonetic model data are loaded into DDR from FLASH chip In RAM chip；

DDR RAM chips 103, for running speech recognition program, storage universal phonetic model data and Individualized feature model data；

Serial port chip 105, DSP by serial port chip and PERCOM peripheral communication, digital signal processor passes through Serial port chip and PERCOM peripheral communication, the corresponding kanji code of the vocabulary identified is provided by serial ports；

Network chip 106, when someone accent especially severe, is recognized during using universal phonetic Model Identification Rate is less than 95%, for loading individualized feature model data, to improve its discrimination.

The speech recognition system that the present invention can load individualized feature model can also include lithium battery 107, Lithium battery is used to power to the speech recognition system that the present invention can load individualized feature model.

As a kind of embodiment, digital signal processor 101 can select the floating-point of high-performance low-power-consumption Type TMS320C6748DSP, meanwhile, in order to reduce power consumption, the use of each interface of processor is reduced as far as possible, In the case where meeting algorithm process, processor working frequency is reduced as far as possible.FLASH chip 102 and DDR The chip that RAM chip 103 selects in the market general and this amount of money word signal processor can be supported is i.e. Can.Serial port chip 105 can select the chip of any one standard of RS232, RS422, RS485.Language Sound codec chip 104 is needed to support plurality of sampling rates, and such as 8KHz, 16KHz, 44.1KHz are adopted Sample precision supports 16bit, 24bit.Network chip 106 can select LAN8710A cake cores.

As a kind of embodiment, phonetic codec chip is configured as 16KHz sample rate, sampling essence Spend for 24bit.It can be communicated between digital signal processor and phonetic codec chip using IIS modes, Transmission byte numerical digit 48K bytes per second, in order to reduce the burden of digital signal processor, make data signal Processor mainly runs recognizer, the communication of digital signal processor and the communication of phonetic codec chip DMA (Direct Memory Access, direct memory access) mode is all taken to communicate.

As shown in Fig. 2 the present invention can load the audio recognition method of individualized feature model, including it is as follows Step：

Step one, after system electrification, speech recognition program is loaded into DDR RAM from FLASH chip first In chip, then universal phonetic model data is loaded into DDR RAM chips (if personalized language Sound model data, personalized speech model data is loaded into DDR RAM chips), bring into operation standard Standby speech recognition；

The specific embodiment of the present invention is described above.It is to be appreciated that not office of the invention It is limited to above-mentioned particular implementation, those skilled in the art can make various within the scope of the claims Deformation is changed, and this has no effect on the substantive content of the present invention.

Claims

1. a kind of speech recognition system for loading individualized feature model, it is characterised in that described to add Carrying the speech recognition system of individualized feature model includes：

Serial port chip, digital signal processor provides knowledge by serial port chip and PERCOM peripheral communication by serial ports The corresponding kanji code of vocabulary not gone out；

2. the speech recognition system according to claim 1 for loading individualized feature model, its feature It is, the speech recognition system for loading individualized feature model also includes lithium battery, lithium battery is used to give The speech recognition system for loading individualized feature model is powered.

3. the speech recognition system according to claim 1 for loading individualized feature model, its feature It is, the digital signal processor is selected at the floating type TMS320C6748 data signals of high-performance low-power-consumption Manage device.

4. the speech recognition system according to claim 1 for loading individualized feature model, it is special Levy and be, the phonetic codec chip needs to support plurality of sampling rates.

5. the speech recognition system according to claim 1 for loading individualized feature model, it is special Levy and be, the network chip selects LAN8710A cake cores.

6. the speech recognition system according to claim 1 for loading individualized feature model, it is special Levy and be, DMA side is all taken in the communication of the digital signal processor and the communication of phonetic codec chip Formula communicates.

7. a kind of audio recognition method for loading individualized feature model, it is characterised in that it is included such as Lower step：