CN102402983A - Cloud data center speech recognition method - Google Patents

Cloud data center speech recognition method Download PDF

Info

Publication number
CN102402983A
CN102402983A CN2011103801667A CN201110380166A CN102402983A CN 102402983 A CN102402983 A CN 102402983A CN 2011103801667 A CN2011103801667 A CN 2011103801667A CN 201110380166 A CN201110380166 A CN 201110380166A CN 102402983 A CN102402983 A CN 102402983A
Authority
CN
China
Prior art keywords
design
speech recognition
voice
speech
chip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011103801667A
Other languages
Chinese (zh)
Inventor
吕广杰
朱锦雷
朱波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN2011103801667A priority Critical patent/CN102402983A/en
Publication of CN102402983A publication Critical patent/CN102402983A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention provides a cloud data center speech recognition method. An HBR110 chip is used to process and analyze the speech by using the dynamic time warping algorithm and recognize the authorization of a speech owner to realize the recognition of the speech. The design of a system comprises the overall architecture design, the hardware design and the software design, wherein the overall architecture design is the key design of the system. The overall architecture of the system is designed by analyzing the system requirements and researching the mainstream speech recognition products in the market. A human speech recognition processor HBR110 chip is selected and used together with a 8031 SCM, an audio amplification circuit, an SPIFLASH memory and necessary peripheral circuits to process and analyze the speech by using the dynamic time warping algorithm to realize the speech recognition and authorization distribution safety function.

Description

A kind of cloud data center audio recognition method
Technical field
The present invention relates to computer application field, specifically a kind of cloud data center audio recognition method.
Background technology
Development along with Information technology; Cloud computing progressively becomes the development focus of industry, and the cloud computing service platform of domestic and international all big enterprises also begins to put into one after another a plurality of fields such as science, education, culture, health, government, high-performance calculation, ecommerce, Internet of Things to be used.
In order to ensure the safety of cloud data center, in the machine room of most of cloud data center the password identification system has been installed.But, cause the password identification system to have a lot of security breaches because the traditional text password has transreplication, shortcoming such as is prone to have things stolen, is prone to forget.
Voice are as a kind of biological characteristic, have human body intrinsic not reproducible uniqueness.Speech recognition system is linked to each other with cloud data center, can be with the alternative sounds information of different user as key, identification user's identity, and decision user's rights of using.Like this,, be difficult to more decode, have higher security with respect to the traditional text password.
In addition, be generally the limitation that single persona certa discerns and unspecified person is discerned to speech recognition system on the market, native system proposes many persona certas speech recognition schemes, has solved the assignment problem of the multi-user of cloud data center rights of using.
Summary of the invention
The purpose of this invention is to provide a kind of cloud data center audio recognition method.
The objective of the invention is to realize, utilize the HBR110 chip, carry out speech processes and analysis through the dynamic time warping algorithm by following mode; The possessory authority of recognizing voice realizes the identification of voice, and system comprises: 1) overall architecture design; 2) hardware designs and 3) software design, wherein
1) overall architecture design is the primary design effort of this system, through analytic system demand and the main flow speech recognition product of investigating on the market, design system overall architecture; Select people's voice recognition processor HBR110 chip; In conjunction with 8031 single-chip microcomputers, audio amplifier circuit, SPI FLASH storer and necessary peripheral circuit; Utilize the dynamic time warping algorithm to carry out speech processes and analysis, realize the security function of speech recognition and right assignment;
2) hardware designs, hardware designs work comprises systematic schematic diagram design, PCB design;
3) software design, Software Design work uses assembly language that 8031 single-chip microcomputers are programmed, and realizes the control to hardware system; Control HBR110 chip is accomplished following operation:
S1 pre-service: comprise the noise effect that voice signal sampling, anti aliasing bandpass filtering, removal individual pronunciation difference and equipment, environment cause, and relate to choosing and the end-point detection problem of speech recognition primitive;
S2 feature extraction: be used for extracting the parameters,acoustic of voice reflection essential characteristic, comprise average energy, on average stride zero rate, resonance peak;
S3 training: before identification,, from the raw tone sample, remove redundant information, keep critical data through letting the talker repeatedly repeat voice, again according to certain rule to data cluster in addition, form library;
The S4 pattern match: be the core of whole speech recognition system, it calculates the similarity between input feature vector and the stock's pattern according to certain rule and expertise, judges the meaning of one's words information of input voice.
The invention has the beneficial effects as follows:
A) many persona certas speech recognition technology: broken through the speech recognition system limitation that is generally single persona certa's identification or unspecified person identification in the market;
B) right assignment improves security: native system can distribute different rights of using to different user, thereby has improved the security performance of system;
C) treatment technology of unique accent: the user must not use RP, gets final product this system of smooth and easy use;
D) diversified speech model sample: the speech model of machine when training input can be decided by user's request, the sample-specific that needn't using system provides.
Through experimental verification, native system has higher accuracy and practicality, and the voice match accuracy reaches more than 90%.
Description of drawings
Fig. 1 is the speech recognition process flow diagram;
Fig. 2 is speech recognition hardware structure figure.
Embodiment
Explanation at length below with reference to Figure of description method of the present invention being done.
A kind of cloud of the present invention data center audio recognition method, its structure be by
The realization flow of native system is shown in accompanying drawing 1.As described in the summary of the invention, architecture of the present invention mainly comprises: overall architecture design, hardware designs, software design.
Wherein, the overall architecture design is the primary design effort of system, through extensive investigation, selects to adopt the hardware structure like accompanying drawing 2.Coprocessor HBR110 be responsible for accomplishing input to sound, identification, processing, output services by, main control chip 8031 is responsible for accomplishing the control corresponding operation, the latter is the core of system through the programmed control whole system operation.
Hardware designs is second step of system design.Element characteristic through each electronic component of analysis-by-synthesis, heat radiation requirement, working environment etc.; Design peripheral circuit and the audio amplifier circuit and the SPI FLASH memory circuit of primary processor 8031 single-chip microcomputers, coprocessor HBR110 chip respectively, accomplish systematic schematic diagram and PCB figure.
Software design is the final step of system design.Use assembly language that 8031 single-chip microcomputers are programmed, control HBR110 chip is accomplished following operation:
S1 pre-service: comprise voice signal sampling, anti aliasing bandpass filtering, remove noise effect that individual pronunciation difference and equipment, environment cause etc., and relate to choosing and the end-point detection problem of speech recognition primitive;
S2 feature extraction: be used for extracting the parameters,acoustic of voice reflection essential characteristic, like average energy, on average stride zero rate, resonance peak etc.;
S3 training: before identification,, from the raw tone sample, remove redundant information, keep critical data through letting the talker repeatedly repeat voice, again according to certain rule to data cluster in addition, form library;
The S4 pattern match: be the core of whole speech recognition system, it calculates the similarity between input feature vector and the stock's pattern according to certain rule and expertise, judges the meaning of one's words information of input voice.
Except that the described technical characterictic of instructions, be the known technology of those skilled in the art.

Claims (1)

1. a cloud data center audio recognition method is characterized in that utilizing the HBR110 chip, carries out speech processes and analysis through the dynamic time warping algorithm; The possessory authority of recognizing voice; Realize the identification of voice, system comprises: 1) overall architecture design, 2) hardware designs and 3) software design; Wherein
1) overall architecture design is the primary design effort of this system, through analytic system demand and the main flow speech recognition product of investigating on the market, design system overall architecture; Select people's voice recognition processor HBR110 chip; In conjunction with 8031 single-chip microcomputers, audio amplifier circuit, SPI FLASH storer and necessary peripheral circuit; Utilize the dynamic time warping algorithm to carry out speech processes and analysis, realize the security function of speech recognition and right assignment;
2) hardware designs, hardware designs work comprises systematic schematic diagram design, PCB design;
3) software design, Software Design work uses assembly language that 8031 single-chip microcomputers are programmed, and realizes the control to hardware system; Control HBR110 chip is accomplished following operation:
S1 pre-service: comprise the noise effect that voice signal sampling, anti aliasing bandpass filtering, removal individual pronunciation difference and equipment, environment cause, and relate to choosing and the end-point detection problem of speech recognition primitive;
S2 feature extraction: be used for extracting the parameters,acoustic of voice reflection essential characteristic, comprise average energy, on average stride zero rate, resonance peak;
S3 training: before identification,, from the raw tone sample, remove redundant information, keep critical data through letting the talker repeatedly repeat voice, again according to certain rule to data cluster in addition, form library;
The S4 pattern match: be the core of whole speech recognition system, it calculates the similarity between input feature vector and the stock's pattern according to certain rule and expertise, judges the meaning of one's words information of input voice.
CN2011103801667A 2011-11-25 2011-11-25 Cloud data center speech recognition method Pending CN102402983A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011103801667A CN102402983A (en) 2011-11-25 2011-11-25 Cloud data center speech recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011103801667A CN102402983A (en) 2011-11-25 2011-11-25 Cloud data center speech recognition method

Publications (1)

Publication Number Publication Date
CN102402983A true CN102402983A (en) 2012-04-04

Family

ID=45885134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011103801667A Pending CN102402983A (en) 2011-11-25 2011-11-25 Cloud data center speech recognition method

Country Status (1)

Country Link
CN (1) CN102402983A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104882140A (en) * 2015-02-05 2015-09-02 宇龙计算机通信科技(深圳)有限公司 Voice recognition method and system based on blind signal extraction algorithm
CN105489219A (en) * 2016-01-06 2016-04-13 广州零号软件科技有限公司 Indoor space service robot distributed speech recognition system and product
CN106328152A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Automatic identification and monitoring system for indoor noise pollution
CN106683677A (en) * 2015-11-06 2017-05-17 阿里巴巴集团控股有限公司 Method and device for recognizing voice

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000058942A2 (en) * 1999-03-26 2000-10-05 Koninklijke Philips Electronics N.V. Client-server speech recognition
CN1547191A (en) * 2003-12-12 2004-11-17 北京大学 Semantic and sound groove information combined speaking person identity system
CN1877697A (en) * 2006-07-25 2006-12-13 北京理工大学 Method for identifying speaker based on distributed structure
CN101452290A (en) * 2008-10-21 2009-06-10 安徽大学 Intelligent appliance control system based on speech recognition and wireless sensing net

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000058942A2 (en) * 1999-03-26 2000-10-05 Koninklijke Philips Electronics N.V. Client-server speech recognition
CN1547191A (en) * 2003-12-12 2004-11-17 北京大学 Semantic and sound groove information combined speaking person identity system
CN1877697A (en) * 2006-07-25 2006-12-13 北京理工大学 Method for identifying speaker based on distributed structure
CN101452290A (en) * 2008-10-21 2009-06-10 安徽大学 Intelligent appliance control system based on speech recognition and wireless sensing net

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王鹏: "基于ARM平台的智能家居***的研究与实现", 《中国优秀硕士学位论文全文数据库》, 23 July 2009 (2009-07-23) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104882140A (en) * 2015-02-05 2015-09-02 宇龙计算机通信科技(深圳)有限公司 Voice recognition method and system based on blind signal extraction algorithm
CN106328152A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Automatic identification and monitoring system for indoor noise pollution
CN106328152B (en) * 2015-06-30 2020-01-31 芋头科技(杭州)有限公司 automatic indoor noise pollution identification and monitoring system
CN106683677A (en) * 2015-11-06 2017-05-17 阿里巴巴集团控股有限公司 Method and device for recognizing voice
US11664020B2 (en) 2015-11-06 2023-05-30 Alibaba Group Holding Limited Speech recognition method and apparatus
CN105489219A (en) * 2016-01-06 2016-04-13 广州零号软件科技有限公司 Indoor space service robot distributed speech recognition system and product

Similar Documents

Publication Publication Date Title
Liu et al. GMM and CNN hybrid method for short utterance speaker recognition
Cai et al. The DKU replay detection system for the ASVspoof 2019 challenge: On data augmentation, feature representation, classification, and fusion
Yang et al. The SJTU robust anti-spoofing system for the ASVspoof 2019 challenge.
CN103475490B (en) A kind of auth method and device
JP2017090912A (en) Neural network training apparatus and method, and speech recognition apparatus and method
Prabakaran et al. A review on performance of voice feature extraction techniques
CN112071322B (en) End-to-end voiceprint recognition method, device, storage medium and equipment
TW201342365A (en) Method of using voice emotion or excitation level to assist distinguishing sex or age of voice signal
Wu et al. Voting for the right answer: Adversarial defense for speaker verification
CN105609117A (en) Device and method for identifying voice emotion
CN102402983A (en) Cloud data center speech recognition method
CN109887510A (en) Voiceprint recognition method and device based on empirical mode decomposition and MFCC
Ling et al. Attention-Based Convolutional Neural Network for ASV Spoofing Detection.
CN108091340A (en) Method for recognizing sound-groove and Voiceprint Recognition System
Wataraka Gamage et al. Speech-based continuous emotion prediction by learning perception responses related to salient events: A study based on vocal affect bursts and cross-cultural affect in AVEC 2018
Ariff et al. Study of adam and adamax optimizers on alexnet architecture for voice biometric authentication system
CN106650685B (en) Identity recognition method and device based on electrocardiogram signal
Nelus et al. Privacy-preserving audio classification using variational information feature extraction
CN108242240A (en) Voiceprint Recognition System under complicated noise
Dua et al. Gujarati language automatic speech recognition using integrated feature extraction and hybrid acoustic model
CN104318931B (en) Method for acquiring emotional activity of audio file, and method and device for classifying audio file
WO2013149217A1 (en) Systems and methods for automated speech and speaker characterization
Yue et al. Equilibrium optimizer for emotion classification from english speech signals
Lei et al. Robust scream sound detection via sound event partitioning
Jha et al. Analysis of Human Voice for Speaker Recognition: Concepts and Advancement

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120404