WO2014182453A3 - Method and apparatus for training a voice recognition model database - Google Patents
Method and apparatus for training a voice recognition model database Download PDFInfo
- Publication number
- WO2014182453A3 WO2014182453A3 PCT/US2014/035117 US2014035117W WO2014182453A3 WO 2014182453 A3 WO2014182453 A3 WO 2014182453A3 US 2014035117 W US2014035117 W US 2014035117W WO 2014182453 A3 WO2014182453 A3 WO 2014182453A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- recognition model
- model database
- noise
- voice recognition
- voice input
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An electronic device (102) digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input / noise sample combinations are used to train a voice recognition model database (308) without the user (104) having to repeat the voice input in each of the different environments. In one variation, the electronic device (102) transmits the user's voice input to a server (301) that maintains and trains the voice recognition model database (308).
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201480025758.9A CN105580071B (en) | 2013-05-06 | 2014-04-23 | Method and apparatus for training a voice recognition model database |
EP14725344.7A EP2994907A2 (en) | 2013-05-06 | 2014-04-23 | Method and apparatus for training a voice recognition model database |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361819985P | 2013-05-06 | 2013-05-06 | |
US61/819,985 | 2013-05-06 | ||
US14/094,875 | 2013-12-03 | ||
US14/094,875 US9275638B2 (en) | 2013-03-12 | 2013-12-03 | Method and apparatus for training a voice recognition model database |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2014182453A2 WO2014182453A2 (en) | 2014-11-13 |
WO2014182453A3 true WO2014182453A3 (en) | 2014-12-31 |
Family
ID=51867838
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/035117 WO2014182453A2 (en) | 2013-05-06 | 2014-04-23 | Method and apparatus for training a voice recognition model database |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP2994907A2 (en) |
CN (1) | CN105580071B (en) |
WO (1) | WO2014182453A2 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232909A (en) * | 2018-03-02 | 2019-09-13 | 北京搜狗科技发展有限公司 | A kind of audio-frequency processing method, device, equipment and readable storage medium storing program for executing |
CN109192216A (en) * | 2018-08-08 | 2019-01-11 | 联智科技(天津)有限责任公司 | A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device |
KR20200033707A (en) * | 2018-09-20 | 2020-03-30 | 삼성전자주식회사 | Electronic device, and Method of providing or obtaining data for training thereof |
CN109545195B (en) * | 2018-12-29 | 2023-02-21 | 深圳市科迈爱康科技有限公司 | Accompanying robot and control method thereof |
CN109545196B (en) * | 2018-12-29 | 2022-11-29 | 深圳市科迈爱康科技有限公司 | Speech recognition method, device and computer readable storage medium |
CN110544469B (en) * | 2019-09-04 | 2022-04-19 | 秒针信息技术有限公司 | Training method and device of voice recognition model, storage medium and electronic device |
CN110808030B (en) * | 2019-11-22 | 2021-01-22 | 珠海格力电器股份有限公司 | Voice awakening method, system, storage medium and electronic equipment |
CN111128141B (en) * | 2019-12-31 | 2022-04-19 | 思必驰科技股份有限公司 | Audio identification decoding method and device |
CN111369979B (en) * | 2020-02-26 | 2023-12-19 | 广州市百果园信息技术有限公司 | Training sample acquisition method, device, equipment and computer storage medium |
CN113099353A (en) * | 2021-04-21 | 2021-07-09 | 浙江吉利控股集团有限公司 | Integrated microphone, safety belt, steering wheel and vehicle for vehicle |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1199708A2 (en) * | 2000-10-16 | 2002-04-24 | Microsoft Corporation | Noise robust pattern recognition |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4590692B2 (en) * | 2000-06-28 | 2010-12-01 | パナソニック株式会社 | Acoustic model creation apparatus and method |
US6556971B1 (en) * | 2000-09-01 | 2003-04-29 | Snap-On Technologies, Inc. | Computer-implemented speech recognition system training |
US6889189B2 (en) * | 2003-09-26 | 2005-05-03 | Matsushita Electric Industrial Co., Ltd. | Speech recognizer performance in car and home applications utilizing novel multiple microphone configurations |
US20060149693A1 (en) * | 2005-01-04 | 2006-07-06 | Isao Otsuka | Enhanced classification using training data refinement and classifier updating |
US8762143B2 (en) * | 2007-05-29 | 2014-06-24 | At&T Intellectual Property Ii, L.P. | Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition |
US8234111B2 (en) * | 2010-06-14 | 2012-07-31 | Google Inc. | Speech and noise models for speech recognition |
TWI442384B (en) * | 2011-07-26 | 2014-06-21 | Ind Tech Res Inst | Microphone-array-based speech recognition system and method |
CN102426837B (en) * | 2011-12-30 | 2013-10-16 | 中国农业科学院农业信息研究所 | Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition |
-
2014
- 2014-04-23 WO PCT/US2014/035117 patent/WO2014182453A2/en active Application Filing
- 2014-04-23 CN CN201480025758.9A patent/CN105580071B/en active Active
- 2014-04-23 EP EP14725344.7A patent/EP2994907A2/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1199708A2 (en) * | 2000-10-16 | 2002-04-24 | Microsoft Corporation | Noise robust pattern recognition |
Non-Patent Citations (3)
Title |
---|
AKIRA SASOU ET AL: "Noise Robust Speech Recognition Applied to Voice-Driven Wheelchair", EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, vol. 20, no. 3, 1 January 2009 (2009-01-01), pages 1 - 9, XP055132340, ISSN: 1687-6180, DOI: 10.1016/j.specom.2006.03.002 * |
JI MING ET AL: "Robust Speaker Recognition in Noisy Conditions", IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, IEEE SERVICE CENTER, NEW YORK, NY, USA, vol. 15, no. 5, 1 July 2007 (2007-07-01), pages 1711 - 1723, XP011185748, ISSN: 1558-7916, DOI: 10.1109/TASL.2007.899278 * |
PEI DING ET AL: "Robust mandarin speech recognition in car environments for embedded navigation system", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 54, no. 2, 1 May 2008 (2008-05-01), pages 584 - 590, XP011229939, ISSN: 0098-3063, DOI: 10.1109/TCE.2008.4560134 * |
Also Published As
Publication number | Publication date |
---|---|
EP2994907A2 (en) | 2016-03-16 |
WO2014182453A2 (en) | 2014-11-13 |
CN105580071A (en) | 2016-05-11 |
CN105580071B (en) | 2020-08-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2014182453A3 (en) | Method and apparatus for training a voice recognition model database | |
EP3683725A4 (en) | Abstract description generation method, abstract description model training method and computer device | |
EP3611657A4 (en) | Model training method and method, apparatus, and device for determining data similarity | |
EP2781883A3 (en) | Method and apparatus for optimizing timing of audio commands based on recognized audio patterns | |
WO2014140816A3 (en) | Apparatus and method for performing actions based on captured image data | |
EP3751561A3 (en) | Hotword recognition | |
WO2014105357A3 (en) | Systems and methods for data entry in a non-destructive testing system | |
WO2015138497A3 (en) | Systems and methods for rapid data analysis | |
WO2014022659A3 (en) | Isolating vowel sounds for assessment | |
EP3968179A4 (en) | Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device | |
EP2806425A3 (en) | System and method for speaker verification | |
WO2015009586A3 (en) | Performing an operation relative to tabular data based upon voice input | |
WO2009111721A3 (en) | Voice recognition grammar selection based on context | |
EP2787449A3 (en) | Text data processing method and corresponding electronic device | |
EP2846226A3 (en) | Method and system for providing haptic effects based on information complementary to multimedia content | |
WO2014105359A3 (en) | Voice inspection guidance | |
WO2014172781A8 (en) | Electronic dental charting | |
EP2963643A3 (en) | Entity name recognition | |
EP2860672A3 (en) | Scalable cross domain recommendation system | |
EP2339576A3 (en) | Multi-modal input on an electronic device | |
WO2013145778A3 (en) | Data processing apparatus, data processing method, and program | |
WO2011011413A3 (en) | Method and apparatus for evaluation of a subject's emotional, physiological and/or physical state with the subject's physiological and/or acoustic data | |
WO2012045017A3 (en) | Choosing recognized text from a background environment | |
SG196783A1 (en) | Systems and methods for analyzing learner?s roles and performance and for intelligently adapting the delivery of education | |
EP2385520A3 (en) | Method and device for generating text from spoken word |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201480025758.9 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14725344 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2014725344 Country of ref document: EP |