WO2008117626A1 - 話者選択装置、話者適応モデル作成装置、話者選択方法、話者選択用プログラムおよび話者適応モデル作成プログラム - Google Patents

話者選択装置、話者適応モデル作成装置、話者選択方法、話者選択用プログラムおよび話者適応モデル作成プログラム Download PDF

Info

Publication number
WO2008117626A1
WO2008117626A1 PCT/JP2008/053629 JP2008053629W WO2008117626A1 WO 2008117626 A1 WO2008117626 A1 WO 2008117626A1 JP 2008053629 W JP2008053629 W JP 2008053629W WO 2008117626 A1 WO2008117626 A1 WO 2008117626A1
Authority
WO
WIPO (PCT)
Prior art keywords
speaker
time
selecting
program
adaptive model
Prior art date
Application number
PCT/JP2008/053629
Other languages
English (en)
French (fr)
Inventor
Masahiro Tani
Tadashi Emori
Yoshifumi Onishi
Original Assignee
Nec Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Corporation filed Critical Nec Corporation
Priority to JP2009506262A priority Critical patent/JP5229219B2/ja
Priority to US12/593,414 priority patent/US8452596B2/en
Publication of WO2008117626A1 publication Critical patent/WO2008117626A1/ja

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

 話者の音響的な特徴量が時々刻々変化しても、変化に対応しながら正確で安定的に発声話者に音響的特徴量が近い話者を選択できるようにする。話者スコア算出手段22は、例えば任意の数発話にもとづく長時間話者スコア(音響特徴量に対する話者モデル記憶部31に記憶されている複数の話者モデルのそれぞれの対数ゆう度)を算出し、例えば短時間の発話にもとづく短時間話者スコアを算出する。長時間話者選択手段23は、長時間話者スコアが高い所定数の話者モデルに対応する話者を選択する。短時間話者選択手段24は、長時間話者選択手段23が選択した話者を対象として、所定数よりも少ない数の短時間話者スコアが高い話者モデルに対応する話者を選択する。
PCT/JP2008/053629 2007-03-27 2008-02-29 話者選択装置、話者適応モデル作成装置、話者選択方法、話者選択用プログラムおよび話者適応モデル作成プログラム WO2008117626A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2009506262A JP5229219B2 (ja) 2007-03-27 2008-02-29 話者選択装置、話者適応モデル作成装置、話者選択方法、話者選択用プログラムおよび話者適応モデル作成プログラム
US12/593,414 US8452596B2 (en) 2007-03-27 2008-02-29 Speaker selection based at least on an acoustic feature value similar to that of an utterance speaker

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2007082230 2007-03-27
JP2007-082230 2007-03-27

Publications (1)

Publication Number Publication Date
WO2008117626A1 true WO2008117626A1 (ja) 2008-10-02

Family

ID=39788364

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2008/053629 WO2008117626A1 (ja) 2007-03-27 2008-02-29 話者選択装置、話者適応モデル作成装置、話者選択方法、話者選択用プログラムおよび話者適応モデル作成プログラム

Country Status (3)

Country Link
US (1) US8452596B2 (ja)
JP (1) JP5229219B2 (ja)
WO (1) WO2008117626A1 (ja)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009057739A1 (ja) * 2007-10-31 2009-05-07 Nec Corporation 話者選択装置、話者適応モデル作成装置、話者選択方法および話者選択用プログラム
WO2011064938A1 (ja) * 2009-11-25 2011-06-03 日本電気株式会社 音声データ解析装置、音声データ解析方法及び音声データ解析用プログラム
JP2012073361A (ja) * 2010-09-28 2012-04-12 Fujitsu Ltd 音声認識装置及び音声認識方法
JP2016529567A (ja) * 2013-06-20 2016-09-23 騰訊科技(深▲セン▼)有限公司Tencent Technology (Shenzhen) Company Limited 決済を検証するための方法、装置、およびシステム
US9536547B2 (en) 2014-10-17 2017-01-03 Fujitsu Limited Speaker change detection device and speaker change detection method
WO2020003413A1 (ja) 2018-06-27 2020-01-02 日本電気株式会社 情報処理装置、制御方法、及びプログラム

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8160877B1 (en) * 2009-08-06 2012-04-17 Narus, Inc. Hierarchical real-time speaker recognition for biometric VoIP verification and targeting
US9047867B2 (en) * 2011-02-21 2015-06-02 Adobe Systems Incorporated Systems and methods for concurrent signal recognition
JP5779032B2 (ja) * 2011-07-28 2015-09-16 株式会社東芝 話者分類装置、話者分類方法および話者分類プログラム
US8965763B1 (en) * 2012-02-02 2015-02-24 Google Inc. Discriminative language modeling for automatic speech recognition with a weak acoustic model and distributed training
US8543398B1 (en) 2012-02-29 2013-09-24 Google Inc. Training an automatic speech recognition system using compressed word frequencies
US8374865B1 (en) 2012-04-26 2013-02-12 Google Inc. Sampling training data for an automatic speech recognition system based on a benchmark classification distribution
US8571859B1 (en) 2012-05-31 2013-10-29 Google Inc. Multi-stage speaker adaptation
US8805684B1 (en) 2012-05-31 2014-08-12 Google Inc. Distributed speaker adaptation
US8880398B1 (en) 2012-07-13 2014-11-04 Google Inc. Localized speech recognition with offload
US9123333B2 (en) 2012-09-12 2015-09-01 Google Inc. Minimum bayesian risk methods for automatic speech recognition
US10249306B2 (en) * 2013-01-17 2019-04-02 Nec Corporation Speaker identification device, speaker identification method, and recording medium
US9390712B2 (en) * 2014-03-24 2016-07-12 Microsoft Technology Licensing, Llc. Mixed speech recognition
US9858922B2 (en) * 2014-06-23 2018-01-02 Google Inc. Caching speech recognition scores
JP6276132B2 (ja) * 2014-07-30 2018-02-07 株式会社東芝 発話区間検出装置、音声処理システム、発話区間検出方法およびプログラム
US9299347B1 (en) 2014-10-22 2016-03-29 Google Inc. Speech recognition using associative mapping
US10311855B2 (en) * 2016-03-29 2019-06-04 Speech Morphing Systems, Inc. Method and apparatus for designating a soundalike voice to a target voice from a database of voices
US10896682B1 (en) * 2017-08-09 2021-01-19 Apple Inc. Speaker recognition based on an inside microphone of a headphone
US20240038244A1 (en) * 2020-12-25 2024-02-01 Nec Corporation Speaker identification apparatus, method, and program

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10319988A (ja) * 1997-05-06 1998-12-04 Internatl Business Mach Corp <Ibm> 話者識別方法および話者認識装置
JP2000250593A (ja) * 1999-03-03 2000-09-14 Fujitsu Ltd 話者認識装置及び方法
JP2003202891A (ja) * 2002-01-07 2003-07-18 Matsushita Electric Ind Co Ltd 音声処理用適応モデル作成方法
JP2004053821A (ja) * 2002-07-18 2004-02-19 Univ Waseda 話者識別方法およびそのシステム、並びにプログラム
JP2004294755A (ja) * 2003-03-27 2004-10-21 Secom Co Ltd 話者認証装置及び話者認証プログラム

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3912089B2 (ja) * 2001-12-03 2007-05-09 セイコーエプソン株式会社 音声認識方法および音声認識装置
JP3756879B2 (ja) 2001-12-20 2006-03-15 松下電器産業株式会社 音響モデルを作成する方法、音響モデルを作成する装置、音響モデルを作成するためのコンピュータプログラム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10319988A (ja) * 1997-05-06 1998-12-04 Internatl Business Mach Corp <Ibm> 話者識別方法および話者認識装置
JP2000250593A (ja) * 1999-03-03 2000-09-14 Fujitsu Ltd 話者認識装置及び方法
JP2003202891A (ja) * 2002-01-07 2003-07-18 Matsushita Electric Ind Co Ltd 音声処理用適応モデル作成方法
JP2004053821A (ja) * 2002-07-18 2004-02-19 Univ Waseda 話者識別方法およびそのシステム、並びにプログラム
JP2004294755A (ja) * 2003-03-27 2004-10-21 Secom Co Ltd 話者認証装置及び話者認証プログラム

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ADAMI A.G. ET AL.: "MODELING PROSODIC DYNAMICS FOR SPEAKER RECOGNITION", PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 4, 6 April 2003 (2003-04-06), pages IV-788 - IV-791 *
YOSHIZAWA S. ET AL.: "Jubun Tokeiryo to Washa Kyori o Mochiita On'in Model no Kyoshi Nashi Gakushuho", THE TRANSACTIONS OF THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS, vol. J85-D-II, no. 3, 1 March 2002 (2002-03-01), pages 382 - 389 *
ZHENG H., YANG Y., WU Z.: "Two-stage Decision for Short Utterance Speaker Identification in Mobile Telecommunication Environment", PROCEEDINGS OF THE 2004 IEEE INTERNATIONAL CONFERNCE ON SYSTEMS, MAN AND CYBERNETICS, vol. 1, 10 October 2004 (2004-10-10), pages 547 - 551, XP010773657 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009057739A1 (ja) * 2007-10-31 2009-05-07 Nec Corporation 話者選択装置、話者適応モデル作成装置、話者選択方法および話者選択用プログラム
JPWO2009057739A1 (ja) * 2007-10-31 2011-03-10 日本電気株式会社 話者選択装置、話者適応モデル作成装置、話者選択方法および話者選択用プログラム
JP5626558B2 (ja) * 2007-10-31 2014-11-19 日本電気株式会社 話者選択装置、話者適応モデル作成装置、話者選択方法および話者選択用プログラム
WO2011064938A1 (ja) * 2009-11-25 2011-06-03 日本電気株式会社 音声データ解析装置、音声データ解析方法及び音声データ解析用プログラム
JP2012073361A (ja) * 2010-09-28 2012-04-12 Fujitsu Ltd 音声認識装置及び音声認識方法
JP2016529567A (ja) * 2013-06-20 2016-09-23 騰訊科技(深▲セン▼)有限公司Tencent Technology (Shenzhen) Company Limited 決済を検証するための方法、装置、およびシステム
US9536547B2 (en) 2014-10-17 2017-01-03 Fujitsu Limited Speaker change detection device and speaker change detection method
WO2020003413A1 (ja) 2018-06-27 2020-01-02 日本電気株式会社 情報処理装置、制御方法、及びプログラム
JPWO2020003413A1 (ja) * 2018-06-27 2021-07-08 日本電気株式会社 情報処理装置、制御方法、及びプログラム
JP6996627B2 (ja) 2018-06-27 2022-01-17 日本電気株式会社 情報処理装置、制御方法、及びプログラム
US11437044B2 (en) 2018-06-27 2022-09-06 Nec Corporation Information processing apparatus, control method, and program

Also Published As

Publication number Publication date
JPWO2008117626A1 (ja) 2010-07-15
JP5229219B2 (ja) 2013-07-03
US20100114572A1 (en) 2010-05-06
US8452596B2 (en) 2013-05-28

Similar Documents

Publication Publication Date Title
WO2008117626A1 (ja) 話者選択装置、話者適応モデル作成装置、話者選択方法、話者選択用プログラムおよび話者適応モデル作成プログラム
US9514747B1 (en) Reducing speech recognition latency
Kinoshita et al. The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
EP3553773A1 (en) Training and testing utterance-based frameworks
US11056118B2 (en) Speaker identification
WO2008108232A1 (ja) 音声認識装置、音声認識方法及び音声認識プログラム
WO2008047339A3 (en) Method and apparatus for large population speaker identification in telephone interactions
TW200601263A (en) Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition
WO2016139670A8 (en) System and method for generating accurate speech transcription from natural speech audio signals
WO2012155079A3 (en) Adaptive voice recognition systems and methods
WO2007095277A3 (en) Communication device having speaker independent speech recognition
WO2014025682A3 (en) Acoustic data selection for training the parameters of an acoustic model
EP4235649A3 (en) Language model biasing
WO2017172632A3 (en) Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems
EP2573768A3 (en) Reverberation suppression device, reverberation suppression method, and computer-readable storage medium storing a reverberation suppression program
WO2012134877A3 (en) Computer-implemented systems and methods evaluating prosodic features of speech
EP2211561A3 (en) Speech signal processing apparatus with microphone signal selection
TW200627376A (en) Method and apparatus for constructing Chinese new words by the input voice
WO2020117639A3 (en) Text independent speaker recognition
WO2007047587A3 (en) Method and device for recognizing human intent
DE602004023134D1 (de) Spracherkennungsverfahren und -system, das an die eigenschaften von nichtmuttersprachlern angepasst ist
EP1471501A3 (en) Speech recognition apparatus, speech recognition method, and recording medium on which speech recognition program is computer-readable recorded
GB2593300A (en) Biometric user recognition
WO2007061749A3 (en) Methods, systems, and computer program products for speech assessment
WO2008126254A1 (ja) 話者認識装置、音響モデル更新方法及び音響モデル更新処理プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08721049

Country of ref document: EP

Kind code of ref document: A1

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
ENP Entry into the national phase

Ref document number: 2009506262

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 12593414

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08721049

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 5453/CHENP/2010

Country of ref document: IN