JP4418903B2 - Voice recognition device - Google Patents

Voice recognition device Download PDF

Info

Publication number
JP4418903B2
JP4418903B2 JP2004075488A JP2004075488A JP4418903B2 JP 4418903 B2 JP4418903 B2 JP 4418903B2 JP 2004075488 A JP2004075488 A JP 2004075488A JP 2004075488 A JP2004075488 A JP 2004075488A JP 4418903 B2 JP4418903 B2 JP 4418903B2
Authority
JP
Japan
Prior art keywords
speech
pitch
interval
utterance
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2004075488A
Other languages
Japanese (ja)
Other versions
JP2005266020A (en
Inventor
紀子 鈴木
恭弘 片桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ATR Advanced Telecommunications Research Institute International
Original Assignee
ATR Advanced Telecommunications Research Institute International
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ATR Advanced Telecommunications Research Institute International filed Critical ATR Advanced Telecommunications Research Institute International
Priority to JP2004075488A priority Critical patent/JP4418903B2/en
Publication of JP2005266020A publication Critical patent/JP2005266020A/en
Application granted granted Critical
Publication of JP4418903B2 publication Critical patent/JP4418903B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Description

この発明は、音声認識装置に関し、特にたとえば、コンピュータと対話をする被験者の発話内容を認識する、音声認識装置に関する。   The present invention relates to a voice recognition device, and more particularly to a voice recognition device that recognizes the content of a speech of a subject who interacts with a computer, for example.

従来のこの種の音声認識装置の一例が、特許文献1に開示されている。この従来技術によれば、ユーザの発話が速すぎる場合にシステム側からの発話速度を遅くするもので、これによってユーザの発話速度が適正範囲に誘導される。かかる誘導は、発話の音量や言い回しの丁寧さについても行われる。
特開2003−150194号公報
An example of a conventional speech recognition apparatus of this type is disclosed in Patent Document 1. According to this prior art, when the user's utterance is too fast, the utterance speed from the system side is slowed down, and thereby the user's utterance speed is guided to an appropriate range. Such guidance is also performed for the volume of speech and the politeness of words.
JP 2003-150194 A

然しながら、上記した従来技術は、発話の速度と音量の誘導のみであったので、音声の認識精度向上には限界があった。   However, the above-described prior art has only the guidance of the speed and volume of speech, so there is a limit to improving the speech recognition accuracy.

それゆえに、この発明の主たる特徴は、発話のその他の韻律パラメータ、即ち発話音声のピッチ、ピッチレンジ、発話間隔についても音声認識に適した範囲に誘導し、音声認識の精度を向上させることを目的とする。   Therefore, the main feature of the present invention is to improve the accuracy of speech recognition by guiding other prosodic parameters of speech, that is, the pitch, pitch range, and speech interval of speech speech to a range suitable for speech recognition. And

請求項1の発明に従う音声認識装置は、擬似音声を出力する出力手段、被験者の発話音声を取り込む取り込み手段、取り込み手段によって取り込まれた発話音声のピッチを検出するピッチ検出手段、ピッチ検出手段によって検出されたピッチが第1閾値を下回るとき出力手段によって出力される擬似音声のピッチを上昇させるピッチ上昇手段、およびピッチ検出手段によって検出されたピッチが第1閾値よりも大きい第2閾値を上回るとき出力手段によって出力される擬似音声のピッチを低下させるピッチ低下手段を備える。   The speech recognition apparatus according to the invention of claim 1 is detected by an output means for outputting pseudo speech, a capture means for capturing the speech sound of the subject, a pitch detection means for detecting the pitch of the speech sound captured by the capture means, and a pitch detection means. Output when the pitch detected is lower than the first threshold, pitch increasing means for increasing the pitch of the pseudo sound output by the output means, and output when the pitch detected by the pitch detection means exceeds a second threshold greater than the first threshold Pitch reduction means for reducing the pitch of the pseudo sound output by the means is provided.

被験者の発話音声のピッチが第1閾値を下回ると、擬似音声のピッチが上昇する。擬似音声のピッチの上昇によって、被験者の発話音声のピッチが上昇方向に誘導される。また、被験者の発話音声のピッチが第2閾値を上回ると、擬似音声のピッチが低下する。擬似音声のピッチの低下によって、被験者の発話音声のピッチが減少方向に誘導される。これによって、被験者の発話音声のピッチを音声認識が可能な範囲に収めることができ、音声認識の精度の向上が図られる。   When the pitch of the uttered voice of the subject falls below the first threshold, the pitch of the pseudo voice increases. As the pitch of the pseudo voice increases, the pitch of the speech voice of the subject is guided in the upward direction. Moreover, if the pitch of a test subject's speech sound exceeds a 2nd threshold value, the pitch of a pseudo sound will fall. The pitch of the uttered voice of the subject is guided in the decreasing direction by the decrease in the pitch of the pseudo voice. As a result, the pitch of the speech voice of the subject can be kept within a range where voice recognition is possible, and the accuracy of voice recognition can be improved.

請求項2の発明に従う音声認識装置は、請求項1に従属し、取り込み手段によって取り込まれた発話音声のピッチレンジを検出するピッチレンジ検出手段、ピッチレンジ検出手段によって検出されたピッチレンジが第3閾値を下回るとき出力手段によって出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大手段、およびピッチレンジ検出手段によって検出されたピッチレンジが第3閾値よりも大きい第4閾値を上回るとき出力手段によって出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小手段をさらに備える。   The speech recognition apparatus according to the invention of claim 2 is dependent on claim 1, and the pitch range detection means for detecting the pitch range of the speech voice captured by the capture means, and the pitch range detected by the pitch range detection means is third. When the pitch range detected by the pitch range detection means exceeds a fourth threshold value greater than the third threshold value, the output means increases the pitch range of the pseudo sound output by the output means when it falls below the threshold value. Pitch range reduction means for reducing the pitch range of the output pseudo sound is further provided.

被験者の発話音声のピッチレンジが第3閾値を下回ると、擬似音声のピッチレンジが拡大される。擬似音声のピッチレンジの拡大によって、被験者の発話音声のピッチレンジが拡大方向に誘導される。また、被験者の発話音声のピッチレンジが第4閾値を上回ると、擬似音声のピッチレンジが縮小される。擬似音声のピッチレンジの縮小によって、被験者の発話音声のピッチレンジが縮小方向に誘導される。これによって、被験者の発話音声のピッチレンジを音声認識が可能な範囲に収めることができ、音声認識の精度の向上が図られる。   When the pitch range of the uttered voice of the subject falls below the third threshold, the pitch range of the pseudo voice is expanded. By expanding the pitch range of the pseudo voice, the pitch range of the speech voice of the subject is guided in the expansion direction. Further, when the pitch range of the uttered voice of the subject exceeds the fourth threshold value, the pitch range of the pseudo voice is reduced. By reducing the pitch range of the pseudo voice, the pitch range of the speech voice of the subject is guided in the reduction direction. As a result, the pitch range of the uttered speech of the subject can be kept within a range where speech recognition is possible, and the accuracy of speech recognition can be improved.

請求項3の発明に従う音声認識装置は、請求項1または2に従属し、取り込み手段によって取り込まれた発話音声の発話間隔を検出する発話間隔検出手段、発話間隔検出手段によって検出された発話間隔が第5閾値を下回るとき出力手段によって出力される擬似音声の発話間隔を伸長する発話間隔伸長手段、および発話間隔検出手段によって検出された発話間隔が第5閾値よりも大きい第6閾値を上回るとき出力手段によって出力される擬似音声の発話間隔を短縮する発話間隔短縮手段をさらに備える。   The speech recognition apparatus according to the invention of claim 3 is dependent on claim 1 or 2, and the speech interval detection means for detecting the speech interval of the speech voice captured by the capture means, and the speech interval detected by the speech interval detection means An utterance interval extension means for extending the utterance interval of the pseudo speech output by the output means when the output value is less than the fifth threshold value, and an output when the utterance interval detected by the utterance interval detection means exceeds the sixth threshold value greater than the fifth threshold value. The apparatus further comprises speech interval shortening means for shortening the speech interval of the pseudo sound output by the means.

被験者の発話音声の発話間隔が第5閾値を下回ると、擬似音声の発話間隔が伸長される。擬似音声の発話間隔の伸長によって、被験者の発話音声の発話間隔が伸長方向に誘導される。また、被験者の発話音声の発話間隔が第6閾値を上回ると、擬似音声の発話間隔が短縮される。擬似音声の発話間隔の短縮によって、被験者の発話音声の発話間隔が短縮方向に誘導される。これによって、被験者の発話音声の速度を音声認識が可能な範囲に収めることができ、音声認識の精度の向上が図られる。   When the speech interval of the test subject's speech is less than the fifth threshold, the speech interval of the pseudo speech is extended. By extending the utterance interval of the pseudo speech, the utterance interval of the subject's utterance speech is guided in the extension direction. Moreover, when the speech interval of the test subject's speech exceeds the sixth threshold, the speech interval of the pseudo speech is shortened. By shortening the speech interval of the pseudo speech, the speech interval of the subject's speech speech is guided in the shortening direction. As a result, the speed of the speech voice of the subject can be kept within a range where voice recognition is possible, and the accuracy of voice recognition can be improved.

請求項4の発明に従う音声認識装置は、擬似音声を出力する出力手段、被験者の発話音声を取り込む取り込み手段、取り込み手段によって取り込まれた発話音声のピッチレンジを検出するピッチレンジ検出手段、ピッチレンジ検出手段によって検出されたピッチレンジが第1閾値を下回るとき出力手段によって出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大手段、およびピッチレンジ検出手段によって検出されたピッチレンジが第1閾値よりも大きい第2閾値を上回るとき出力手段によって出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小手段を備える。   According to a fourth aspect of the present invention, there is provided a speech recognition apparatus comprising: output means for outputting pseudo speech; capture means for capturing a speech voice of a subject; pitch range detection means for detecting a pitch range of speech sound captured by the capture means; pitch range detection When the pitch range detected by the means falls below the first threshold, the pitch range expanding means for expanding the pitch range of the pseudo sound output by the output means, and the pitch range detected by the pitch range detecting means is less than the first threshold. Pitch range reduction means is provided for reducing the pitch range of the pseudo sound output by the output means when it exceeds a large second threshold.

請求項5の発明に従う音声認識装置は、請求項4に従属し、取り込み手段によって取り込まれた発話音声の発話間隔を検出する発話間隔検出手段、発話間隔検出手段によって検出された発話間隔が第3閾値を下回るとき出力手段によって出力される擬似音声の発話間隔を伸長する発話間隔伸長手段、および発話間隔検出手段によって検出された発話間隔が第3閾値よりも大きい第4閾値を上回るとき出力手段によって出力される擬似音声の発話間隔を短縮する発話間隔短縮手段をさらに備える。   The speech recognition apparatus according to the invention of claim 5 is dependent on claim 4, and the speech interval detection means for detecting the speech interval of the speech voice captured by the capture means, and the speech interval detected by the speech interval detection means is third. When the speech interval detected by the speech interval detecting means exceeds the fourth threshold value, which is larger than the third threshold value, by the output means, the speech interval extending means for extending the speech interval of the pseudo speech output by the output means when it falls below the threshold value. It further includes speech interval shortening means for shortening the speech interval of the output pseudo voice.

請求項6の発明に従う音声認識装置は、擬似音声を出力する出力手段、被験者の発話音声を取り込む取り込み手段、取り込み手段によって取り込まれた発話音声の発話間隔を検出する発話間隔検出手段、発話間隔検出手段によって検出された発話間隔が第1閾値を下回るとき出力手段によって出力される擬似音声の発話間隔を伸長する発話間隔伸長手段、および発話間隔検出手段によって検出された発話間隔が第1閾値よりも大きい第2閾値を上回るとき出力手段によって出力される擬似音声の発話間隔を短縮する発話間隔短縮手段を備える。   The speech recognition apparatus according to the invention of claim 6 includes an output means for outputting a pseudo sound, a capturing means for capturing a speech sound of a subject, a speech interval detecting means for detecting a speech interval of speech sound captured by the capturing means, and a speech interval detection. The speech interval extending means for extending the speech interval of the pseudo speech output by the output means when the speech interval detected by the means is less than the first threshold, and the speech interval detected by the speech interval detecting means is less than the first threshold Speaking interval shortening means for shortening the speech interval of the pseudo voice output by the output means when exceeding the large second threshold value is provided.

請求項7の発明に従う音声認識プログラムは、音声認識装置のプロセサによって実行される音声認識プログラムであって、被験者の発話音声のピッチを検出するピッチ検出ステップ、ピッチ検出ステップによって検出されたピッチが第1閾値を下回るときスピーカから出力される擬似音声のピッチを上昇させるピッチ上昇ステップ、およびピッチ検出手段によって検出されたピッチが第1閾値よりも大きい第2閾値を上回るときスピーカから出力される擬似音声のピッチを低下させるピッチ低下ステップを備える。   A speech recognition program according to a seventh aspect of the invention is a speech recognition program executed by a processor of a speech recognition device, wherein the pitch detection step for detecting the pitch of the uttered speech of the subject and the pitch detected by the pitch detection step are the first. A pitch increasing step for increasing the pitch of the pseudo sound output from the speaker when the threshold is less than one threshold, and the pseudo sound output from the speaker when the pitch detected by the pitch detection means exceeds a second threshold greater than the first threshold. A pitch lowering step for lowering the pitch.

請求項8の発明に従う音声認識プログラムは、請求項7に従属し、被験者の発話音声のピッチレンジを検出するピッチレンジ検出ステップ、ピッチレンジ検出ステップによって検出されたピッチレンジが第3閾値を下回るときスピーカから出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大ステップ、およびピッチレンジ検出ステップによって検出されたピッチレンジが第3閾値よりも大きい第4閾値を上回るときスピーカから出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小ステップをさらに備える。   The speech recognition program according to the invention of claim 8 is dependent on claim 7 and includes a pitch range detection step for detecting the pitch range of the uttered speech of the subject, and the pitch range detected by the pitch range detection step is below the third threshold value. The pitch range expansion step for expanding the pitch range of the pseudo sound output from the speaker, and the pseudo sound output from the speaker when the pitch range detected by the pitch range detection step exceeds a fourth threshold value that is greater than the third threshold value. A pitch range reduction step for reducing the pitch range is further provided.

請求項9の発明に従う音声認識プログラムは、請求項7または8に従属し、被験者の発話音声の発話間隔を検出する発話間隔検出ステップ、発話間隔検出ステップによって検出された発話間隔が第5閾値を下回るときスピーカから出力される擬似音声の発話間隔を伸長する発話間隔伸長ステップ、および発話間隔検出ステップによって検出された発話間隔が第5閾値よりも大きい第6閾値を上回るときスピーカから出力される擬似音声の発話間隔を短縮する発話間隔短縮ステップをさらに備える。   The speech recognition program according to the invention of claim 9 is dependent on claim 7 or 8, and the speech interval detected by the speech interval detection step for detecting the speech interval of the speech speech of the subject, the speech interval detected by the speech interval detection step has a fifth threshold value. The speech output from the speaker when the speech interval detected by the speech interval extension step for extending the speech interval of the pseudo sound output from the speaker when the time is less than the sixth threshold greater than the fifth threshold is exceeded. An utterance interval shortening step for shortening the speech utterance interval is further provided.

請求項10の発明に従う音声認識プログラムは、音声認識装置のプロセサによって実行される音声認識プログラムであって、被験者の発話音声のピッチレンジを検出するピッチレンジ検出ステップ、ピッチレンジ検出ステップによって検出されたピッチレンジが第1閾値を下回るときスピーカから出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大ステップ、およびピッチレンジ検出ステップによって検出されたピッチレンジが第1閾値よりも大きい第2閾値を上回るときスピーカから出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小ステップを備える。   A speech recognition program according to the invention of claim 10 is a speech recognition program executed by a processor of a speech recognition device, and is detected by a pitch range detection step and a pitch range detection step for detecting a pitch range of speech sound of a subject. When the pitch range falls below the first threshold, the pitch range expansion step for expanding the pitch range of the pseudo sound output from the speaker, and the pitch range detected by the pitch range detection step exceeds the second threshold that is larger than the first threshold. A pitch range reduction step for reducing the pitch range of the pseudo sound output from the speaker.

請求項11の発明に従う音声認識プログラムは、請求項10に従属し、被験者の発話音声の発話間隔を検出する発話間隔検出ステップ、発話間隔検出ステップによって検出された発話間隔が第3閾値を下回るときスピーカから出力される擬似音声の発話間隔を伸長する発話間隔伸長ステップ、および発話間隔検出ステップによって検出された発話間隔が前記第3閾値よりも大きい第4閾値を上回るときスピーカから出力される擬似音声の発話間隔を短縮する発話間隔短縮ステップをさらに備える。   The speech recognition program according to the invention of claim 11 is dependent on claim 10, and the speech interval detection step for detecting the speech interval of the speech speech of the subject, the speech interval detected by the speech interval detection step is less than the third threshold value A pseudo speech output from the speaker when the speech interval detected by the speech interval extending step for extending the speech interval of the pseudo speech output from the speaker and the speech interval detecting step exceeds a fourth threshold value greater than the third threshold value. The speech interval shortening step for shortening the speech interval is further provided.

請求項12の発明に従う音声認識プログラムは、音声認識装置のプロセサによって実行される音声認識プログラムであって、被験者の発話音声の発話間隔を検出する発話間隔検出ステップ、発話間隔検出ステップによって検出された発話間隔が第1閾値を下回るときスピーカから出力される擬似音声の発話間隔を伸長する発話間隔伸長ステップ、および発話間隔検出ステップによって検出された発話間隔が第1閾値よりも大きい第2閾値を上回るときスピーカから出力される擬似音声の発話間隔を短縮する発話間隔短縮ステップを備える。   A speech recognition program according to a twelfth aspect of the invention is a speech recognition program executed by a processor of a speech recognition device, which is detected by an utterance interval detection step and an utterance interval detection step for detecting an utterance interval of speech sound of a subject. When the speech interval is less than the first threshold, the speech interval extension step that extends the speech interval of the pseudo sound output from the speaker, and the speech interval detected by the speech interval detection step exceeds the second threshold that is greater than the first threshold. An utterance interval shortening step for shortening the utterance interval of the pseudo sound output from the speaker.

この発明によれば、擬似音声のピッチの上昇/低下、ピッチレンジの拡大/縮小、または発話間隔の伸長/短縮によって被験者の発話音声のピッチ、ピッチレンジまたは発話間隔を所望の方向に誘導するようにしたため、音声認識の精度を向上させることができる。   According to the present invention, the pitch, pitch range, or speech interval of the subject's speech is guided in a desired direction by increasing / decreasing the pitch of the pseudo speech, expanding / reducing the pitch range, or extending / shortening the speech interval. Therefore, the accuracy of voice recognition can be improved.

この発明の上述の目的,その他の目的,特徴および利点は、図面を参照して行う以下の実施例の詳細な説明から一層明らかとなろう。   The above object, other objects, features and advantages of the present invention will become more apparent from the following detailed description of embodiments with reference to the drawings.

図1を参照して、この実施例の音声認識装置10は、被験者の発話音声信号を取り込むマイク18を含む。マイク18によって取り込まれた発話音声信号は、A/D変換器14を介してCPU12に与えられる。CPU12は、A/D変換器14から取り込まれた発話音声データの内容を解析し、被験者と対話する擬似音声データを作成する。作成された擬似音声データは、D/A変換器16を介してスピーカ20から出力される。なお、この擬似音声としては、周知の音声合成手段によって合成された合成音声が該当する。   With reference to FIG. 1, the speech recognition apparatus 10 of this embodiment includes a microphone 18 that captures a speech voice signal of a subject. The utterance voice signal captured by the microphone 18 is given to the CPU 12 via the A / D converter 14. The CPU 12 analyzes the content of the utterance voice data taken from the A / D converter 14 and creates pseudo voice data for dialogue with the subject. The created pseudo audio data is output from the speaker 20 via the D / A converter 16. The pseudo speech corresponds to synthesized speech synthesized by a known speech synthesis means.

CPU12は、被験者の発話音声の認識精度を高めるべく、1フレーズ分の発話音声データが取り込まれる毎に、図3〜図5に示すフロー図に従う処理を実行する。なお、これらのフロー図に対応する制御プログラムは、メモリ22に記憶される。   CPU12 performs the process according to the flowchart shown in FIGS. 3-5 whenever the speech audio | speech data for 1 phrase is taken in, in order to improve the recognition precision of a test subject's speech. A control program corresponding to these flowcharts is stored in the memory 22.

まず図3を参照して、ステップS1では、取り込まれた発話音声データに基づいて被験者の声の高さつまりピッチPIhを検出する。ステップS3およびS5では、検出されたピッチPIhと閾値PIh1およびPIh2との大小関係を判別する。なお、図2に示すように、閾値PIh1およびPIh2はそれぞれ、音声認識可能範囲の下限値および上限値である。   First, referring to FIG. 3, in step S <b> 1, the pitch of the subject's voice, that is, pitch PIh is detected based on the captured speech data. In steps S3 and S5, the magnitude relationship between the detected pitch PIh and the threshold values PIh1 and PIh2 is determined. As shown in FIG. 2, threshold values PIh1 and PIh2 are a lower limit value and an upper limit value of the speech recognizable range, respectively.

ピッチPIhが閾値PIh1以上でかつ閾値PIh2以下であれば、音声認識が可能であるとみなし、ステップS3からステップS11に進む。ピッチPIhが閾値PIh1を下回るときは、ステップS3でNOと判断しかつステップS5でYESと判断し、ステップS7で擬似音声のピッチPIcを“Δα”だけ高くする。ピッチPIhが閾値PIh2を上回るときは、ステップS3およびステップS5でNOと判断し、ステップS9で擬似音声のピッチPIcを“Δα”だけ低くする。ステップS7またはS9の処理が完了すると、ステップS11に進む。   If the pitch PIh is greater than or equal to the threshold value PIh1 and less than or equal to the threshold value PIh2, it is considered that voice recognition is possible, and the process proceeds from step S3 to step S11. When the pitch PIh is lower than the threshold value PIh1, it is determined NO in step S3 and YES in step S5, and the pitch PIc of the pseudo sound is increased by “Δα” in step S7. When the pitch PIh exceeds the threshold value PIh2, NO is determined in step S3 and step S5, and the pitch PIc of the pseudo sound is decreased by “Δα” in step S9. When the process of step S7 or S9 is completed, the process proceeds to step S11.

なお、ピッチPIcの可変範囲は“初期値±2α”であり、この範囲を外れる方向への更新を試みるステップS7またはS9の処理は、意味を成さない。   Note that the variable range of the pitch PIc is “initial value ± 2α”, and the processing in step S7 or S9 that attempts to update in a direction out of this range does not make sense.

このように、発話音声のピッチPIhが低すぎれば擬似音声のピッチPIcが高くなり、発話音声のピッチPIhが高すぎれば擬似音声のピッチPIcが低くなる。つまり、発話音声のピッチPIhは、整列傾向によって、音声認識可能範囲に収まるように誘導される。   Thus, if the pitch PIh of the utterance voice is too low, the pitch PIc of the pseudo voice is high, and if the pitch PIh of the utterance voice is too high, the pitch PIc of the pseudo voice is low. That is, the pitch PIh of the speech voice is guided so as to fall within the voice recognizable range due to the alignment tendency.

ステップS11では、取り込まれた発話音声データに基づいて被験者の声の抑揚範囲つまりピッチレンジPRhを検出する。ステップS13およびS15では、検出されたピッチレンジPRhと閾値PRh1およびPRh2との大小関係を判別する。なお、図2に示すように、閾値PRh1およびPRh2も、それぞれ音声認識可能範囲の下限値および上限値である。   In step S11, an inflection range of the subject's voice, that is, a pitch range PRh is detected based on the captured speech data. In steps S13 and S15, the magnitude relationship between the detected pitch range PRh and the threshold values PRh1 and PRh2 is determined. As shown in FIG. 2, threshold values PRh1 and PRh2 are also a lower limit value and an upper limit value of the speech recognizable range, respectively.

ピッチレンジPRhがPRh1≦PRh≦PRh2の条件を満たせば、音声認識が可能であるとみなし、ステップS13からステップS21に進む。ピッチレンジPRhが閾値PRh1を下回るときは、ステップS17で擬似音声のピッチレンジPRcを“Δβ”だけ拡大させる。ピッチレンジPRhが閾値PRh2を上回るときは、ステップS19で擬似音声のピッチレンジPRcを“Δβ”だけ縮小させる。ステップS17またはS19の処理が完了すると、ステップS21に進む。   If the pitch range PRh satisfies the condition of PRh1 ≦ PRh ≦ PRh2, it is considered that speech recognition is possible, and the process proceeds from step S13 to step S21. When the pitch range PRh is lower than the threshold value PRh1, the pitch range PRc of the pseudo sound is expanded by “Δβ” in step S17. When the pitch range PRh exceeds the threshold value PRh2, the pseudo audio pitch range PRc is reduced by “Δβ” in step S19. When the process of step S17 or S19 is completed, the process proceeds to step S21.

なお、上述と同様、ピッチレンジPRhの可変範囲も“初期値±2β”であり、この範囲を外れる方向への更新を試みるステップS17またはS19の処理は、意味を成さない。   As described above, the variable range of the pitch range PRh is also “initial value ± 2β”, and the processing in step S17 or S19 that attempts to update in a direction out of this range does not make sense.

このように、発話音声のピッチレンジPRhが狭すぎれば擬似音声のピッチレンジPRcが拡大され、発話音声のピッチレンジPRhが広すぎれば擬似音声のピッチレンジPRcが縮小される。つまり、発話音声のピッチレンジPRhは、整列傾向によって、音声認識可能範囲に収まるように誘導される。   Thus, if the pitch range PRh of the utterance voice is too narrow, the pitch range PRc of the pseudo voice is expanded, and if the pitch range PRh of the utterance voice is too wide, the pitch range PRc of the pseudo voice is reduced. That is, the pitch range PRh of the uttered speech is guided so as to be within the speech recognizable range due to the alignment tendency.

ステップS21では、取り込まれた発話音声データに基づいて被験者の発話速度Shを検出する。ステップS23およびS25では、検出された発話速度Shと閾値Sh1およびSh2との大小関係を判別する。上述と同様、閾値Sh1およびSh2も、それぞれ音声認識可能範囲の下限値および上限値である。また、ステップS21で検出される発話速度の単位は、“mora/sec”である。   In step S21, the utterance speed Sh of the subject is detected based on the captured utterance voice data. In steps S23 and S25, the magnitude relationship between the detected utterance speed Sh and the thresholds Sh1 and Sh2 is determined. As described above, the thresholds Sh1 and Sh2 are also a lower limit value and an upper limit value of the speech recognizable range, respectively. The unit of the speech rate detected in step S21 is “mora / sec”.

発話速度ShがSh1≦Sh≦Sh2の条件を満たせば、音声認識が可能であるとみなし、ステップS23からステップS31に進む。発話速度Shが閾値Sh1を下回るときは、ステップS27で擬似音声の発話速度Scを“Δγ”だけ上昇させる。発話速度Shが閾値Sh2を上回るときは、ステップS29で擬似音声の発話速度Scを“Δγ”だけ低下させる。ステップS27またはS29の処理が完了すると、ステップS31に進む。   If the speech rate Sh satisfies the condition of Sh1 ≦ Sh ≦ Sh2, it is considered that speech recognition is possible, and the process proceeds from step S23 to step S31. When the utterance speed Sh is lower than the threshold value Sh1, the utterance speed Sc of the pseudo voice is increased by “Δγ” in step S27. When the utterance speed Sh exceeds the threshold value Sh2, the utterance speed Sc of the pseudo voice is decreased by “Δγ” in step S29. When the process of step S27 or S29 is completed, the process proceeds to step S31.

なお、上述と同様、発話速度Shの可変範囲も“初期値±2γ”であり、この範囲を外れる方向への更新を試みるステップS27またはS29の処理は、意味を成さない。   As described above, the variable range of the speech rate Sh is also “initial value ± 2γ”, and the process of step S27 or S29 that attempts to update in a direction out of this range does not make sense.

このように、被験者の発話速度Shが低すぎれば擬似音声の発話速度Scが上昇し、被験者の発話速度Shが高すぎれば擬似音声の発話速度Scが低下する。つまり、被験者の発話速度Shは、整列傾向によって、音声認識可能範囲に収まるように誘導される。   Thus, if the subject's speech rate Sh is too low, the pseudo speech rate Sc increases, and if the subject's speech rate Sh is too high, the pseudo speech rate Sc decreases. That is, the speaking speed Sh of the subject is guided so as to be within the speech recognizable range due to the alignment tendency.

ステップS31では、取り込まれた発話音声データに基づいて被験者の発話間隔(相手方の擬似音声終了時刻から自身の応答開始時刻までの間隔)Thを検出する。ステップS33およびS35では、検出された発話間隔Thと閾値Th1およびTh2との大小関係を判別する。上述と同様、閾値Th1およびTh2も、それぞれ音声認識可能範囲の下限値および上限値である。   In step S31, the subject's speech interval (interval from the opponent's pseudo-speech end time to its own response start time) Th is detected based on the captured speech voice data. In steps S33 and S35, the magnitude relationship between the detected speech interval Th and the threshold values Th1 and Th2 is determined. As described above, the threshold values Th1 and Th2 are also a lower limit value and an upper limit value of the speech recognizable range, respectively.

発話間隔ThがTh1≦Th≦Th2の条件を満たせば、音声認識が可能であるとみなし、ステップS33からステップS41に進む。発話間隔Thが閾値Th1を下回るときは、ステップS37で擬似音声の発話間隔Tcを“Δδ”だけ伸長させる。発話間隔Thが閾値Th2を上回るときは、ステップS39で擬似音声の発話間隔Tcを“Δδ”だけ短縮させる。ステップS37またはS39の処理が完了すると、ステップS41に進む。   If the speech interval Th satisfies the condition of Th1 ≦ Th ≦ Th2, it is considered that speech recognition is possible, and the process proceeds from step S33 to step S41. When the utterance interval Th is less than the threshold value Th1, the utterance interval Tc of the pseudo voice is extended by “Δδ” in step S37. When the utterance interval Th exceeds the threshold Th2, the utterance interval Tc of the pseudo voice is shortened by “Δδ” in step S39. When the process of step S37 or S39 is completed, the process proceeds to step S41.

なお、上述と同様、発話間隔Thの可変範囲も“初期値±2δ”であり、この範囲を外れる方向への更新を試みるステップS37またはS39の処理は、意味を成さない。   As described above, the variable range of the speech interval Th is also “initial value ± 2δ”, and the processing in step S37 or S39 that attempts to update in a direction out of this range does not make sense.

このように、被験者の発話間隔Thが短すぎれば擬似音声の発話間隔Tcが伸長され、被験者の発話間隔Thが長すぎれば擬似音声の発話間隔Tcが短縮される。つまり、被験者の発話間隔Thは、整列傾向によって、音声認識可能範囲に収まるように誘導される。   Thus, if the subject's utterance interval Th is too short, the pseudo speech utterance interval Tc is extended, and if the subject's utterance interval Th is too long, the pseudo speech utterance interval Tc is shortened. That is, the subject's utterance interval Th is guided so as to be within the speech recognizable range due to the alignment tendency.

ステップS41では、取り込まれた発話音声データに基づいて被験者の声量Vhを検出する。ステップS43およびS45では、検出された声量Vhと閾値Vh1およびVh2との大小関係を判別する。なお、図2に示すように、閾値Vh1およびVh2も、それぞれ音声認識可能範囲の下限値および上限値である。   In step S41, the subject's voice volume Vh is detected based on the captured speech data. In steps S43 and S45, the magnitude relationship between the detected voice volume Vh and the threshold values Vh1 and Vh2 is determined. As shown in FIG. 2, threshold values Vh1 and Vh2 are also a lower limit value and an upper limit value of the speech recognizable range, respectively.

声量VhがVh1≦Vh≦Vh2の条件を満たせば、音声認識が可能であるとみなし、処理を終了する。声量Vhが閾値Vh1を下回るときは、ステップS47で擬似音声の声量Vcを“Δε”だけ増大させる。声量Vhが閾値Vh2を上回るときは、ステップS49で擬似音声の声量Vcを“Δε”だけ減少させる。ステップS47またはS49の処理が完了すると、処理を終了する。   If the voice volume Vh satisfies the condition of Vh1 ≦ Vh ≦ Vh2, it is considered that voice recognition is possible, and the process is terminated. When the voice volume Vh is lower than the threshold value Vh1, the voice volume Vc of the pseudo voice is increased by “Δε” in step S47. When the voice volume Vh exceeds the threshold value Vh2, the voice volume Vc of the pseudo voice is decreased by “Δε” in step S49. When the process of step S47 or S49 is completed, the process ends.

なお、上述と同様、声量Vhの可変範囲も“初期値±2ε”であり、この範囲を外れる方向への更新を試みるステップS47またはS49の処理は、意味を成さない。   As described above, the variable range of the voice volume Vh is also “initial value ± 2ε”, and the process of step S47 or S49 that attempts to update in a direction out of this range does not make sense.

このように、発話音声の声量Vhが小さすぎれば擬似音声の声量Vcが増大し、発話音声の声量Vhが大きすぎれば擬似音声の声量Vcが減少する。つまり、発話音声の声量Vhは、整列傾向によって、音声認識可能範囲に収まるように誘導される。   Thus, if the voice volume Vh of the uttered voice is too small, the voice volume Vc of the pseudo voice increases, and if the voice volume Vh of the uttered voice is too large, the voice volume Vc of the pseudo voice decreases. That is, the voice volume Vh of the uttered voice is guided so as to be within the voice recognizable range due to the alignment tendency.

以上の説明から分かるように、擬似音声はスピーカ20から出力され、被験者の発話音声はマイク18によって取り込まれる。取り込まれた発話音声の韻律パラメータ値(ピッチ,ピッチレンジ,発話速度,発話間隔,声量)は、CPU12によって検出される。検出された韻律パラメータ値が音声認識可能範囲の下限値(PIh1,PRh1,Sh1,Th1,Vh1)を下回れば、スピーカ20から出力される擬似音声の韻律パラメータ値が上昇する。一方、検出された韻律パラメータ値が音声認識可能範囲の上限値(PIh2,PRh2,Sh2,Th2,Vh2)を上回れば、スピーカ20から出力される擬似音声の韻律パラメータ値が低下する。これによって、発話音声の韻律パラメータ値が音声認識可能範囲の収まるように誘導され、音声認識の精度の向上が図られる。   As can be seen from the above description, the pseudo sound is output from the speaker 20, and the speech sound of the subject is captured by the microphone 18. The CPU 12 detects the prosodic parameter values (pitch, pitch range, utterance speed, utterance interval, voice volume) of the captured utterance. If the detected prosodic parameter value falls below the lower limit value (PIh1, PRh1, Sh1, Th1, Vh1) of the speech recognizable range, the prosodic parameter value of the pseudo speech output from the speaker 20 increases. On the other hand, if the detected prosodic parameter value exceeds the upper limit (PIh2, PRh2, Sh2, Th2, Vh2) of the speech recognizable range, the prosodic parameter value of the pseudo speech output from the speaker 20 decreases. As a result, the prosodic parameter value of the uttered speech is guided to fall within the speech recognizable range, and the accuracy of speech recognition is improved.

この発明の一実施例の構成を示すブロック図である。It is a block diagram which shows the structure of one Example of this invention. 図1実施例の動作の一部を示す図解図である。It is an illustration figure which shows a part of operation | movement of FIG. 1 Example. 図1実施例に適用されるCPUの動作の一部を示すフロー図である。It is a flowchart which shows a part of operation | movement of CPU applied to the FIG. 1 Example. 図1実施例に適用されるCPUの動作の他の一部を示すフロー図である。It is a flowchart which shows a part of other operation | movement of CPU applied to the FIG. 1 Example. 図1実施例に適用されるCPUの動作のその他の一部を示すフロー図である。It is a flowchart which shows a part of other operation | movement of CPU applied to the FIG. 1 Example.

符号の説明Explanation of symbols

10 … 音声認識装置
12 … CPU
18 … マイク
20 … スピーカ
22 … メモリ
10: Voice recognition device 12 ... CPU
18 ... Microphone 20 ... Speaker 22 ... Memory

Claims (12)

擬似音声を出力する出力手段、
被験者の発話音声を取り込む取り込み手段、
前記取り込み手段によって取り込まれた発話音声のピッチを検出するピッチ検出手段、
前記ピッチ検出手段によって検出されたピッチが第1閾値を下回るとき前記出力手段によって出力される擬似音声のピッチを上昇させるピッチ上昇手段、および
前記ピッチ検出手段によって検出されたピッチが前記第1閾値よりも大きい第2閾値を上回るとき前記出力手段によって出力される擬似音声のピッチを低下させるピッチ低下手段を備える、音声認識装置。
Output means for outputting pseudo sound;
Capture means for capturing the speech of the subject,
Pitch detecting means for detecting the pitch of the uttered voice captured by the capturing means;
A pitch raising means for raising the pitch of the pseudo sound output by the output means when the pitch detected by the pitch detection means is below a first threshold; and the pitch detected by the pitch detection means is greater than the first threshold. A speech recognition apparatus comprising pitch reduction means for reducing the pitch of the pseudo speech output by the output means when the second threshold value is larger than the second threshold value.
前記取り込み手段によって取り込まれた発話音声のピッチレンジを検出するピッチレンジ検出手段、
前記ピッチレンジ検出手段によって検出されたピッチレンジが第3閾値を下回るとき前記出力手段によって出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大手段、および
前記ピッチレンジ検出手段によって検出されたピッチレンジが前記第3閾値よりも大きい第4閾値を上回るとき前記出力手段によって出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小手段をさらに備える、請求項1記載の音声認識装置。
Pitch range detecting means for detecting the pitch range of the speech voice captured by the capturing means;
Pitch range expanding means for expanding the pitch range of the pseudo sound output by the output means when the pitch range detected by the pitch range detecting means falls below a third threshold; and the pitch range detected by the pitch range detecting means The speech recognition apparatus according to claim 1, further comprising pitch range reduction means for reducing the pitch range of the pseudo speech output by the output means when the value exceeds a fourth threshold value that is greater than the third threshold value.
前記取り込み手段によって取り込まれた発話音声の発話間隔を検出する発話間隔検出手段、
前記発話間隔検出手段によって検出された発話間隔が第5閾値を下回るとき前記出力手段によって出力される擬似音声の発話間隔を伸長する発話間隔伸長手段、および
前記発話間隔検出手段によって検出された発話間隔が前記第5閾値よりも大きい第6閾値を上回るとき前記出力手段によって出力される擬似音声の発話間隔を短縮する発話間隔短縮手段をさらに備える、請求項1または2記載の音声認識装置。
An utterance interval detecting means for detecting an utterance interval of the utterance voice captured by the capturing means;
An utterance interval extending means for extending an utterance interval of the pseudo speech output by the output means when the utterance interval detected by the utterance interval detecting means is less than a fifth threshold; and the utterance interval detected by the utterance interval detecting means. The speech recognition apparatus according to claim 1, further comprising speech interval shortening means for shortening the speech interval of the pseudo speech output by the output means when the value exceeds a sixth threshold value that is greater than the fifth threshold value.
擬似音声を出力する出力手段、
被験者の発話音声を取り込む取り込み手段、
前記取り込み手段によって取り込まれた発話音声のピッチレンジを検出するピッチレンジ検出手段、
前記ピッチレンジ検出手段によって検出されたピッチレンジが第1閾値を下回るとき前記出力手段によって出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大手段、および
前記ピッチレンジ検出手段によって検出されたピッチレンジが前記第1閾値よりも大きい第2閾値を上回るとき前記出力手段によって出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小手段を備える、音声認識装置。
Output means for outputting pseudo sound;
Capture means for capturing the speech of the subject,
Pitch range detecting means for detecting the pitch range of the speech voice captured by the capturing means;
Pitch range expanding means for expanding the pitch range of the pseudo sound output by the output means when the pitch range detected by the pitch range detecting means falls below a first threshold; and the pitch range detected by the pitch range detecting means A speech recognition apparatus comprising pitch range reduction means for reducing the pitch range of the pseudo voice output by the output means when the value exceeds a second threshold value that is greater than the first threshold value.
前記取り込み手段によって取り込まれた発話音声の発話間隔を検出する発話間隔検出手段、
前記発話間隔検出手段によって検出された発話間隔が第3閾値を下回るとき前記出力手段によって出力される擬似音声の発話間隔を伸長する発話間隔伸長手段、および
前記発話間隔検出手段によって検出された発話間隔が前記第3閾値よりも大きい第4閾値を上回るとき前記出力手段によって出力される擬似音声の発話間隔を短縮する発話間隔短縮手段をさらに備える、請求項4記載の音声認識装置。
An utterance interval detecting means for detecting an utterance interval of the utterance voice captured by the capturing means;
An utterance interval extending means for extending an utterance interval of the pseudo speech output by the output means when the utterance interval detected by the utterance interval detecting means is less than a third threshold; and the utterance interval detected by the utterance interval detecting means. The speech recognition apparatus according to claim 4, further comprising speech interval shortening means for shortening the speech interval of the pseudo speech output by the output means when the value exceeds a fourth threshold value greater than the third threshold value.
擬似音声を出力する出力手段、
被験者の発話音声を取り込む取り込み手段、
前記取り込み手段によって取り込まれた発話音声の発話間隔を検出する発話間隔検出手段、
前記発話間隔検出手段によって検出された発話間隔が第1閾値を下回るとき前記出力手段によって出力される擬似音声の発話間隔を伸長する発話間隔伸長手段、および
前記発話間隔検出手段によって検出された発話間隔が前記第1閾値よりも大きい第2閾値を上回るとき前記出力手段によって出力される擬似音声の発話間隔を短縮する発話間隔短縮手段を備える、音声認識装置。
Output means for outputting pseudo sound;
Capture means for capturing the speech of the subject,
An utterance interval detecting means for detecting an utterance interval of the utterance voice captured by the capturing means;
An utterance interval extending means for extending an utterance interval of the pseudo voice output by the output means when the utterance interval detected by the utterance interval detecting means is less than a first threshold; and the utterance interval detected by the utterance interval detecting means. A speech recognition apparatus comprising speech interval shortening means for shortening the speech interval of the pseudo speech output by the output means when the value exceeds a second threshold value greater than the first threshold value.
音声認識装置のプロセサによって実行される音声認識プログラムであって、
被験者の発話音声のピッチを検出するピッチ検出ステップ、
前記ピッチ検出ステップによって検出されたピッチが第1閾値を下回るときスピーカから出力される擬似音声のピッチを上昇させるピッチ上昇ステップ、および
前記ピッチ検出手段によって検出されたピッチが前記第1閾値よりも大きい第2閾値を上回るとき前記スピーカから出力される擬似音声のピッチを低下させるピッチ低下ステップを備える、音声認識プログラム。
A speech recognition program executed by a processor of a speech recognition device,
A pitch detection step for detecting the pitch of the speech voice of the subject;
A pitch increasing step for increasing the pitch of the pseudo sound output from the speaker when the pitch detected by the pitch detecting step is below the first threshold; and the pitch detected by the pitch detecting means is larger than the first threshold A speech recognition program comprising a pitch reduction step of reducing the pitch of the pseudo sound output from the speaker when a second threshold value is exceeded.
前記被験者の発話音声のピッチレンジを検出するピッチレンジ検出ステップ、
前記ピッチレンジ検出ステップによって検出されたピッチレンジが第3閾値を下回るとき前記スピーカから出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大ステップ、および
前記ピッチレンジ検出ステップによって検出されたピッチレンジが前記第3閾値よりも大きい第4閾値を上回るとき前記スピーカから出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小ステップをさらに備える、請求項7記載の音声認識プログラム。
A pitch range detecting step for detecting a pitch range of the speech of the subject;
When the pitch range detected by the pitch range detection step is below a third threshold, the pitch range expansion step for expanding the pitch range of the pseudo sound output from the speaker; and the pitch range detected by the pitch range detection step The speech recognition program according to claim 7, further comprising a pitch range reduction step of reducing a pitch range of the pseudo sound output from the speaker when a fourth threshold value that is larger than the third threshold value is exceeded.
前記被験者の発話音声の発話間隔を検出する発話間隔検出ステップ、
前記発話間隔検出ステップによって検出された発話間隔が第5閾値を下回るとき前記スピーカから出力される擬似音声の発話間隔を伸長する発話間隔伸長ステップ、および
前記発話間隔検出ステップによって検出された発話間隔が前記第5閾値よりも大きい第6閾値を上回るとき前記スピーカから出力される擬似音声の発話間隔を短縮する発話間隔短縮ステップをさらに備える、請求項7または8記載の音声認識プログラム。
An utterance interval detecting step of detecting an utterance interval of the utterance voice of the subject;
An utterance interval extending step of extending an utterance interval of the pseudo sound output from the speaker when the utterance interval detected by the utterance interval detecting step is less than a fifth threshold; and the utterance interval detected by the utterance interval detecting step The speech recognition program according to claim 7 or 8, further comprising an utterance interval shortening step of shortening an utterance interval of the pseudo speech output from the speaker when the sixth threshold value is larger than the fifth threshold value.
音声認識装置のプロセサによって実行される音声認識プログラムであって、
被験者の発話音声のピッチレンジを検出するピッチレンジ検出ステップ、
前記ピッチレンジ検出ステップによって検出されたピッチレンジが第1閾値を下回るときスピーカから出力される擬似音声のピッチレンジを拡大させるピッチレンジ拡大ステップ、および
前記ピッチレンジ検出ステップによって検出されたピッチレンジが前記第1閾値よりも大きい第2閾値を上回るとき前記スピーカから出力される擬似音声のピッチレンジを縮小させるピッチレンジ縮小ステップを備える、音声認識プログラム。
A speech recognition program executed by a processor of a speech recognition device,
A pitch range detection step for detecting the pitch range of the speech of the subject;
When the pitch range detected by the pitch range detection step is below a first threshold, the pitch range expansion step for expanding the pitch range of the pseudo sound output from the speaker, and the pitch range detected by the pitch range detection step is A speech recognition program, comprising: a pitch range reduction step of reducing a pitch range of a pseudo sound output from the speaker when a second threshold value greater than a first threshold value is exceeded.
前記被験者の発話音声の発話間隔を検出する発話間隔検出ステップ、
前記発話間隔検出ステップによって検出された発話間隔が第3閾値を下回るとき前記スピーカから出力される擬似音声の発話間隔を伸長する発話間隔伸長ステップ、および
前記発話間隔検出ステップによって検出された発話間隔が前記第3閾値よりも大きい第4閾値を上回るとき前記スピーカから出力される擬似音声の発話間隔を短縮する発話間隔短縮ステップをさらに備える、請求項10記載の音声認識プログラム。
An utterance interval detecting step of detecting an utterance interval of the utterance voice of the subject;
An utterance interval extending step of extending an utterance interval of the pseudo sound output from the speaker when the utterance interval detected by the utterance interval detecting step is less than a third threshold; and the utterance interval detected by the utterance interval detecting step The speech recognition program according to claim 10, further comprising an utterance interval shortening step of shortening an utterance interval of the pseudo speech output from the speaker when the fourth threshold value is larger than the third threshold value.
音声認識装置のプロセサによって実行される音声認識プログラムであって、
被験者の発話音声の発話間隔を検出する発話間隔検出ステップ、
前記発話間隔検出ステップによって検出された発話間隔が第1閾値を下回るときスピーカから出力される擬似音声の発話間隔を伸長する発話間隔伸長ステップ、および
前記発話間隔検出ステップによって検出された発話間隔が前記第1閾値よりも大きい第2閾値を上回るとき前記スピーカから出力される擬似音声の発話間隔を短縮する発話間隔短縮ステップを備える、音声認識プログラム。
A speech recognition program executed by a processor of a speech recognition device,
An utterance interval detecting step for detecting an utterance interval of speech sound of the subject;
When the speech interval detected by the speech interval detection step falls below a first threshold, the speech interval extension step of extending the speech interval of the pseudo sound output from the speaker; and the speech interval detected by the speech interval detection step A speech recognition program comprising: an utterance interval shortening step for shortening an utterance interval of pseudo speech output from the speaker when a second threshold value greater than a first threshold value is exceeded.
JP2004075488A 2004-03-17 2004-03-17 Voice recognition device Expired - Fee Related JP4418903B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2004075488A JP4418903B2 (en) 2004-03-17 2004-03-17 Voice recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2004075488A JP4418903B2 (en) 2004-03-17 2004-03-17 Voice recognition device

Publications (2)

Publication Number Publication Date
JP2005266020A JP2005266020A (en) 2005-09-29
JP4418903B2 true JP4418903B2 (en) 2010-02-24

Family

ID=35090675

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2004075488A Expired - Fee Related JP4418903B2 (en) 2004-03-17 2004-03-17 Voice recognition device

Country Status (1)

Country Link
JP (1) JP4418903B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6819672B2 (en) * 2016-03-31 2021-01-27 ソニー株式会社 Information processing equipment, information processing methods, and programs

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0247700U (en) * 1988-09-27 1990-03-30
JPH0538700U (en) * 1991-04-11 1993-05-25 沖電気工業株式会社 Voice response device
JP2003150194A (en) * 2001-11-14 2003-05-23 Seiko Epson Corp Voice interactive device, input voice optimizing method in the device and input voice optimizing processing program in the device

Also Published As

Publication number Publication date
JP2005266020A (en) 2005-09-29

Similar Documents

Publication Publication Date Title
EP0911805B1 (en) Speech recognition method and speech recognition apparatus
JP2006251147A (en) Speech recognition method
JP5431282B2 (en) Spoken dialogue apparatus, method and program
WO2008062529A1 (en) Sentence reading-out device, method for controlling sentence reading-out device and program for controlling sentence reading-out device
JP2007316330A (en) Rhythm identifying device and method, voice recognition device and method
JP4418903B2 (en) Voice recognition device
JP4880136B2 (en) Speech recognition apparatus and speech recognition method
JP2008250236A (en) Speech recognition device and speech recognition method
JP5621786B2 (en) Voice detection device, voice detection method, and voice detection program
JPH09325798A (en) Voice recognizing device
JPH06236196A (en) Method and device for voice recognition
JP6277739B2 (en) Communication device
JP4839970B2 (en) Prosody identification apparatus and method, and speech recognition apparatus and method
JP4951422B2 (en) Speech recognition apparatus and speech recognition method
JP2005234331A (en) Voice interaction device
JP4479191B2 (en) Speech recognition apparatus, speech recognition method, and speech recognition processing program
JP4654615B2 (en) Voice effect imparting device and voice effect imparting program
JP2008216618A (en) Speech discrimination device
JP2008139573A (en) Vocal quality conversion method, vocal quality conversion program and vocal quality conversion device
JP3588929B2 (en) Voice recognition device
JP2006010739A (en) Speech recognition device
JP3846500B2 (en) Speech recognition dialogue apparatus and speech recognition dialogue processing method
JP2001042889A (en) Device for normalizing interval of inputted voice for voice recognition
JP2009175178A (en) Speech recognition device, program and utterance signal extraction method
KR100322203B1 (en) Device and method for recognizing sound in car

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20061221

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20091020

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20091027

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20091028

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121211

Year of fee payment: 3

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

LAPS Cancellation because of no payment of annual fees