JPH1152981A

JPH1152981A - Method and device for voice interaction

Info

Publication number: JPH1152981A
Application number: JP9207935A
Authority: JP
Inventors: Atsushi Noguchi; 淳野口
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1997-08-01
Filing date: 1997-08-01
Publication date: 1999-02-26
Anticipated expiration: 2017-08-01
Also published as: JP3077746B2

Abstract

PROBLEM TO BE SOLVED: To provide the voice interactive device in which an unnecessary voice guidance is effectively eliminated and sufficient voice guidance information is obtained even though the device is in a process holding. SOLUTION: In the device, an interactive control section 104 and a voice output section 107 act as an output control means. The section 104 controls the flow of interaction in accordance with the recognition result of the voice input of a user by a voice recognition section 102. When a device utilization detection section 108 detects the beginning of a use of the device, the section 108 instructs the outputting of voice guidance, checks the position of the voice guidance corresponding to the recognition result based on the information contents of an information position storage section 105 when the recognition result is transmitted and outputs the guidance. Then, under the control of the section 104, a voice output section 107 stops the voice guidance, which is stored in a guidance storage section 106 and is being outputted, skips the voice guidance to the position being indicated by the information contents and then, the voice guidance is outputted again.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、主として音声ガイ
ダンスの出力及びユーザの音声入力に際して動作処理を
行う音声対話方法及び音声対話装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice dialogue method and a voice dialogue apparatus for performing an operation process when outputting voice guidance and inputting a voice of a user.

【０００２】[0002]

【従来の技術】従来、この種の音声対話装置では、利用
方法を熟知したユーザが使用した場合には音声ガイダン
スを全て聞く必要が無いため、ユーザが効率的な対話を
行い得るように音声ガイダンスを適切なタイミングで停
止する動作処理が必要になっている。2. Description of the Related Art Conventionally, in a voice dialogue apparatus of this type, when a user familiar with the use method does not need to listen to all voice guidance, the voice guidance is provided so that the user can perform an efficient dialogue. It is necessary to perform an operation process for stopping the operation at an appropriate timing.

【０００３】こうした動作処理機能を考慮した周知技術
としては、例えば特開平８−２６５４４０号公報に開示
された音声認識応答装置が挙げられる。この音声認識応
答装置では、電話を介して行われる音声対話に際し、ユ
ーザが予め決められた番号のＰＢトーン入力を行うこと
により、出力中の音声ガイダンスをスキップすることが
可能となっている。As a well-known technique in consideration of such an operation processing function, there is, for example, a speech recognition responder disclosed in Japanese Patent Application Laid-Open No. 8-265440. In this voice recognition response device, during a voice dialogue performed via a telephone, the user can skip the voice guidance being output by inputting a PB tone of a predetermined number.

【０００４】又、特開平８−１４６９９１号公報に開示
された情報処理装置及びその制御方法では、音声ガイダ
ンス出力中に音声入力があったときに音声ガイダンスの
出力を停止し、音声入力の認識結果に応じて音声ガイダ
ンスの再開を行うか、或いは音声ガイダンスの出力を停
止するかを決定するようになっている。Further, in the information processing apparatus and the control method disclosed in Japanese Patent Application Laid-Open No. Hei 8-146991, the output of the voice guidance is stopped when the voice input is performed during the output of the voice guidance, and the recognition result of the voice input is output. , The voice guidance is restarted or the output of the voice guidance is stopped.

【０００５】因みに、音声対話及び音声認識に関連する
その他の周知技術としては、例えば特開平５−３２３９
９３号公報に開示された音声対話システム、特開平６−
２０８３８９号公報に開示された情報処理方法及び装
置、特開平７−７２８９５号公報に開示された音声対話
システム、特開平８−２６３０９２号公報に開示された
応答音声生成方法及び音声対話システム、特開平８−２
９７４９８号公報に開示された音声認識対話装置等が挙
げられる。[0005] Incidentally, other well-known techniques related to speech dialogue and speech recognition include, for example, JP-A-5-3239.
No. 93, Japanese Patent Application Laid-open No.
JP-A-208389, information processing method and apparatus, Japanese Patent Application Laid-Open No. 7-72895, voice dialogue system, Japanese Patent Application Laid-Open No. 8-263092, response voice generation method and voice dialogue system, 8-2
97498 discloses a speech recognition dialogue apparatus.

【０００６】[0006]

【発明が解決しようとする課題】上述した音声ガイダン
スを適切なタイミングで停止する動作処理機能を有する
装置の場合、現在出力中の音声ガイダンスのスキップや
停止を行うのみであり、ユーザはそれ以降の対話で仮に
音声ガイダンスの冒頭部を聞いた後にスキップや停止す
ることが可能であっても、依然として不必要な音声ガイ
ダンスを聞かざるを得ないという問題がある。In the case of the above-described apparatus having an operation processing function of stopping the voice guidance at an appropriate timing, only the voice guidance currently being output is skipped or stopped. Even if it is possible to skip or stop after hearing the beginning of the voice guidance in the dialogue, there is a problem that the user still has to listen to unnecessary voice guidance.

【０００７】又、ユーザが如何にしてアプリケーション
プログラム（ＡＰ）と対話して良いか分からない場合
や、ユーザが特異な話者であって入力音声の認識が円滑
に行われない場合等の処理保留中には音声ガイダンスの
情報を十分に入手することができなくなるという欠点も
ある。[0007] Further, when the user does not know how to interact with the application program (AP), or when the user is a unique speaker and the recognition of the input voice is not performed smoothly, the processing is suspended. Some disadvantages are that it is not possible to obtain sufficient audio guidance information.

【０００８】本発明は、このような問題点を解決すべく
なされたもので、その技術的課題は、不必要な音声ガイ
ダンスを効果的に排除するための動作処理を行い得ると
共に、処理保留中にも音声ガイダンスの情報を十分に入
手し得る音声対話方法及び音声対話装置を提供すること
にある。SUMMARY OF THE INVENTION The present invention has been made to solve such a problem, and a technical problem of the present invention is that an operation process for effectively eliminating unnecessary voice guidance can be performed, and a process is pending. Another object of the present invention is to provide a voice interaction method and a voice interaction device capable of sufficiently obtaining voice guidance information.

【０００９】[0009]

【課題を解決するための手段】本発明によれば、ユーザ
に対して音声ガイダンスを出力すると共に、該ユーザの
発声による音声入力を音声認識処理した認識結果に応じ
て対応する該音声ガイダンスを出力する音声対話方法で
あって、ユーザの使用開始時に音声ガイダンスを出力す
ると共に、発声の音声認識時に認識結果に対応する情報
が存在する箇所まで該音声ガイダンスをスキップして出
力を行う音声対話方法が得られる。According to the present invention, voice guidance is output to a user, and the voice guidance corresponding to the voice input by the user's utterance is output in accordance with the recognition result. A voice interaction method for outputting voice guidance at the start of use of a user and skipping the voice guidance to a point where information corresponding to a recognition result exists at the time of voice recognition of an utterance and outputting the voice guidance. can get.

【００１０】一方、本発明によれば、ユーザに対して音
声ガイダンスを出力すると共に、該ユーザの発声による
音声入力を音声認識処理した認識結果に応じて対応する
該音声ガイダンスを出力する出力制御手段を備えた音声
対話装置であって、出力制御手段は、ユーザの使用開始
時に音声ガイダンスを出力すると共に、発声の音声認識
時に認識結果に対応する情報が存在する箇所まで該音声
ガイダンスをスキップして出力を行う音声対話装置が得
られる。On the other hand, according to the present invention, output control means for outputting voice guidance to a user and outputting the voice guidance corresponding to a recognition result of voice recognition processing of a voice input by the user's voice. The voice control apparatus, wherein the output control means outputs the voice guidance at the start of use of the user, and skips the voice guidance to a point where information corresponding to the recognition result exists at the time of voice recognition of the utterance. A spoken dialogue device for output is obtained.

【００１１】又、本発明によれば、上記音声対話装置に
おいて、音声ガイダンスを記憶したガイダンス記憶手段
と、音声認識処理による認識結果に対応する情報が音声
ガイダンス中のどの位置に存在するかを記憶する情報位
置記憶手段と、装置の使用開始及び使用終了を検出する
装置使用検出手段とを備え、出力制御手段は、認識結果
に従って対話の流れを管理し、装置使用検出手段が使用
開始を検出したときに音声ガイダンスを出力指示し、該
認識結果が送られてきたときに情報位置記憶手段におけ
る情報内容に基づいて該認識結果に対応する該音声ガイ
ダンスの位置を調べて出力する対話制御部を備えた音声
対話装置が得られる。Further, according to the present invention, in the above-mentioned voice interaction apparatus, the guidance storage means for storing voice guidance and the position in the voice guidance where the information corresponding to the recognition result by the voice recognition processing is stored. And a device use detecting means for detecting the start and end of use of the device, wherein the output control means manages the flow of the dialog according to the recognition result, and the device use detecting means detects the use start. A dialogue control unit for instructing to output voice guidance at times and checking and outputting the position of the voice guidance corresponding to the recognition result based on the information content in the information position storage means when the recognition result is sent. The obtained speech dialogue device is obtained.

【００１２】更に、本発明によれば、上記音声対話装置
において、ユーザが発声により音声入力を入力する音声
入力手段と、音声入力に対して音声認識処理を行う音声
認識手段と、音声認識処理に際して用いる認識用辞書デ
ータを記憶した認識用辞書記憶手段とを備え、音声認識
手段は認識結果を対話制御部へ送出する音声対話装置が
得られる。Further, according to the present invention, in the above-mentioned voice interaction apparatus, a voice input means for inputting a voice input by a user by voice, a voice recognition means for performing voice recognition processing on the voice input, And a recognition dictionary storage means for storing recognition dictionary data to be used, and a voice dialogue apparatus is provided, in which the voice recognition means sends a recognition result to a dialogue control unit.

【００１３】加えて、本発明によれば、上記何れかの音
声対話装置において、出力制御手段は、対話制御部から
の出力指示に従ってガイダンス記憶手段における音声ガ
イダンスを出力する音声出力手段を備えた音声対話装置
が得られる。[0013] In addition, according to the present invention, in any of the above-described voice interactive devices, the output control means includes a voice output means for outputting voice guidance in the guidance storage means in accordance with an output instruction from the dialog control unit. An interactive device is obtained.

【００１４】他方、本発明によれば、上記音声対話装置
において、音声出力手段は、対話制御部の制御により情
報位置記憶手段における情報内容が送られたときに出力
中の音声ガイダンスを停止して該情報内容が示す位置ま
で該音声ガイダンスをスキップした後に該音声ガイダン
スを再出力する音声対話装置が得られる。On the other hand, according to the present invention, in the above-mentioned voice interaction apparatus, the voice output means stops the voice guidance being output when the information content in the information position storage means is transmitted under the control of the dialog control unit. A voice dialogue device is provided which re-outputs the voice guidance after skipping the voice guidance to the position indicated by the information content.

【００１５】[0015]

【発明の実施の形態】以下に実施例を挙げ、本発明の音
声対話方法及び音声対話装置について、図面を参照して
詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The embodiments of the present invention will be described below in detail with reference to the accompanying drawings.

【００１６】最初に、本発明の音声対話方法の概要を簡
単に説明する。この音声対話方法は、ユーザに対して音
声ガイダンスを出力すると共に、そのユーザの発声によ
る音声入力を音声認識処理した認識結果に応じて対応す
る音声ガイダンスを出力する際、ユーザの使用開始時に
音声ガイダンスを出力すると共に、発声の音声認識時に
認識結果に対応する情報が存在する箇所まで音声ガイダ
ンスをスキップして出力を行うものである。これによ
り、不必要な音声ガイダンスを効果的に排除でき、使用
環境やユーザによって音声認識処理を行うことが困難で
ある場合や音声入力されない場合等の処理保留中にもユ
ーザが音声入力を放置して音声ガイダンスに従うことで
十分な情報入手が可能になる。First, the outline of the voice interaction method of the present invention will be briefly described. This voice interaction method outputs a voice guidance to a user and outputs a corresponding voice guidance according to a recognition result obtained by performing voice recognition processing on a voice input by the user's voice. At the same time, the voice guidance is skipped to the point where the information corresponding to the recognition result exists at the time of voice recognition of the utterance, and the output is performed. As a result, unnecessary voice guidance can be effectively eliminated, and the user can leave the voice input even when the voice recognition processing is difficult depending on the use environment or the user or when the voice input is not performed. Sufficient information can be obtained by following the voice guidance.

【００１７】この音声対話方法を適用した音声対話装置
では、ユーザに対して音声ガイダンスを出力すると共
に、そのユーザの発声による音声入力を音声認識処理し
た認識結果に応じて対応する音声ガイダンスを出力する
出力制御手段を備えた基本構成において、出力制御手段
は、ユーザの使用開始時に音声ガイダンスを出力すると
共に、発声の音声認識時に認識結果に対応する情報が存
在する箇所まで音声ガイダンスをスキップして出力を行
うものとなる。In the voice dialogue apparatus to which the voice dialogue method is applied, a voice guidance is output to a user, and a voice guidance corresponding to a voice input by the user's utterance is output according to a recognition result. In the basic configuration including the output control unit, the output control unit outputs the voice guidance at the start of use of the user, and skips the voice guidance to a position where the information corresponding to the recognition result exists at the time of voice recognition of the utterance and outputs the voice guidance. Will be performed.

【００１８】図１は、本発明の音声対話方法を適用した
一実施例に係る音声対話装置の基本構成を示したブロッ
ク図である。この音声対話装置は、ユーザが発声により
音声入力を入力する音声入力部１０１と、音声入力に対
して音声認識処理を行う音声認識部１０２と、音声認識
処理に際して用いる認識用辞書データを記憶した認識用
辞書記憶部１０３と、音声ガイダンスを記憶したガイダ
ンス記憶部１０６と、音声認識処理による認識結果に対
応する情報が音声ガイダンス中のどの位置に存在するか
を記憶する情報位置記憶部１０５と、装置の使用開始及
び使用終了を検出する装置使用検出部１０８と、認識結
果に従って対話の流れを管理し、装置使用検出部１０８
が使用開始を検出したときに音声ガイダンスを出力指示
し、認識結果が送られてきたときに情報位置記憶部１０
５における情報内容に基づいて認識結果に対応する音声
ガイダンスの位置を調べて出力する対話制御部１０４
と、対話制御部１０４からの出力指示に従ってガイダン
ス記憶部１０６における音声ガイダンスを出力し、対話
制御部１０４の制御により情報位置記憶部１０５におけ
る情報内容が送られたときに出力中の音声ガイダンスを
停止してその情報内容が示す位置まで音声ガイダンスを
スキップした後に音声ガイダンスを再出力する音声出力
部１０７とを備えて成っている。FIG. 1 is a block diagram showing a basic configuration of a voice interaction apparatus according to an embodiment to which the voice interaction method of the present invention is applied. The speech dialogue apparatus includes a speech input unit 101 for a user to input a speech input by utterance, a speech recognition unit 102 for performing a speech recognition process on the speech input, and a recognition device storing recognition dictionary data used in the speech recognition process. A dictionary storage unit 103, a guidance storage unit 106 that stores voice guidance, an information position storage unit 105 that stores where in the voice guidance information corresponding to the recognition result obtained by the voice recognition process is located, Device use detection unit 108 for detecting the start and end of use of the device, and managing the flow of dialogue according to the recognition result, and
Outputs a voice guidance when the start of use is detected, and outputs an information position storage unit 10 when a recognition result is sent.
Dialogue control unit 104 for checking and outputting the position of the voice guidance corresponding to the recognition result based on the information content in 5
The voice guidance in the guidance storage unit 106 is output in accordance with the output instruction from the dialog control unit 104, and the voice guidance being output is stopped when the information content in the information position storage unit 105 is transmitted under the control of the dialog control unit 104. And a voice output unit 107 for re-outputting the voice guidance after skipping the voice guidance to the position indicated by the information content.

【００１９】このうち、対話制御部１０４及び音声出力
部１０７は、合わせてユーザの使用開始時に音声ガイダ
ンスを出力すると共に、発声の音声認識時に認識結果に
対応する情報が存在する箇所まで音声ガイダンスをスキ
ップして出力を行う出力制御手段として働く。Among them, the dialog control unit 104 and the voice output unit 107 output voice guidance at the start of use of the user, and also provide voice guidance up to the point where information corresponding to the recognition result exists at the time of voice recognition of speech. It works as output control means for skipping and outputting.

【００２０】以下は、この音声対話装置の動作処理につ
いて説明する。ここでは、先ず装置使用検出部１０８が
ユーザによる装置の使用開始を検出すると、対話制御部
１０４では音声出力部１０７に対して音声ガイダンスを
先頭から出力するように指示する。これにより、音声出
力部１０７はガイダンス記憶部１０６に記憶された音声
ガイダンスを先頭より出力する。In the following, the operation process of the voice interactive device will be described. Here, first, when the device use detection unit 108 detects the start of use of the device by the user, the interaction control unit 104 instructs the voice output unit 107 to output voice guidance from the beginning. Thereby, the audio output unit 107 outputs the audio guidance stored in the guidance storage unit 106 from the top.

【００２１】そこで、次にガイダンス出力後、ユーザが
音声入力部１０１に対して音声入力を行うと、音声入力
部１０１は音声入力を音声認識部１０２へ送出する。音
声認識部１０２では認識用辞書記憶部１０３に記憶され
ている音声認識用辞書を用いて音声認識処理を行い、そ
の認識結果を対話制御部１０４へ送出する。対話制御部
１０４では送られた認識結果に相当する音声ガイダンス
の情報位置を情報位置記憶部１０５の記憶内容に基づい
て調べ、音声出力部１０７に対して音声ガイダンスの出
力指示並びに音声ガイダンスの情報位置を出力する。音
声出力部１０７では、対話制御部１０４より音声ガイダ
ンスに関する出力指示及び情報位置が送られたときに現
在出力中のガイダンス記憶部１０６に記憶された音声ガ
イダンスを停止し、情報位置及びガイダンス記憶部１０
６の記憶内容に基づいて認識結果に相当する音声ガイダ
ンスまでスキップし、その後にユーザに対して再出力し
続ける。この状態で装置使用検出部１０８からユーザの
使用終了が検出され、対話制御部１０４に送出される
と、音声出力部１０７は音声ガイダンスの再出力を停止
する。Then, after the guidance is output, when the user inputs a voice to the voice input unit 101, the voice input unit 101 sends the voice input to the voice recognition unit 102. The speech recognition unit 102 performs a speech recognition process using the speech recognition dictionary stored in the recognition dictionary storage unit 103, and sends the recognition result to the dialog control unit 104. The dialog control unit 104 checks the information position of the voice guidance corresponding to the sent recognition result based on the storage content of the information position storage unit 105, and outputs a voice guidance to the voice output unit 107 and the information position of the voice guidance. Is output. The voice output unit 107 stops the voice guidance currently stored in the guidance storage unit 106 being output when the output instruction and the information position regarding the voice guidance are sent from the dialog control unit 104, and outputs the information position and the guidance storage unit 10.
Skip to voice guidance corresponding to the recognition result based on the storage content of No. 6 and then continue to re-output to the user. In this state, when the end of use of the user is detected by the device use detection unit 108 and sent to the dialog control unit 104, the audio output unit 107 stops outputting the audio guidance again.

【００２２】以下は、この音声対話装置の動作処理につ
いて、更に具体例を参照して説明する。但し、ここで
は、ユーザがプロ野球の試合経過情報を得ることができ
るサービスを行うものとする。このようなサービスを行
うインフラとしては、例えば専用の電話番号に電話をか
ける構成のもの、街頭の情報端末を用いる構成のもの等
がある。In the following, the operation processing of the voice interactive device will be described with reference to a more specific example. However, here, it is assumed that a service is provided in which the user can obtain professional baseball game progress information. Infrastructures that provide such services include, for example, a configuration that calls a dedicated telephone number, and a configuration that uses a street information terminal.

【００２３】図２はガイダンス記憶部１０６に記憶され
ている音声ガイダンスを例示したものであり、図３は認
識用辞書記憶部１０３に記憶されている音声認識用辞書
の内容を例示したものであり、図４は情報位置記憶部１
０５の記憶内容を例示したものである。FIG. 2 exemplifies the voice guidance stored in the guidance storage unit 106, and FIG. 3 exemplifies the contents of the voice recognition dictionary stored in the recognition dictionary storage unit 103. FIG. 4 shows the information location storage unit 1.
05 is an example of the stored contents.

【００２４】因みに、図３に示す音声認識用辞書は、左
欄に登録されている表記、右側欄に登録されている読み
が記述されている。音声認識部１０２では入力音声とこ
こでの読みとを用いてパターンマッチングを行い、認識
結果として表記を出力する。又、例えば図４中のロッテ
の情報は図２中の音声ガイダンスにおいて２３．６秒の
時点から存在していることが分かる。Incidentally, the dictionary for voice recognition shown in FIG. 3 describes notations registered in the left column and readings registered in the right column. The voice recognition unit 102 performs pattern matching using the input voice and the reading here, and outputs a notation as a recognition result. Also, for example, it can be seen that the information of Lotte in FIG. 4 exists from the time point of 23.6 seconds in the voice guidance in FIG.

【００２５】そこで、ユーザが本サービスとの対話を開
始したもの（専用の電話番号に電話をかける構成であれ
ば通話が開始された場合、街頭の情報端末による構成で
あればＳＴＡＲＴボタンが押された場合）とする。この
とき、装置使用検出部１０８は対話が開始したという情
報を対話制御部１０４へ送出する。対話制御部１０４で
は、音声出力部１０７に対して音声ガイダンスを冒頭よ
り出力するよう指示する。音声出力部１０７はこの指示
により、ガイダンス記憶部１０６に記憶されている図２
に示されるような音声ガイダンス（『プロ野球情報で
す。東京ドームで行われている…，日本ハムがリードと
なっております。』）をユーザに対して出力する。音声
出力部１０７は、ユーザからの音声入力があるまで、音
声ガイダンスを出し続ける。尚、図２に示す音声ガイダ
ンスの最後まで出力し終ったならば、再び音声ガイダン
スを始めから出力し直しても良い。Then, the user starts the dialogue with the service (the telephone call is started in the case of making a call to a dedicated telephone number, or the START button is pressed in the case of the structure of a street information terminal). ). At this time, the device use detection unit 108 sends information to the dialog control unit 104 that the dialog has started. The dialog control unit 104 instructs the voice output unit 107 to output voice guidance from the beginning. In response to this instruction, the audio output unit 107 stores the instruction shown in FIG.
Is output to the user ("Professional baseball information. Performed at Tokyo Dome ..., Nippon Ham is the lead."). The voice output unit 107 continues to output voice guidance until there is a voice input from the user. When the output of the voice guidance shown in FIG. 2 is completed, the voice guidance may be output again from the beginning.

【００２６】ここで、音声ガイダンス出力中にユーザが
オリックスと音声入力し、音声認識部１０２が正しく認
識できたものとすれば、このとき対話制御部１０４は、
音声認識部１０２より送出された認識結果であるオリッ
クスの情報位置を情報位置記憶部１０５の記憶内容より
調べる。図４からはこのオリックスの情報が１７．６秒
の時点であることが分かるので、対話制御部１０４は音
声出力部１０７に対して１７．６秒の時点から出力する
よう指示を送る。そこで、音声出力部１０７はこのよう
な指示を受け、現在出力中の音声ガイダンスを停止し、
１７．６秒の時点から音声ガイダンス（『西武球場の西
武・オリックス戦は…，日本ハムがリードとなっており
ます。』）を再出力する。この後、再びユーザからの音
声が入力されるか、或いはユーザの使用終了に至るま
で、音声ガイダンス（『西武球場の西武・オリックス戦
は…，日本ハムがリードとなっております。』）を出力
し続ける。Here, assuming that the user inputs the voice of ORIX during the voice guidance output and the voice recognition unit 102 can correctly recognize the voice, the dialog control unit 104
The information position of ORIX, which is the recognition result sent from the voice recognition unit 102, is checked from the contents stored in the information position storage unit 105. Since it can be seen from FIG. 4 that the information of this ORIX is at the time of 17.6 seconds, the dialog control unit 104 sends an instruction to the audio output unit 107 to output the information at the time of 17.6 seconds. Then, the voice output unit 107 receives such an instruction, stops the voice guidance currently being output, and
At 17.6 seconds, the voice guidance (“Nippon-Ham is the leader in the Seibu-Orix match at Seibu Stadium”) is output again. After that, the voice guidance is output until the user's voice is input again or until the user finishes using ("Nishiham is the lead in Seibu Stadium Seibu-Orix game ..."). Keep doing.

【００２７】図５はユーザが特定の情報のみを入手した
場合の対話例を模試的に示したものである。ここでは、
ユーザが阪神と近鉄とに関する情報のみを入手した場合
の対話例を示している。この場合、ユーザが装置使用開
始後のシステムにおける音声ガイダンス（『プロ野球情
報です。東京ドームで行われている…，日本ハムがリー
ドとなっております。』）に対し、ユーザが必要な情報
として阪神を音声入力してそれに関する音声ガイダンス
出力（『甲子園球場で行われている阪神・中日戦は２対
３で中日が勝ちました。西武球場の』）が終了した時点
で次に近鉄を音声入力してそれに関する音声ガイダンス
出力（『マリンスタジアムで行われているロッテ・近鉄
戦は９回表で１対１の同点、福岡ドームの』）を得てか
ら装置使用終了とすることにより、不要な音声ガイダン
スを聞かずに済む（不必要な音声ガイダンスを効果的に
排除できる）ようにした様子を示している。FIG. 5 schematically shows an example of a dialog when the user obtains only specific information. here,
The example of a dialogue when a user acquires only information about Hanshin and Kintetsu is shown. In this case, the information required by the user for the voice guidance (“Professional baseball information. Performed at Tokyo Dome ..., Nippon Ham is the lead”) in the system after the user started using the device. When the voice guidance of Hanshin was input and the voice guidance output related to it ("Kanen-China-Japan match held at Koshien Stadium won by 2-3 against China-Japan. Seibu Stadium"), Kintetsu It is not necessary to end the use of the device after obtaining the voice guidance and obtaining the voice guidance output related to it ("Lotte and Kintetsu held at Marine Stadium are tied 1 to 1 in the 9th round, Fukuoka Dome"). It is shown that unnecessary voice guidance is not required to be heard (unnecessary voice guidance can be effectively eliminated).

【００２８】更に、使用環境やユーザによって周囲雑音
等の理由から音声認識処理を行うことが困難である場合
や、音声入力が可能であること（使用方法）が分からな
かったり、音声入力できない等の理由により音声入力さ
れない場合等の処理保留中の状態も起こり得る。このよ
うな場合でも、音声ガイダンスを最後まで出力し終った
時点で再び音声ガイダンスを始めから出力し直すように
していれば、ユーザは何も音声入力を行わなければ、或
る程度不必要な情報も聞かなければならないが、やがて
は必要な音声ガイダンスの情報を入手できるため、処理
保留中の状態であっても音声ガイダンスの情報を十分に
入手し得る。Furthermore, depending on the use environment or the user, it is difficult to perform speech recognition processing due to ambient noise or the like, or it is not known that speech input is possible (how to use), or speech input is not possible. A state where processing is suspended, such as when voice input is not performed for some reason, may occur. Even in such a case, if the voice guidance is output again from the beginning when the voice guidance has been output to the end, if the user does not perform any voice input, a certain amount of unnecessary information can be obtained. However, since the necessary voice guidance information can be obtained in due course, it is possible to sufficiently obtain the voice guidance information even when the processing is suspended.

【００２９】[0029]

【発明の効果】以上に述べたように、本発明の音声対話
方法及び音声対話装置によれば、ユーザの使用開始時に
音声ガイダンスを出力すると共に、発声による音声入力
の音声認識時に認識結果に対応する情報が存在する箇所
まで音声ガイダンスをスキップして出力を行うため、ユ
ーザは音声入力することで必要な音声ガイダンスの情報
を即時に入手でき、しかも不要な音声ガイダンスの情報
を聞かずに済むようになり、しかも使用環境やユーザに
よって音声認識処理を行うことが困難である場合や音声
入力されない場合等の処理保留中に際しても、ユーザは
何もせずに全ての音声ガイダンスを順に聞くことで音声
ガイダンスの情報を十分に入手できるため、結果として
ユーザにとって音声対話の使い易さが向上し、産業上極
めて有益となる。As described above, according to the voice dialogue method and voice dialogue apparatus of the present invention, a voice guidance is output at the start of use of a user, and a voice recognition of a voice input by utterance corresponds to a recognition result. Since the voice guidance is skipped to the point where the information to be output exists, the voice guidance is output, so that the user can immediately obtain the necessary voice guidance information by inputting the voice, and further, can avoid the unnecessary voice guidance information. In addition, even when the voice recognition processing is difficult depending on the usage environment or the user or when the voice input is not being performed, the user is asked to listen to all voice guidance in order without doing anything. Can be obtained sufficiently, and as a result, the user can easily use voice dialogue, which is extremely useful in industry.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の音声対話方法を適用した一実施例に係
る音声対話装置の基本構成を示したブロック図である。FIG. 1 is a block diagram showing a basic configuration of a voice interaction device according to an embodiment to which a voice interaction method of the present invention is applied.

【図２】図１に示す音声対話装置の動作処理を具体的に
説明するためにガイダンス記憶部に記憶されている音声
ガイダンスを例示したものである。FIG. 2 is a diagram exemplifying voice guidance stored in a guidance storage unit for specifically describing an operation process of the voice interaction device shown in FIG. 1;

【図３】図１に示す音声対話装置の動作処理を具体的に
説明するために認識用辞書記憶部に記憶されている音声
認識用辞書の内容を例示したものである。FIG. 3 illustrates contents of a speech recognition dictionary stored in a recognition dictionary storage unit in order to specifically describe an operation process of the speech interaction apparatus illustrated in FIG. 1;

【図４】図１に示す音声対話装置の動作処理を具体的に
説明するために情報位置記憶部の記憶内容を例示したも
のである。FIG. 4 is a diagram exemplifying storage contents of an information position storage unit for specifically describing an operation process of the voice interaction apparatus shown in FIG. 1;

【図５】図１に示す音声対話装置の動作処理を具体的に
説明するためにユーザが特定の情報のみを入手した場合
の対話例を模試的に示したものである。FIG. 5 schematically shows an example of dialogue when a user obtains only specific information in order to specifically explain the operation processing of the voice interaction device shown in FIG. 1;

【符号の説明】[Explanation of symbols]

１０１音声入力部１０２音声認識部１０３認識用辞書記憶部１０４対話制御部１０５情報位置記憶部１０６ガイダンス記憶部１０７音声出力部１０８装置使用検出部 Reference Signs List 101 Voice input unit 102 Voice recognition unit 103 Recognition dictionary storage unit 104 Dialogue control unit 105 Information position storage unit 106 Guidance storage unit 107 Voice output unit 108 Device use detection unit

Claims

【特許請求の範囲】[Claims]

【請求項１】ユーザに対して音声ガイダンスを出力す
ると共に、該ユーザの発声による音声入力を音声認識処
理した認識結果に応じて対応する該音声ガイダンスを出
力する音声対話方法であって、前記ユーザの使用開始時
に前記音声ガイダンスを出力すると共に、前記発声の音
声認識時に前記認識結果に対応する情報が存在する箇所
まで該音声ガイダンスをスキップして出力を行うことを
特徴とする音声対話方法。1. A voice interaction method for outputting voice guidance to a user and outputting the voice guidance corresponding to a recognition result obtained by performing voice recognition processing on a voice input by uttering the user, And outputting the voice guidance at the start of using the voice guidance and skipping the voice guidance up to a point where information corresponding to the recognition result exists at the time of voice recognition of the utterance.

【請求項２】ユーザに対して音声ガイダンスを出力す
ると共に、該ユーザの発声による音声入力を音声認識処
理した認識結果に応じて対応する該音声ガイダンスを出
力する出力制御手段を備えた音声対話装置であって、前
記出力制御手段は、前記ユーザの使用開始時に前記音声
ガイダンスを出力すると共に、前記発声の音声認識時に
前記認識結果に対応する情報が存在する箇所まで該音声
ガイダンスをスキップして出力を行うことを特徴とする
音声対話装置。2. A voice interactive device comprising: an output control unit that outputs voice guidance to a user and outputs the voice guidance corresponding to a recognition result obtained by performing voice recognition processing on a voice input generated by the user. The output control means outputs the voice guidance at the start of use of the user, and skips and outputs the voice guidance to a point where information corresponding to the recognition result exists at the time of voice recognition of the utterance. A voice interaction device characterized by performing:

【請求項３】請求項２記載の音声対話装置において、
前記音声ガイダンスを記憶したガイダンス記憶手段と、
前記音声認識処理による前記認識結果に対応する情報が
前記音声ガイダンス中のどの位置に存在するかを記憶す
る情報位置記憶手段と、装置の使用開始及び使用終了を
検出する装置使用検出手段とを備え、前記出力制御手段
は、前記認識結果に従って対話の流れを管理し、前記装
置使用検出手段が前記使用開始を検出したときに前記音
声ガイダンスを出力指示し、該認識結果が送られてきた
ときに前記情報位置記憶手段における前記情報内容に基
づいて該認識結果に対応する該音声ガイダンスの位置を
調べて出力する対話制御部を備えたことを特徴とする音
声対話装置。3. The voice interaction device according to claim 2, wherein
Guidance storage means for storing the voice guidance,
An information position storage unit that stores information corresponding to the result of the recognition by the voice recognition process in the voice guidance; and a device use detection unit that detects start and end of use of the device. The output control means manages the flow of the dialogue according to the recognition result, instructs to output the voice guidance when the device use detection means detects the start of use, and outputs the voice guidance when the recognition result is sent. A speech dialogue device, comprising: a dialogue control unit that checks and outputs the position of the voice guidance corresponding to the recognition result based on the information content in the information location storage unit.

【請求項４】請求項３記載の音声対話装置において、
ユーザが発声により前記音声入力を入力する音声入力手
段と、前記音声入力に対して前記音声認識処理を行う音
声認識手段と、前記音声認識処理に際して用いる認識用
辞書データを記憶した認識用辞書記憶手段とを備え、前
記音声認識手段は前記認識結果を前記対話制御部へ送出
することを特徴とする音声対話装置。4. The voice interaction device according to claim 3, wherein
Voice input means for the user to input the voice input by utterance, voice recognition means for performing the voice recognition processing on the voice input, and recognition dictionary storage means for storing recognition dictionary data used in the voice recognition processing Wherein the voice recognition unit sends the recognition result to the dialog control unit.

【請求項５】請求項３又は４記載の音声対話装置にお
いて、前記出力制御手段は、前記対話制御部からの出力
指示に従って前記ガイダンス記憶手段における前記音声
ガイダンスを出力する音声出力手段を備えたことを特徴
とする音声対話装置。5. The voice interaction device according to claim 3, wherein the output control means includes voice output means for outputting the voice guidance in the guidance storage means in accordance with an output instruction from the dialog control unit. A voice interactive device characterized by the following.

【請求項６】請求項５記載の音声対話装置において、
前記音声出力手段は、前記対話制御部の制御により前記
情報位置記憶手段における前記情報内容が送られたとき
に出力中の前記音声ガイダンスを停止して該情報内容が
示す位置まで該音声ガイダンスをスキップした後に該音
声ガイダンスを再出力することを特徴とする音声対話装
置。6. The voice interaction device according to claim 5, wherein
The voice output means stops the voice guidance being output when the information content in the information position storage means is transmitted under the control of the dialogue control unit, and skips the voice guidance to a position indicated by the information content. And outputting the voice guidance again after performing the voice guidance.