JP2009036998A

JP2009036998A - Interactive method using computer, interactive system, computer program and computer-readable storage medium

Info

Publication number: JP2009036998A
Application number: JP2007201254A
Authority: JP
Inventors: Hiroshi Aihara; 博合原; Hideo Nakano; 英雄中野
Original assignee: GENGO RIKAI KENKYUSHO KK; Infocom Corp
Current assignee: GENGO RIKAI KENKYUSHO KK; Infocom Corp
Priority date: 2007-08-01
Filing date: 2007-08-01
Publication date: 2009-02-19

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method of switching situation, while maintaining voice recognition rate based on the speech of a user. <P>SOLUTION: This interactive method that uses a computer includes the steps of providing a situation language model constituted of a set of vocabularies related to a plurality of situations, respectively and a switching language model, which is a set of vocabulary; interpreting the intention of the speech of a user, with reference to the situation language model and to the switching language model; and generating speech based on the intention of speech of the user and the situation language model, wherein when there is vocabulary, which is contained in the switching language model but is not contained in the current situation language model in the speech of the user, a speech according to the situation, corresponding to the vocabulary, is generated instead of the current situation. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、コンピュータによる対話方法、対話システム、同方法を実行するためのコンピュータプログラムおよび同プログラムを格納したコンピュータに読み取り可能な記憶媒体に関するものであり、特に、ユーザによる話題の変更にスムーズに追従することができる対話方法等に関するものである。 The present invention relates to a computer interactive method, a dialog system, a computer program for executing the method, and a computer-readable storage medium storing the program, and in particular, smoothly follows changes in topics by a user. It relates to a dialogue method that can be performed.

ユーザがコンピュータに会話を入力した場合、コンピュータは、それまでの会話の内容などから、その会話のシチュエーションは何であるかを特定し、当該シチュエーションで専ら用いられる語彙を参照して会話の内容を解釈することが行われる。これは、シチュエーションを特定することによって、ユーザーが入力した会話のコンピュータによる解釈がより正確なものになり、したがって、ユーザに応答してコンピュータが返す質問等がより適切になるからである。
このようなシステムによれば、例えば、コンピュータとユーザとが、入出力インターフェースを通じて以下のような対話を行うようなことが可能になる。
コンピュータ：「昨日はどこでゴルフをしたのですか？」
ユーザ：「○○カントリーでしたよ。」
コンピュータ：「成績はいかがでしたか？」
ユーザ：「イマイチでしたね。」 When a user inputs a conversation to a computer, the computer identifies what the situation of the conversation is based on the contents of the conversation so far, and interprets the contents of the conversation by referring to the vocabulary used exclusively in the situation. To be done. This is because by specifying the situation, the computer's interpretation of the conversation entered by the user becomes more accurate, and therefore the questions returned by the computer in response to the user become more appropriate.
According to such a system, for example, a computer and a user can perform the following dialogue through an input / output interface.
Computer: “Where did you play golf yesterday?”
User: “It was XX country.”
Computer: “How was your grade?”
User: “It was not good.”

上記のコンピュータとユーザとの対話は、「ゴルフ」というシチュエーションにおいて行われたものの例である。この場合、質問および返事のやり取りが「ゴルフ」の範疇であれば問題はないが、シチュエーションを変えて例えば、次の休暇に予定している予定について、「旅行」というシチュエーションで対話を行おうとすると、従来方法では話題の変更がスムーズに行われない。つまり、上記の対話に続いて仮にユーザが旅行の話題を持ち出したとしても、コンピュータはシチュエーション「ゴルフ」を維持したままで、その範囲内で対話を行なおうとするからである。 The above-mentioned dialogue between the computer and the user is an example of what was performed in the situation of “golf”. In this case, there is no problem if the exchange of questions and answers is in the category of “golf”, but if you change the situation, for example, if you are going to talk with the situation of “travel” about the schedule scheduled for the next vacation In the conventional method, the topic is not changed smoothly. That is, even if the user brings out the topic of travel following the above dialogue, the computer keeps the situation “golf” and tries to carry out the dialogue within the range.

本発明は、従来技術が有する上記のような問題点を改善するために案出されたものであり、ユーザの発話に基づいてシチュエーションの変更を随時行うことによって、ユーザの話題の変更にスムーズに追従することのできる対話方法、対話システム、同方法を実行するためのコンピュータプログラムおよび同プログラムを格納したコンピュータに読み取り可能な記憶媒体を提供することを目的としたものである。 The present invention has been devised in order to improve the above-described problems of the prior art. By changing the situation as needed based on the user's utterance, the user's topic can be changed smoothly. It is an object of the present invention to provide an interactive method, an interactive system, a computer program for executing the method, and a computer-readable storage medium storing the program.

上記の目的を達成するために、本発明は、複数のシチュエーションのそれぞれに関連する語彙の集合からなるシチュエーション言語モデルと、語彙の集合である切り替え言語モデルを備え、
ユーザの発話の意図を、上記シチュエーション言語モデルと切り替え言語モデルを参照して解釈し、
前記ユーザの発話の意図とシチュエーション言語モデルに基づいて発話を生成する、コンピュータによる対話方法であって、
ユーザの発話の中に、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙がある場合、現在のシチュエーションに代えて当該語彙に対応するシチュエーションに応じた発話を生成するコンピュータによる対話方法を提案する。 To achieve the above object, the present invention comprises a situation language model composed of a set of vocabulary related to each of a plurality of situations, and a switching language model that is a set of vocabularies,
Interpreting the user's utterance intention with reference to the above situation language model and switching language model,
A computer interaction method for generating an utterance based on the intention of the user's utterance and a situation language model,
When there is a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, the computer generates a utterance corresponding to the situation corresponding to the vocabulary instead of the current situation. Propose a dialogue method.

ここで、シチュエーションとは、例えば、「ゴルフクラブ」、「ゴルフコース」、「ゴルフスウィング」というような複数の話題を包含する上位概念である。シチュエーション言語モデルは、上記の例の場合であれば、「ゴルフクラブ」、「ゴルフコース」、「ゴルフスウィング」等のそれぞれに関連する語彙の集合である。例えば、話題「ゴルフクラブ」には、「ドライバー」、「アイアン」、「パター」、「ウッド」等の語彙が含まれる。 Here, the situation is a general concept including a plurality of topics such as “golf club”, “golf course”, and “golf swing”. In the case of the above example, the situation language model is a set of vocabularies related to “golf club”, “golf course”, “golf swing”, and the like. For example, the topic “golf club” includes vocabularies such as “driver”, “iron”, “putter”, and “wood”.

本発明の対話方法によれば、ユーザの発話の意図を解釈するに際して、上記シチュエーション言語モデルと切り替え言語モデルを参照する。さらに、併せて一般言語の語彙を参照するものであってもよい。特に、一般言語の語彙を併せて参照する場合には、現在のシチュエーション言語モデルを一般言語の語彙に優先して用いることが望ましい。シチュエーション言語モデルは、ユーザの発話の意図を解釈する際と、コンピュータによる発話を生成する際に参照される。
本明細書に於いて、発話とは、文書を提示すること一般の意味で用いており、ユーザがキーボードを通じて文字入力を行うこと、マイクを使って音声入力すること、コンピュータが文字列を画面に表示すること、スピーカを使って発音することを含む概念として用いる。 According to the dialogue method of the present invention, the situation language model and the switching language model are referred to when interpreting the intention of the user's utterance. Furthermore, the vocabulary of a general language may be referred together. In particular, when referring to a vocabulary in a general language, it is desirable to use the current situation language model in preference to a vocabulary in a general language. The situation language model is referred to when interpreting a user's utterance intention and when generating a computer utterance.
In this specification, utterance is used in the general sense of presenting a document. The user inputs characters using a keyboard, inputs voice using a microphone, and the computer displays a character string on the screen. It is used as a concept that includes displaying and sounding using a speaker.

切り替え言語モデルは、特定のシチュエーションにおいて典型的に試用される語彙の集合である。例えば、「頭痛」、「映画」、「恋人」のような語彙である。特定のシチュエーション言語モデルと共に使用される切り替え言語モデルは、当該シチュエーション言語モデルに含まれる語彙を含むものであってもよいし、含まないものであってもよい。前者の場合、シチュエーションにかかわらず同一の切り替え言語モデルを使用することが可能になる。 A switched language model is a collection of vocabularies that are typically tried in a particular situation. For example, vocabulary such as “headache”, “movie”, “lover”. The switching language model used with a specific situation language model may or may not include a vocabulary included in the situation language model. In the former case, the same switching language model can be used regardless of the situation.

本発明の方法によれば、ユーザの発話中に、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙がある場合、現在のシチュエーションに代えて当該語彙に対応するシチュエーションに応じた発話を生成する。つまり、特定のシチュエーションにおいてユーザの発話が行われている間は、コンピュータによって、当該シチュエーション言語モデルに基づいて発話が行われるが、ユーザの発話の中に、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙がある場合、コンピュータはユーザが話題を変更しようとしていると判断して、ユーザの発話中に発見された切り替え言語モデルに対応付けられたシチュエーション言語モデルに基づいて発話する。 According to the method of the present invention, when there is a vocabulary included in the switching language model but not included in the current situation language model during the user's utterance, the current situation is replaced with a situation corresponding to the vocabulary. Generate utterances. In other words, while a user utters in a specific situation, the computer utters based on the situation language model, but the user's utterance is included in the switching language model but is in the current situation. If there is a vocabulary that is not included in the language model, the computer determines that the user is changing the topic and speaks based on the situation language model associated with the switched language model found during the user's utterance To do.

シチュエーションの切り替えは、「切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙」が１つ認識されたことによって行われてもよいし、複数認識されることを条件として行われるものであってもよい。あるいは、複数認識された前記語彙が最も多く関連付けられたシチュエーションを選択するように、認識比率に基づくものであってもよい、さらには、最近設定されたことのあるシチュエーションを優先的に選択することであってもよい。 The situation switching may be performed by recognizing one “vocabulary included in the switching language model but not included in the current situation language model” or on condition that a plurality of words are recognized. It may be a thing. Alternatively, it may be based on a recognition ratio so as to select the situation associated with the largest number of the vocabulary recognized multiple times, and also preferentially select a situation that has been set recently. It may be.

従来技術に関して例示した対話に基づいて本発明による話題の変更を例示すると、典型的には以下のようになる。
コンピュータ：「昨日はどこでゴルフをしたのですか？」
ユーザ：「○○カントリーでしたよ。」
コンピュータ：「成績はいかがでしたか？」
ユーザ：「イマイチでしたね。途中で頭痛がするようになったので集中力が足りませんでした。」
コンピュータ：「それはいけませんね。風邪をひいたのかもしれません。のどの痛みはありますか？」
ユーザ：「今日は、のどは痛くないけれど、熱がありそうです。」 An example of a topic change according to the present invention based on the dialogue illustrated with respect to the prior art is typically as follows.
Computer: “Where did you play golf yesterday?”
User: “It was XX country.”
Computer: “How was your grade?”
User: “It wasn't good. I started to have a headache on the way, so I couldn't concentrate.”
Computer: “That shouldn't be. You may have caught a cold. Do you have a sore throat?”
User: “The throat doesn't hurt today, but it seems fever.”

上記の対話例では、途中までシチュエーションは「ゴルフ」が設定されていたが、ユーザの発話中に、切り替え言語モデルに含まれが現在のシチュエーション言語モデルに含まれない「頭痛」という語彙が含まれていたため、コンピュータは、シチュエーションを「病気」に切り替え、病気に関するシチュエーション言語モデルに基づいて次の発話「それはいけませんね。風邪をひいたのかもしれません。のどの痛みはありますか？」を発話した。
このようにして、ユーザの発話に基づき対話のシチュエーションが変化したことを感知して、コンピュータが自動的に新しいシチュエーションの基で発話を行うので、対話が非常にスムーズで違和感がない。
また、ユーザが、「頭痛がする」ではなく、「仕事のことが気になって」と発話すれば、シチュエーションは「仕事」に切り替わって、シチュエーション「仕事」の下で対話が続くことになる。 In the above dialogue example, the situation was set to “golf” halfway, but during the user's utterance, the vocabulary “headache” included in the switching language model but not in the current situation language model is included. Therefore, the computer switched the situation to “Illness” and based on the Situation Language Model on the disease, the next utterance “Do n’t do that. You may have caught a cold. What pain do you have?” Uttered.
In this manner, since the computer automatically senses that the situation of the dialogue has changed based on the user's utterance, and the computer automatically utters based on the new situation, the dialogue is very smooth and uncomfortable.
Also, if the user says “I am worried about work” instead of “I have a headache”, the situation will switch to “Work” and the conversation will continue under the situation “Work” .

本発明においては、切り替え言語モデルは、語彙を逆ツリー状に階層構造化したものであり、該語彙構造における少なくとも最上位の語彙にはシチュエーションが対応付けられているのが望ましい。図１は、本発明に基づく切り替え言語モデルの階層構造を例示したものである。この例では、「ヘルスケア」という概念には「損ねる」「維持」という属性があり、その属性と関連付けられる概念として「病気」「ダイエット」「運動」が存在する。また「症例」という概念には、その概念の実体として「発熱」「咳」「頭痛」が存在することを意味している。また、実体は認識辞書の見出し語と同じである。すみ括弧で括ったメモの図がシチュエーションを表している。
図１に例示したように、上層の語彙に対してその下の層の１つまたは複数の語彙が関連付けられるが、下層の語彙から見ると関連付けられたその上の層の語彙は１つのみである構造をここでは、逆ツリー状に階層構造と称する。 In the present invention, the switching language model is a vocabulary hierarchically structured in an inverted tree shape, and it is desirable that a situation is associated with at least the highest vocabulary in the vocabulary structure. FIG. 1 illustrates a hierarchical structure of a switching language model based on the present invention. In this example, the concept of “healthcare” has attributes of “damage” and “maintenance”, and “disease”, “diet”, and “exercise” are associated with the attributes. The concept of “case” means that “fever”, “cough”, and “headache” exist as entities of the concept. The entity is the same as the entry word in the recognition dictionary. The figure in memos enclosed in square brackets represents the situation.
As illustrated in FIG. 1, one or more vocabularies in the lower layer are associated with the upper vocabulary, but only one upper layer vocabulary is associated with the lower vocabulary. Here, a certain structure is referred to as a hierarchical structure in the form of an inverted tree.

前記語彙構造において、ユーザの発話の中の、「切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙」にはシチュエーションが対応付けられていない場合、前記語彙構造を上位方向に遡り、最初に発見されたシチュエーションの対応付けを有する語彙に対応するシチュエーションに応じた発話を生成する。図１に示した例では、例えば、ユーザに対してゴルフコースの質問をした場合、ユーザの回答に「発熱」のキーワードを認識した場合、「発熱」に直接対応付けられたシチュエーションがないので、「症例」、「病気」の順で、逆ツリー状の階層構造を上に辿り、最終的には「ヘルスケア」に対応付けられた、「健康」のシチュエーションに切り替わることになる。 In the vocabulary structure, when a situation is not associated with the “vocabulary included in the switching language model but not included in the current situation language model” in the user's utterance, the vocabulary structure is moved upward. Going back, an utterance corresponding to the situation corresponding to the vocabulary having the situation association first found is generated. In the example shown in FIG. 1, for example, when a golf course question is made to the user, when the keyword “fever” is recognized in the user's answer, there is no situation directly associated with “fever”. In the order of “case” and “disease”, the hierarchical structure of an inverted tree is traced up, and finally the situation is switched to the “health” situation associated with “health care”.

ユーザの発話およびコンピュータによって生成された発話のうちの少なくとも一方、好ましくは両方が音声情報であるのが望ましい。 Desirably, at least one of the user's utterance and the computer-generated utterance, preferably both, are speech information.

本発明はさらに、複数のシチュエーションのそれぞれに関連する語彙の集合からなるシチュエーション言語モデルと、語彙の集合である切り替え言語モデルとを記憶した記憶媒体と、
ユーザの発話の意図を、上記シチュエーション言語モデルと切り替え言語モデルとを参照して解釈する意図解釈部と、
前記ユーザの発話の意図とシチュエーション言語モデルに基づいて発話を生成する発話生成部と、
ユーザの発話の中に、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙がある場合、現在のシチュエーションに代えて当該語彙に対応するシチュエーションに切り替えるシチュエーション設定部とを備え、
シチュエーションに応じた発話を生成すると共に、ユーザの発話に基づいてシチュエーションの切り替えを行う対話システムを提案する。 The present invention further includes a storage medium storing a situation language model composed of a set of vocabulary related to each of a plurality of situations, and a switching language model that is a set of vocabularies.
An intention interpreter that interprets the intention of the user's utterance with reference to the situation language model and the switching language model;
An utterance generation unit that generates an utterance based on the intention of the user's utterance and the situation language model;
When there is a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, a situation setting unit that switches to a situation corresponding to the vocabulary instead of the current situation is provided,
We propose a dialogue system that generates utterances according to situations and switches situations based on user utterances.

ここで、意図解釈部と、発話生成部と、シチュエーション設定部とは物理的に独立した構成要素であってもよいし、物理的には同一のコンピュータに含まれる各部に対応した機能であってもよい。また、シチュエーション設定部は、前記シチュエーションと語彙の対応関係を参照して、「切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙」から、対応するモデルシチュエーションの検索を行うシチュエーション探索部と、対話におけるシチュエーションを制御する対話シチュエーション制御部とから構成されていてもよい。 Here, the intention interpreting unit, the utterance generating unit, and the situation setting unit may be physically independent components, or physically function corresponding to each unit included in the same computer. Also good. In addition, the situation setting unit refers to the correspondence relationship between the situation and the vocabulary, and searches for the corresponding model situation from the “vocabulary included in the switching language model but not in the current situation language model”. You may be comprised from the search part and the dialog situation control part which controls the situation in a dialog.

この場合も、切り替え言語モデルは、語彙を逆ツリー状に階層構造化されたものであり、該語彙構造における少なくとも最上位の語彙にはシチュエーションが対応付けられているのが好ましい。このような語彙構造を前提とすれば、ユーザの発話の中の、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙にはシチュエーションが対応付けられていない場合、前記語彙構造を上位方向に遡り、最初に発見されたシチュエーションの対応付けを有する語彙に対応するシチュエーションに切り替えることが可能であり好ましい。ユーザの発話およびコンピュータによって生成された発話はいずれも音声情報であることが好ましい。 Also in this case, the switching language model is a vocabulary having a hierarchical structure in an inverted tree shape, and it is preferable that a situation is associated with at least the highest vocabulary in the vocabulary structure. Assuming such a vocabulary structure, if a situation is not associated with a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, the vocabulary structure Can be switched back to the situation corresponding to the vocabulary having the situation association first discovered. Both the user's utterance and the computer-generated utterance are preferably audio information.

前記意図解釈部は、ユーザの発話を文字列に変換した後にシチュエーション言語モデルと切り替え言語モデルとを参照する音声認識処理部と、音声認識処理に基づいて意図を解釈する意図解釈処理部から構成されていてもよい。上記において、音声認識処理部、意図解釈処理部、シチュエーション探索部、対話シチュエーション制御部はそれぞれハードウェアに対応するものであってもよいし、それらの一部又は全体が１つのハードウェアにまとめられており各部はそれぞれ当該ハードウェアが実行する機能に対応するものであってもよい。 The intention interpreting unit includes a speech recognition processing unit that refers to the situation language model and the switching language model after converting the user's utterance into a character string, and an intention interpretation processing unit that interprets the intention based on the speech recognition processing. It may be. In the above, each of the speech recognition processing unit, the intention interpretation processing unit, the situation search unit, and the dialogue situation control unit may correspond to hardware, or a part or all of them may be combined into one hardware. Each unit may correspond to a function executed by the hardware.

好ましくは、前記方法はコンピュータによって読み取り可能に記載されたコンピュータプログラムによって実行される。また、このようなコンピュータプログラムを格納した、コンピュータに読み取り可能な記憶媒体も本発明の対象である。 Preferably, the method is executed by a computer program which is written to be readable by a computer. A computer-readable storage medium storing such a computer program is also an object of the present invention.

本発明のコンピュータによる対話方法、対話システム、同方法を実行するためのコンピュータプログラムおよび同プログラムを格納したコンピュータに読み取り可能な記憶媒体によれば、ユーザとコンピュータが対話を行うに当たって、ユーザによる話題の変更にスムーズに追従することができるので、対話がきわめて自然でユーザがストレスを感じることが少ない。
また、本発明が提案する逆ツリー状に階層構造化された語彙構造を用いれば、新しいシチュエーションへの対応が迅速かつ適切であり、対話が一層速やかかつ自然になる。本発明が有するその他の効果については、明細書の記載から当業者に自明であろう。 According to the interactive method, interactive system, computer program for executing the method, and computer-readable storage medium storing the program according to the present invention, when the user and the computer interact, Since the change can be smoothly followed, the dialogue is very natural and the user is less stressed.
In addition, if the vocabulary structure hierarchically structured in an inverted tree shape proposed by the present invention is used, it is possible to respond quickly and appropriately to a new situation, and the conversation becomes quicker and more natural. Other effects of the present invention will be apparent to those skilled in the art from the description of the specification.

発明の実施例Embodiment of the Invention

図２に、本発明のシステム構成の１例を示す。図示したものは本発明に基づくシステムの概念を説明するために例示したものであって、本発明がこの実施例に限定されるわけではない。
図２に示した実施例に基づくシステム構成によれば、音声認識処理部１００は、話題言語モデル（シチュエーション言語モデル）と切替え言語モデルとから構成される音声認識辞書６００を参照して、ユーザの発話を音声認識し、その結果を意図理解処理部（意図解釈処理部）２００に伝える。意図理解処理部２００では、ユーザの発話の意図を解釈し、発話の中に切り替え言語モデルに含まれる語彙が、シチュエーションの切り替えを必要としているか否かを決定する。シチュエーションの切り替えが不要と判断されれば、処理は対話シチュエーション制御部４００に移動して、次の発話が行われる。意図理解処理部２００が、シチュエーションの切り替えを必要と判断した場合、シチュエーション探索部３００が、シチュエーションと概念関係対応データ７００を参照して新たなシチュエーションを設定してこの情報を対話シチュエーション制御部に伝える。最後に、対話シチュエーション制御部４００からの情報に基づき、応答／質問文生成処理部５００が応答文または質問文を生成して、音声出力する。
意図理解処理部２００が、シチュエーションの切り替えを必要と判断した場合であっても、シチュエーション探索部３００が、シチュエーションと概念関係対応データ７００を参照して設定した新たなシチュエーションが、前のシチュエーションと同じであることも起こりえるし、これは本発明において特に問題にはならない。 FIG. 2 shows an example of the system configuration of the present invention. What has been illustrated is illustrated to explain the concept of the system according to the present invention, and the present invention is not limited to this embodiment.
According to the system configuration based on the embodiment shown in FIG. 2, the speech recognition processing unit 100 refers to the speech recognition dictionary 600 composed of a topic language model (situation language model) and a switching language model, and The speech is recognized as speech, and the result is transmitted to the intention understanding processing unit (intention interpretation processing unit) 200. The intention understanding processing unit 200 interprets the intention of the user's utterance, and determines whether the vocabulary included in the switching language model in the utterance needs to switch the situation. If it is determined that the situation switching is unnecessary, the process moves to the dialogue situation control unit 400 to perform the next utterance. When the intent understanding processing unit 200 determines that situation switching is necessary, the situation search unit 300 refers to the situation and conceptual relationship correspondence data 700 to set a new situation and conveys this information to the dialog situation control unit. . Finally, based on the information from the dialogue situation control unit 400, the response / question sentence generation processing unit 500 generates a response sentence or a question sentence and outputs the voice.
Even if the intention understanding processing unit 200 determines that situation switching is necessary, the new situation set by the situation search unit 300 with reference to the situation and the conceptual relationship correspondence data 700 is the same as the previous situation. This is not a problem in the present invention.

発話に基づいて行われるシチュエーションの切り替えプロセスについて、１つの実施例を図示した図３に基づいて説明する。
音声認識が行われ、音声認識されたユーザの発話の意図を理解した結果、意図理解処理部が現在のシチュエーションを継続すべきか否かを判断する（意図理解処理）。ここで、現在のシチュエーションを継続すべきと判断されれば、処理はシチュエーション制御部に移動して（対話シチュエーション制御）、シチュエーション制御部が質問／応答文を生成する（応答／質問文生成）。 A situation switching process performed based on an utterance will be described with reference to FIG. 3 illustrating one embodiment.
As a result of speech recognition and understanding of the speech utterance user's utterance intention, the intention understanding processing unit determines whether or not to continue the current situation (intention understanding processing). If it is determined that the current situation should be continued, the process moves to the situation control unit (interactive situation control), and the situation control unit generates a question / response sentence (response / question sentence generation).

意図理解処理部が現在のシチュエーションを維持すべきではないと判断した場合、認識語句の概念を参照して、当該概念に関連するシチュエーションが存在しているか否かを判断する。当該概念に関連するシチュエーションが存在すれば、そのシチュエーションを採用してシチュエーション制御部がシチュエーション制御を行う。概念に関連するシチュエーションが存在していないときは、逆ツリー状の階層構造化された語彙構造を遡り、親概念（上位の語彙）に関連するシチュエーションが存在するか否かを判断する。関連するシチュエーションが存在すれば、該シチュエーションを採用してシチュエーション制御部がシチュエーション制御を行う。親概念に関連するシチュエーションが存在しなければ、関連するシチュエーションが見つかるまで語彙構造を遡る。関連するシチュエーションが発見されれば当該シチュエーションを採用してシチュエーション制御を行うが、語彙構造の最上位まで遡及しても関連シチュエーションが見つからないときはエラー処理を行う（シチュエーション探索）。 When the intention understanding processing unit determines that the current situation should not be maintained, the concept of the recognized word / phrase is referred to and it is determined whether or not a situation related to the concept exists. If there is a situation related to the concept, the situation control unit adopts the situation and performs situation control. When the situation related to the concept does not exist, the lexical structure having a hierarchical structure in an inverted tree shape is traced, and it is determined whether or not the situation related to the parent concept (higher vocabulary) exists. If there is a related situation, the situation controller adopts the situation and performs situation control. If there is no situation related to the parent concept, the lexical structure is traced until the relevant situation is found. If a related situation is found, the situation is adopted and situation control is performed. If no related situation is found even when retroactively reaching the top of the vocabulary structure, error processing is performed (situation search).

上記は本発明の１つの実施例に基づいて本発明の構成を明らかにしたものであるが、本発明は、特許請求の範囲および明細書の記載全体を参照して理解されるべきものである。 Although the above has clarified the structure of the present invention based on one embodiment of the present invention, the present invention should be understood with reference to the claims and the entire description. .

本発明の１実施例に基づく語彙構造を示す図である。FIG. 3 is a diagram illustrating a vocabulary structure according to an embodiment of the present invention. 本発明の１実施例に基づくシステム構成を示す図である。It is a figure which shows the system configuration | structure based on one Example of this invention. 本発明の１実施例に基づくシチュエーションの設定処理を示すフローを示す図である。It is a figure which shows the flow which shows the setting process of the situation based on one Example of this invention.

Claims

複数のシチュエーションのそれぞれに関連する語彙の集合からなるシチュエーション言語モデルと、語彙の集合である切り替え言語モデルを備え、
ユーザの発話の意図を、上記シチュエーション言語モデルと切り替え言語モデルを参照して解釈し、
前記ユーザの発話の意図とシチュエーション言語モデルに基づいて発話を生成する、コンピュータによる対話方法であって、
ユーザの発話の中に、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙がある場合、現在のシチュエーションに代えて当該語彙に対応するシチュエーションに応じた発話を生成するコンピュータによる対話方法。 A situation language model consisting of a set of vocabulary related to each of a plurality of situations and a switching language model that is a set of vocabularies are provided.
Interpreting the user's utterance intention with reference to the above situation language model and switching language model,
A computer interaction method for generating an utterance based on the intention of the user's utterance and a situation language model,
When there is a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, the computer generates a utterance corresponding to the situation corresponding to the vocabulary instead of the current situation. How to interact.

前記切り替え言語モデルは、語彙を逆ツリー状に階層構造化したものであり、該語彙構造における少なくとも最上位の語彙にはシチュエーションが対応付けられている請求項１に記載のコンピュータによる対話方法。 2. The computer interaction method according to claim 1, wherein the switching language model is a vocabulary having a hierarchical structure in an inverted tree shape, and a situation is associated with at least the highest vocabulary in the vocabulary structure.

前記語彙構造において、ユーザの発話の中の、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙にはシチュエーションが対応付けられていない場合、前記語彙構造を上位方向に遡り、最初に発見されたシチュエーションの対応付けを有する語彙に対応するシチュエーションに応じた発話を生成する請求項２に記載のコンピュータによる対話方法。 In the vocabulary structure, when a situation is not associated with a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, the vocabulary structure is traced back in the upper direction, The computer interaction method according to claim 2, wherein an utterance corresponding to a situation corresponding to a vocabulary having a situation association first discovered is generated.

ユーザの発話およびコンピュータによって生成された発話はいずれも音声情報である請求項１ないし３のいずれかに記載のコンピュータによる対話方法。 4. The computer interactive method according to claim 1, wherein the user's speech and the computer-generated speech are both voice information.

複数のシチュエーションのそれぞれに関連する語彙の集合からなるシチュエーション言語モデルと、語彙の集合である切り替え言語モデルとを記憶した記憶媒体と、
ユーザの発話の意図を、上記シチュエーション言語モデルと切り替え言語モデルとを参照して解釈する意図解釈部と、
前記ユーザの発話の意図とシチュエーション言語モデルに基づいて発話を生成する発話生成部と、
ユーザの発話の中に、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙がある場合、現在のシチュエーションに代えて当該語彙に対応するシチュエーションに切り替えるシチュエーション設定部とを備え、
シチュエーションに応じた発話を生成すると共に、ユーザの発話に基づいてシチュエーションの切り替えを行う対話システム。 A storage medium storing a situation language model composed of a set of vocabulary related to each of a plurality of situations, and a switching language model that is a set of vocabularies;
An intention interpreter that interprets the intention of the user's utterance with reference to the situation language model and the switching language model;
An utterance generation unit that generates an utterance based on the intention of the user's utterance and the situation language model;
When there is a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, a situation setting unit that switches to a situation corresponding to the vocabulary instead of the current situation is provided,
An interactive system that generates an utterance according to a situation and switches the situation based on the user's utterance.

前記切り替え言語モデルは、語彙を逆ツリー状に階層構造化されたものであり、該語彙構造における少なくとも最上位の語彙にはシチュエーションが対応付けられている請求項５に記載の対話システム。 6. The dialogue system according to claim 5, wherein the switching language model has a vocabulary hierarchically structured in an inverted tree shape, and a situation is associated with at least the highest vocabulary in the vocabulary structure.

前記語彙構造において、ユーザの発話の中の、切り替え言語モデルに含まれるが現在のシチュエーション言語モデルには含まれない語彙にはシチュエーションが対応付けられていない場合、前記語彙構造を上位方向に遡り、最初に発見されたシチュエーションの対応付けを有する語彙に対応するシチュエーションに切り替える請求項６に記載の対話システム。 In the vocabulary structure, when a situation is not associated with a vocabulary included in the switching language model but not included in the current situation language model in the user's utterance, the vocabulary structure is traced back in the upper direction, The dialogue system according to claim 6, wherein the dialogue system switches to a situation corresponding to a vocabulary having a situation association first found.

ユーザの発話およびコンピュータによって生成された発話はいずれも音声情報である請求項５ないし７のいずれかに記載の対話システム。 8. The dialogue system according to claim 5, wherein both the user's utterance and the computer-generated utterance are voice information.

前記意図解釈部は、ユーザの発話を文字列に変換した後にシチュエーション言語モデルと切り替え言語モデルとを参照して解釈する請求項８に記載の対話システム。 The dialogue system according to claim 8, wherein the intention interpretation unit interprets the user's utterance by referring to the situation language model and the switching language model after converting the user's utterance into a character string.

コンピュータに対して請求項１ないし４のいずれかに記載の方法を実行させるように、コンピュータによって読み取り可能に記載されたコンピュータプログラム。 A computer program readable by a computer so as to cause the computer to execute the method according to claim 1.

請求項１０に記載のコンピュータプログラムを格納した、コンピュータに読み取り可能な記憶媒体。 A computer-readable storage medium storing the computer program according to claim 10.