JP7212888B2

JP7212888B2 - Automatic dialogue device, automatic dialogue method, and program

Info

Publication number: JP7212888B2
Application number: JP2019094260A
Authority: JP
Inventors: 竜一郎東中; 準二富田; 雄一郎吉川; 和紀酒井; 浩石黒
Original assignee: Nippon Telegraph and Telephone Corp; Osaka University NUC
Current assignee: Nippon Telegraph and Telephone Corp; Osaka University NUC
Priority date: 2019-05-20
Filing date: 2019-05-20
Publication date: 2023-01-26
Anticipated expiration: 2039-05-20
Also published as: JP2020190585A

Description

特許法第３０条第２項適用ウェブサイト掲載日２０１８年７月９日ウェブサイトのアドレスＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅ１９ｔｈＡｎｎｕａｌＳＩＧｄｉａｌＭｅｅｔｉｎｇｏｎＤｉｓｃｏｕｒｓｅａｎｄＤｉａｌｏｇｕｅ（ｈｔｔｐｓ：／／ｗｗｗ．ｓｉｇｄｉａｌ．ｏｒｇ／ｆｉｌｅｓ／ｗｏｒｋｓｈｏｐｓ／ｃｏｎｆｅｒｅｎｃｅ１９／ｐｒｏｃｅｅｄｉｎｇｓ／）Application of Article 30, Paragraph 2 of the Patent Act Website posting date July 9, 2018 Website address Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue (https://www. /proceedings/)

本発明は、利用者と対話を行う自動対話技術に関し、特に、特定の話題について対話を行うための技術に関する。 TECHNICAL FIELD The present invention relates to an automatic dialogue technique for having a dialogue with a user, and more particularly to a technique for having a dialogue on a specific topic.

議論は根拠や反論の提示を通して合意形成を行う手続きである（非特許文献１）。近年、コンピュータ上で議論を扱うための議論のモデル化や、主張とその根拠となる文をテキストから自動で抽出する技術の開発が進められている。一方、議論構造と呼ばれる、ある話題に関する議論のデータベースを用いて、利用者と議論を行う議論対話システムの開発も進められている（非特許文献２）。この議論対話システムでは、利用者の各発話に対し、議論構造から当該発話に対する支持・不支持を表す発話を選択していくことでシステム側の発話を決定する。 Argument is a procedure for building consensus through the presentation of grounds and counterarguments (Non-Patent Document 1). In recent years, the development of argument modeling for handling arguments on computers and the development of techniques for automatically extracting arguments and their supporting sentences from texts have been advanced. On the other hand, development of a discussion dialogue system for having a discussion with a user using a database of discussions on a certain topic, called a discussion structure, is also underway (Non-Patent Document 2). In this discussion dialogue system, for each user's utterance, the utterance on the system side is determined by selecting an utterance that indicates support or disapproval of the user's utterance from the discussion structure.

Toulmin, S. E. The uses of argument. Cambridge university press, 1958.Toulmin, S. E. The uses of argument. Cambridge university press, 1958. R. Higashinaka, K. Sakai, H. Sugiyama, H. Narimatsu, T. Arimoto, T. Fukutomi, K. Matsui, Y. Ijima, H. Ito, S. Araki, Y. Yoshikawa, H. Ishiguro, and Y. Matsuo., "Argumentative dialogue system based on argumentation structures," In Proceedings of the 21st Workshop on the Semantics and Pragmatics of Dialogue, pp. 154-155, 2017.R. Higashinaka, K. Sakai, H. Sugiyama, H. Narimatsu, T. Arimoto, T. Fukutomi, K. Matsui, Y. Ijima, H. Ito, S. Araki, Y. Yoshikawa, H. Ishiguro, and Y. Matsuo., "Argumentative dialogue system based on argumentation structures," In Proceedings of the 21st Workshop on the Semantics and Pragmatics of Dialogue, pp. 154-155, 2017.

従来の対話システムは、対話開始時に唐突に目的とする特定の話題の対話を始めるため、利用者がいきなり対話相手や話題についていくことが難しいという問題がある。 A conventional dialogue system has a problem that it is difficult for a user to suddenly follow a dialogue partner or a topic because a dialogue on a specific target topic abruptly starts at the start of the dialogue.

本発明はこのような点に鑑みてなされたものであり、利用者が目的とする特定の話題の対話に円滑に入っていくことを可能にする技術を提供することを目的とする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above points, and an object of the present invention is to provide a technique that enables a user to smoothly enter into dialogue on a specific target topic.

利用者と対話を行う自動対話装置は、目的とする特定の話題の対話を行うための導入となる対話で発話される第１の発話文を出力する第１の対話処理部と、特定の話題の対話で発話される第２の発話文を出力する第２の対話処理部と、を有する。第１の対話処理部は、発話候補文の集合から、特定の話題との関連性に基づいて単数または複数の第１の発話文を選択して出力し、第２の対話処理部は、第１の対話処理部が第１の発話文を出力した後に、第２の発話文を出力する。 An automatic dialogue device for dialogue with a user includes: a first dialogue processing unit for outputting a first utterance sentence uttered in a dialogue that serves as an introduction to a dialogue on a target specific topic; and a second dialogue processing unit that outputs a second utterance sentence uttered in the dialogue. A first dialogue processing unit selects and outputs one or more first utterance sentences from a set of utterance candidate sentences based on relevance to a specific topic, and a second dialogue processing unit selects and outputs a first utterance sentence. After one dialogue processing unit outputs the first uttered sentence, it outputs the second uttered sentence.

これにより、利用者が目的とする特定の話題の対話に円滑に入っていくことができる。 As a result, the user can smoothly enter into dialogue on a specific topic.

図１は実施形態の自動対話装置の機能構成を説明するためのブロック図である。FIG. 1 is a block diagram for explaining the functional configuration of the automatic dialogue system of the embodiment. 図２は実施形態の自動対話装置の処理を説明するためのフロー図である。FIG. 2 is a flowchart for explaining the processing of the automatic dialogue device of the embodiment. 図３は実施形態の発話候補文の集合であるＰＤＢ（Person DataBase）を例示した図である。FIG. 3 is a diagram exemplifying a PDB (Person DataBase), which is a set of utterance candidate sentences according to the embodiment. 図４は議論構造を例示した図である。FIG. 4 is a diagram illustrating an argument structure. 図５は分類器の教師有り学習に使用する発話文と対話行為ラベルとの組からなる学習データを例示した図である。FIG. 5 is a diagram exemplifying learning data consisting of pairs of utterance sentences and dialogue act labels used for supervised learning of the classifier. 図６は入力発話文とそれらを分類器で分類して得られる対話行為ラベルとの関係を例示した図である。FIG. 6 is a diagram illustrating the relationship between input utterance sentences and dialogue act labels obtained by classifying them with a classifier. 図７は議論の特定の話題とＰＤＢの質問文との類似度を例示した図である。FIG. 7 is a diagram illustrating the degree of similarity between a specific topic of discussion and a PDB question sentence. 図８は議論構造からのノードの選択処理を例示するための図である。FIG. 8 is a diagram for exemplifying the process of selecting a node from the discussion structure. 図９は挨拶対話フェーズ、質問対話フェーズ、および議論対話フェーズでの対話内容を例示するための図である。FIG. 9 is a diagram for exemplifying dialogue contents in the greeting dialogue phase, the question dialogue phase, and the discussion dialogue phase. 図１０は挨拶対話フェーズでの対話内容を例示するための図である。FIG. 10 is a diagram for exemplifying the dialogue content in the greeting dialogue phase. 図１１は質問対話フェーズでの対話内容を例示するための図である。FIG. 11 is a diagram for exemplifying the dialogue contents in the question dialogue phase. 図１２は議論対話フェーズでの対話内容を例示するための図である。FIG. 12 is a diagram for exemplifying the dialogue contents in the discussion dialogue phase. 図１３は実施形態の変形例の自動対話装置の機能構成を説明するためのブロック図である。FIG. 13 is a block diagram for explaining the functional configuration of the automatic dialogue device of the modified example of the embodiment.

以下、図面を参照して本発明の実施形態を説明する。
［第１実施形態］
まず本発明の第１実施形態を説明する。
第１実施形態では、「目的とする特定の話題の対話」が、特定の話題に関する議論であり、「発話候補文の集合」が、個人的な内容を問う質問を表す質問文と、該質問に対する回答文と、の組を含むＰＤＢである場合を例示する。しかしながら、これは本発明を限定するものではない。「該質問への回答文」とは、特定の質問に対して本自動対話装置が回答する際に、回答として用いる事が可能な文を表す。例えば、自動対話装置からの質問に対し、利用者が回答し、それに対して更に自動対話装置が回答する内容を表す文が「該質問への回答文」である。しかし、自動対話装置が回答する内容は利用者がどう答えるかに関わりなく定められていてもよい。例えば、利用者の回答はあってもなくてもよく、特定の質問（ある種の話題のようなもの）に関して、自動対話装置が回答する（つまり自己開示する）だけでもよい。 Embodiments of the present invention will be described below with reference to the drawings.
[First embodiment]
First, a first embodiment of the present invention will be described.
In the first embodiment, the "target dialogue on a specific topic" is a discussion on a specific topic, and the "set of utterance candidate sentences" is a question sentence expressing a question about personal content, and the question sentence. An example of a PDB that includes a set of a reply sentence to and . However, this is not a limitation of the invention. "Answer sentence to the question" represents a sentence that can be used as an answer when this automatic dialogue device answers a specific question. For example, a sentence representing the contents of the user's answer to a question from the automatic dialogue device and the automatic dialogue device's reply to the question is the "answer sentence to the question." However, the content that the automatic dialogue device answers may be determined regardless of how the user answers. For example, the user may or may not answer, and the automatic dialogue device may just answer (that is, self-disclose) a specific question (such as a certain topic).

＜構成＞
図１に例示するように、本実施形態の自動対話装置１は、対話処理部１１（第１の対話処理部）、対話処理部１２（第２の対話処理部）、単語ベクトル辞書記憶部１３、対話管理部１４、および発話文生成部１５を有する。自動対話装置１は、さらに対話処理部１０を有していてもよいし、有していなくてもよい。対話処理部１１は、発話候補文ＤＢ記憶部１１１および発話生成部１１２を有する。対話処理部１２は、議論構造記憶部１２１、議論管理部１２２、対話行為推定部１２３、および議論発話推定部１２４を有する。なお、以下では説明を省略するが、各処理部（対話処理部１１，１２、対話管理部１４、および発話文生成部１５）に入力されたデータや各処理部で得られたデータはメモリ（図示せず）に格納され、必要に応じてそこから読み出されて他の処理に利用される。 <Configuration>
As illustrated in FIG. 1, the automatic dialogue apparatus 1 of this embodiment includes a dialogue processing unit 11 (first dialogue processing unit), a dialogue processing unit 12 (second dialogue processing unit), a word vector dictionary storage unit 13 , a dialog management unit 14, and an utterance sentence generation unit 15. FIG. The automatic dialogue device 1 may or may not have a dialogue processor 10 further. The dialogue processing unit 11 has an utterance candidate sentence DB storage unit 111 and an utterance generation unit 112 . The dialogue processing unit 12 has a discussion structure storage unit 121 , a discussion management unit 122 , a dialogue act estimation unit 123 , and a discussion utterance estimation unit 124 . Although the description is omitted below, data input to each processing unit (dialogue processing units 11 and 12, dialogue management unit 14, and utterance sentence generation unit 15) and data obtained by each processing unit are stored in a memory ( (not shown), and read from there as needed for use in other processes.

＜事前処理＞
本実施形態では、対話処理前に行われる事前処理で予め発話候補文ＤＢ記憶部１１１に発話候補文データベース（ＤＢ）が格納され、議論構造記憶部１２１に議論構造が格納され、単語ベクトル辞書記憶部１３に単語ベクトル辞書が格納される例を示す。しかしこれは本発明を限定するものではなく、対話処理が開始されてから、発話候補文ＤＢ記憶部１１１、議論構造記憶部１２１、単語ベクトル辞書記憶部１３の一部または全部で上述の格納が行われてもよいし、発話候補文ＤＢ記憶部１１１、議論構造記憶部１２１、単語ベクトル辞書記憶部１３の一部または全部に格納されていた情報が更新されてもよい。また、入力された発話内容を表すテキストである入力発話文がどのような対話行為を表しているかを推定する分類器が構築される。以下、発話候補文ＤＢ、議論構造、単語ベクトル辞書、および分類器を例示する。 <Pretreatment>
In this embodiment, an utterance candidate sentence database (DB) is stored in advance in the utterance candidate sentence DB storage unit 111 in advance processing performed before dialogue processing, discussion structures are stored in the discussion structure storage unit 121, and a word vector dictionary is stored. An example in which a word vector dictionary is stored in the unit 13 is shown. However, this does not limit the present invention, and after the dialogue processing is started, the aforementioned storage may be performed in part or all of the utterance candidate sentence DB storage unit 111, the argument structure storage unit 121, and the word vector dictionary storage unit 13. Alternatively, information stored in part or all of the utterance candidate sentence DB storage unit 111, argument structure storage unit 121, and word vector dictionary storage unit 13 may be updated. Also, a classifier is constructed for estimating what kind of dialogue act an input utterance sentence, which is a text representing the content of an input utterance, represents. The utterance candidate sentence DB, argument structure, word vector dictionary, and classifier are exemplified below.

≪発話候補文ＤＢ（発話候補文の集合）の例示≫
本実施形態で例示する発話候補文ＤＢは、対話中に行われる個人的な内容を問う質問を表す質問文と、該質問に対する回答（回答例）を表す回答文と、の組を含むＰＤＢである。個人的な内容を問う質問の例は、個人の属性を問う質問、個人の好みを問う質問、個人の行動を問う質問、個人の日常を問う質問、個人のキャラクタを問う質問などである。ＰＤＢについては、例えば参考文献１等参照。
参考文献１：H. Sugiyama, T. Meguro, and R. Higashinaka., "Large-scale collection and analsis of personal question-answer pairs for conversational agents," In Proceedings of Intelligent Virtual Agents, pp. 420 - 433, 2014.
例えば、対話中に問われる個人的な内容に関する質問を表す質問文と、該質問に対する回答例を表す回答文と、の組のデータが大量に収集され、それらの組のうち対話においてよく問われる質問に対応する組を用いてＰＤＢが生成される。データの収集過程では、データ作成を行う作業者は、性別や年齢などの属性が異なる対話の相手に対して対話中に問う質問を作成する。生成された大量の質問は分析され、複数の対象に対して問われる頻度の高い質問が抽出され、抽出された質問を表す質問文と該質問に対する回答例を表す回答文との組がＰＤＢとして構築される。また、質問の作成時に質問のカテゴリや話題といった情報も生成されてもよく、その場合にはそれらの情報も付与されてもよい。図３にＰＤＢの一例を示す。図３に例示したＰＤＢでは、質問を表す質問文と、該質問に対する回答例を表す回答文と、当該質問のカテゴリである質問カテゴリとが対応付けられている。 <<Example of Utterance Candidate Sentence DB (Set of Utterance Candidate Sentences)>>
The utterance candidate sentence DB exemplified in the present embodiment is a PDB that includes a set of a question sentence representing a question that asks about a personal content during a dialogue and an answer sentence representing an answer (answer example) to the question. be. Examples of questions about personal content include questions about personal attributes, questions about personal preferences, questions about personal behavior, questions about personal daily life, questions about personal character, and the like. For PDB, see Reference 1, for example.
Reference 1: H. Sugiyama, T. Meguro, and R. Higashinaka., "Large-scale collection and analsis of personal question-answer pairs for conversational agents," In Proceedings of Intelligent Virtual Agents, pp. 420 - 433, 2014 .
For example, a large amount of data is collected in sets of question sentences representing questions about personal contents to be asked during dialogue and answer sentences representing example answers to the questions, and among these sets, frequently asked in dialogues. A PDB is generated with the set corresponding to the question. In the process of collecting data, the worker who creates the data creates questions to be asked during the dialogue with the dialogue partners with different attributes such as gender and age. A large number of generated questions are analyzed, and frequently asked questions for multiple subjects are extracted, and a set of a question sentence representing the extracted question and an answer sentence representing an example answer to the question is stored as a PDB. be built. Information such as the category and topic of the question may also be generated when the question is created, and in that case, such information may also be added. FIG. 3 shows an example of PDB. In the PDB illustrated in FIG. 3, a question sentence representing a question, an answer sentence representing an example answer to the question, and a question category, which is the category of the question, are associated with each other.

≪議論構造の例示≫
本実施形態で例示する議論構造は、従来の議論モデル（例えば、参考文献２等参照）を拡張したグラフ構造であり、議論の階層構造を特定するデータである。
参考文献２：Walton, D., "Methods of argumentation," Cambridge University Press. 2013.
図４に例示するように、本実施形態で例示する議論構造は、議論を構成する階層構造の各命題の文章（発話テキスト）を表す複数のノードと、二つのノードに対応する二つの命題の間の「支持」／「不支持」の関係を表すリンクとを有する。リンクは矢印で表され、その向きが各命題の支持（不支持）方向を表す。すなわち、第１のノード（例えば「自動運転はすでに事故を起こしている」という命題を表すノード）から第２のノード（例えば「自動運転は危険だ」という命題を表すノード）へ向かう「支持」を表すリンクが設定されている場合、第１のノードが表す命題（例えば「自動運転はすでに事故を起こしている」という命題）は、第２のノードが表す命題（例えば「自動運転は危険だ」という命題）を支持している命題となる。また、第３のノード（例えば「自動運転は危険だ」という命題を表すノード）から第４のノード（例えば「自動運転に賛成」という命題を表すノード）へ向かう「不支持」を表すリンクが設定されている場合、第３のノードが表す命題（例えば「自動運転は危険だ」という命題）は、第４のノードが表す命題（例えば「自動運転に賛成」という命題）を支持していない命題となる。リンクで繋がれた二つのノードの一方を親ノードとすると他方は子ノードである。リンクの向きは子ノードから親ノードに向かう方向に設定されている。また、議論構造を対話処理に利用するために、各ノードには内部状態を表す３種類のフラグ（主張行為の有無を表すフラグｆ１、質問行為の有無を表すフラグｆ２、主張が受け入れられたか、打ち負かされたか、未定かを表すフラグｆ３）の情報が付加されている。フラグｆ１は２値を取り得、フラグｆ２は２値を取り得、フラグｆ３は３値を取り得る。このような議論構造は、例えば次の手順で構築される。初めにあるノード（例えば「自動運転に賛成」）が決定される。次に作業者がそのノードに対して「支持」／「不支持」の関係を持つノードを階層的に追加し、これらの関係を表す「支持」／「不支持」を表すリンクを設定していく。ノードが一層追加されるたびに、作業者が、リンクの設定された２つのノード間の論理的関係の整合性をチェックする。議論構造記憶部１２１には、特定の話題の１個の議論構造のみが格納されてもよいし、議論の話題の異なる複数個の議論構造が格納されてもよい。例えば、図４に例示した議論構造の話題は「自動運転に賛成か反対か」であるが、それ以外に「住むなら田舎か都会か」「旅行にいくなら北海道か沖縄か」「朝食ならごはんかパンか」「遊ぶなら遊園地か動物園か」などの話題の議論構造も設定され、議論構造記憶部１２１に格納されてもよい。また、議論構造記憶部１２１に格納される議論構造は、それぞれに対応する話題が特定可能とされている。例えば、何れかのノードが表す主張が話題を表していてもよいし、議論構造と当該議論構造に対応する話題とが対応付けられて議論構造記憶部１２１に格納されてもよい。 ≪Example of discussion structure≫
The discussion structure exemplified in the present embodiment is a graph structure obtained by extending a conventional discussion model (see Reference 2, for example), and is data specifying the hierarchical structure of the discussion.
Reference 2: Walton, D., "Methods of argumentation," Cambridge University Press. 2013.
As illustrated in FIG. 4, the discussion structure illustrated in this embodiment includes a plurality of nodes representing sentences (speech texts) of each proposition in the hierarchical structure that constitutes the discussion, and two propositions corresponding to the two nodes. and a link that represents the "support"/"dissupport" relationship between them. Links are represented by arrows, and their direction indicates the support (or non-support) direction of each proposition. That is, "support" from the first node (for example, the node representing the proposition "autonomous driving has already caused an accident") to the second node (e.g. the node representing the proposition "autonomous driving is dangerous") is set, the proposition represented by the first node (e.g. the proposition ``autonomous driving has already caused an accident'') is the proposition represented by the second node (e.g. ``autonomous driving is dangerous ”). In addition, there is a link representing "disagree" from the third node (for example, the node representing the proposition "Automatic driving is dangerous") to the fourth node (for example, the node representing the proposition "I agree with automatic driving"). If set, the proposition represented by the third node (e.g. the proposition "autonomous driving is dangerous") does not support the proposition represented by the fourth node (e.g. the proposition "in favor of autonomous driving") It becomes a proposition. If one of two nodes connected by a link is a parent node, the other is a child node. The direction of the link is set in the direction from the child node to the parent node. In addition, in order to use the discussion structure for interactive processing, each node has three types of flags representing the internal state (flag f1 representing the presence or absence of an assertive action, flag f2 representing the presence or absence of a questioning action, whether the argument was accepted, Information of flag f3) indicating whether it is defeated or undecided is added. Flag f1 can take two values, flag f2 can take two values, and flag f3 can take three values. Such a discussion structure is constructed, for example, by the following procedure. First, a node (eg “in favor of autonomous driving”) is determined. Next, the worker hierarchically adds a node that has a "support"/"dissupport" relationship to that node, and sets a link representing "support"/"dissupport" to represent these relationships. go. As each additional node is added, an operator checks the consistency of the logical relationship between the two linked nodes. The discussion structure storage unit 121 may store only one discussion structure for a specific topic, or may store a plurality of discussion structures for different discussion topics. For example, the topic of the discussion structure illustrated in Fig. 4 is "Are you for or against automated driving?" A discussion structure for a topic such as "whether it's an amusement park or a zoo?" In addition, the discussion structure stored in the discussion structure storage unit 121 is capable of specifying a corresponding topic. For example, an assertion represented by any node may represent a topic, or a discussion structure and a topic corresponding to the discussion structure may be associated and stored in the discussion structure storage unit 121 .

≪単語ベクトル辞書の例示≫
単語ベクトル辞書は単語をベクトルに変換するための辞書であり、ｗｏｒｄ２ｖｅｃ等の公知の自然言語のベクトル化手法に用いることのできるものであれば、どのようなものであってもよい。 ≪Example of word vector dictionary≫
The word vector dictionary is a dictionary for converting words into vectors, and may be of any type as long as it can be used for a known natural language vectorization method such as word2vec.

≪分類器の例示≫
本実施形態の分類器は、与えられた入力発話文がどのような対話行為を表しているかを推定し、推定した対話行為を表すラベルである対話行為ラベルを出力する。すなわち、本実施形態の分類器は、入力発話文を対話行為ごとに分類して当該入力発話文に対応する対話行為ラベルを出力するものである。対話行為とは、対話中の発話が意味する行為を表す。対話行為の種別に限定はないが、本実施形態では、発話者の主張を行う（意思を伝える）主張行為（Inform）、発話者が質問を行う質問行為（Question）、発話者が同意する同意行為（Agree）、発話者が異議を唱える異議行為（Disagree）、その他の行為（Other）の５種類の対話行為を定義する例を示す。対話行為ラベルは、各対話行為を識別するラベルである。主張行為、質問行為、同意行為、異議行為、およびその他の行為に対応する対話行為ラベルを、それぞれ、Inform、Question、Agree、Disagree、およびOtherと定義する。分類器は、図５に例示するような、発話文と当該発話文に対応する対話行為ラベルとの組を含む学習データを用いた教師有り学習によって生成される。分類器の一例はＳＶＭ（support vector machine）であり、その学習方法も周知である。その他、周知の確率モデルや、ニューラルネットワーク等を用い、与えられた入力発話文に対応する対話行為ラベルを得て出力する分類器を用いてもよい。 <<Example of classifier>>
The classifier of this embodiment estimates what kind of dialogue act a given input utterance represents, and outputs a dialogue act label that is a label representing the estimated dialogue act. That is, the classifier of the present embodiment classifies an input utterance sentence for each dialogue act and outputs a dialogue act label corresponding to the input utterance sentence. A dialogue act represents an act meant by an utterance during a dialogue. Although there are no restrictions on the type of dialogue act, in this embodiment, the act of assertion (Inform) in which the speaker asserts (conveys his/her intention), the act of asking a question (Question) in which the speaker asks a question, and the consent in which the speaker agrees An example of defining five types of dialogue actions, that is, an action (Agree), an objection action (Disagree) in which the speaker raises an objection, and an other action (Other), is shown. A dialogue act label is a label that identifies each dialogue act. Define dialogue action labels corresponding to Assertive Action, Question Action, Agree Action, Objection Action, and Other Action as Inform, Question, Agree, Disagree, and Other, respectively. A classifier is generated by supervised learning using learning data including pairs of utterances and dialogue act labels corresponding to the utterances, as exemplified in FIG. An example of a classifier is an SVM (support vector machine), and its learning method is well known. Alternatively, a classifier that obtains and outputs a dialogue act label corresponding to a given input utterance using a well-known probability model, neural network, or the like may be used.

＜初期処理＞
対話処理を開始する前に、議論（対話）の特定の話題を表す文である話題文と、当該特定の話題の議論を行うための導入となる対話の回数を特定する１以上の整数Ｍと、目的とする特定の話題の議論の回数（対話の回数）を特定する１以上の整数Ｎとが、対話管理部１４に入力される。 <Initial processing>
Before starting the dialogue processing, a topic sentence that is a sentence representing a specific topic of discussion (dialogue), and an integer M of 1 or more that specifies the number of dialogues that will be an introduction to the discussion of the specific topic , and an integer N of 1 or more that specifies the number of discussions (the number of dialogues) of a specific target topic are input to the dialogue management unit 14 .

＜対話処理＞
図２を用いて本実施形態の対話処理を説明する。対話管理部１４は本実施形態の対話処理を管理する。自動対話装置１が対話処理部１０を有しない場合、対話管理部１４は、対話処理部１１による質問対話フェーズ（ＰＨ１）を実行した後に、対話処理部１２による議論対話フェーズ（ＰＨ２）を実行する。自動対話装置１が対話処理部１０を有する場合、対話管理部１４は、対話処理部１０による挨拶対話フェーズ（ＰＨ０）を実行した後に、対話処理部１１による質問対話フェーズを実行し、質問対話フェーズの後、対話処理部１２による議論対話フェーズを実行する。なお、自動対話装置１が対話処理部１０を有する場合であっても、挨拶対話フェーズの実行の有無を選択可能であってもよい。以下に、挨拶対話フェーズ（ＰＨ０）、質問対話フェーズ（ＰＨ１）、および議論対話フェーズ（ＰＨ２）の詳細を説明する。 <Dialogue processing>
The interactive processing of this embodiment will be described with reference to FIG. The dialogue management unit 14 manages the dialogue processing of this embodiment. If the automatic dialogue device 1 does not have the dialogue processing unit 10, the dialogue management unit 14 executes the discussion dialogue phase (PH2) by the dialogue processing unit 12 after executing the question dialogue phase (PH1) by the dialogue processing unit 11. . When the automatic dialogue device 1 has the dialogue processing unit 10, the dialogue management unit 14 executes the question dialogue phase by the dialogue processing unit 11 after executing the greeting dialogue phase (PH0) by the dialogue processing unit 10, and the question dialogue phase After that, the discussion dialogue phase by the dialogue processing unit 12 is executed. Even when the automatic dialogue device 1 has the dialogue processing unit 10, it may be possible to select whether or not to execute the greeting dialogue phase. Details of the greeting dialogue phase (PH0), the question dialogue phase (PH1), and the discussion dialogue phase (PH2) are described below.

≪挨拶対話フェーズ（ＰＨ０）≫
挨拶対話フェーズ（ＰＨ０）では、自動対話装置１が利用者に対して簡単な挨拶発話を行う。挨拶発話の例は、「こんにちは」などの挨拶、「私は〇〇です」などの自己紹介、「あなたのお名前はなんですか」などの名前の質問などである。自動対話装置１は、挨拶発話を１回のみ行ってもよいし、複数回行ってもよい。自動対話装置１が挨拶発話を行う場合、対話処理部１０が自動対話装置１の挨拶発話の内容を対話管理部１４に送り、対話管理部１４はその挨拶発話の内容を発話文生成部１５に送る。発話文生成部１５は、この挨拶発話の内容を表すテキストである出力発話文を生成して出力する。発話文生成部１５は、受け取った挨拶発話の内容を表す文章をそのまま出力発話文としてもよいし、対話が不自然とならないように、受け取った挨拶発話の内容を追加修正したものを出力発話文としてもよい。挨拶発話は自動対話装置１のみが行ってもよいし、自動対話装置１と利用者とが交互に行ってもよい。後者の場合、自動対話装置１が挨拶発話を行った後に利用者が挨拶発話を行ってもよいし、利用者が挨拶発話を行った後に自動対話装置１が挨拶発話を行ってもよい。利用者が挨拶発話を行った場合、その内容を表す入力発話文が対話管理部１４に入力され、対話処理部１０に送られる。これに対して自動対話装置１が挨拶発話を行う場合、対話処理部１０は自動対話装置１の挨拶発話の内容を対話管理部１４に送り、上述のように挨拶発話が行われる。対話処理部１０は、予め定められた自動対話装置１の挨拶発話の内容を出力してもよいし、入力された利用者の挨拶発話に応じて自動対話装置１の挨拶発話の内容を決定して出力してもよい（ステップＳ１０）。 ≪Greeting Dialogue Phase (PH0)≫
In the greeting dialogue phase (PH0), the automatic dialogue device 1 makes a simple greeting utterance to the user. Examples of greeting utterances include greetings such as "Hello", self-introductions such as "I am XX", and name questions such as "What is your name?". The automatic dialogue device 1 may make a greeting utterance only once, or may make it a plurality of times. When the automatic dialogue device 1 makes a greeting utterance, the dialogue processing unit 10 sends the content of the greeting utterance of the automatic dialogue device 1 to the dialogue management unit 14, and the dialogue management unit 14 sends the content of the greeting utterance to the utterance sentence generation unit 15. send. The utterance sentence generation unit 15 generates and outputs an output utterance sentence, which is a text representing the content of the greeting utterance. The utterance sentence generation unit 15 may use the sentence representing the content of the received greeting utterance as it is as an output utterance sentence, or may output an output utterance sentence by adding and correcting the content of the received greeting utterance so as not to make the dialogue unnatural. may be The greeting utterance may be performed only by the automatic dialogue device 1, or may be alternately performed by the automatic dialogue device 1 and the user. In the latter case, the user may make a greeting speech after the automatic dialogue device 1 makes a greeting speech, or the automatic dialogue device 1 may make a greeting speech after the user makes a greeting speech. When the user makes a greeting utterance, an input utterance sentence representing the content is input to the dialogue management unit 14 and sent to the dialogue processing unit 10 . On the other hand, when the automatic dialogue device 1 makes a greeting utterance, the dialogue processing unit 10 sends the content of the greeting utterance of the automatic dialogue device 1 to the dialogue management unit 14, and the greeting utterance is made as described above. The dialogue processing unit 10 may output predetermined content of the greeting utterance of the automatic dialogue device 1, or determine the content of the greeting utterance of the automatic dialogue device 1 according to the user's input greeting utterance. may be output (step S10).

≪質問対話フェーズ（ＰＨ１）≫
挨拶対話フェーズが実行された場合にはその終了後に質問対話フェーズ（ＰＨ１）が実行される。挨拶対話フェーズが実行されない場合には最初に質問対話フェーズが実行される。質問対話フェーズでは、対話処理部１１が、議論対話フェーズ（ＰＨ２）での特定の話題に関する議論（目的とする特定の話題の対話）を行うための導入となる対話で発話される発話文（第１の発話文）を対話管理部１４に出力する。対話管理部１４は発話文（第１の発話文）を発話文生成部１５に送り、発話文生成部１５は発話文（第１の発話文）に対応する出力発話文を出力する。以下詳細に説明する。 ≪Question Dialogue Phase (PH1)≫
When the greeting dialogue phase is executed, the question dialogue phase (PH1) is executed after the end of the greeting dialogue phase. If the greeting dialogue phase is not executed, the question dialogue phase is executed first. In the question dialogue phase, the dialogue processing unit 11 generates an utterance sentence (second 1) is output to the dialogue management unit 14 . The dialogue management unit 14 sends an utterance sentence (first utterance sentence) to the utterance sentence generation unit 15, and the utterance sentence generation unit 15 outputs an output utterance sentence corresponding to the utterance sentence (first utterance sentence). A detailed description will be given below.

対話管理部１４は、初期処理で入力された議論の話題を表す話題文と整数Ｍとを対話処理部１１の発話生成部１１２に送る。これらを受け取った発話生成部１１２は、発話候補文ＤＢ記憶部１１１に記憶された発話候補文ＤＢ（発話候補文の集合）から、話題文（特定の話題）との関連性に基づいてＭ個（単数または複数）の発話文（第１の発話文）を選択して出力する。例えば、発話生成部１１２は、話題文との類似度が高い順に発話候補文ＤＢからＭ個の発話文を選択する。本実施形態の発話文（第１の発話文）それぞれは、質問文と回答文との組を含む。以下にこの処理の一例を示す。
（ステップ１－Ｉ）発話生成部１１２は、発話候補文ＤＢ記憶部１１１に記憶された発話候補文ＤＢ（例えば、図３）からすべての質問文を取得し、すべての質問文と話題文とについて単語ごとに形態素解析を行う。形態素解析には一例として出願人の開発した形態素解析器ＪＴＡＧを用いるが、ＭｅＣａｂなど一般的な周知技術を用いてもよい。
（ステップ１－ＩＩ）発話生成部１１２は、上述の形態素解析結果および単語ベクトル辞書記憶部１３に格納された単語ベクトル辞書を用い、Ｗｏｒｄ２ｖｅｃのような単語をベクトル化する公知の手法を用いて質問文と話題文の各単語をベクトル化して単語ベクトルを得、質問文および話題文のテキストごとに当該単語ベクトルを用いて文ベクトルを得る。単語ベクトルから文ベクトルを得る際、本実施形態においては当該単語ベクトルの平均をとる。ただし、平均を取る替わりにリカレントニューラルネットワーク等を用いて、当該単語ベクトルから文のベクトルを得るなど、別の方法を用いてもよい。質問文に対応する文ベクトルを「質問文ベクトル」と呼び、話題文に対応する文ベクトルを「話題文ベクトル」と呼ぶことにする。
（ステップ１－ＩＩＩ）発話生成部１１２は、各質問文ベクトルと話題文ベクトルとの類似度を計算し、質問文ベクトルを類似度が高い順にソートする。次に発話生成部１１２は、類似度が高い順にＭ個の質問文ベクトルを選択し、当該Ｍ個の質問文ベクトルに対応するＭ個の質問文とそれらに対応するＭ個の回答文とを得る。発話生成部１１２は、これらのＭ個の質問文と回答文との組を発話文として出力する。なお、類似度としては、ベクトル間の類似度としてよく用いられているコサイン類似度を用いることができるが、これは本発明を限定するものではなく、ベクトル間の類似度を示す他の類似度が用いられてもよい。 The dialog management unit 14 sends the topic sentence representing the topic of discussion and the integer M input in the initial processing to the utterance generation unit 112 of the dialog processing unit 11 . Upon receiving these, the utterance generation unit 112 selects M utterance candidate sentences from the utterance candidate sentence DB (a set of utterance candidate sentences) stored in the utterance candidate sentence DB storage unit 111 based on the relevance to the topic sentence (specific topic). Select and output (singular or plural) utterance sentences (first utterance sentences). For example, the utterance generation unit 112 selects M utterance sentences from the utterance candidate sentence DB in descending order of similarity to the topic sentence. Each utterance sentence (first utterance sentence) of the present embodiment includes a set of a question sentence and an answer sentence. An example of this processing is shown below.
(Step 1-I) The utterance generation unit 112 acquires all question sentences from the utterance candidate sentence DB (eg, FIG. 3) stored in the utterance candidate sentence DB storage unit 111, and generates all question sentences and topic sentences. Perform morphological analysis on each word. For morphological analysis, the morphological analyzer JTAG developed by the applicant is used as an example, but general well-known techniques such as MeCab may be used.
(Step 1-II) The utterance generation unit 112 uses the above-described morphological analysis result and the word vector dictionary stored in the word vector dictionary storage unit 13, and uses a known technique for vectorizing words such as Word2vec to generate a question. A word vector is obtained by vectorizing each word of a sentence and a topic sentence, and a sentence vector is obtained using the word vector for each text of a question sentence and a topic sentence. When obtaining a sentence vector from a word vector, the word vector is averaged in this embodiment. However, instead of taking the average, another method such as obtaining a sentence vector from the word vector using a recurrent neural network or the like may be used. A sentence vector corresponding to a question sentence is called a "question sentence vector", and a sentence vector corresponding to a topic sentence is called a "topic sentence vector".
(Step 1-III) The utterance generator 112 calculates the degree of similarity between each question sentence vector and the topic sentence vector, and sorts the question sentence vectors in descending order of similarity. Next, the utterance generation unit 112 selects M question sentence vectors in descending order of similarity, and generates M question sentences corresponding to the M question sentence vectors and M answer sentences corresponding to them. obtain. The utterance generation unit 112 outputs sets of these M question sentences and answer sentences as utterance sentences. As the degree of similarity, cosine similarity, which is often used as the degree of similarity between vectors, can be used, but this does not limit the present invention. may be used.

図７に議論の話題を表す話題文が「自動運転に賛成か反対か」である場合にステップ１－Ｉ，ＩＩで得られる質問文ベクトルと話題文ベクトルとの類似度を例示する。Ｍ＝３とすると、発話生成部１１２は、ステップ１－ＩＩＩで類似度の高い順に３個の質問文「車を運転しますか？」「自動車の運転は好きですか？」「通勤は電車を利用されていますか？」を選択し、これらとこれらに対応する回答文とを出力する。 FIG. 7 illustrates the degree of similarity between the question sentence vector and the topic sentence vector obtained in steps 1-I and 1-II when the topic sentence representing the topic of discussion is "Do you agree or disagree with automated driving?" Assuming that M=3, the utterance generation unit 112 generates three question sentences "Do you drive a car?", "Do you like driving a car?" are you using?", and output these and their corresponding answer sentences.

発話生成部１１２は、Ｍが２以上の場合、Ｍ個の発話文（第１の発話文）それぞれと話題文（特定の話題）との関連性の高さの順位付けが可能なようにＭ個の発話文を出力してもよい。例えば、発話生成部１１２は、Ｍ個の発話文を話題文との関連性が高い順序で出力してもよいし、関連性が低い順序で出力してもよいし、各発話文の話題文との関連性の高さを表す情報を出力してもよい。例えば、関連性として類似度を用いる場合、発話生成部１１２は、Ｍ個の発話文を、上述の類似度の高い順序で出力してもよいし、上述の類似度の低い順序で出力してもよいし、各発話文に対応する類似度を出力してもよい（ステップＳ１１ａ）。 When M is 2 or more, the utterance generation unit 112 generates M utterance sentences may be output. For example, the utterance generation unit 112 may output the M utterance sentences in order of high relevance to the topic sentence, or may output M utterance sentences in order of low relevance to the topic sentence. You may output the information showing the height of relevance with. For example, when similarity is used as the relevance, the utterance generation unit 112 may output the M utterance sentences in the above order of high similarity or in the low order of similarity. Alternatively, the degree of similarity corresponding to each uttered sentence may be output (step S11a).

発話生成部１１２から出力されたＭ個の発話文（第１の発話文）は、対話管理部１４に入力される。対話管理部１４は、発話文が有する質問文および回答文を所定の順序で発話文生成部１５に送り、発話文生成部１５は送られた質問文および回答文の内容をそれぞれ表す出力発話文を生成して出力する。発話文生成部１５は、受け取った内容を表す文章をそのまま出力発話文としてもよいし、対話が不自然とならないように、受け取った発話文（質問文又は回答文）の内容を追加修正したものを出力発話文としてもよい。例えば、利用者の発話に対する反応を表現するために、回答文の内容を表す文章の文頭に「そうなんだ」などの相槌表現を付加してもよい。以下に詳細に説明する。 The M utterance sentences (first utterance sentences) output from the utterance generation unit 112 are input to the dialogue management unit 14 . The dialogue management unit 14 sends the question sentences and answer sentences included in the utterance sentences to the utterance sentence generation unit 15 in a predetermined order, and the utterance sentence generation unit 15 outputs output utterance sentences respectively representing the contents of the sent question sentences and answer sentences. is generated and output. The utterance sentence generation unit 15 may output sentences expressing the received contents as they are, or may additionally modify the contents of the received utterance sentences (question sentences or answer sentences) so that the dialogue does not become unnatural. may be used as the output utterance sentence. For example, in order to express the reaction to the user's utterance, a back-and-forth expression such as "that's right" may be added to the beginning of the sentence representing the content of the answer sentence. Details will be described below.

対話管理部１４は、上述のＭ個の発話文のうち、まだ選択されていない１個の発話文を選択する。例えば、Ｍ個の発話文それぞれと話題文との関連性の高さの順位付けが可能な場合、対話管理部１４は、話題文との関連性の低い順（特定の話題との関連性の低い順）で未選択の１個の発話文を選択する。例えば、対話管理部１４は、上述の類似度の低い順で未選択の１個の発話文を選択する。その他、対話管理部１４がランダムな順序で未選択の１個の発話文を選択してもよい（ステップＳ１１ｂ）。 The dialogue management unit 14 selects one unselected utterance sentence from among the M utterance sentences. For example, when it is possible to rank the degree of relevance between each of the M utterance sentences and the topic sentence, the dialogue management unit 14 ranks the relevance to the topic sentence in descending order of relevance (relevance to a specific topic). descending order) to select one unselected utterance sentence. For example, the dialogue management unit 14 selects one unselected utterance sentence in descending order of similarity as described above. Alternatively, the dialogue management unit 14 may select one unselected utterance sentence in random order (step S11b).

対話管理部１４は、ステップＳ１１ｂで選択した発話文が有する質問文を発話文生成部１５に送り、発話文生成部１５は当該質問文の内容を表す出力発話文を生成して出力する（ステップＳ１１ｃ）。 The dialogue manager 14 sends the question contained in the utterance selected in step S11b to the utterance generator 15, and the utterance generator 15 generates and outputs an output utterance representing the content of the question (step S11c).

ステップＳ１１ｃで出力された出力発話文に対して行われた利用者の発話内容は入力発話文として対話管理部１４に入力される。この入力発話文は破棄されてもよいし、図示していないメモリに保存されてもよい（ステップＳ１１ｄ）。 The content of the user's utterance made in response to the output utterance sentence output in step S11c is input to the dialogue management unit 14 as an input utterance sentence. This input utterance sentence may be discarded, or may be stored in a memory (not shown) (step S11d).

ステップＳ１１ｄの後、対話管理部１４は、ステップＳ１１ｂで選択した発話文が有する回答文を発話文生成部１５に送り、発話文生成部１５は当該回答文の内容を表す出力発話文を生成して出力する（ステップＳ１１ｅ）。 After step S11d, the dialogue manager 14 sends the answer sentence included in the utterance sentence selected in step S11b to the utterance sentence generator 15, and the utterance sentence generator 15 generates an output utterance sentence representing the content of the answer sentence. and output (step S11e).

ステップＳ１１ｅで出力された出力発話文に対して行われた利用者の発話内容は入力発話文として対話管理部１４に入力される。この入力発話文は破棄されてもよいし、図示していないメモリに保存されてもよい（ステップＳ１１ｆ）。 The content of the user's utterance made in response to the output utterance sentence output in step S11e is input to the dialogue management unit 14 as an input utterance sentence. This input utterance sentence may be discarded, or may be stored in a memory (not shown) (step S11f).

対話管理部１４は、ステップＳ１１ｂからＳ１１ｆまでの処理がＭ回繰り返され、上述のＭ個の発話文がすべて選択されたかを判定する（ステップＳ１１ｇ）。ステップＳ１１ｂからＳ１１ｆまでの処理がＭ回繰り返されていない場合には、処理がステップＳ１１ｂに戻る。一方、ステップＳ１１ｂからＳ１１ｆまでの処理がＭ回繰り返された場合には、以下の議論対話フェーズ（ＰＨ２）に進む。 The dialogue management unit 14 repeats the processing from steps S11b to S11f M times, and determines whether or not all of the above-described M utterance sentences have been selected (step S11g). If the processing from steps S11b to S11f has not been repeated M times, the processing returns to step S11b. On the other hand, when the processing from steps S11b to S11f has been repeated M times, the process proceeds to the following discussion dialogue phase (PH2).

≪議論対話フェーズ（ＰＨ２）≫
質問対話フェーズ（ＰＨ１）ですべての発話文に対応する出力発話文が出力された後、議論対話フェーズ（ＰＨ２）が行われる。議論対話フェーズで自動対話装置１が発話を行う場合、対話処理部１２が、目的の議論（特定の話題の対話）で発話される発話文（第２の発話文）を対話管理部１４に出力する。対話管理部１４は発話文（第２の発話文）を発話文生成部１５に送り、発話文生成部１５は発話文（第２の発話文）に対応する出力発話文を出力する。これに対し、利用者が発話を行う場合、その内容は入力発話文として対話管理部１４に入力され、その内容に応じて次の自動対話装置１の発話内容が決定される。以下に詳細に説明する。 ≪Discussion Dialogue Phase (PH2)≫
After the output utterance sentences corresponding to all utterance sentences are output in the question dialogue phase (PH1), the discussion dialogue phase (PH2) is performed. When the automatic dialogue device 1 makes an utterance in the discussion dialogue phase, the dialogue processing unit 12 outputs an utterance sentence (second utterance sentence) uttered in the target discussion (dialogue on a specific topic) to the dialogue management unit 14. do. The dialogue management unit 14 sends the utterance sentence (second utterance sentence) to the utterance sentence generation unit 15, and the utterance sentence generation unit 15 outputs an output utterance sentence corresponding to the utterance sentence (second utterance sentence). On the other hand, when the user utters an utterance, the content of the utterance is input to the dialogue management unit 14 as an input utterance sentence, and the next utterance content of the automatic dialogue apparatus 1 is determined according to the content. Details will be described below.

自動対話装置１が発話を行う場合（ステップＳ１２ａ）：
議論管理部１２２は、議論構造記憶部１２１に格納された議論構造に基づいて、自動対話装置１が行う発話内容を表す発話文（第２の発話文）を選択して出力する。例えば、議論構造記憶部１２１に１個の議論構造のみが格納されている場合、議論管理部１２２は、その議論構造に基づいて発話文（第２の発話文）を選択して出力する。例えば、議論構造記憶部１２１で複数個の議論構造と当該議論構造に対応する話題とが対応付けられて格納されている場合、議論管理部１２２は、対話管理部１４から話題文（議論の特定の話題を表す文）を取得し、当該話題文に対応する話題に対応付けられている議論構造に基づいて発話文（第２の発話文）を選択して出力する。本実施形態の場合、議論管理部１２２は、議論構造記憶部１２１に格納された議論構造から、所定の規則に従い、次に自動対話装置１が発話する内容を表す文章（発話テキスト）を表すノードを選択し、その対話行為を選択し、選択したノードの内部状態のフラグを選択した対話行為に対応する値に更新し（対話行為に対応する値のフラグを立て）、選択したノードの発話テキストおよび更新したフラグの情報を出力する。例えば、議論対話フェーズで利用者による発話が行われていない場合（例えば、議論対話フェーズの開始時点）、議論管理部１２２は、上記の議論構造の予め決められたノードを選択し、その対話行為として主張行為（Inform）を選択し、当該ノードのフラグｆ１を「主張行為が有ったこと」を表す値に更新し、選択したノードの発話テキストおよび更新したフラグｆ１の情報を出力する。この「予め決められたノード」の例はルートノード（例えば、図４の「自動運転に賛成」という主張を表すノード）である。例えば、議論対話フェーズで利用者による発話や自動対話装置１による発話が行われていた場合、議論管理部１２２は、最後に更新されたノードから定まるノードを選択し、その対話行為を選択し、選択したノードの内部状態のフラグを選択した対話行為に対応する値に更新し、選択したノードの発話テキストおよび更新したフラグの情報を出力する。例えば、議論管理部１２２は、最後に更新されたノードの子ノードを選択し、その対話行為として主張行為（Inform）を選択し、選択したノードの内部状態のフラグｆ１を「主張行為が有ったこと」を表す値に更新し、選択した子ノードの発話テキストおよび更新したフラグｆ１の情報を出力する。あるいは、例えば、議論管理部１２２は、最後に更新されたノードを選択し、その対話行為として質問行為（Question）を選択し、選択したノードの内部状態のフラグｆ２を「質問行為が有ったこと」を表す値に更新し、選択したノードの発話テキストおよび更新したフラグｆ２の情報を出力する。議論管理部１２２から出力されたノードの発話テキストおよび更新したフラグの情報は、対話管理部１４に送られ、さらに発話文生成部１５に送られる。発話文生成部１５は、例えば、送られた発話テキストおよびフラグの情報から、正規表現による変換規則を用いて出力発話文を生成して出力する。ただし、これは本発明を限定するものではなく、発話文生成部１５は、正規表現に限らず、例えば、何らかのIF-Thenルールに基づいた書き換え規則に従い、送られた発話テキストおよびフラグの情報から出力発話文を生成してもよい。例えば、フラグの情報が「質問行為が有ったこと」を表しており、発話テキストが肯定文あるとき、発話文生成部１５は、発話テキストを疑問形に変換した出力発話文を生成して出力する。また、直前の利用者の発話のフラグと自動対話装置１の発話のフラフ情報がともに「主張行為が有ったこと」を表しており、それらの発話テキストが表す立場が同じであれば、発話文生成部１５は、発話テキストの冒頭に順接を表す接続詞（例えば、「それに」など）を付与したものを出力発話文としてもよい。直前の利用者の発話のフラグと自動対話装置１の発話のフラフ情報がともに「主張行為が有ったこと」を表しており、それらの発話テキストが表す立場が異なる場合は発話テキストの文頭に逆接を表す接続詞（例えば、「でも」など）を付与したものを出力発話文としてもよい（ステップＳ１２ａ）。 When the automatic dialogue device 1 speaks (step S12a):
The discussion management unit 122 selects and outputs an utterance sentence (second utterance sentence) representing the content of the utterance made by the automatic dialogue apparatus 1 based on the discussion structure stored in the discussion structure storage unit 121 . For example, when only one discussion structure is stored in the discussion structure storage unit 121, the discussion management unit 122 selects and outputs an utterance sentence (second utterance sentence) based on the discussion structure. For example, when the discussion structure storage unit 121 stores a plurality of discussion structures and topics corresponding to the discussion structures in association with each other, the discussion management unit 122 receives topic sentences (discussion identification topic), and selects and outputs an utterance sentence (second utterance sentence) based on the discussion structure associated with the topic corresponding to the topic sentence. In the case of this embodiment, the discussion management unit 122 follows a predetermined rule from the discussion structure stored in the discussion structure storage unit 121, and selects a node representing a sentence (speech text) representing the content to be uttered by the automatic dialogue apparatus 1 next. , selects that dialogue act, updates the flag of the internal state of the selected node to the value corresponding to the selected dialogue act (sets the flag of the value corresponding to the dialogue act), and sets the utterance text of the selected node and output updated flag information. For example, when no utterance is made by the user in the discussion dialogue phase (for example, at the start of the discussion dialogue phase), the discussion management unit 122 selects a predetermined node of the discussion structure, and , the flag f1 of the node is updated to a value indicating that there was an assertive act, and the utterance text of the selected node and the information of the updated flag f1 are output. An example of this "predetermined node" is the root node (eg, the node representing the assertion "in favor of automated driving" in FIG. 4). For example, in the discussion-dialogue phase, when the user has spoken or the automatic dialog device 1 has spoken, the discussion management unit 122 selects a node determined from the last updated node, selects the dialogue act, Update the flag of the internal state of the selected node to a value corresponding to the selected interaction act, and output the utterance text of the selected node and information of the updated flag. For example, the discussion management unit 122 selects a child node of the node that was last updated, selects an assertive act (Inform) as the dialogue act, and sets the flag f1 of the internal state of the selected node to "There is an assertive act." Then, the utterance text of the selected child node and information of the updated flag f1 are output. Alternatively, for example, the discussion management unit 122 selects the last updated node, selects the question action as its dialogue action, and sets the flag f2 of the internal state of the selected node to "There was a question action. and output the utterance text of the selected node and information of the updated flag f2. The node utterance text and the updated flag information output from the discussion management unit 122 are sent to the dialogue management unit 14 and further to the utterance sentence generation unit 15 . The utterance sentence generation unit 15 generates and outputs an output utterance sentence, for example, from the transmitted utterance text and flag information using conversion rules based on regular expressions. However, this does not limit the present invention. An output speech sentence may be generated. For example, when the information of the flag indicates "that there was a question act" and the utterance text is an affirmative sentence, the utterance sentence generation unit 15 converts the utterance text into an interrogative form to generate an output utterance sentence. Output. Also, if the flag of the immediately preceding user's utterance and the flag information of the utterance of the automatic dialogue device 1 both indicate that "there was an assertive act", and the positions represented by those utterance texts are the same, then the utterance The sentence generating unit 15 may add a conjunctive word (for example, “and”) to the beginning of the spoken text as an output spoken sentence. Both the flag of the previous user's utterance and the flag information of the utterance of the automatic dialogue device 1 indicate that "there was an assertive act", and if the positions represented by those utterance texts are different, the sentence of the utterance text The output utterance sentence may be a sentence to which a conjunctive adverb (for example, "but") is added (step S12a).

利用者が発話を行う場合（ステップＳ１２ｂ）：
利用者が発話を行った場合、その発話内容は入力発話文として対話管理部１４に入力される。対話管理部１４は入力発話文を議論管理部１２２に送り、議論管理部１２２は入力発話文を対話行為推定部１２３および議論発話推定部１２４に送る。さらに議論管理部１２２は、ステップＳ１２ａで用いた議論構造を議論構造記憶部１２１から抽出して議論発話推定部１２４に送る。 When the user speaks (step S12b):
When the user utters an utterance, the content of the utterance is input to the dialogue management unit 14 as an input utterance sentence. Dialogue manager 14 sends the input utterance to discussion manager 122 , and discussion manager 122 sends the input utterance to dialogue act estimator 123 and discussion utterance estimator 124 . Furthermore, the discussion management unit 122 extracts the discussion structure used in step S12a from the discussion structure storage unit 121 and sends it to the discussion utterance estimation unit .

対話行為推定部１２３は、送られた入力発話文を前述した分類器に入力し、当該入力発話文がどのような対話行為を表しているかを推定し、推定した対話行為を表すラベルである対話行為ラベルを出力する。図６に、分類器が与えられた各入力発話文に対して出力する対話行為ラベルを例示する。対話行為ラベルは議論管理部１２２に入力される。 The dialogue act estimating unit 123 inputs the sent input utterance sentence to the classifier described above, estimates what kind of dialogue act the input utterance sentence represents, and classifies the estimated dialogue act as a label representing the dialogue act. Output the action label. FIG. 6 exemplifies the dialogue act label that the classifier outputs for each input utterance sentence. The dialogue action label is input to discussion manager 122 .

議論発話推定部１２４は、送られた入力発話文と議論構造とを受け取り、当該議論構造のノードによって表される発話テキストの中で、入力発話文と最も意味的類似度が近いものを特定し、その発話テキストを表すノードを識別するノードＩＤを出力する。ノードＩＤは議論管理部１２２に入力される。以下にこの処理の一例を示す。
（ステップ２－Ｉ）議論発話推定部１２４は、入力発話文および議論構造のノードによって表される発話テキストについて単語ごとに形態素解析を行う。形態素解析にはＭｅＣａｂやＪＴＡＧなどを用いればよい。
（ステップ２－ＩＩ）議論発話推定部１２４は、上述の形態素解析結果および単語ベクトル辞書記憶部１３に格納された単語ベクトル辞書を用い、Ｗｏｒｄ２ｖｅｃのような単語をベクトル化する公知の手法を用いて、入力発話文および各発話テキストの各単語をベクトル化して単語ベクトルを得、入力発話文および各発話テキストのテキストごとに当該単語ベクトルを用いて文ベクトルを得る。入力発話文に対応する文ベクトルを「入力発話文ベクトル」と呼び、発話テキストに対応する文ベクトルを「発話テキストベクトル」と呼ぶことにする。
（ステップ２－ＩＩＩ）議論発話推定部１２４は、入力発話文ベクトルと各発話テキストベクトルとの類似度を計算し、入力発話文ベクトルとの類似度が最も大きな発話テキストベクトルを特定し、それに対応するノードのノードＩＤを出力する。類似度としてはコサイン類似度を用いることができるが、これは本発明を限定するものではない。 The discussion utterance estimation unit 124 receives the sent input utterance text and discussion structure, and identifies the utterance text represented by the node of the discussion structure that has the closest semantic similarity to the input utterance text. , outputs a node ID that identifies the node representing the uttered text. The node ID is input to discussion manager 122 . An example of this processing is shown below.
(Step 2-I) The discussion utterance estimation unit 124 performs morphological analysis for each word on the input utterance sentence and the utterance text represented by the node of the discussion structure. MeCab, JTAG, etc. may be used for morphological analysis.
(Step 2-II) The discussion utterance estimation unit 124 uses the above-described morphological analysis result and the word vector dictionary stored in the word vector dictionary storage unit 13, and uses a known technique for vectorizing words such as Word2vec. , each word of the input utterance sentence and each utterance text is vectorized to obtain a word vector, and the word vector is used for each text of the input utterance sentence and each utterance text to obtain a sentence vector. A sentence vector corresponding to an input utterance sentence is called an "input utterance sentence vector", and a sentence vector corresponding to an utterance text is called an "utterance text vector".
(Step 2-III) The discussion utterance estimation unit 124 calculates the degree of similarity between the input utterance sentence vector and each utterance text vector, identifies the utterance text vector with the highest degree of similarity with the input utterance sentence vector, and corresponds to it. Outputs the node ID of the node that Cosine similarity can be used as similarity, but this is not a limitation of the present invention.

議論管理部１２２は、対話行為ラベルおよびノードＩＤを入力とし、議論構造記憶部１２１に格納された議論構造からノードＩＤによって識別されるノードを選択し、当該ノードの内部状態のフラグを対話行為ラベルが示す値に更新する。例えば、対話行為ラベルがInformならノードＩＤによって識別されるノードのフラグｆ１を「主張行為が有ったこと」を表す値に更新し、対話行為ラベルがQuestionならノードＩＤによって識別されるノードのフラグｆ２を「質問行為が有ったこと」を表す値に更新する（ステップＳ１２ｂ）。 The discussion management unit 122 receives the dialogue act label and the node ID, selects a node identified by the node ID from the discussion structures stored in the discussion structure storage unit 121, and sets the flag of the internal state of the node to the dialogue act label. Update to the value indicated by . For example, if the dialogue act label is Inform, the flag f1 of the node identified by the node ID is updated to a value representing "there was an assertive act", and if the dialogue act label is Question, the flag of the node identified by the node ID is updated. f2 is updated to a value representing "the act of questioning" (step S12b).

次に、対話管理部１４は、ステップＳ１２ａおよびＳ１２ｂの処理がＮ回繰り返されたかを判断する（ステップＳ１２ｃ）。ここで、当該処理がＮ回繰り返されていないと判断された場合には、処理をステップＳ１２ａに戻す。一方、当該処理がＮ回繰り返されたと判断されたときには処理を終了する。この際、発話文生成部１５は、対話の終了を付ける発話（例えば「これで終わりです」など）を表す出力発話文を出力して対話を終了してもよい。 Next, the dialogue manager 14 determines whether the processes of steps S12a and S12b have been repeated N times (step S12c). Here, if it is determined that the process has not been repeated N times, the process returns to step S12a. On the other hand, when it is determined that the process has been repeated N times, the process ends. At this time, the utterance sentence generation unit 15 may output an output utterance sentence representing an utterance to end the dialogue (for example, "This is the end") to end the dialogue.

＜挨拶対話フェーズ（ＰＨ０），質問対話フェーズ（ＰＨ１），議論対話フェーズ（ＰＨ２）の具体例＞
図９から図１２に、本実施形態の自動対話技術で行われた挨拶対話フェーズ（ＰＨ０），質問対話フェーズ（ＰＨ１），議論対話フェーズ（ＰＨ２）の具体例を示す。ここで「Ｓ」は自動対話装置１が行った発話の内容を表す発話文（出力発話文）を表し、「Ｕ」は利用者が行った発話の内容を表す発話文（入力発話文）を表す。 <Specific examples of greeting dialogue phase (PH0), question dialogue phase (PH1), discussion dialogue phase (PH2)>
9 to 12 show specific examples of the greeting dialogue phase (PH0), the question dialogue phase (PH1), and the discussion dialogue phase (PH2) performed by the automatic dialogue technique of this embodiment. Here, "S" represents an utterance sentence (output utterance sentence) representing the contents of the utterance made by the automatic dialogue apparatus 1, and "U" represents an utterance sentence (input utterance sentence) representing the contents of the utterance made by the user. show.

＜本実施形態の特徴＞
従来の自動対話技術では、対話開始時に唐突に目的とする特定の話題の対話を始めるため、利用者がいきなりその話題についていくことが難しいという問題があった。一方、本実施形態で説明した自動対話技術では、目的とする特定の話題の議論（議論対話フェーズ）を開始する前に、発話候補文の集合から、当該特定の話題との関連性に基づいて単数または複数の発話文（第１の発話文）を選択し、議論の導入となる対話（質問対話フェーズ）を行うこととした。そのため、本実施形態では、唐突に目的とする特定の話題の議論の対話を始める場合に比べ、利用者を円滑に議論に導入させることができ、利用者が議論についていきやすいという利点がある。 <Characteristics of this embodiment>
The conventional automatic dialogue technology has a problem that it is difficult for the user to keep up with the topic because the dialogue of a specific target topic is abruptly started at the start of the dialogue. On the other hand, in the automatic dialogue technique described in the present embodiment, before starting a discussion of a specific target topic (discussion dialogue phase), from a set of utterance candidate sentences, based on the relevance to the specific topic, A single or a plurality of utterance sentences (first utterance sentences) were selected, and a dialogue (question dialogue phase) was conducted to introduce the discussion. Therefore, in this embodiment, users can be smoothly introduced into the discussion, and the user can easily follow the discussion, compared to the case of suddenly starting a discussion on a specific target topic.

また従来の自動対話技術では、対話開始時に議論対話システムと利用者との間の関係構築がされておらず、利用者はそのような相手に対して好き嫌い以上の議論をいきなり行わないといけないため、対話が円滑に進まないという問題があった。また関係構築がなされていない、例えば初対面での対話のような場面において、利用者は好き嫌い以上の考え方や価値観に関わるような発言を受けても、それがどのような背景のもとに出された発言なのか理解することが難しい。さらに、利用者は関係構築されていない相手に対して、その人の考え方や価値観を知ろうとする高いモチベーションを持ちづらく、そのため利用者が積極的に対話を進めることが難しい。これに対し、本実施形態では、目的とする特定の話題の議論（議論対話フェーズ）を開始する前に、議論の導入となる対話（質問対話フェーズ）を行うため、目的の議論が開始される前の質問対話フェーズで自動対話装置１と利用者との間の関係構築がある程度見込まれるといる利点がある。特に、発話候補文の集合がＰＤＢなどの個人的な内容を問う質問を表す質問文と、質問に対する回答を表す回答文と、の組を含み、質問対話フェーズでの発話文（第１発話文）のそれぞれが当該質問文と回答文との組を含む場合には、自動対話装置１と利用者との間の関係構築がより築き易いという利点がある。すなわち、利用者の個人的な内容を問う対話では利用者が自身のパーソナリティについて発話するため、利用者は自動対話装置１に対して親密さを感じ易い。このように目的とする特定の話題の議論を行う前に、利用者がパーソナリティに関する質問や自己開示を行うことで、対話の相手である自動対話装置１との親密性や関心を高め、初めて対話を円滑に行うことができる。またそれにより、利用者が相手（自動対話装置１）のことを知りたいという関心が増し、利用者の議論に対する動機付けが行われる。さらに、質問対話フェーズで複数の発話文（第１発話文）を特定の話題との関連性の低い順に出力し、すべての発話文（第１発話文）を出力した後に、議論対話フェーズでの発話文（第２の発話文）を出力こととした場合、対話内容が徐々に議論の話題に近づいていくため、議論の話題が唐突なものでなくなり、利用者がより円滑に議論を開始することができる。 In addition, in conventional automatic dialogue technology, the relationship between the discussion dialogue system and the user is not established at the start of the dialogue, and the user has to suddenly discuss more than likes and dislikes with such a partner. , there was a problem that the dialogue did not proceed smoothly. Also, in situations where relationships are not built, for example, in situations such as first-time conversations, even if users receive remarks that are more than likes and dislikes and are related to their way of thinking and values, what kind of background do they come up with? It is difficult to understand whether it is a remark that was made. Furthermore, it is difficult for the user to have a high motivation to get to know the person's way of thinking and sense of values toward the other party with whom the relationship is not built, and therefore it is difficult for the user to actively proceed with the dialogue. On the other hand, in the present embodiment, before starting a discussion of a specific target topic (discussion dialogue phase), a dialogue that introduces the discussion (question dialogue phase) is performed, so that the target discussion is started. There is an advantage that it is expected that the relationship between the automatic dialogue apparatus 1 and the user will be established to some extent in the previous question dialogue phase. In particular, the set of utterance candidate sentences includes a set of a question sentence expressing a question asking about personal content such as PDB and an answer sentence expressing an answer to the question, and an utterance sentence in the question dialogue phase (first utterance sentence ) includes a pair of the question sentence and the answer sentence, there is an advantage that it is easier to build a relationship between the automatic dialogue apparatus 1 and the user. That is, since the user talks about his/her own personality in the dialogue asking about the user's personal content, the user tends to feel closeness to the automatic dialogue apparatus 1 . In this way, before discussing a specific target topic, the user asks questions about personality and self-disclosure, thereby increasing intimacy and interest with the automatic dialogue device 1, which is the other party of the dialogue, and interacting for the first time. can be performed smoothly. Moreover, as a result, the user's interest in wanting to know the other party (automatic dialogue apparatus 1) increases, and the user's motivation for discussion is provided. Furthermore, in the question dialogue phase, multiple utterance sentences (first utterance sentence) are output in descending order of relevance to a specific topic. When an utterance sentence (second utterance sentence) is output, the content of the dialogue gradually approaches the topic of discussion, so the topic of discussion is no longer abrupt, and the user can start the discussion more smoothly. be able to.

また本実施形態では、目的とする「特定の話題の対話」が、特定の話題に関する議論である場合を例示したが、「特定の話題の対話」が議論以外の対話であってもよい。ただし、「特定の話題の対話」が、特定の話題に関する質問と回答とを複数回繰り返すことで利用者の考えを導き出すための対話である方が、質問対話フェーズを実行することによって得られる上述の効果が大きくなる。このような「特定の話題の対話」の例は、「特定の話題の深堀のための対話」、または「好き嫌い以上の、考え方や価値観に対する対話」などである。このような対話としては、例えば、インタビュー対話、カウンセリング、タスク指向型対話などを例示できる。なお、タスク指向型対話とは、特定のタスクの達成を目的とする対話を指す。例えば、「利用者にお薦めの音楽をかける」というタスクの達成を目的とする対話がタスク指向型対話である。より具体的には、例えば、利用者の「お勧めの音楽をかけて」という発話に対し、自動対話装置１が利用者のパーソナリティを知るための対話を行い、それによって取得した利用者のパーソナリティを考慮して音楽を選択して「○○はどうですか？」と返答する対話がタスク指向型対話の一例である。第２の対話処理部による、特定の話題の対話で発話される第２の発話文の選択方法も上述の実施形態に限定されるものではなく、その他の公知の自動対話技術（例えば、非特許文献２等）で用いられる方法を用いてもよい。 Further, in the present embodiment, the target "dialogue on a specific topic" is a discussion on a specific topic, but the "dialogue on a specific topic" may be a dialogue other than a discussion. However, if the "dialogue on a specific topic" is a dialogue for deriving the user's thoughts by repeating questions and answers on a specific topic multiple times, it is better to obtain the above-mentioned dialogue obtained by executing the question dialogue phase. effect becomes greater. Examples of such a "dialogue on a specific topic" include "dialogue for deepening a specific topic" or "dialogue on ideas and values beyond likes and dislikes". Examples of such dialogues include interview dialogues, counseling, task-oriented dialogues, and the like. Note that task-oriented dialogue refers to dialogue aimed at achieving a specific task. For example, a task-oriented dialog is a dialog whose purpose is to accomplish the task of "playing music recommended for the user." More specifically, for example, in response to the user's utterance "Play recommended music", the automatic dialogue device 1 conducts a dialogue to learn the user's personality, and the user's personality obtained by the dialogue is performed. An example of a task-oriented dialogue is a dialogue in which music is selected in consideration of , and a response of "How about XX?" The method by which the second dialogue processing unit selects the second utterance sentence uttered in the dialogue on a specific topic is not limited to the above-described embodiment, and other known automatic dialogue techniques (e.g., non-patent Document 2, etc.) may be used.

本実施形態では、「発話候補文の集合」が、個人的な内容を問う質問を表す質問文と、質問に対する回答を表す回答文と、の組を含むＰＤＢであり、対話処理部１１（第１の対話処理部）が、ＰＤＢ（発話候補文の集合）から、特定の話題との関連性に基づいて単数または複数の発話文（第１の発話文）を選択して出力することとした。しかしながら、第１の対話処理部として公知の雑談対話システムなどを用いてもよい。 In this embodiment, the "set of utterance candidate sentences" is a PDB containing a set of a question sentence expressing a question about personal content and an answer sentence expressing an answer to the question. 1 dialogue processing unit) selects and outputs one or more utterance sentences (first utterance sentences) from the PDB (set of candidate utterance sentences) based on the relevance to a specific topic. . However, a known chat dialogue system or the like may be used as the first dialogue processing unit.

前述のように、対話処理部（第１の対話処理部）が導入のための発話文（第１の発話文）を出力した後に、対話処理部１２（第２の対話処理部）が目的とする特定の話題の対話のための発話文（第２の発話文）を出力するが、この切り替えの制御方法は本実施形態のものに限定されない。すなわち、本実施形態では、質問対話フェーズ（ＰＨ１）での対話をＭ回行い、議論対話フェーズ（ＰＨ２）での対話をＮ回行うといったように（Ｍ，Ｎは１以上の任意の整数）、それぞれの対話の繰り返し回数を条件としてこの切り替えの制御を行った（図２）。ただし、この切り替えの制御は一例であり、その他の制御によってこの切り替えが行われてもよい。例えば、第１の対話処理部による第１の発話文の出力回数が繰り返し回数Ｍに達したかや、第２の対話処理部による第２の発話文の出力回数が繰り返し回数Ｎに達したかを判断するのではなく、第１の発話文を選択する際の基準となる「目的とする特定の話題」と第１の発話文との関連性の高さまたは類似度を基準として、第１の対話処理部による第１の発話文の出力を行うフェーズから、第２の対話処理部による第２の発話文の出力を行うフェーズへ遷移する時点が決定されてもよい。また、入力発話文と任意条件とのパタンマッチによって、第１の対話処理部による第１の発話文の出力を行うフェーズから、第２の対話処理部による第２の発話文の出力を行うフェーズへ遷移する時点や、第２の対話処理部による第２の発話文の出力を行うフェーズの終了時点が決定されてもよい。例えば、以下のような制御が行われてもよい。
・選択された第１の発話文と「目的とする特定の話題」との類似度が閾値以下になったときに第１の対話処理部による第１の発話文の出力を行うフェーズを終了する。
・第２の対話処理部による第２の発話文に対応する発話に対して利用者が納得したときに（例えば、入力発話文に対応する対話行為ラベルがDisagreeからAgreeになったとき）、第２の対話処理部による第２の発話文を出力するフェーズを終了する。
・議論管理部１２２で選択されるノードが議論構造の特定のノードに達したときに、第２の対話処理部による第２の発話文を出力するフェーズを終了する。 As described above, after the dialogue processing unit (first dialogue processing unit) outputs an introductory utterance sentence (first utterance sentence), the dialogue processing unit 12 (second dialogue processing unit) Although an utterance sentence (second utterance sentence) for dialogue on a specific topic is output, the switching control method is not limited to that of the present embodiment. That is, in this embodiment, the dialogue in the question dialogue phase (PH1) is performed M times, and the dialogue in the discussion dialogue phase (PH2) is performed N times (M and N are arbitrary integers of 1 or more), This switching was controlled based on the number of repetitions of each dialogue (Fig. 2). However, this switching control is an example, and this switching may be performed by other control. For example, whether the number of outputs of the first utterance sentence by the first dialogue processing unit has reached the number of repetitions M, or whether the number of times of output of the second utterance sentence by the second dialogue processing unit has reached the number of repetitions N Instead of judging the first utterance sentence, the first A transition point may be determined from the phase in which the dialogue processing unit outputs the first uttered sentence to the phase in which the second dialogue processing unit outputs the second uttered sentence. Also, by pattern matching between the input utterance sentence and the arbitrary condition, from the phase in which the first dialogue processing unit outputs the first utterance sentence, to the phase in which the second dialogue processing unit outputs the second utterance sentence or the end of the phase at which the second dialogue processing unit outputs the second utterance sentence may be determined. For example, the following control may be performed.
When the degree of similarity between the selected first utterance sentence and the "target specific topic" becomes equal to or less than the threshold, the phase of outputting the first utterance sentence by the first dialogue processing unit is ended. .
- When the user agrees with the utterance corresponding to the second utterance sentence by the second dialogue processing unit (for example, when the dialogue act label corresponding to the input utterance sentence changes from Disagree to Agree), the second 2 ends the phase of outputting the second utterance sentence by the second dialogue processing unit.
- When the node selected by the discussion management unit 122 reaches a specific node in the discussion structure, the phase of outputting the second utterance sentence by the second dialog processing unit ends.

［第１実施形態の変形例］
第１実施形態では、第２の対話処理部による第２の発話文の出力を行うフェーズである議論対話フェーズ（ＰＨ２）においてのみ、対話行為推定部が入力発話文を分類器に入力し、当該入力発話文がどのような対話行為を表しているかを推定して対話行為ラベルを得た。しかしながら、第１の対話処理部による第１の発話文の出力を行うフェーズである質問対話フェーズ（ＰＨ１）やその前の挨拶対話フェーズ（ＰＨ０）でも対話行為推定部が入力発話文を分類器に入力し、当該入力発話文がどのような対話行為を表しているかを推定して対話行為ラベルを得てもよい。以下では、第１実施形態との相違点を中心に説明し、第１実施形態と共通する部分については同じ参照番号を用いて説明を省略する。 [Modification of First Embodiment]
In the first embodiment, only in the discussion dialogue phase (PH2), which is the phase in which the second dialogue processing unit outputs the second utterance sentence, the dialogue act estimation unit inputs the input utterance sentence to the classifier, A dialogue act label was obtained by estimating what kind of dialogue act the input utterance sentence represented. However, even in the question dialogue phase (PH1), which is the phase in which the first dialogue processing unit outputs the first utterance sentence, and the preceding greeting dialogue phase (PH0), the dialogue act estimation unit classifies the input utterance sentence into the classifier. A dialogue act label may be obtained by inputting and estimating what kind of dialogue act the input utterance sentence represents. In the following, differences from the first embodiment will be mainly described, and descriptions of parts common to the first embodiment will be omitted by using the same reference numerals.

図１３に例示するように、この変形例の自動対話装置２は、対話処理部１１（第１の対話処理部）、対話処理部２２（第２の対話処理部）、単語ベクトル辞書記憶部２３、対話管理部１４、発話文生成部１５、対話行為推定部１２３を有する。自動対話装置２は、さらに対話処理部１０を有していてもよいし、有していなくてもよい。対話処理部２２は、議論構造記憶部１２１、議論管理部１２２、および議論発話推定部１２４を有する。 As exemplified in FIG. 13, the automatic dialogue apparatus 2 of this modification includes a dialogue processing section 11 (first dialogue processing section), a dialogue processing section 22 (second dialogue processing section), a word vector dictionary storage section 23 , dialogue management unit 14 , utterance sentence generation unit 15 , and dialogue act estimation unit 123 . The automatic dialogue device 2 may or may not have a dialogue processor 10 further. The dialogue processing section 22 has a discussion structure storage section 121 , a discussion management section 122 and a discussion utterance estimation section 124 .

第１実施形態との相違点は、議論対話フェーズ（ＰＨ２）で入力された入力発話文だけではなく、質問対話フェーズ（ＰＨ１）で入力された入力発話文も対話行為推定部２２３に入力され、当該入力発話文を分類器に入力して、当該入力発話文がどのような対話行為を表しているかを推定して対話行為ラベルが得られる点である。さらに挨拶対話フェーズ（ＰＨ０）で入力された入力発話文も対話行為推定部２２３に入力され、当該入力発話文がどのような対話行為を表しているかを推定して対話行為ラベルが得られてもよい。議論対話フェーズ（ＰＨ２）で得られた対話行為ラベルの使用方法は第１実施形態で説明した通りである。例えば、質問対話フェーズ（ＰＨ１）で得られた対話行為ラベルは、対話処理部１１に送られ、対話処理部１１がそれを用いて質問対話フェーズ（ＰＨ１）での自動対話装置２の発話内容を特定してもよい。同様に、挨拶対話フェーズ（ＰＨ０）で得られた対話行為ラベルは、対話処理部１０に送られ、対話処理部１０がそれを用いて挨拶対話フェーズ（ＰＨ０）での自動対話装置２の発話内容を特定してもよい。 The difference from the first embodiment is that not only the input utterance text input in the discussion dialogue phase (PH2) but also the input utterance text input in the question dialogue phase (PH1) is input to the dialogue act estimation unit 223, The point is that the input utterance sentence is input to a classifier, and the dialogue act label is obtained by estimating what kind of dialogue act the input utterance sentence represents. Furthermore, the input utterance sentence input in the greeting dialogue phase (PH0) is also input to the dialogue act estimation unit 223, and the dialogue act label is obtained by estimating what kind of dialogue act the input utterance sentence represents. good. The method of using the dialogue act label obtained in the discussion dialogue phase (PH2) is as described in the first embodiment. For example, the dialogue action label obtained in the question dialogue phase (PH1) is sent to the dialogue processing unit 11, and the dialogue processing unit 11 uses it to determine the utterance content of the automatic dialogue device 2 in the question dialogue phase (PH1). may be specified. Similarly, the dialogue action label obtained in the greeting dialogue phase (PH0) is sent to the dialogue processing unit 10, and the dialogue processing unit 10 uses it to determine the utterance content of the automatic dialogue device 2 in the greeting dialogue phase (PH0). may be specified.

［その他の変形例等］
なお、本発明は上述の実施形態に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。 [Other modifications, etc.]
It should be noted that the present invention is not limited to the above-described embodiments. For example, the various types of processing described above may not only be executed in chronological order according to the description, but may also be executed in parallel or individually according to the processing capacity of the device that executes the processing or as necessary. In addition, it goes without saying that appropriate modifications are possible without departing from the gist of the present invention.

上記の各装置は、例えば、ＣＰＵ（central processing unit）等のプロセッサ（ハードウェア・プロセッサ）およびＲＡＭ（random-access memory）・ＲＯＭ（read-only memory）等のメモリ等を備える汎用または専用のコンピュータが所定のプログラムを実行することで構成される。このコンピュータは１個のプロセッサやメモリを備えていてもよいし、複数個のプロセッサやメモリを備えていてもよい。このプログラムはコンピュータにインストールされてもよいし、予めＲＯＭ等に記録されていてもよい。また、ＣＰＵのようにプログラムが読み込まれることで機能構成を実現する電子回路（circuitry）ではなく、プログラムを用いることなく処理機能を実現する電子回路を用いて一部またはすべての処理部が構成されてもよい。１個の装置を構成する電子回路が複数のＣＰＵを含んでいてもよい。 Each of the above devices is, for example, a general-purpose or dedicated computer equipped with a processor (hardware processor) such as a CPU (central processing unit) and memories such as RAM (random-access memory) and ROM (read-only memory) is configured by executing a predetermined program. This computer may have a single processor and memory, or may have multiple processors and memories. This program may be installed in the computer, or may be recorded in ROM or the like in advance. Also, some or all of the processing units are configured using an electronic circuit that realizes processing functions without using a program, rather than an electronic circuit that realizes a functional configuration by reading a program like a CPU. may An electronic circuit that constitutes one device may include a plurality of CPUs.

上述の構成をコンピュータによって実現する場合、各装置が有すべき機能の処理内容はプログラムによって記述される。このプログラムをコンピュータで実行することにより、上記処理機能がコンピュータ上で実現される。この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体の例は、非一時的な（non-transitory）記録媒体である。このような記録媒体の例は、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等である。 When the above configuration is implemented by a computer, the processing contents of the functions that each device should have are described by a program. By executing this program on a computer, the above processing functions are realized on the computer. A program describing the contents of this processing can be recorded in a computer-readable recording medium. An example of a computer-readable recording medium is a non-transitory recording medium. Examples of such recording media are magnetic recording devices, optical discs, magneto-optical recording media, semiconductor memories, and the like.

このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ－ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The distribution of this program is carried out, for example, by selling, assigning, lending, etc. portable recording media such as DVDs and CD-ROMs on which the program is recorded. Further, the program may be distributed by storing the program in the storage device of the server computer and transferring the program from the server computer to other computers via the network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。処理の実行時、このコンピュータは、自己の記憶装置に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。 A computer that executes such a program, for example, first stores the program recorded on a portable recording medium or the program transferred from the server computer once in its own storage device. When executing the process, this computer reads the program stored in its own storage device and executes the process according to the read program. As another form of execution of this program, the computer may directly read the program from a portable recording medium and execute processing according to the program. , may sequentially execute processing according to the received program. A configuration in which the above processing is executed by a so-called ASP (Application Service Provider) type service, which does not transfer the program from the server computer to this computer and realizes the processing function only by the execution instruction and result acquisition, is also possible. good.

コンピュータ上で所定のプログラムを実行させて本装置の処理機能が実現されるのではなく、これらの処理機能の少なくとも一部がハードウェアで実現されてもよい。 At least a part of these processing functions may be realized by hardware instead of executing a predetermined program on a computer to realize the processing functions of the present apparatus.

１，２自動対話装置
１０，１１，１１，２２対話処理部 1, 2 automatic dialogue devices 10, 11, 11, 22 dialogue processing unit

Claims

利用者と対話を行う自動対話装置であって、
目的とする特定の話題の対話を行うための導入となる対話で発話される第１の発話文を出力する第１の対話処理部と、
前記特定の話題の対話で発話される第２の発話文を出力する第２の対話処理部と、を有し、
前記第１の対話処理部は、発話候補文の集合から、前記特定の話題との関連性に基づいて単数または複数の前記第１の発話文を選択して出力し、
前記第２の対話処理部は、前記第１の対話処理部が前記第１の発話文を出力した後に、前記第２の発話文を出力する、自動対話装置。 An automatic dialogue device for dialogue with a user,
a first dialogue processing unit that outputs a first utterance sentence uttered in an introductory dialogue for conducting a dialogue on a target specific topic;
a second dialogue processing unit that outputs a second utterance sentence uttered in the dialogue on the specific topic;
The first dialogue processing unit selects and outputs one or more of the first utterance sentences from a set of utterance candidate sentences based on relevance to the specific topic,
The automatic dialogue apparatus, wherein the second dialogue processing unit outputs the second utterance sentence after the first dialogue processing unit outputs the first utterance sentence.

請求項１の自動対話装置であって、
前記発話候補文の集合は、個人的な内容を問う質問を表す質問文と、前記質問に対する回答を表す回答文と、の組を含み、
前記第１の発話文のそれぞれは、前記質問文と前記回答文との組を含む、自動対話装置。 The automatic dialogue device of claim 1,
the set of utterance candidate sentences includes a set of a question sentence representing a question about personal content and an answer sentence representing an answer to the question;
The automatic dialogue apparatus, wherein each of the first utterance sentences includes a set of the question sentence and the answer sentence.

請求項１または２の自動対話装置であって、
前記特定の話題の対話は、前記特定の話題に関する質問と回答とを複数回繰り返すことで前記利用者の考えを導き出すための対話である、自動対話装置。 3. The automatic dialogue device of claim 1 or 2,
The automatic dialogue apparatus, wherein the dialogue on the specific topic is dialogue for deriving the user's thoughts by repeating questions and answers on the specific topic a plurality of times.

請求項２の自動対話装置であって、
前記特定の話題の対話は、前記特定の話題に関する議論である、自動対話装置。 3. The automatic dialogue device of claim 2,
The automatic dialogue device, wherein the dialogue on the specific topic is a discussion on the specific topic.

請求項１から４の何れかの自動対話装置であって、
対話管理部をさらに有し、
前記第１の対話処理部は、前記発話候補文の集合から、前記特定の話題との関連性に基づいて複数の前記第１の発話文を選択し、
前記対話管理部は、複数の前記第１の発話文を前記特定の話題との関連性の低い順に出力し、すべての前記第１の発話文を出力した後に、前記第２の発話文を出力する、自動対話装置。 The automatic dialogue device according to any one of claims 1 to 4,
further having a dialogue management department;
The first dialogue processing unit selects a plurality of the first utterance sentences from the set of utterance candidate sentences based on relevance to the specific topic,
The dialogue management unit outputs a plurality of the first utterance sentences in descending order of relevance to the specific topic, and outputs the second utterance sentence after outputting all the first utterance sentences. , an automatic dialogue device.

利用者と対話を行う自動対話方法であって、
目的とする特定の話題の対話を行うための導入となる対話で発話される第１の発話文を出力する第１の対話処理ステップと、
前記特定の話題の対話で発話される第２の発話文を出力する第２の対話処理ステップと、を有し、
前記第１の対話処理ステップは、発話候補文の集合から、前記特定の話題との関連性に基づいて単数または複数の前記第１の発話文を選択して出力するステップであり、
前記第２の対話処理ステップは、前記第１の対話処理ステップが前記第１の発話文を出力した後に、前記第２の発話文を出力するステップである、自動対話方法。 An automatic interaction method for interacting with a user,
a first dialogue processing step of outputting a first utterance sentence uttered in an introductory dialogue for carrying out a dialogue on a target specific topic;
a second dialogue processing step of outputting a second utterance sentence uttered in the dialogue on the specific topic;
The first dialogue processing step is a step of selecting and outputting one or more of the first utterance sentences from a set of utterance candidate sentences based on relevance to the specific topic,
The automatic dialogue method, wherein the second dialogue processing step is a step of outputting the second utterance sentence after the first dialogue processing step outputs the first utterance sentence.

請求項１から５の何れかの自動対話装置としてコンピュータを機能させるためのプログラム。 A program for causing a computer to function as the automatic dialogue device according to any one of claims 1 to 5.