JP6452420B2

JP6452420B2 - Electronic device, speech control method, and program

Info

Publication number: JP6452420B2
Application number: JP2014247827A
Authority: JP
Inventors: 孝之永松
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2014-12-08
Filing date: 2014-12-08
Publication date: 2019-01-16
Anticipated expiration: 2034-12-08
Also published as: JP2016109897A

Description

本発明は、発話が可能な電子機器、電子機器における発話制御方法、および電子機器を制御するためのプログラムに関する。 The present invention relates to an electronic device capable of speaking, an utterance control method in the electronic device, and a program for controlling the electronic device.

従来、フレーズから語調に関する特徴量を分析する技術が知られている。たとえば、特許文献１には、このような分析技術を備えるユーザプロファイル抽出装置が開示されている。当該ユーザプロファイル抽出装置は、発話者の音声信号から有声区間の音声信号をフレーズとして抽出し、フレーズから語調に関する特徴量を分析する。さらに、ユーザプロファイル抽出装置は、第１記憶手段に記憶されている地域別の語調特徴量を読み出して、分析した特徴量に最も近い語調特徴量に対応する地域を判定する。 2. Description of the Related Art Conventionally, a technique for analyzing a feature amount related to tone from a phrase is known. For example, Patent Document 1 discloses a user profile extraction device having such an analysis technique. The user profile extraction apparatus extracts a voice signal of a voiced section from a voice signal of a speaker as a phrase, and analyzes a feature amount related to the tone from the phrase. Further, the user profile extraction device reads out the region-specific tone feature amount stored in the first storage unit, and determines the region corresponding to the tone feature amount closest to the analyzed feature amount.

特許文献２にも、上記のような分析技術を備える情報処理システムが開示されている。情報処理システムでは、音声状況判定部が、ユーザの音声信号から発話中の方言を語調に基づいて分析し、方言が使用されている地域を判定すると共に、分析された方言の数及び各方言の使用時間長に基づいて利用者の発話状況を判定する。さらに、当該情報処理システムでは、サービスゲートウェイが、前回の会話から一定時間以内に該利用者から情報検索要求が送信された場合に、判定された地域情報及び利用者の発話状況に対応する付加情報を情報検索要求に追加して、追加された付加情報に対応する情報検索応答を利用者に提供する。 Patent Document 2 also discloses an information processing system including the above analysis technique. In the information processing system, the voice situation determination unit analyzes the dialect being spoken from the voice signal of the user based on the tone, determines the area where the dialect is used, and determines the number of dialects analyzed and the number of each dialect The user's utterance status is determined based on the usage time length. Further, in the information processing system, when the service gateway transmits an information search request from the user within a predetermined time from the previous conversation, the additional information corresponding to the determined regional information and the user's utterance status Is added to the information search request, and an information search response corresponding to the added additional information is provided to the user.

また、従来、ユーザとの間で会話が可能なロボットが開発されている。たとえば、特許文献３には、このようなロボットとしてのコミュニケーションロボットが開示されている。当該コミュニケーションロボットは、ユーザが発話する言葉を検知する音声検知手段と、所定の応答反応を表出する反応表出手段とを有する。コミュニケーションロボットは、ユーザの話し言葉に反応して予め定められた複数パターンの応答反応を表出させる。詳しくは、コミュニケーションロボットは、話し言葉が通常表現であるか否かの判定、話し言葉が同意要求表現であるか否かの判定、話し言葉が断定表現であるか否かの判定の少なくともいずれかの判定を実施し、判定結果に基づいて異なる応答反応を表出させる。 Conventionally, a robot capable of talking with a user has been developed. For example, Patent Document 3 discloses a communication robot as such a robot. The communication robot includes voice detection means for detecting words spoken by the user, and reaction expression means for expressing a predetermined response reaction. The communication robot displays a plurality of predetermined response responses in response to the user's spoken language. Specifically, the communication robot determines at least one of determination of whether or not the spoken word is a normal expression, determination of whether or not the spoken word is an agreement request expression, and determination of whether or not the spoken word is an assertive expression. Implement different response responses based on the determination results.

特許文献４には、子供の相手をするためのロボットが開示されている。当該ロボットは、子供の状況を示す情報を取得する。また、ロボットは、取得された状況に基づき、状況と、当該状況が生じたときにロボットがとるべき行動とを対応づけて記憶する行動記憶部を参照して、行動を決定する。 Patent Document 4 discloses a robot for playing with a child. The robot acquires information indicating the situation of the child. Further, the robot determines an action based on the acquired situation with reference to an action storage unit that stores the situation and an action to be taken by the robot when the situation occurs.

特開２０１０−２５６７６５号公報JP 2010-256765 A 特開２０１０−２７７３８８号公報JP 2010-277388 A 特開２０１３−８６２２６号公報JP 2013-86226 A 特開２００５−３０５６３１号公報JP 2005-305631 A

特許文献１のユーザプロファイル抽出装置は、特定された方言を用いて発話をするものではない。この点は、特許文献２の情報処理システムでも同様である。 The user profile extraction device of Patent Document 1 does not utter using a specified dialect. This also applies to the information processing system disclosed in Patent Document 2.

特許文献３における応答反応は、相槌を打つこと、首を振ること等である。それゆえ、応答反応は、ロボットがユーザに話す場合の話し言葉とは関連性がない。 The response reaction in Patent Document 3 includes hitting a head, shaking the head, and the like. Therefore, the response response is not related to the spoken language when the robot speaks to the user.

特許文献４のロボットは、ロボットの前にいる子供の声に基づき行動を決定する構成であって、子供との会話を通じて行動を決定するものではない。 The robot of patent document 4 is a structure which determines action based on the voice of the child in front of a robot, Comprising: It does not determine action through conversation with a child.

本願発明は、上記の問題点に鑑みなされたものであって、その目的は、会話の相手に適した発話をすることが可能な電子機器、発話制御方法、およびプログラムを提供することにある。 The present invention has been made in view of the above-described problems, and an object thereof is to provide an electronic device, an utterance control method, and a program capable of uttering suitable for a conversation partner.

本発明のある局面に従うと、電子機器は、人との会話に基づき人の特徴を判定し、判定された特徴に応じた発話を行なう。 According to one aspect of the present invention, the electronic device determines the characteristics of a person based on a conversation with the person, and performs an utterance according to the determined characteristics.

好ましくは、特徴は、方言の種別、年齢層、および会話時の気分のうちの少なくとも１つである。 Preferably, the feature is at least one of a dialect type, an age group, and a mood during conversation.

好ましくは、特徴は、方言の種別である。電子機器は、発話のために音声を出力する音声出力手段と、方言の種別を判定する第１の判定手段と、判定された特徴に応じた発話を音声出力手段に行なわせる発話制御手段とを備える。発話制御手段は、音声出力手段に判定された種別の方言で発話させる。 Preferably, the feature is a dialect type. The electronic device includes: a voice output unit that outputs a voice for utterance; a first determination unit that determines a type of dialect; and an utterance control unit that causes the voice output unit to perform a utterance according to the determined feature. Prepare. The speech control means causes the voice output means to speak in the determined type of dialect.

好ましくは、特徴は、方言の種別である。電子機器は、発話のために音声を出力する音声出力手段と、方言の種別を判定する第１の判定手段と、判定された特徴に応じた発話を音声出力手段に行なわせる発話制御手段とを備える。発話制御手段は、音声出力手段に判定された種別の方言が用いられている地域の情報に関する発話を行なわせる。 Preferably, the feature is a dialect type. The electronic device includes: a voice output unit that outputs a voice for utterance; a first determination unit that determines a type of dialect; and an utterance control unit that causes the voice output unit to perform a utterance according to the determined feature. Prepare. The utterance control means causes the voice output means to utter about the information on the area where the determined type of dialect is used.

好ましくは、特徴は、年齢層である。電子機器は、発話のために音声を出力する音声出力手段と、人の年齢層を判定する第２の判定手段と、判定された特徴に応じた発話を音声出力手段に行なわせる発話制御手段とを備える。発話制御手段は、音声出力手段に判定された年齢層に見合った話し方で発話を行わせる。 Preferably, the feature is an age group. The electronic device includes a voice output unit that outputs a voice for utterance, a second determination unit that determines a person's age group, and an utterance control unit that causes the voice output unit to perform a utterance according to the determined feature. Is provided. The utterance control means causes the voice output means to utter in a manner appropriate for the age group determined.

好ましくは、発話制御手段は、音声出力手段に判定された年齢層に見合った内容を発話させる。 Preferably, the speech control means causes the speech output means to utter content corresponding to the determined age group.

好ましくは、特徴は、気分である。電子機器は、発話のために音声を出力する音声出力手段と、人の気分を判定する第３の判定手段と、判定された特徴に応じた発話を音声出力手段に行なわせる発話制御手段とを備える。発話制御手段は、音声出力手段に判定された気分に応じた発話を行なわせる。 Preferably, the feature is mood. The electronic device includes a voice output unit that outputs a voice for speech, a third determination unit that determines a person's mood, and a speech control unit that causes the voice output unit to perform a speech according to the determined feature. Prepare. The utterance control means causes the audio output means to utter according to the determined mood.

好ましくは、電子機器は、複数の人との会話に基づき複数の人の各々の特徴を判定する。電子機器は、判定された複数の特徴のうち、特徴を共通にする人が最も多い特徴を特定する。電子機器は、特定された特徴に応じた発話を行なう。 Preferably, the electronic device determines the characteristics of each of the plurality of people based on conversations with the plurality of people. The electronic device identifies a feature having the largest number of people who share the same among the determined features. The electronic device speaks according to the specified feature.

好ましくは、電子機器は、複数の人のうち特定された特徴とは異なる特徴を有する人に対して発話する場合には、特定された特徴に応じた発話を行なわずに、当該人の特徴に応じた発話を行なう。 Preferably, when the electronic device utters a person having a characteristic different from the specified characteristic among the plurality of persons, the electronic apparatus does not perform the utterance according to the specified characteristic, and Speak in response.

好ましくは、電子機器は自走式のロボットである。
本発明の他の局面に従うと、発話制御方法は、電子機器において実行される。発話制御方法は、人との会話に基づき人の特徴を判定するステップと、判定された特徴に応じた発話を行なうステップとを備える。 Preferably, the electronic device is a self-propelled robot.
When the other situation of this invention is followed, the speech control method is performed in an electronic device. The utterance control method includes a step of determining the characteristics of a person based on a conversation with the person, and a step of performing an utterance according to the determined characteristics.

本発明のさらに他の局面に従うと、プログラムは、電子機器を制御する。プログラムは、人との会話に基づき人の特徴を判定するステップと、判定された特徴に応じた発話を行なうステップとを、電子機器のプロセッサに実行させる。 When the further another situation of this invention is followed, a program controls an electronic device. The program causes the processor of the electronic device to execute a step of determining the characteristics of the person based on the conversation with the person and a step of speaking according to the determined characteristics.

上記の発明によれば、会話の相手に適した発話をすることが可能となる。 According to the above-described invention, it is possible to make an utterance suitable for a conversation partner.

ロボット１の外観を表した図である。2 is a diagram illustrating an appearance of a robot 1. FIG. ロボット１と人９０１とが会話をしている状態を表した図である。It is a figure showing the state where the robot 1 and the person 901 are talking. ロボット１が使用するデータベースＤ３の概略構成を説明するための図である。It is a figure for demonstrating schematic structure of the database D3 which the robot 1 uses. ロボット１の機能的構成を説明するための機能ブロック図である。4 is a functional block diagram for explaining a functional configuration of the robot 1. FIG. ロボット１における処理の流れを説明するためのフローチャートである。4 is a flowchart for explaining a flow of processing in the robot 1. ロボット１のハードウェア構成を表した図である。2 is a diagram illustrating a hardware configuration of a robot 1. FIG. ロボット１と複数の人間とが輪になっている状況を表した図である。It is a figure showing the situation where the robot 1 and several people are a ring. ロボット１における処理の流れを説明するためのフローチャートである。4 is a flowchart for explaining a flow of processing in the robot 1. 図８のステップＳ１１０，Ｓ１１８の処理における例外処理を説明するためのフローチャートである。It is a flowchart for demonstrating the exception process in the process of step S110, S118 of FIG. ロボットとサーバとを備えた通信システムの概略図である。It is the schematic of the communication system provided with the robot and the server. ロボット２と人９０１とが会話をしている状態を表した図である。It is a figure showing the state where the robot 2 and the person 901 are talking.

以下、図面を参照しつつ、本発明の各実施の形態に係る電子機器について説明する。また、以下の説明では、同一の部材には同一の参照符号を付してある。それらの名称および機能も同じである。したがって、それらについての詳細な説明は繰り返さない。 Hereinafter, electronic devices according to embodiments of the present invention will be described with reference to the drawings. In the following description, the same reference numerals are assigned to the same members. Their names and functions are also the same. Therefore, detailed description thereof will not be repeated.

また、実施の形態１〜４では、電子機器が自走式の人間型ロボットである場合を例に挙げて説明する。実施の形態５では、電子機器が自走式の掃除機型ロボットである場合を例に挙げて説明する。なお、電子機器は、必ずしもロボットである必要はない。また、上記ロボットは、必ずしも自走式ではなくてもよい。 In the first to fourth embodiments, the case where the electronic device is a self-propelled humanoid robot will be described as an example. In the fifth embodiment, a case where the electronic device is a self-propelled cleaner-type robot will be described as an example. Note that the electronic device is not necessarily a robot. Further, the robot does not necessarily have to be self-propelled.

さらに、実施の形態１では、説明を簡略化するため、ロボットが一人の人間と会話するときの処理を説明する。その後、実施の形態２等において、ロボットが複数の人間と会話するときの処理を説明する。
［実施の形態１］
＜Ａ．外観＞
図１は、ロボット１の外観を表した図である。図１を参照して、ロボット１は、車輪１１１によって自走可能なロボットである。ロボット１は、タッチスクリーン１０９を備えている。ロボット１は、発話機能のみならず、人と会話を行なう機能を備える。このため、ロボット１は、図示しないマイクおよびスピーカを筐体に備えている。なお、ロボット１は、一般家庭のみならず、公共施設で使用され得る。公共施設としては、たとえば、駅、空港等が挙げられる。 Furthermore, in the first embodiment, in order to simplify the description, processing when the robot has a conversation with one person will be described. Thereafter, in the second embodiment and the like, processing when the robot has a conversation with a plurality of humans will be described.
[Embodiment 1]
<A. Appearance>
FIG. 1 is a diagram showing the appearance of the robot 1. Referring to FIG. 1, the robot 1 is a robot that can self-propelled by wheels 111. The robot 1 includes a touch screen 109. The robot 1 has not only a speech function but also a function for talking with a person. For this reason, the robot 1 includes a microphone and a speaker (not shown) in the casing. The robot 1 can be used not only in ordinary households but also in public facilities. Examples of public facilities include a station and an airport.

＜Ｂ．処理の概要＞
ロボット１は、人との会話に基づき当該人の特徴を判定し、当該判定された特徴に応じた発話を行なう。典型的には、ロボット１は、人との会話における内容および音声に基づき、人が使用する方言の種別、人の年齢層、人の気分等を判定し、判定結果に基づいた発話を行なう。また、ある局面では、ロボット１は、人との会話に基づき当該人の特徴を判定し、当該判定された特徴に応じた発話を行なう。 <B. Outline of processing>
The robot 1 determines the characteristics of the person based on the conversation with the person, and performs an utterance according to the determined characteristics. Typically, the robot 1 determines the type of dialect used by the person, the age group of the person, the mood of the person, and the like based on the content and voice in the conversation with the person, and performs utterance based on the determination result. Moreover, in a certain situation, the robot 1 determines the characteristics of the person based on the conversation with the person, and performs an utterance according to the determined characteristics.

上述したように、上記特徴としては、方言の種別、年齢層、および会話時の気分が挙げられる。なお、特徴は、これらに限定されるものではない。以下では、特徴が上記の典型な例である場合の処理について、例を挙げて説明する。 As described above, the features include dialect type, age group, and mood during conversation. The features are not limited to these. In the following, an example is given and demonstrated about the process in case a characteristic is said typical example.

（ｂ１．方言の種別）
ロボット１は、人との会話を通じて、人の特徴として方言の種別を判定する。ロボット１は、当該判定結果に基づいた発話を行なう。たとえば、ロボット１は、判定された種別の方言を用いた発話を行なう。 (B1. Dialect type)
The robot 1 determines the type of dialect as a feature of the person through conversation with the person. The robot 1 speaks based on the determination result. For example, the robot 1 speaks using the determined type of dialect.

図２は、ロボット１と人（男性）９０１とが会話をしている状態を表した図である。図２を参照して、ロボット１は、人９０１が発話した言葉がたとえば関西弁であると判定すると、人９０１に対して関西弁で発話を行なう。 FIG. 2 is a diagram showing a state in which the robot 1 and a person (male) 901 are talking. Referring to FIG. 2, when robot 1 determines that the word spoken by person 901 is, for example, Kansai dialect, it speaks to person 901 with Kansai dialect.

判定結果に基づいた発話は、判定された種別の方言を用いた発話に限定されるものではない。たとえば、判定結果に基づいた発話は、判定された種別の方言が用いられている地域の情報が含まれたものであってもよい。 The utterance based on the determination result is not limited to the utterance using the determined type of dialect. For example, the utterance based on the determination result may include information on a region where the determined type of dialect is used.

（ｂ２．年齢）
ロボット１は、人９０１との会話を通じて、人９０１の特徴として年齢層を判定する。ロボット１は、当該判定結果に基づいた発話を行なう。典型的には、ロボット１は、判定された年齢層に応じた発話（話し方）をする。また、ロボット１は、判定された年齢層に応じた内容を発話する。 (B2. Age)
The robot 1 determines an age group as a characteristic of the person 901 through a conversation with the person 901. The robot 1 speaks based on the determination result. Typically, the robot 1 speaks (speaks) according to the determined age group. Further, the robot 1 utters content corresponding to the determined age group.

（ｂ３．気分）
ロボット１は、人９０１との会話を通じて、人９０１の特徴として会話時の人９０１の気分を判定する。ロボット１は、判定結果（気分）に応じた発話を行なう。たとえば、ロボット１は、人９０１が落ち込んでいると判定すると、優しい口調で発話する。 (B3. Mood)
The robot 1 determines the mood of the person 901 during the conversation as a characteristic of the person 901 through the conversation with the person 901. The robot 1 speaks according to the determination result (mood). For example, when the robot 1 determines that the person 901 is depressed, the robot 1 speaks with a gentle tone.

（ｂ４．利点）
以上のように、ロボット１が、人９０１との会話に基づき人９０１の特徴を判定し、当該判定された特徴に応じた発話を行なうため、ロボット１は、会話の相手に適した発話をすることが可能となる。それゆえ、人９０１は、ロボット１との会話を通じて、ロボット１に親近感を抱くことができる。したがって、ロボット１と人９０１との会話が活発になり、ロボット１が人９０１にとって一層役に立つ存在となり得る。 (B4. Advantages)
As described above, since the robot 1 determines the characteristics of the person 901 based on the conversation with the person 901 and performs an utterance according to the determined characteristic, the robot 1 speaks suitable for the conversation partner. It becomes possible. Therefore, the person 901 can feel close to the robot 1 through the conversation with the robot 1. Therefore, the conversation between the robot 1 and the person 901 becomes active, and the robot 1 can be more useful for the person 901.

以下、上記のような機能を有するロボット１の詳細について説明する。
＜Ｃ．データ＞
図３は、ロボット１が使用するデータベースＤ３の概略構成を説明するための図である。図３を参照して、データベースＤ３は、上述した特徴とロボットの発話形式とが対応付けられている。特徴は、複数に分類（大分類）されている。詳しくは、各分類は、さらに、複数に分類（小分類）されている。データベースＤ３においては、小分類毎に、ロボットの発話形式が対応付けられている。 Hereinafter, the details of the robot 1 having the above functions will be described.
<C. Data>
FIG. 3 is a diagram for explaining a schematic configuration of the database D3 used by the robot 1. As shown in FIG. Referring to FIG. 3, in the database D3, the above-described features are associated with the utterance format of the robot. The features are classified into a plurality (major classification). Specifically, each classification is further classified into a plurality (small classification). In the database D3, the robot speech format is associated with each small classification.

大分類としては、典型的には、上述したように、方言の種別、年齢層、および発話時の気分（機嫌）が挙げられる。方言の種別についての分類（小分類）としては、“標準語”、“関西弁”、“九州弁”等が挙げられる。年齢層についての分類としては、“幼い”、“学生”、“お年寄り”等が挙げられる。発話時の気分についての分類としては、“機嫌よい”、“怒り”、“苛立ち”、“落ち込み”等が挙げられる。 As described above, typically, the major classification includes dialect type, age group, and mood (money) when speaking. Examples of dialect types (small classification) include “standard language”, “Kansai dialect”, “Kyushu dialect”, and the like. Examples of age groups include “young”, “student”, and “old”. As a classification about the mood at the time of utterance, “moment”, “anger”, “irritation”, “depression” and the like can be mentioned.

ロボット１は、データベースＤ３を参照し、人９０１との会話に使用するフレーズ、語調等を決定する。詳しくは、ロボット１は、データベースＤ３を参照し、人９０１との会話を通じて、人９０１の特徴を判定する。その後、ロボット１は、判定された特徴に応じた発話を行なう。一例として、ロボット１は、人９０１の発話が関西弁であって、年寄りの発話であると判定すると、関西弁で発話するとともに、ゆっくり一語ずつ丁寧に話す。 The robot 1 refers to the database D3 and determines a phrase, tone, and the like used for conversation with the person 901. Specifically, the robot 1 refers to the database D3 and determines the characteristics of the person 901 through a conversation with the person 901. Thereafter, the robot 1 speaks according to the determined feature. As an example, when the robot 1 determines that the utterance of the person 901 is Kansai dialect and is an elderly utterance, the robot 1 speaks with Kansai dialect and slowly speaks one word at a time.

なお、小分類毎のロボットの発話形式の欄に、複数の発話形式（たとえば、“標準語で話す”、“日本で最近の話題を活用する”）が記載されている場合には、ロボット１は、全ての発話形式を利用して、人９０１に対して発話を行なってもよいし、あるいは、ロボット１は、複数の発話形式のうちの１つを利用して、人９０１に対して発話してもよい。後者の場合には、ロボット１は、予め定められた規則に基づき、どの発話形式を選択するかを決定すればよい。 If a plurality of utterance formats (for example, “speak in standard language”, “utilize recent topics in Japan”) are described in the utterance format column of the robot for each minor classification, the robot 1 May utter the person 901 using all utterance formats, or the robot 1 utters the person 901 using one of a plurality of utterance formats. May be. In the latter case, the robot 1 may determine which utterance format to select based on a predetermined rule.

＜Ｄ．機能的構成＞
図４は、ロボット１の機能的構成を説明するための機能ブロック図である。図４を参照して、ロボット１は、制御部１５１と、音声入力部１５２と、記憶部１５３と、音声出力部１５４と、通信部１５５とを備える。 <D. Functional configuration>
FIG. 4 is a functional block diagram for explaining the functional configuration of the robot 1. With reference to FIG. 4, the robot 1 includes a control unit 151, a voice input unit 152, a storage unit 153, a voice output unit 154, and a communication unit 155.

制御部１５１は、特徴判定部１５１０と、発話制御部１５２０とを含む。特徴判定部１５１０は、方言判定部１５１１と、年齢層判定部１５１２と、気分判定部１５１３とを有する。 Control unit 151 includes a feature determination unit 1510 and an utterance control unit 1520. The feature determination unit 1510 includes a dialect determination unit 1511, an age group determination unit 1512, and a mood determination unit 1513.

制御部１５１は、ロボット１の全体の動作を制御する。詳しくは、制御部１５１は、記憶部１５３に記憶された、オペレーティングシステムおよび各種のプログラムを実行することにより、ロボット１の全体の動作を制御する。さらに詳しくは、ロボット１は、記憶部１５３に記憶されたデータベースＤ３（図３）を参照して、特徴を判定し、当該判定された特徴に応じた発話を行なうための制御を行なう。 The control unit 151 controls the overall operation of the robot 1. Specifically, the control unit 151 controls the overall operation of the robot 1 by executing an operating system and various programs stored in the storage unit 153. More specifically, the robot 1 determines a feature with reference to the database D3 (FIG. 3) stored in the storage unit 153, and performs control for performing an utterance according to the determined feature.

音声入力部１５２は、マイク１０８（図６）に対応する。音声入力部１５２からは、人９０１等の声、周囲の雑音等が入力される。 The voice input unit 152 corresponds to the microphone 108 (FIG. 6). From the voice input unit 152, the voice of the person 901, ambient noise, and the like are input.

音声出力部１５４は、スピーカ１０６（図６）に対応する。音声出力部１５４は、典型的には、発話のために音声を出力する。 The audio output unit 154 corresponds to the speaker 106 (FIG. 6). The audio output unit 154 typically outputs audio for utterance.

通信部１５５は、無線通信ＩＦ（InterFace）１１２およびアンテナ１１３に対応する。通信部１５５は、他の通信機器（図示せず）と通信するために設けられている。 The communication unit 155 corresponds to the wireless communication IF (InterFace) 112 and the antenna 113. The communication unit 155 is provided for communicating with other communication devices (not shown).

次に、制御部１５１の特徴判定部１５１０と発話制御部１５２０との処理について説明する。 Next, processing of the feature determination unit 1510 and the speech control unit 1520 of the control unit 151 will be described.

特徴判定部１５１０は、人９０１との会話に基づき人９０１の特徴を判定する。発話制御部１５２０は、上記判定された特徴に応じた発話を音声出力部１５４に行なわせる。 The feature determination unit 1510 determines the feature of the person 901 based on the conversation with the person 901. The utterance control unit 1520 causes the audio output unit 154 to perform utterances according to the determined characteristics.

方言判定部１５１１は、人９０１が方言で発話した場合、当該方言の種別を判定する。具体的に説明すると以下の通りである。記憶部１５３には、各地域の方言（複数の言葉）が当該地域（正確には地域の識別情報）に対応付けて記憶されている。方言判定部１５１１は、人９０１が方言を発話した場合、当該方言の種別を判定する。方言判定部１５１１は、判定結果を発話制御部１５２０に送る。 When the person 901 speaks in a dialect, the dialect determination unit 1511 determines the type of the dialect. Specifically, it is as follows. The storage unit 153 stores dialects (a plurality of words) of each region in association with the region (exactly, identification information of the region). When the person 901 speaks a dialect, the dialect determination unit 1511 determines the type of the dialect. The dialect determination unit 1511 sends the determination result to the utterance control unit 1520.

発話制御部１５２０は、典型的には、判定された種別の方言を音声出力部１５４に発話させる制御を行なう。具体的には、発話制御部１５２０は、データベースＤ３におけるロボットの発話形式を参照することにより、たとえば判定された種別の方言を音声出力部１５４に発話させる制御を行なう。 The utterance control unit 1520 typically performs control to cause the voice output unit 154 to utter the determined type of dialect. Specifically, the utterance control unit 1520 controls the voice output unit 154 to utter, for example, a dialect of the determined type by referring to the utterance format of the robot in the database D3.

あるいは、発話制御部１５２０は、データベースＤ３におけるロボットの発話形式を参照することにより、判定された種別の方言が用いられている地域の情報を含んだ発話を、音声出力部１５４に行わせてもよい。 Alternatively, the utterance control unit 1520 may cause the voice output unit 154 to utter an utterance including information on a region where the determined type of dialect is used by referring to the utterance format of the robot in the database D3. Good.

年齢層判定部１５１２は、人９０１が発話した場合、人９０１の年齢層を判定する。具体的に説明すると以下の通りである。具体的に説明すると以下の通りである。記憶部１５３には、年齢層を判定するためのデータ（図示せず）が記憶されている。当該データは、各年齢層が使用する特有の言葉の情報等が格納されている。年齢層判定部１５１２は、人９０１が発話した場合、発話の内容（使用される言葉等）に基づき、年齢層を判定する。
年齢層判定部１５１２は、判定結果を発話制御部１５２０に送る。 The age group determination unit 1512 determines the age group of the person 901 when the person 901 speaks. Specifically, it is as follows. Specifically, it is as follows. The storage unit 153 stores data (not shown) for determining the age group. The data stores information on unique words used by each age group. When the person 901 speaks, the age group determination unit 1512 determines the age group based on the content of the utterance (such as words used).
The age group determination unit 1512 sends the determination result to the utterance control unit 1520.

発話制御部１５２０は、典型的には、判定された年齢層に見合った話し方で、音声出力部１５４に発話させる制御を行なう。具体的には、発話制御部１５２０は、データベースＤ３におけるロボットの発話形式を参照することにより、たとえば判定された年齢層に見合った話で、音声出力部１５４に発話させる制御を行なう。 The utterance control unit 1520 typically performs control to cause the audio output unit 154 to utter in a manner appropriate to the determined age group. Specifically, the utterance control unit 1520 refers to the robot utterance format in the database D3, and controls the voice output unit 154 to utter, for example, a story that matches the determined age group.

あるいは、発話制御部１５２０は、データベースＤ３におけるロボットの発話形式を参照することにより、判定された年齢層に見合った内容を音声出力部１５４に発話させてもよい。 Alternatively, the utterance control unit 1520 may cause the audio output unit 154 to utter the content corresponding to the determined age group by referring to the utterance format of the robot in the database D3.

気分判定部１５１３は、人９０１が発話した場合、人９０１の気分を判定する。具体的に説明すると以下の通りである。記憶部１５３には、様々な気分を表す言葉が、当該気分（正確には、気分の識別情報）に対応付けて記憶されている。気分判定部１５１３は、人９０１が発話した場合、当該発話における言葉が対応付けれた気分を判定する。気分判定部１５１３は、判定結果を発話制御部１５２０に送る。なお、気分判定部１５１３は、人９０１の発話における声の抑揚等に基づいて、気分を判定してもよい。この場合には、判定の精度を上げるため、ロボット１は、人９０１の通常時の抑揚を事前に学習しておくことが好ましい。 The mood determination unit 1513 determines the mood of the person 901 when the person 901 speaks. Specifically, it is as follows. In the storage unit 153, words representing various moods are stored in association with the mood (more accurately, identification information of the mood). When the person 901 speaks, the mood determination unit 1513 determines the mood associated with the words in the utterance. The mood determination unit 1513 sends the determination result to the utterance control unit 1520. Note that the mood determination unit 1513 may determine the mood based on voice inflection or the like in the speech of the person 901. In this case, it is preferable that the robot 1 learns the inflection of the person 901 in advance in order to increase the accuracy of the determination.

発話制御部１５２０は、典型的には、判定された気分に応じた発話を、音声出力部１５４にさせる制御を行なう。具体的には、発話制御部１５２０は、データベースＤ３におけるロボットの発話形式を参照することにより、たとえば判定された気分に応じた発話を音声出力部１５４に行わせる。 The utterance control unit 1520 typically performs control to cause the audio output unit 154 to make an utterance according to the determined mood. Specifically, the utterance control unit 1520 causes the voice output unit 154 to utter, for example, according to the determined mood by referring to the utterance format of the robot in the database D3.

＜Ｅ．制御構造＞
図５は、ロボット１における処理の流れを説明するためのフローチャートである。図５を参照して、ステップＳ２において、ロボット１は、特徴についてのデフォルトの設定で、人９０１との会話する。ステップＳ４において、ロボット１は、人９０１との会話を通じて、人９０１の特徴を判定する。ステップＳ６において、ロボット１は、特徴が判定できたか否かを判断する。 <E. Control structure>
FIG. 5 is a flowchart for explaining the flow of processing in the robot 1. Referring to FIG. 5, in step S 2, robot 1 has a conversation with person 901 with default settings for features. In step S4, the robot 1 determines the characteristics of the person 901 through a conversation with the person 901. In step S6, the robot 1 determines whether or not the feature has been determined.

特徴が判定できたと判断された場合（ステップＳ６においてＹＥＳ）、ロボット１は、ステップＳ８において、判定された特徴で会話をする。たとえば、デフォルトの設定が標準語である場合、人９０１の特徴が関西弁であると判定されると、ロボット１は、標準語による発話（会話）を取止め、関西弁に応じた発話（会話）をする。なお、特徴が判定できていないと判断された場合（ステップＳ６においてＮＯ）、ロボット１は、処理をステップＳ４に戻す。 If it is determined that the feature has been determined (YES in step S6), the robot 1 has a conversation with the determined feature in step S8. For example, when the default setting is a standard word, if it is determined that the characteristic of the person 901 is the Kansai dialect, the robot 1 stops the utterance (conversation) using the standard word, and the utterance (conversation) according to the Kansai dialect. do. If it is determined that the feature cannot be determined (NO in step S6), the robot 1 returns the process to step S4.

ステップＳ１０において、ロボット１は、会話の終了を示すイベントが発生したか否かを判断する。「インベントの発生」としては、たとえば、ロボット１の周囲に人９０１がいなくなったこと、人９０１からの発話が予め定めらた時間行われなかったこと、人９０１から所定の入力を受け付けたこと等とすることができる。 In step S10, the robot 1 determines whether an event indicating the end of the conversation has occurred. Examples of the “occurrence of an event” include that the person 901 is no longer around the robot 1, that the utterance from the person 901 has not been performed for a predetermined time, that a predetermined input has been received from the person 901, etc. It can be.

イベントが発生したと判断した場合（ステップＳ１０においてＹＥＳ）、ロボット１は、ステップＳ１２において、判定された特徴をデフォルトに設定する。これにより、ロボット１は、次回の会話において、最初から、上記判定された特徴で会話を開始することができる。たとえば、ロボット１は、ステップＳ２において、関西弁で会話を開始することができる。なお、イベントが発生していないと判断した場合（ステップＳ１０においてＮＯ）、ロボット１は、処理をステップＳ８に戻す。 If it is determined that an event has occurred (YES in step S10), the robot 1 sets the determined feature as a default in step S12. Thereby, the robot 1 can start a conversation with the determined characteristics from the beginning in the next conversation. For example, the robot 1 can start a conversation with the Kansai dialect in step S2. When it is determined that no event has occurred (NO in step S10), the robot 1 returns the process to step S8.

なお、特徴についてのデフォルトの設定を変更しない構成であってもよい。たとえば、ステップＳ２において、常に標準語で会話を開始するようにロボット１を構成してもよい。 A configuration in which default settings for features are not changed may be used. For example, in step S2, the robot 1 may be configured to always start a conversation using a standard language.

＜Ｆ．ハードウェア構成＞
図６は、ロボット１のハードウェア構成を表した図である。図６を参照して、ロボット１は、プログラムを実行するＣＰＵ１０１と、ＲＯＭ（Read Only Memory）１０２と、ＲＡＭ（Random Access Memory）１０３と、フラッシュメモリ１０４と、操作キー１０５と、スピーカ１０６と、カメラ１０７と、マイク１０８と、タッチスクリーン１０９と、モータ１１０と、車輪１１１と、無線通信ＩＦ（Interface）１１２と、アンテナ１１３とを、少なくとも含んで構成されている。タッチスクリーン１０９は、ディスプレイ１０９１と、タッチパネル１０９２とを含む。各構成要素１０１〜１１０，１１２は、相互にデータバスによって接続されている。 <F. Hardware configuration>
FIG. 6 is a diagram illustrating a hardware configuration of the robot 1. Referring to FIG. 6, the robot 1 includes a CPU 101 that executes a program, a ROM (Read Only Memory) 102, a RAM (Random Access Memory) 103, a flash memory 104, an operation key 105, a speaker 106, It includes at least a camera 107, a microphone 108, a touch screen 109, a motor 110, wheels 111, a wireless communication IF (Interface) 112, and an antenna 113. The touch screen 109 includes a display 1091 and a touch panel 1092. The components 101 to 110 and 112 are connected to each other by a data bus.

アンテナ１１３は、無線通信ＩＦ１１２に接続されている。アンテナ１１３および無線通信ＩＦ１１２は、たとえば、基地局を介した、他の移動体端末、固定電話、およびＰＣ（Personal Computer）との間における無線通信に用いられる。 The antenna 113 is connected to the wireless communication IF 112. The antenna 113 and the wireless communication IF 112 are used for wireless communication with other mobile terminals, fixed telephones, and PCs (Personal Computers) via a base station, for example.

ＲＯＭ１０２は、不揮発性の半導体メモリである。ＲＯＭ１０２は、ロボット１のブートプログラムが予め格納されている。フラッシュメモリ１０４は、不揮発性の半導体メモリである。フラッシュメモリ１０４は、一例としてＮＡＮＤ型で構成してもよい。フラッシュメモリ１０４は、ロボット１のオペレーティングシステム、ロボット１を制御するための各種のプログラム、並びに、ロボット１が生成したデータ、ロボット１の外部装置から取得したデータ等の各種データを揮発的に格納する。 The ROM 102 is a nonvolatile semiconductor memory. The ROM 102 stores a boot program for the robot 1 in advance. The flash memory 104 is a nonvolatile semiconductor memory. The flash memory 104 may be configured as a NAND type as an example. The flash memory 104 stores various data such as an operating system of the robot 1, various programs for controlling the robot 1, data generated by the robot 1, and data acquired from an external device of the robot 1 in a volatile manner. .

ロボット１における処理は、各ハードウェアおよびＣＰＵ１０１により実行されるソフトウェアによって実現される。このようなソフトウェアは、フラッシュメモリ１０４に予め記憶されている場合がある。また、ソフトウェアは、図示しないメモリカードその他の記憶媒体に格納されて、プログラムプロダクトとして流通している場合もある。あるいは、ソフトウェアは、いわゆるインターネットに接続されている情報提供事業者によってダウンロード可能なプログラムプロダクトとして提供される場合もある。このようなソフトウェアは、アンテナ１１３および無線通信ＩＦ１１２を介してダウンロードされた後、フラッシュメモリ１０４に一旦格納される。そのソフトウェアは、ＣＰＵ１０１によってフラッシュメモリ１０４から読み出され、さらにフラッシュメモリ１０４に実行可能なプログラムの形式で格納される。ＣＰＵ１０１は、そのプログラムを実行する。 The processing in the robot 1 is realized by each hardware and software executed by the CPU 101. Such software may be stored in the flash memory 104 in advance. The software may be stored in a memory card or other storage medium (not shown) and distributed as a program product. Alternatively, the software may be provided as a program product that can be downloaded by an information provider connected to the so-called Internet. Such software is downloaded via the antenna 113 and the wireless communication IF 112 and then temporarily stored in the flash memory 104. The software is read from the flash memory 104 by the CPU 101 and further stored in the flash memory 104 in the form of an executable program. The CPU 101 executes the program.

本発明の本質的な部分は、フラッシュメモリ１０４その他の記憶媒体に格納されたソフトウェア、あるいはネットワークを介してダウンロード可能なソフトウェアであるともいえる。なお、記録媒体としては、ＤＶＤ-ＲＯＭ、ＣＤ−ＲＯＭ、ＦＤ、ハードディスクに限られず、磁気テープ、カセットテープ、光ディスク、光カード、マスクＲＯＭ、ＥＰＲＯＭ、ＥＥＰＲＯＭ、フラッシュＲＯＭなどの半導体メモリ等の固定的にプログラムを担持する媒体でもよい。また、記録媒体は、当該プログラム等をコンピュータが読取可能な一時的でない媒体である。また、ここでいうプログラムとは、ＣＰＵにより直接実行可能なプログラムだけでなく、ソースプログラム形式のプログラム、圧縮処理されたプログラム、暗号化されたプログラム等を含む。
［実施の形態２］
本実施の形態では、上述したとおり、ロボット１が複数の人間と会話するときの処理を説明する。 An essential part of the present invention can be said to be software stored in the flash memory 104 or other storage medium, or software that can be downloaded via a network. The recording medium is not limited to DVD-ROM, CD-ROM, FD, and hard disk, but is fixed such as semiconductor memory such as magnetic tape, cassette tape, optical disk, optical card, mask ROM, EPROM, EEPROM, and flash ROM. A medium carrying a program may be used. The recording medium is a non-temporary medium that can be read by the computer. The program here includes not only a program directly executable by the CPU but also a program in a source program format, a compressed program, an encrypted program, and the like.
[Embodiment 2]
In the present embodiment, as described above, processing when the robot 1 has a conversation with a plurality of humans will be described.

図７は、ロボット１と複数の人間とが輪になっている状況を表した図である。図７を参照して、ロボット１は、２名の大人の男性９０１，９０２と、２名の大人の女性９０３，９０４と、１名の男の子（子供）と会話が可能な状態となっている。 FIG. 7 is a diagram illustrating a situation where the robot 1 and a plurality of humans are in a circle. Referring to FIG. 7, the robot 1 is in a state where it can talk with two adult men 901 and 902, two adult women 903 and 904, and one boy (child). .

この場合、ロボット１は、５人と会話を行なう。会話の内容としては、５人全員に対するもの、特定の一人に対するもの、５人を１つのグループとしてとらえた場合における一部のサブグループ（たとえば、２人の大人の女性で構成されるサブグループ、４人の大人で構成されるサブグループ）に対するもの等がある。 In this case, the robot 1 has a conversation with five people. The content of the conversation is for all 5 people, for a specific person, and some subgroups when 5 people are considered as a group (for example, a subgroup consisting of 2 adult women, For subgroups of 4 adults).

図８は、ロボット１における処理の流れを説明するためのフローチャートである。図８を参照して、図５のフローチャートの相違は、ステップＳ１０２〜Ｓ１２０が追加されている点である。したがって、以下では、図５のフローチャートと異なる点を主として説明する。 FIG. 8 is a flowchart for explaining the flow of processing in the robot 1. Referring to FIG. 8, the difference in the flowchart of FIG. 5 is that steps S102 to S120 are added. Therefore, in the following, differences from the flowchart of FIG. 5 will be mainly described.

ステップＳ２の後、ロボット１は、ステップＳ１０２において、会話の相手が複数人か否かを判断する。複数人ではないと判断された場合（ステップＳ１０２においてＮＯ）、ロボット１は、処理をステップＳ４に進める。複数人であると判断された場合（ステップＳ１０２においてＹＥＳ）、ロボット１は、ステップＳ１０４において、会話を通じて各人の特徴の判定を開始する。ステップＳ１０６において、ロボット１は、全員分の特徴が判定されたか否かを判断する。 After step S2, the robot 1 determines whether or not there are a plurality of conversation partners in step S102. If it is determined that there are not a plurality of persons (NO in step S102), robot 1 advances the process to step S4. If it is determined that there are a plurality of persons (YES in step S102), the robot 1 starts determining characteristics of each person through conversation in step S104. In step S106, the robot 1 determines whether or not the characteristics for all the members have been determined.

全員の特徴が判定されていないと判断された場合（ステップＳ１０６においてＮＯ）、ロボット１は、ステップＳ１０８において、判定がなされた人数分の判定結果に基づいて、会話に用いる特徴を決定する。ステップＳ１１０において、ロボット１は、決定された特徴で会話する。 If it is determined that the characteristics of all the members have not been determined (NO in step S106), the robot 1 determines the characteristics to be used for the conversation in step S108 based on the determination results for the determined number of people. In step S110, the robot 1 has a conversation with the determined characteristics.

ステップＳ１１２において、ロボット１は、会話の終了を示すイベントが発生したか否かを判断する。インベントが発生したと判断された場合（ステップＳ１１２においてＹＥＳ）、ロボット１は、ステップＳ１１４において、決定された特徴をデフォルトに設定する。インベントが発生していないと判断された場合（ステップＳ１１２においてＮＯ）、ロボット１は、処理をステップＳ１０４に戻す。 In step S112, the robot 1 determines whether an event indicating the end of the conversation has occurred. If it is determined that an event has occurred (YES in step S112), robot 1 sets the determined feature as a default in step S114. If it is determined that no event has occurred (NO in step S112), robot 1 returns the process to step S104.

全員の特徴が判定されたと判断された場合（ステップＳ１０６においてＹＥＳ）、ロボット１は、ステップＳ１１６において、全員の判定結果に基づいて、会話に用いる特徴を決定する。ステップＳ１１８において、ロボット１は、決定された特徴で会話する。 If it is determined that the characteristics of all members have been determined (YES in step S106), the robot 1 determines the characteristics to be used for the conversation in step S116 based on the determination results of all members. In step S118, the robot 1 has a conversation with the determined characteristics.

ステップＳ１２０において、ロボット１は、会話の終了を示すイベントが発生したか否かを判断する。インベントが発生したと判断された場合（ステップＳ１２０においてＹＥＳ）、ロボット１は、処理をステップＳ１１４に進める。インベントが発生していないと判断された場合（ステップＳ１２０においてＮＯ）、ロボット１は、処理をステップＳ１２０に戻す。 In step S120, the robot 1 determines whether or not an event indicating the end of the conversation has occurred. If it is determined that an event has occurred (YES in step S120), robot 1 advances the process to step S114. If it is determined that no event has occurred (NO in step S120), robot 1 returns the process to step S120.

ロボット１は、ステップＳ１０８，Ｓ１１６において、典型的には、最も人数の多い特徴（つまり、特徴を共通にする人が最も多い特徴）を、会話に用いる特徴として決定する。たとえば、ロボット１は、全員（５人）のうち、３人の特徴が関西弁で、かつ２人の特徴が標準語であると判断した場合、ロボット１は、ステップＳ１１６において、会話に用いる特徴を関西弁とする。 In steps S108 and S116, the robot 1 typically determines the feature with the largest number of people (that is, the feature with the largest number of people having the same feature) as the feature used for the conversation. For example, if the robot 1 determines that the characteristics of three of all (5) are Kansai dialects and the characteristics of the two are standard words, the robot 1 uses the characteristics for conversation in step S116. Is the Kansai dialect.

以上のように、ロボット１は、複数の前記人との会話に基づき当該複数の人の各々の特徴を判定する。ロボット１は、判定された複数の特徴のうち、特徴を共通にする人が最も多い特徴を特定する。ロボット１は、特定された特徴に応じた発話を行なう。これにより、ロボット１は、複数人と会話する場合であっても、全体最適の観点から全体（グループ）に適した発話をすることができる。
［実施の形態３］
本実施の形態でも、実施の形態２と同様、ロボットが複数の人間と会話するときの処理を説明する。 As described above, the robot 1 determines the characteristics of each of the plurality of people based on conversations with the plurality of people. The robot 1 identifies a feature having the largest number of people who share a common feature among the determined features. The robot 1 speaks according to the specified feature. Thereby, even when the robot 1 is talking with a plurality of people, the robot 1 can make an utterance suitable for the whole (group) from the viewpoint of overall optimization.
[Embodiment 3]
In the present embodiment, similarly to the second embodiment, processing when the robot has a conversation with a plurality of humans will be described.

実施の形態２においては、ステップＳ１０８およびステップＳ１１６において決定された特徴でのみ会話が行われる構成を説明した。 In the second embodiment, the configuration in which the conversation is performed only with the feature determined in step S108 and step S116 has been described.

本実施の形態では、ロボット１が、ステップＳ１０８およびステップＳ１１６で決定された特徴に基づき、ステップＳ１１０およびステップＳ１１８において会話（発話）をすることを原則としつつも、ステップＳ１１０およびステップＳ１１８において例外処理を設ける。以下、例外処理について説明する。 In the present embodiment, the robot 1 basically performs conversation (utterance) in steps S110 and S118 based on the characteristics determined in steps S108 and S116, but exception processing is performed in steps S110 and S118. Is provided. Hereinafter, exception processing will be described.

図９は、図８のステップＳ１１０，Ｓ１１８の処理における例外処理を説明するためのフローチャートである。図９を参照して、ステップＳ２０２において、ロボット１は、発話するフレーズを決定する。ステップＳ２０４において、ロボット１は、決定されたフレーズは、全員向けのフレーズであるか否かを判断する。なお、フレーズが全員向けであるか否かは、フレーズ毎に予め識別情報を付加しておくことより、ロボット１が当該識別情報を参照して判断すればよい。 FIG. 9 is a flowchart for explaining exception processing in the processing of steps S110 and S118 of FIG. With reference to FIG. 9, in step S202, the robot 1 determines a phrase to be uttered. In step S204, the robot 1 determines whether or not the determined phrase is a phrase for everyone. Whether or not the phrase is for everyone can be determined by the robot 1 referring to the identification information by adding identification information in advance for each phrase.

全員向けのフレーズであると判断された場合（ステップＳ２０６においてＹＥＳ）、ロボット１は、ステップＳ１１０またはステップＳ１１２で決定された特徴で会話する。全員向けのフレーズであると判断されなかった場合（ステップＳ２０６においてＮＯ）、ロボット１は、特定の人、または特定のサブグループ向けの特徴で会話する。 If it is determined that the phrase is for everyone (YES in step S206), the robot 1 has a conversation with the characteristics determined in step S110 or step S112. If it is not determined that the phrase is for everyone (NO in step S206), the robot 1 has a conversation with a feature for a specific person or a specific subgroup.

本実施の形態における処理の一例を、具体例を挙げて説明すれば以下のとおりである。たとえば、図７においては、５人のうち４人が中年の大人であるため、ロボット１は、ステップＳ１１６において、会話に用いる特徴として、年齢層のうち“中年”を選択する。 An example of the processing in the present embodiment will be described with a specific example as follows. For example, in FIG. 7, since 4 out of 5 are middle-aged adults, the robot 1 selects “middle-aged” from the age group as a feature used for conversation in step S116.

このため、ロボット１は、ステップＳ１１８において、“中年”に対応したロボットの発話形式（図３参照）にて、会話する。しかしながら、フレーズが子供向けである場合には、会話に用いる特徴として、年齢層のうち“幼い”を選択する。つまり、フレーズが子供向けの場合には、ロボット１は、発話形式を“中年”に対応した形式から“幼い”に対応した形式に一時的に切り替えて、発話を行なう。 For this reason, in step S118, the robot 1 has a conversation in the robot's utterance format corresponding to “middle age” (see FIG. 3). However, when the phrase is for a child, “little” is selected from the age group as a feature used for conversation. That is, when the phrase is for children, the robot 1 utters by temporarily switching the utterance format from the format corresponding to “middle age” to the format corresponding to “little”.

以上のように、ロボット１は、複数の人のうち特定された特徴とは異なる特徴を有する人に対して発話する場合には、特定された特徴に応じた発話を行なわずに、当該人の特徴に応じた発話を行なう。これにより、ロボット１は、会話相手に対して一層柔軟な発話を行なうことが可能となる。
［実施の形態４］
上記の実施の形態１から３においては、ロボット１が上述したデータテーブルＤ３等の各種のデータ、プログラムを格納し、上述した全ての処理を単独で実行する構成を例に挙げて説明した。しかしながら、これに限定されず、ロボット１は、サーバと共同して、上述した各処理を実行してもよい。 As described above, when the robot 1 speaks to a person having a characteristic different from the identified characteristic among a plurality of persons, the robot 1 does not perform the utterance according to the identified characteristic. Speak according to the characteristics. Thereby, the robot 1 can perform more flexible utterance to the conversation partner.
[Embodiment 4]
In the first to third embodiments, the configuration in which the robot 1 stores various data and programs such as the data table D3 described above and executes all the processes described above as an example has been described. However, the present invention is not limited to this, and the robot 1 may execute the processes described above in cooperation with the server.

図１０は、ロボットとサーバとを備えた通信システムの概略図である。図１０を参照して、通信システムは、ロボット１Ａと、サーバ７００と、ルータ９００とを備える。ロボット１は、ルータ９００を介して、サーバ７００と通信可能に接続されている。なお、ロボット１Ａは、ロボット１と同様のハードウェア構成を有するため、ここでは、ロボット１Ａのハードウェア構成については繰り返し説明は行わない。 FIG. 10 is a schematic diagram of a communication system including a robot and a server. Referring to FIG. 10, the communication system includes a robot 1A, a server 700, and a router 900. The robot 1 is communicably connected to the server 700 via the router 900. Since the robot 1A has the same hardware configuration as the robot 1, the hardware configuration of the robot 1A will not be described repeatedly here.

このような通信システムでは、たとえば、ロボット１Ａの代わりにサーバ７００がデータベースＤ３を備えていてもよい。また、ロボット１Ａの代わりにサーバ７００が図４に示した特徴判定部１５１０を備えていてもよい。 In such a communication system, for example, the server 700 may include the database D3 instead of the robot 1A. Further, the server 700 may include the feature determination unit 1510 illustrated in FIG. 4 instead of the robot 1A.

このような構成であっても、ロボット１Ａは、会話の相手に適した発話をすることが可能となる。
［実施の形態５］
実施の形態１から４では、電子機器の一例として、人間型のロボット１，１Ａを例に挙げて説明したが、これに限定されるものではない。 Even with such a configuration, the robot 1A can make an utterance suitable for a conversation partner.
[Embodiment 5]
In the first to fourth embodiments, the humanoid robots 1 and 1A have been described as an example of the electronic apparatus. However, the present invention is not limited to this.

図１１は、ロボット２と人９０１とが会話をしている状態を表した図である。図１１を参照して、ロボット１，１Ａの代わりに掃除機型のロボット２に、ロボット１，１Ａと同様の発話処理（発話制御方法）を行なわせてもよい。 FIG. 11 is a diagram illustrating a state in which the robot 2 and the person 901 are having a conversation. Referring to FIG. 11, instead of robots 1, 1 A, vacuum cleaner-type robot 2 may perform speech processing (speech control method) similar to robots 1, 1 A.

＜まとめ＞
（１）以上のように、ロボット１，１Ａ，２は、人との会話に基づき人の特徴を判定し、判定された特徴に応じた発話を行なう。上記の構成によれば、会話の相手に適した発話をすることが可能となる。 <Summary>
(1) As described above, the robots 1, 1 A, 2 determine the characteristics of the person based on the conversation with the person, and perform utterances according to the determined characteristics. According to the above configuration, it is possible to utter a speech suitable for a conversation partner.

（２）たとえば、特徴は、方言の種別、年齢層、および会話時の気分のうちの少なくとも１つである。 (2) For example, the feature is at least one of a dialect type, an age group, and a mood during conversation.

（３）特徴は、方言の種別である。ロボット１，１Ａ，２は、発話のために音声を出力する音声出力部１５４と、方言の種別を判定する方言判定部１５１１と、判定された特徴に応じた発話を音声出力部１５４に行なわせる発話制御部１５２０とを備える。発話制御部１５２０は、音声出力部１５４に判定された種別の方言で発話させる。上記の構成によれば、ロボット１，１Ａ，２は、人が話している方言と同じ方言で発話することが可能となるため、会話が弾む。 (3) The feature is the type of dialect. The robots 1, 1 A, and 2 cause the voice output unit 154 to output voice for utterance, the dialect determination unit 1511 that determines the type of dialect, and the voice output unit 154 to perform utterance according to the determined feature. An utterance control unit 1520. The utterance control unit 1520 causes the voice output unit 154 to utter in the determined type of dialect. According to the above configuration, since the robots 1, 1A, and 2 can utter in the same dialect that a person is speaking, the conversation bounces.

（４）特徴は、方言の種別である。ロボット１，１Ａ，２は、発話のために音声を出力する音声出力部１５４と、方言の種別を判定する方言判定部１５１１と、判定された特徴に応じた発話を音声出力部１５４に行なわせる発話制御部１５２０とを備える。発話制御部１５２０は、音声出力部１５４に判定された種別の方言が用いられている地域の情報に関する発話を行なわせる。上記の構成によれば、ロボット１，１Ａ，２は、人の出身地に関する情報を発話するため、会話が弾む。 (4) The feature is a dialect type. The robots 1, 1 A, and 2 cause the voice output unit 154 to output voice for utterance, the dialect determination unit 1511 that determines the type of dialect, and the voice output unit 154 to perform utterance according to the determined feature. An utterance control unit 1520. The utterance control unit 1520 causes the audio output unit 154 to perform utterance regarding information on the area where the determined type of dialect is used. According to the above configuration, the robots 1, 1A, and 2 speak information related to the person's hometown, so that the conversation bounces.

（５）好ましくは、特徴は、年齢層である。ロボット１，１Ａ，２は、発話のために音声を出力する音声出力部１５４と、人の年齢層を判定する年齢層判定部１５１２と、判定された特徴に応じた発話を音声出力部１５４に行なわせる発話制御部１５２０とを備える。発話制御部１５２０は、音声出力部１５４に判定された年齢層に見合った話し方で発話を行わせる。上記の構成によれば、ロボット１，１Ａ，２は、同じような話し方で発話をするため、人は違和感を感じすることなくロボット１，１Ａ，２と会話ができる。 (5) Preferably, the feature is an age group. The robots 1, 1 A, and 2 include a voice output unit 154 that outputs a voice for utterance, an age group determination unit 1512 that determines a person's age group, and a voice corresponding to the determined feature to the voice output unit 154. And an utterance control unit 1520 to be performed. The utterance control unit 1520 causes the voice output unit 154 to utter in a manner appropriate for the determined age group. According to the above configuration, since the robots 1, 1A, 2 speak in the same way, the person can talk with the robots 1, 1A, 2 without feeling uncomfortable.

（６）発話制御部１５２０は、音声出力部１５４に判定された年齢層に見合った内容を発話させる。上記の構成によれば、ロボット１，１Ａ，２は、会話の内容に興味を抱きやすいため、人との会話が弾む。 (6) The utterance control unit 1520 causes the audio output unit 154 to utter content corresponding to the determined age group. According to the above configuration, the robots 1, 1 A, and 2 are easily interested in the content of the conversation, so that the conversation with the person is bounced.

（７）特徴は、気分である。ロボット１，１Ａ，２は、発話のために音声を出力する音声出力部１５４と、人の気分を判定する気分判定部１５１３と、判定された特徴に応じた発話を音声出力部１５４に行なわせる発話制御部１５２０とを備える。発話制御部１５２０は、音声出力部１５４に判定された気分に応じた発話を行なわせる。上記の構成によれば、ロボット１は、ユーザの気分に応じた発話を行なうため、ユーザは、ロボット１と心地の良い会話ができる。 (7) The feature is mood. The robots 1, 1 A, and 2 cause the voice output unit 154 to output voice for utterance, the mood determination unit 1513 that determines a person's mood, and the voice output unit 154 to perform the utterance according to the determined feature. An utterance control unit 1520. The utterance control unit 1520 causes the audio output unit 154 to perform utterance according to the determined mood. According to said structure, since the robot 1 speaks according to a user's mood, the user can have a pleasant conversation with the robot 1.

（８）ロボット１，１Ａ，２は、複数の人との会話に基づき複数の人の各々の特徴を判定する。ロボット１，１Ａ，２は、判定された複数の特徴のうち、特徴を共通にする人が最も多い特徴を特定する。ロボット１，１Ａ，２は、特定された特徴に応じた発話を行なう。上記の構成によれば、ロボット１，１Ａ，２は、複数人と会話する場合であっても、全体最適の観点から全体（グループ）に適した発話をすることができる。 (8) The robots 1, 1A, and 2 determine the characteristics of each of the plurality of people based on the conversation with the plurality of people. The robots 1, 1 A, and 2 identify a feature having the largest number of people who share a common feature among the determined features. The robots 1, 1 A, 2 perform utterances according to the specified characteristics. According to the above configuration, the robots 1, 1 A, 2 can utter utterances suitable for the whole (group) from the viewpoint of overall optimization even when talking to a plurality of people.

（９）ロボット１，１Ａ，２は、複数の人のうち特定された特徴とは異なる特徴を有する人に対して発話する場合には、特定された特徴に応じた発話を行なわずに、当該人の特徴に応じた発話を行なう。上記の構成によれば、ロボット１，１Ａ，２は、会話相手に対して一層柔軟な発話を行なうことが可能となる。
（１０）好ましくは、ロボット１，１Ａ，２は自走式である。 (9) When the robot 1, 1A, 2 speaks to a person having a feature different from the specified feature among a plurality of people, the robot 1, 1A, 2 does not perform the utterance according to the specified feature. Speak according to the characteristics of the person. According to the above configuration, the robots 1, 1 A, 2 can perform more flexible speech to the conversation partner.
(10) Preferably, the robots 1, 1A, 2 are self-propelled.

今回開示された実施の形態は例示であって、上記内容のみに制限されるものではない。本発明の範囲は特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time is an exemplification, and the present invention is not limited to the above contents. The scope of the present invention is defined by the terms of the claims, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１，１Ａ，２ロボット、１０６スピーカ、１０８マイク、１０９タッチスクリーン、１１１車輪、１５１制御部、１５２音声入力部、１５３記憶部、１５４音声出力部、１５５通信部、７００サーバ、９００ルータ、１５１０特徴判定部、１５１１方言判定部、１５１２年齢層判定部、１５１３気分判定部、１５２０発話制御部、Ｄ３データベース。 1,1A, 2 Robot, 106 Speaker, 108 Microphone, 109 Touch screen, 111 Wheel, 151 Control unit, 152 Audio input unit, 153 Storage unit, 154 Audio output unit, 155 Communication unit, 700 Server, 900 Router, 1510 Determination unit, 1511 Dialect determination unit, 1512 Age group determination unit, 1513 Mood determination unit, 1520 Speech control unit, D3 database.

Claims

人との会話に基づき前記人の特徴を判定するとともに前記判定された特徴に応じた発話を行なう電子機器であって、
複数の前記人との会話に基づき前記複数の人の各々の特徴を判定し、
前記判定された複数の前記特徴のうち、前記特徴を共通にする人が最も多い前記特徴を特定し、
前記特定された特徴に応じた発話を行ない、
前記複数の人のうち前記特定された特徴とは異なる特徴を有する人に対して発話する場合には、前記特定された特徴に応じた発話を行なわずに、当該人の特徴に応じた発話を行なう、電子機器。 A the determined row of power sale electronic devices utterance corresponding to the feature as well as determining the person of features based on the conversation with the person,
Determining characteristics of each of the plurality of persons based on conversations with the plurality of persons;
Among the determined plurality of features, identify the features with the most people who share the features,
Speaking according to the identified features,
When speaking to a person having a characteristic different from the identified characteristic among the plurality of persons, the utterance according to the characteristic of the person is performed without performing the utterance according to the identified characteristic. Electronic equipment to perform .

前記特徴は、方言の種別、年齢層、および前記会話時の気分のうちの少なくとも１つである、請求項１に記載の電子機器。 The electronic device according to claim 1, wherein the characteristic is at least one of a dialect type, an age group, and a mood during the conversation.

前記特徴は、前記方言の種別であって、
前記発話のために音声を出力する音声出力手段と、
前記方言の種別を判定する第１の判定手段と、
前記判定された特徴に応じた発話を前記音声出力手段に行なわせる発話制御手段とを備え、
前記発話制御手段は、前記音声出力手段に前記判定された種別の方言で発話させる、請求項２に記載の電子機器。 The feature is a type of the dialect,
Voice output means for outputting voice for the utterance;
First determining means for determining the type of the dialect;
Utterance control means for causing the voice output means to utter according to the determined characteristics,
The electronic device according to claim 2, wherein the utterance control unit causes the voice output unit to utter in the dialect of the determined type.

前記特徴は、前記方言の種別であって、
前記発話のために音声を出力する音声出力手段と、
前記方言の種別を判定する第１の判定手段と、
前記判定された特徴に応じた発話を前記音声出力手段に行なわせる発話制御手段とを備え、
前記発話制御手段は、前記音声出力手段に前記判定された種別の方言が用いられている地域の情報に関する発話を行なわせる、請求項２に記載の電子機器。 The feature is a type of the dialect,
Voice output means for outputting voice for the utterance;
A first determination means Ru determine Teisu a type of the dialect,
Utterance control means for causing the voice output means to utter according to the determined characteristics,
The electronic device according to claim 2, wherein the utterance control unit causes the voice output unit to perform utterance regarding information on a region where the determined type of dialect is used.

前記特徴は、前記年齢層であって、
前記発話のために音声を出力する音声出力手段と、
前記人の年齢層を判定する第２の判定手段と、
前記判定された特徴に応じた発話を前記音声出力手段に行なわせる発話制御手段とを備え、
前記発話制御手段は、前記音声出力手段に前記判定された年齢層に見合った話し方で発話を行わせる、請求項２に記載の電子機器。 The characteristic is the age group,
Voice output means for outputting voice for the utterance;
Second determination means for determining the age group of the person;
Utterance control means for causing the voice output means to utter according to the determined characteristics,
The electronic device according to claim 2, wherein the utterance control unit causes the audio output unit to utter in a manner suitable for the determined age group.

前記発話制御手段は、前記音声出力手段に前記判定された年齢層に見合った内容を発話させる、請求項５に記載の電子機器。 The electronic device according to claim 5, wherein the utterance control unit causes the voice output unit to utter content corresponding to the determined age group.

前記特徴は、前記気分であって、
前記発話のために音声を出力する音声出力手段と、
前記人の気分を判定する第３の判定手段と、
前記判定された特徴に応じた発話を前記音声出力手段に行なわせる発話制御手段とを備え、
前記発話制御手段は、前記音声出力手段に前記判定された気分に応じた発話を行なわせる、請求項２に記載の電子機器。 The characteristic is the mood,
Voice output means for outputting voice for the utterance;
Third determination means for determining the mood of the person;
Utterance control means for causing the voice output means to utter according to the determined characteristics,
The electronic device according to claim 2, wherein the utterance control unit causes the audio output unit to perform an utterance according to the determined mood.

前記電子機器は自走式のロボットである、請求項１から７のいずれか１項に記載の電子機器。 The electronic device is a self-propelled robot, electronic device according to any one of claims 1 to 7.

人との会話に基づき前記人の特徴を判定するとともに前記判定された特徴に応じた発話を行なう電子機器における発話制御方法であって、
前記電子機器が、複数の前記人との会話に基づき前記複数の人の各々の特徴を判定するステップと、
前記電子機器が、前記判定された複数の前記特徴のうち、前記特徴を共通にする人が最も多い前記特徴を特定するステップと、
前記電子機器が、前記特定された特徴に応じた発話を行なうステップと、
前記電子機器が、前記複数の人のうち前記特定された特徴とは異なる特徴を有する人に対して発話する場合には、前記特定された特徴に応じた発話を行なわずに、当該人の特徴に応じた発話を行なうステップとを備える、発話制御方法。 An utterance control method in an electronic device that determines the characteristics of the person based on a conversation with a person and performs an utterance according to the determined characteristics ,
The electronic device determining characteristics of each of the plurality of persons based on conversations with the plurality of persons;
The electronic device identifying the feature having the largest number of people who share the feature among the plurality of determined features;
The electronic device uttering in accordance with the identified characteristics;
When the electronic device utters a person having a characteristic different from the specified characteristic among the plurality of persons, the characteristic of the person is not performed without speaking according to the specified characteristic. An utterance control method comprising the steps of:

人との会話に基づき前記人の特徴を判定するとともに前記判定された特徴に応じた発話を行なう電子機器を制御するためのプログラムであって、
複数の前記人との会話に基づき前記複数の人の各々の特徴を判定するステップと、
前記判定された複数の前記特徴のうち、前記特徴を共通にする人が最も多い前記特徴を特定するステップと、
前記特定された特徴に応じた発話を行なうステップと、
前記複数の人のうち前記特定された特徴とは異なる特徴を有する人に対して発話する場合には、前記特定された特徴に応じた発話を行なわずに、当該人の特徴に応じた発話を行なうステップとを、前記電子機器のプロセッサに実行させる、プログラム。 A program for controlling an electronic device that determines the characteristics of the person based on a conversation with a person and performs an utterance according to the determined characteristics ,
Determining the characteristics of each of the plurality of persons based on conversations with the plurality of persons;
Identifying the feature having the largest number of people who share the feature among the plurality of determined features;
Uttering according to the identified features;
When speaking to a person having a characteristic different from the identified characteristic among the plurality of persons, the utterance according to the characteristic of the person is performed without performing the utterance according to the identified characteristic. A program for causing a processor of the electronic device to execute the step of performing .