JP6286289B2

JP6286289B2 - Management device, conversation system, conversation management method and program

Info

Publication number: JP6286289B2
Application number: JP2014122331A
Authority: JP
Inventors: 秀行窪田
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2014-06-13
Filing date: 2014-06-13
Publication date: 2018-02-28
Anticipated expiration: 2034-06-13
Also published as: JP2016005017A

Description

本発明は、ヘッドセット等の音声の入出力を行う通信機器を介して複数のユーザが会話するための技術に関する。 The present invention relates to a technique for conversation between a plurality of users via a communication device that inputs and outputs audio such as a headset.

複数の人物が集まる会合や講演会等のイベントの会場では、ヘッドセット等の音声の入出力を行う通信機器を介して、イベントの参加者同士が会話する会話システムが使用されることがある。この種の会話システムでは、各参加者が使用する言語が異なっていても会話が不自由とならないように、通訳者（翻訳者）又は翻訳エンジンによって参加者の発話内容が翻訳されることがある（例えば、特許文献１，２）。
同一の仮想空間に居る者同士が会話するシステムとして、特許文献３は、多地点音声通信システム（チャットシステム）において、一の音声コミュニケーション端末のユーザが、他の音声コミュニケーション端末のユーザに話し掛けたことを条件に、会話グループを構成することを開示している。また、特許文献３は、音声コミュニケーション端末を使用するユーザの顔の向きを検知し、仮想空間上で当該顔が向く方向に配置されている音源に基づいて、会話の相手を判定することを開示している。 In a venue for an event such as a meeting or lecture meeting where a plurality of persons gather, a conversation system in which event participants communicate with each other via a communication device that inputs and outputs audio such as a headset may be used. In this type of conversation system, the content of the participants' utterances may be translated by an interpreter (translator) or translation engine so that the conversation does not become inconvenient even if the language used by each participant is different. (For example, Patent Documents 1 and 2).
As a system in which people in the same virtual space can talk to each other, Patent Document 3 describes that in a multipoint voice communication system (chat system), a user of one voice communication terminal talks to a user of another voice communication terminal. It is disclosed that a conversation group is formed on the condition of. Patent Document 3 discloses that the orientation of a face of a user who uses a voice communication terminal is detected and a conversation partner is determined based on a sound source arranged in a direction in which the face faces in a virtual space. doing.

特開２０１２−１７００５９号公報JP 2012-170059 A 特開２００５−１９７５９５号公報JP 2005-197595 A 特開２０１２−１０８５８７号公報JP 2012-108587 A

特許文献１に記載された技術では、自分自身或いは所属するグループを特定する情報を、ヘッドセットのユーザが予め入力しておく必要がある。特許文献２に記載された技術では、予め定められた通信端末を介して、通信端末のユーザ同士が会話する。即ち、特許文献１，２に記載された技術では、各ユーザの発話内容は、事前に設定された会話の相手にのみ聴取される。このため、特許文献１，２に記載された技術では、例えば、通りすがりに出会ったユーザ同士が会話する場合にも、会話の相手の設定をユーザが都度行わなければならない、という問題がある。 In the technique described in Patent Document 1, it is necessary for a user of the headset to input in advance information for identifying himself or a group to which he belongs. In the technique described in Patent Document 2, users of communication terminals have a conversation with each other via a predetermined communication terminal. That is, in the techniques described in Patent Documents 1 and 2, the utterance content of each user is listened only to a conversation partner set in advance. For this reason, the techniques described in Patent Documents 1 and 2 have a problem that, for example, even when users who have met passing each other have a conversation, the user must set a conversation partner each time.

特許文献３に記載された技術では、ユーザの顔の向きによって会話の相手を特定するが、相手方の意思については特に考慮していない。よって、特許文献３に記載された技術では、相手方のユーザが、自身の意思に関係なく会話グループに含められてしまう、という問題がある。特許文献３には、ユーザが、会話の相手の名称や所定のキーワードの発話することによって、会話の相手が特定されることも記載されているが、この場合、会話の相手を指定するための発話を、ユーザが意識して行わなければならない。
これに対し、本発明の目的は、会話の相手を指定する動作をユーザが意識して行わなくとも、当該ユーザが属する会話のグループの柔軟な管理を実現することである。 In the technique described in Patent Document 3, the partner of the conversation is specified by the orientation of the user's face, but the intention of the partner is not particularly taken into consideration. Therefore, in the technique described in Patent Document 3, there is a problem that the other party user is included in the conversation group regardless of his / her own intention. Patent Document 3 also describes that a conversation partner is specified by a user speaking a name or a predetermined keyword of a conversation partner. In this case, in order to specify a conversation partner. Utterance must be consciously performed by the user.
On the other hand, an object of the present invention is to realize flexible management of a conversation group to which the user belongs even if the user does not consciously perform an operation of designating a conversation partner.

上述した課題を解決するため、本発明の管理装置は、音声データを送受信して音声の入出力を行う複数の通信機器の各々を使用する複数のユーザを、グループ分けし、同一のグループに属する前記ユーザ間で、前記音声の入出力による会話を実現させる会話システムの管理装置であって、前記複数の通信機器から、前記複数のユーザの各々の顔又は身体が向く方向を示す方向データを取得する方向データ取得手段と、取得された前記方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するグループ管理手段とを備える。 In order to solve the above-described problem, the management apparatus according to the present invention divides a plurality of users who use each of a plurality of communication devices that transmit and receive audio data and input and output audio, and belongs to the same group. A conversation system management apparatus that realizes a conversation by inputting and outputting voice between the users, and obtaining direction data indicating a direction in which each of the plurality of users faces or body is directed from the plurality of communication devices. Direction data acquisition means for performing the classification, and group management means for classifying the two or more users facing the face or body into the same group based on the acquired direction data.

本発明の管理装置において、前記複数のユーザの位置又は前記ユーザ間の距離を特定する特定手段を備え、前記グループ管理手段は、特定された前記位置又は前記距離が所定条件を満たし、且つ、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類してもよい。 In the management device of the present invention, the management device further includes a specifying unit that specifies the positions of the plurality of users or the distances between the users, and the group management unit satisfies the specified condition or the distance, and Two or more users having face or body facing each other may be classified into the same group.

本発明の管理装置において、前記複数のユーザの各々の移動状態を示す状態データを取得する状態データ取得手段を備え、前記グループ管理手段は、取得された前記状態データに基づいて、前記同一のグループに属する２以上の前記ユーザのうち、前記移動状態が所定条件を満たす前記ユーザを、当該グループから除外してもよい。 In the management device of the present invention, the management device further includes state data acquisition means for acquiring state data indicating a movement state of each of the plurality of users, and the group management means is configured to use the same group based on the acquired state data. Among the two or more users belonging to the group, the user whose moving state satisfies a predetermined condition may be excluded from the group.

本発明の管理装置において、前記グループ管理手段は、前記同一のグループ内で、一の前記ユーザの顔又は身体が、所定時間継続して他の少なくとも一部の前記ユーザの顔又は身体と向き合わなかった場合、当該一の前記ユーザを当該グループから除外してもよい。 In the management device of the present invention, the group management means may be configured such that, in the same group, the face or body of one user does not face the face or body of at least some other users for a predetermined time. In this case, the one user may be excluded from the group.

本発明の管理装置において、前記グループ管理手段は、前記同一のグループに属する２以上の前記ユーザのいずれかと、他の前記ユーザとの前記顔又は身体が向き合った場合、当該他の前記ユーザを当該グループに分類してもよい。 In the management device of the present invention, when the face or body of one of the two or more users belonging to the same group and the face or body of the other user face each other, the group management means You may classify into groups.

本発明の管理装置において、前記複数のユーザは、第１のユーザと、複数の第２のユーザとを含み、前記グループ管理手段は、前記第１のユーザの顔又は身体が前記第２のユーザの方向を向いた後、当該第２のユーザの顔又は身体が当該第１のユーザの方向を向いた場合、当該第１のユーザ及び当該第２のユーザを前記同一のグループに分類してもよい。 In the management device of the present invention, the plurality of users include a first user and a plurality of second users, and the group management means is configured such that the face or body of the first user is the second user. If the face or body of the second user faces the direction of the first user after facing the direction, the first user and the second user may be classified into the same group. Good.

本発明の管理装置において、前記グループ管理手段は、前記通信機器に入力された前記ユーザの音声のレベルに基づいて、前記同一のグループに分類する２以上の前記ユーザを決定してもよい。 In the management device of the present invention, the group management means may determine two or more users to be classified into the same group based on a voice level of the user input to the communication device.

本発明の会話システムは、複数のユーザの各々に使用され、音声データを送受信して音声の入出力を行う複数の通信機器と、前記複数のユーザをグループ分けし、同一のグループに属する前記ユーザ間で前記音声の入出力による会話を実現させる管理サーバとを備える会話システムであって、前記複数の通信機器の各々は、自機を使用する前記ユーザの顔又は身体が向く方向を検知する方向検知手段と、検知された前記顔又は身体が向く方向を示す方向データを、前記管理サーバへ送信する方向データ送信手段とを有し、前記管理サーバは、前記方向データ送信手段により送信された前記方向データを取得する方向データ取得手段と、取得された前記方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するグループ管理手段とを有する。 The conversation system of the present invention is used for each of a plurality of users, and a plurality of communication devices that transmit / receive voice data and input / output voices, and the plurality of users are grouped, and the users belonging to the same group A communication system including a management server that realizes a conversation based on voice input / output, wherein each of the plurality of communication devices detects a direction in which the face or body of the user using the own device faces Detection means and direction data transmission means for transmitting the detected direction data indicating the direction of the face or body to the management server, wherein the management server is transmitted by the direction data transmission means. Direction data acquisition means for acquiring direction data, and two or more users facing the face or body based on the acquired direction data are connected to the same group. And a group management means for classifying the-loop.

本発明の会話管理方法は、音声データを送受信して音声の入出力を行う複数の通信機器の各々を使用する複数のユーザを、グループ分けし、同一のグループに属する前記ユーザ間で、前記音声の入出力による会話を実現させる会話管理方法であって、前記複数のユーザの各々の顔又は身体が向く方向を検知するステップと、検知した前記顔が向く方向を示す方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するステップとを備える。 In the conversation management method of the present invention, a plurality of users who use each of a plurality of communication devices that transmit and receive voice data and input and output voices are grouped, and the voices among the users belonging to the same group are grouped. A conversation management method for realizing a conversation by input / output of the plurality of users, based on the step of detecting the direction in which each face or body of each of the plurality of users faces, and the direction data indicating the direction in which the detected face faces, Classifying two or more users facing each other in face or body into the same group.

本発明のプログラムは、音声データを送受信して音声の入出力を行う複数の通信機器の各々を使用する複数のユーザを、グループ分けし、同一のグループに属する前記ユーザ間で、前記音声の入出力による会話を実現させる会話システムを管理するコンピュータに、前記複数の通信機器から、前記複数のユーザの各々の顔又は身体が向く方向を示す方向データを取得するステップと、取得した前記方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するステップとを実行させるためのプログラムである。 The program of the present invention divides a plurality of users who use each of a plurality of communication devices that transmit and receive audio data and inputs and outputs audio, and inputs the audio between the users belonging to the same group. In the computer that manages the conversation system that realizes the conversation by output, the step of acquiring direction data indicating the direction in which each face or body of each of the plurality of users faces from the plurality of communication devices, and the acquired direction data And the step of classifying the two or more users facing the body or body into the same group.

本発明によれば、会話の相手を指定する動作をユーザが意識して行わなくとも、当該ユーザが属する会話のグループの柔軟な管理を実現することができる。 According to the present invention, it is possible to realize flexible management of a conversation group to which the user belongs without the user performing the operation of designating the conversation partner.

本発明の一実施形態に係る会話システムの全体構成を示す図。The figure which shows the whole structure of the conversation system which concerns on one Embodiment of this invention. 同実施形態に係る会話システムの構成を示すブロック図。The block diagram which shows the structure of the conversation system which concerns on the same embodiment. 同実施形態に係るグループＤＢに格納される情報の説明図。Explanatory drawing of the information stored in group DB which concerns on the embodiment. 同実施形態に係る会話グループの具体例の説明図。Explanatory drawing of the specific example of the conversation group which concerns on the embodiment. 同会話システムの会話グループを形成する処理を示すシーケンス図。The sequence diagram which shows the process which forms the conversation group of the conversation system. 同会話システムの会話グループにユーザを追加する処理を示すシーケンス図。The sequence diagram which shows the process which adds a user to the conversation group of the conversation system. 同会話グループにユーザを追加する処理の具体例の説明図。Explanatory drawing of the specific example of the process which adds a user to the conversation group. 同会話システムのユーザの移動状態に基づく会話グループを管理する処理のシーケンス図。The sequence diagram of the process which manages the conversation group based on the movement state of the user of the conversation system. 同会話システムのユーザの移動状態に基づく会話グループの管理の説明図。Explanatory drawing of management of the conversation group based on the movement state of the user of the conversation system. 同会話システムのユーザの顔の向く方向に基づく会話グループを管理する処理を示すシーケンス図。The sequence diagram which shows the process which manages the conversation group based on the direction where the user's face of the conversation system faces. 同会話システムのユーザの顔の向く方向に基づく会話グループの管理の説明図。Explanatory drawing of management of the conversation group based on the direction which the user's face of the conversation system faces. 本発明の変形例１に係る会話システムの会話グループを形成する処理を示すシーケンス図。The sequence diagram which shows the process which forms the conversation group of the conversation system which concerns on the modification 1 of this invention. 同会話システムの会話グループを形成する処理の具体例の説明図。Explanatory drawing of the specific example of the process which forms the conversation group of the conversation system. 本発明の変形例３に係る会話システムの会話グループを形成する処理を示すシーケンス図。The sequence diagram which shows the process which forms the conversation group of the conversation system which concerns on the modification 3 of this invention. 同会話システムの会話グループを形成する処理の具体例の説明図。Explanatory drawing of the specific example of the process which forms the conversation group of the conversation system.

以下、図面を参照して本発明の一実施形態を説明する。
図１は、本実施形態に係る会話システムの全体構成を示す図である。会話システム１は、管理サーバ１０と、複数のヘッドセット２０（２０Ａ，２０Ｂ，２０Ｃ，２０Ｄ，２０Ｅ）と、翻訳装置３０とを備える。ヘッドセット２０Ａ，２０Ｂ，２０Ｃ，２０Ｄ，２０Ｅの各々を使用するユーザを、順にユーザＡ，Ｂ，Ｃ，Ｄ，Ｅと表す。管理サーバ１０及び複数のヘッドセット２０の各々は、ネットワーク１００に接続する。ヘッドセット２０は、ネットワーク１００へのゲートウェイとなる無線通信端末Ｐ（例えばスマートフォン又はタブレット端末）を介して、ネットワーク１００に接続する。図１には、ユーザＡが使用（携帯）する無線通信端末Ｐのみが示されているが、ユーザＢ，Ｃ，Ｄ，Ｅの各ユーザも、ユーザＡと同様に、無線通信端末Ｐを使用（携帯）する。ネットワーク１００は、例えば、移動体通信網、ゲートウェイ装置及びインターネットを含む公衆通信回線である。
なお、会話システム１に含まれるヘッドセット２０は５台に限られず、２台以上４台以下又は６台以上であってもよい。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram illustrating an overall configuration of a conversation system according to the present embodiment. The conversation system 1 includes a management server 10, a plurality of headsets 20 (20A, 20B, 20C, 20D, 20E) and a translation device 30. The users who use each of the headsets 20A, 20B, 20C, 20D, and 20E are sequentially represented as users A, B, C, D, and E. Each of the management server 10 and the plurality of headsets 20 is connected to the network 100. The headset 20 is connected to the network 100 via a wireless communication terminal P (for example, a smartphone or a tablet terminal) serving as a gateway to the network 100. FIG. 1 shows only the wireless communication terminal P used (carried) by the user A, but each of the users B, C, D, and E uses the wireless communication terminal P in the same manner as the user A. (Mobile). The network 100 is a public communication line including, for example, a mobile communication network, a gateway device, and the Internet.
Note that the number of headsets 20 included in the conversation system 1 is not limited to five, and may be two or more and four or less or six or more.

管理サーバ１０は、複数のヘッドセット２０を介して複数のユーザ間で行われる会話を実現させるサーバ装置である。管理サーバ１０は、ヘッドセット２０を使用する複数のユーザをグループ分けし、同一のグループ（以下「会話グループ」という。）に属するユーザ間で会話を実現させるためのデータの管理を行う管理装置として機能する。
ヘッドセット２０は、ユーザの頭又は耳に装着（固定）して使用され、音声データの送受信を行って音声の入出力を行う通信機器である。ヘッドセット２０は、いわゆるウェアラブルコンピュータの一種である。 The management server 10 is a server device that realizes a conversation performed between a plurality of users via a plurality of headsets 20. The management server 10 is a management device that groups a plurality of users who use the headset 20 and manages data for realizing a conversation between users belonging to the same group (hereinafter referred to as “conversation group”). Function.
The headset 20 is a communication device that is used by being attached (fixed) to the user's head or ear, and that transmits and receives audio data to input and output audio. The headset 20 is a kind of so-called wearable computer.

翻訳装置３０は、ヘッドセット２０が送信した音声データに基づいて音声認識を行うことにより、当該音声データを文字コード（テキストデータ）に変換して、他言語に翻訳する翻訳処理を行う。翻訳装置３０が行う翻訳処理は、公知の翻訳エンジンで行われる翻訳処理と同じでよい。
なお、翻訳装置３０は、ここでは管理サーバ１０とは別に設けられた装置（例えば翻訳サーバ）により実現されるが、管理サーバ１０に組み込まれる形態の装置であってもよい。また、翻訳装置３０は、ここでは管理サーバ１０を介してヘッドセット２０との間で音声データの送受信を行う。ただし、翻訳装置３０は、ネットワーク１００に接続する場合、管理サーバ１０を介さずに、ヘッドセット２０との間で音声データの送受信を行ってもよい。 The translation device 30 performs speech recognition based on the speech data transmitted by the headset 20, thereby converting the speech data into a character code (text data) and translating it into another language. The translation process performed by the translation apparatus 30 may be the same as the translation process performed by a known translation engine.
Here, the translation device 30 is realized by a device (for example, a translation server) provided separately from the management server 10, but may be a device incorporated in the management server 10. In addition, the translation device 30 transmits and receives voice data to and from the headset 20 via the management server 10 here. However, when the translation device 30 is connected to the network 100, the translation device 30 may transmit and receive voice data to and from the headset 20 without using the management server 10.

図２は、会話システム１の構成を示すブロック図である。図２において実線の矢印は、信号が流れる方向を意味する。
ヘッドセット２０は、ハードウェア構成として、制御部２１と、音声入力部２２と、音声出力部２３と、通信部２４と、方向センサ２５と、加速度センサ２６と、測位部２７と、発光部２８と、操作部２９とを備える。
制御部２１は、演算処理装置としてのＣＰＵ（Central Processing Unit）と、ＲＯＭ(Read Only Memory)及びＲＡＭ（Random Access Memory）を含むメモリを備えたマイクロコンピュータである。ＣＰＵは、ＲＯＭに記憶された制御プログラムをＲＡＭに読み出して実行することにより、ヘッドセット２０の各部を制御する。 FIG. 2 is a block diagram showing the configuration of the conversation system 1. In FIG. 2, a solid arrow means a direction in which a signal flows.
The headset 20 has, as a hardware configuration, a control unit 21, a voice input unit 22, a voice output unit 23, a communication unit 24, a direction sensor 25, an acceleration sensor 26, a positioning unit 27, and a light emitting unit 28. And an operation unit 29.
The control unit 21 is a microcomputer provided with a CPU (Central Processing Unit) as an arithmetic processing unit and a memory including a ROM (Read Only Memory) and a RAM (Random Access Memory). The CPU controls each unit of the headset 20 by reading the control program stored in the ROM into the RAM and executing it.

音声入力部２２は、例えばマイクロホン及びＡ／Ｄ（Analog to Digital）変換回路を有し、入力された音声を示す音声データを生成する。音声入力部２２は、マイクロホンに入力された音声を示すアナログ形式の音声信号を、Ａ／Ｄ変換回路を用いてデジタル形式に変換する。
音声出力部２３は、例えばスピーカ及びＤ／Ａ（Digital to Analog）変換回路を有し、音声データに基づいて音声を出力する。音声出力部２３は、Ｄ／Ａ変換回路を用いてデジタル形式の音声データをアナログ形式に変換し、変換後の音声信号に基づいてスピーカから音声を出力する。 The audio input unit 22 includes, for example, a microphone and an A / D (Analog to Digital) conversion circuit, and generates audio data indicating the input audio. The audio input unit 22 converts an analog audio signal indicating the audio input to the microphone into a digital format using an A / D conversion circuit.
The audio output unit 23 includes, for example, a speaker and a D / A (Digital to Analog) conversion circuit, and outputs audio based on audio data. The audio output unit 23 converts the audio data in the digital format into an analog format using the D / A conversion circuit, and outputs the audio from the speaker based on the converted audio signal.

通信部２４は、例えば無線通信回路及びアンテナを有し、ネットワーク１００に接続して無線通信を行う通信手段である。通信部２４は、無線通信端末Ｐと近距離無線通信を行うことにより、無線通信端末Ｐを介してネットワーク１００に接続する。近距離無線通信は、例えばＢｌｕｅｔｏｏｔｈ（登録商標）に準拠した無線通信であるが、Ｚｉｇｂｅｅ（登録商標）等の他方式に準拠した無線通信であってもよい。 The communication unit 24 includes, for example, a wireless communication circuit and an antenna, and is a communication unit that connects to the network 100 and performs wireless communication. The communication unit 24 connects to the network 100 via the wireless communication terminal P by performing short-range wireless communication with the wireless communication terminal P. The short-range wireless communication is, for example, wireless communication conforming to Bluetooth (registered trademark), but may be wireless communication conforming to another method such as Zigbee (registered trademark).

方向センサ２５は、例えばジャイロセンサ（角速度センサ）を有し、ヘッドセット２０を使用（装着）するユーザの顔が向く方向を検知するセンサである。方向センサ２５は、基準方向からのユーザの顔が向く方向の変化を検知することにより、ユーザの顔が向く方向を検知する方向検知手段として機能する。
なお、方向センサ２５は、ジャイロセンサ以外のセンサ、例えば２軸又は３軸の地磁気センサを有してもよい。方向センサ２５が地磁気センサを有する場合、ヘッドセット２０は、ユーザの顔が向く方向の基準方向を地磁気センサの検知結果により特定したり、ユーザの顔が向く方位を特定したりすることも可能である。 The direction sensor 25 includes, for example, a gyro sensor (angular velocity sensor), and is a sensor that detects a direction in which the face of the user who uses (wears) the headset 20 faces. The direction sensor 25 functions as a direction detection unit that detects a direction in which the user's face faces by detecting a change in the direction in which the user's face faces from the reference direction.
The direction sensor 25 may include a sensor other than the gyro sensor, for example, a biaxial or triaxial geomagnetic sensor. When the direction sensor 25 includes a geomagnetic sensor, the headset 20 can specify a reference direction in which the user's face is directed based on a detection result of the geomagnetic sensor, or can specify an orientation in which the user's face is directed. is there.

加速度センサ２６は、例えば２軸又は３軸の加速度センサであり、ヘッドセット２０に作用した加速度を検知するセンサである。加速度センサ２６は、ヘッドセット２０を使用するユーザの移動状態を検知する状態検知手段として機能する。ユーザの移動状態は、例えば、ユーザの移動の有無、及び、ユーザが移動する場合の移動方向並びに移動速度の状態である。
なお、ヘッドセット２０は、方向センサ２５及び加速度センサ２６に代えて、例えば３軸加速度、３軸角速度及び３軸地磁気の９軸モーションセンサを用いて、ユーザの顔が向く方向及びユーザの移動状態を検知してもよい。 The acceleration sensor 26 is, for example, a biaxial or triaxial acceleration sensor, and is a sensor that detects acceleration applied to the headset 20. The acceleration sensor 26 functions as a state detection unit that detects the movement state of the user who uses the headset 20. The movement state of the user is, for example, the presence / absence of the movement of the user, the movement direction and the movement speed when the user moves.
The headset 20 uses, for example, a 9-axis motion sensor of 3-axis acceleration, 3-axis angular velocity, and 3-axis geomagnetism instead of the direction sensor 25 and the acceleration sensor 26, and the user's face direction and the user's movement state. May be detected.

測位部２７は、ヘッドセット２０のユーザの位置を測定（測位）する手段である。測位部２７は、公知の屋内測位技術を用いて、屋内におけるユーザの現在位置を測位する。具体的な屋内測位技術については特に問わないが、測位部２７は、例えば、複数の無線アクセスポイントから受信した電波の強度及び到達時間に基づいて三点測量を行うことにより、ユーザの位置を測位する。 The positioning unit 27 is a means for measuring (positioning) the position of the user of the headset 20. The positioning unit 27 measures the current position of the user indoors using a known indoor positioning technique. The specific indoor positioning technique is not particularly limited. For example, the positioning unit 27 measures the position of the user by performing three-point surveying based on the strength and arrival time of radio waves received from a plurality of wireless access points. To do.

発光部２８は、例えば発光ダイオード（ＬＥＤ：Light Emitting Diode）を有し、所定の光を発する手段である。発光部２８は、ヘッドセット２０を使用するユーザ以外の人物が発光を知覚可能な位置に設けられる。
操作部２９は、例えば各種の物理キー（例えば押下ボタン）を有し、ユーザの操作を受け付ける操作手段である。 The light emitting unit 28 has, for example, a light emitting diode (LED) and is a unit that emits predetermined light. The light emitting unit 28 is provided at a position where a person other than the user who uses the headset 20 can perceive light emission.
The operation unit 29 is, for example, an operation unit that has various physical keys (for example, push buttons) and receives user operations.

管理サーバ１０は、ハードウェア構成として、制御部１１と、通信部１２と、グループＤＢ（Data Base）１３とを備える。
制御部１１は、演算処理装置としてのＣＰＵと、ＲＯＭ及びＲＡＭを含むメモリとを備えたマイクロコンピュータである。ＣＰＵは、ＲＯＭに記憶された制御プログラムをＲＡＭに読み出して実行することにより、管理サーバ１０の各部を制御する。通信部１２は、ネットワーク１００に接続するためのインタフェースである。 The management server 10 includes a control unit 11, a communication unit 12, and a group DB (Data Base) 13 as a hardware configuration.
The control unit 11 is a microcomputer including a CPU as an arithmetic processing device and a memory including a ROM and a RAM. The CPU controls each unit of the management server 10 by reading the control program stored in the ROM into the RAM and executing it. The communication unit 12 is an interface for connecting to the network 100.

グループＤＢ１３は、例えばハードディスク装置等の記憶装置で実現され、会話グループの管理に関する情報が格納（蓄積）されるデータベースである。本実施形態では、管理サーバ１０が、グループＤＢ１３を備えているが、外部装置としてのグループＤＢ１３にアクセスしてもよい。 The group DB 13 is a database that is realized by a storage device such as a hard disk device, for example, and stores (accumulates) information related to conversation group management. In the present embodiment, the management server 10 includes the group DB 13, but may access the group DB 13 as an external device.

図３は、グループＤＢ１３に格納される情報を説明する図である。図４は、会話グループの具体例を説明する図である。本実施形態では、屋内の会場Ｑでユーザ同士が会話する場合を説明する。
図３に示すように、グループＤＢ１３は、ヘッドセット２０のユーザ毎に、「ユーザＩＤ」と、「端末ＩＤ」と、「言語情報」と、「方向データ」と、「状態データ」と、「位置情報」と、「グループ情報」との各情報を対応付けて格納したデータベースである。
ユーザＩＤは、ヘッドセット２０のユーザを識別するユーザ識別子である。端末ＩＤは、ユーザが使用するヘッドセット２０を識別する端末識別子である。端末ＩＤは、例えば電話番号又は個体識別番号であるが、ヘッドセット２０へ情報を送信するために使用される通信アドレス（宛先情報）であってもよい。
なお、図３に示すユーザＩＤ及び端末ＩＤの末尾のアルファベット「Ａ」〜「Ｅ」は、図１で説明したヘッドセット２０Ａ〜２０Ｅ、及び、ユーザＡ〜Ｅのアルファベットと対応している。例えば、ユーザＩＤ「ＵＩＤ−Ａ」はユーザＡのユーザＩＤであり、端末ＩＤ「ＭＩＤ−Ａ」は、ヘッドセット２０Ａの端末ＩＤである。 FIG. 3 is a diagram for explaining information stored in the group DB 13. FIG. 4 is a diagram illustrating a specific example of a conversation group. In the present embodiment, a case will be described in which users have a conversation at an indoor venue Q.
As shown in FIG. 3, for each user of the headset 20, the group DB 13 includes “user ID”, “terminal ID”, “language information”, “direction data”, “state data”, “ This is a database in which the pieces of information “position information” and “group information” are stored in association with each other.
The user ID is a user identifier that identifies the user of the headset 20. The terminal ID is a terminal identifier that identifies the headset 20 used by the user. The terminal ID is, for example, a telephone number or an individual identification number, but may be a communication address (destination information) used for transmitting information to the headset 20.
Note that the alphabets “A” to “E” at the end of the user ID and terminal ID shown in FIG. 3 correspond to the alphabets of the headsets 20A to 20E and the users A to E described in FIG. For example, the user ID “UID-A” is the user ID of the user A, and the terminal ID “MID-A” is the terminal ID of the headset 20A.

言語情報は、ヘッドセット２０のユーザが使用する言語の情報であり、例えば、ユーザが日常的に使用する言語（例えば母国語）又はユーザが理解可能な言語である。言語情報は、例えば、ヘッドセット２０又は無線通信端末Ｐを用いて、予めユーザによって指定されている。方向データは、ヘッドセット２０のユーザの顔の向く方向を示すデータである。方向データが示す顔の方向は、例えば、ユーザＡ〜Ｅで共通の基準方向（例えば方位）からの方向の変化量により特定される。状態データは、ヘッドセット２０のユーザの移動状態を示すデータである。位置情報は、ヘッドセット２０のユーザの位置を示す。グループ情報は、ヘッドセット２０のユーザが属する会話グループを示す情報である。図３，４の例では、ユーザＡ及びユーザＢが属する会話グループＧ１と、ユーザＣ及びユーザＤが属する会話グループＧ２とが存在する。ユーザＥは、ここではどの会話グループにも属していない。
なお、言語情報、方向データ、状態データ、位置情報及びグループ情報の各々は、例えば、ユーザ毎に最新の情報がグループＤＢ１３に格納される。 The language information is information on a language used by the user of the headset 20, and is, for example, a language that the user uses on a daily basis (for example, a native language) or a language that the user can understand. The language information is designated in advance by the user using the headset 20 or the wireless communication terminal P, for example. The direction data is data indicating the direction in which the face of the user of the headset 20 faces. The face direction indicated by the direction data is specified by, for example, the amount of change in direction from a common reference direction (for example, azimuth) among the users A to E. The state data is data indicating the movement state of the user of the headset 20. The position information indicates the position of the user of the headset 20. The group information is information indicating a conversation group to which the user of the headset 20 belongs. 3 and 4, there are a conversation group G1 to which the user A and the user B belong, and a conversation group G2 to which the user C and the user D belong. User E does not belong to any conversation group here.
In addition, as for each of language information, direction data, state data, position information, and group information, for example, the latest information is stored in the group DB 13 for each user.

図２に戻り、ヘッドセット２０及び管理サーバ１０の機能構成を説明する。
ヘッドセット２０の制御部２１は、制御プログラムを実行することにより、音声データ送信手段２１１と、音声データ取得手段２１２と、報知手段２１３と、方向データ送信手段２１４と、状態データ送信手段２１５と、位置情報送信手段２１６とに相当する機能を実現する。
音声データ送信手段２１１は、音声入力部２２に入力された音声（例えばユーザの会話の音声）を示す音声データを、通信部２４を介して管理サーバ１０へ送信する手段である。
音声データ取得手段２１２は、通信部２４により管理サーバ１０から音声データが受信されると、受信された音声データを取得する手段である。この音声データは、例えば、他のヘッドセット２０の音声データ送信手段２１１により送信された音声データに対し、翻訳装置３０により翻訳処理が行われた後の音声データである。音声出力部２３は、音声データ取得手段２１２が取得した音声データに基づいて、音声を出力する。
報知手段２１３は、発光部２８を制御して、音声データ取得手段２１２により取得された音声データに基づいて音声出力部２３が音声出力したことを報知する手段である。 Returning to FIG. 2, functional configurations of the headset 20 and the management server 10 will be described.
The control unit 21 of the headset 20 executes a control program, thereby executing a voice data transmission unit 211, a voice data acquisition unit 212, a notification unit 213, a direction data transmission unit 214, a state data transmission unit 215, A function corresponding to the position information transmitting unit 216 is realized.
The voice data transmission unit 211 is a unit that transmits voice data indicating voice (for example, voice of a user's conversation) input to the voice input unit 22 to the management server 10 via the communication unit 24.
The voice data acquisition unit 212 is a unit that acquires the received voice data when the communication unit 24 receives the voice data from the management server 10. The voice data is, for example, voice data after the translation processing is performed on the voice data transmitted by the voice data transmission unit 211 of the other headset 20. The audio output unit 23 outputs audio based on the audio data acquired by the audio data acquisition unit 212.
The notifying unit 213 is a unit that controls the light emitting unit 28 to notify that the audio output unit 23 outputs a sound based on the sound data acquired by the sound data acquiring unit 212.

方向データ送信手段２１４は、方向センサ２５により検知されたユーザの顔が向く方向を示す方向データを、通信部２４を介して管理サーバ１０へ送信する手段である。
状態データ送信手段２１５は、加速度センサ２６により検知されたユーザの移動状態を示す状態データを、通信部２４を介して管理サーバ１０へ送信する手段である。
位置情報送信手段２１６は、測位部２７により測定されたユーザの位置を示す位置情報を、通信部２４を介して管理サーバ１０へ送信する手段である。 The direction data transmission unit 214 is a unit that transmits direction data indicating the direction in which the user's face is detected, which is detected by the direction sensor 25, to the management server 10 via the communication unit 24.
The state data transmission unit 215 is a unit that transmits state data indicating the movement state of the user detected by the acceleration sensor 26 to the management server 10 via the communication unit 24.
The position information transmitting unit 216 is a unit that transmits position information indicating the position of the user measured by the positioning unit 27 to the management server 10 via the communication unit 24.

管理サーバ１０は、制御プログラムを実行することにより、音声データ取得手段１１１と、翻訳制御手段１１２と、音声データ送信手段１１３と、方向データ取得手段１１４と、状態データ取得手段１１５と、特定手段１１６と、グループ管理手段１１７とに相当する機能を実現する。
音声データ取得手段１１１は、通信部１２によりヘッドセット２０から音声データが受信されると、受信された音声データを取得する手段である。音声データ取得手段１１１は、音声データ送信手段２１１により送信された音声データを取得する。
翻訳制御手段１１２は、音声データ取得手段１１１が取得した音声データに基づいて、翻訳装置３０に翻訳処理を行わせる手段である。翻訳制御手段１１２は、グループＤＢ１３に格納された言語情報に基づいて翻訳処理を行うように、翻訳装置３０を制御する。
音声データ送信手段１１３は、翻訳装置３０により翻訳処理が行われた後の音声データを、通信部１２を介してヘッドセット２０へ送信する手段である。ただし、送受信側のヘッドセット２０のユーザの言語情報が同じの場合、音声データ送信手段１１３は、翻訳処理を行うことなく、音声データを送信する。音声データ送信手段１１３は、一のユーザのヘッドセット２０からの音声データに基づいて翻訳処理が行われた場合、同じ会話グループに属する他の全てのユーザのヘッドセット２０へ音声データを送信する。 By executing the control program, the management server 10 executes the audio data acquisition unit 111, the translation control unit 112, the audio data transmission unit 113, the direction data acquisition unit 114, the state data acquisition unit 115, and the identification unit 116. And the function equivalent to the group management means 117 is implement | achieved.
The voice data acquisition unit 111 is a unit that acquires the received voice data when the communication unit 12 receives the voice data from the headset 20. The voice data acquisition unit 111 acquires the voice data transmitted by the voice data transmission unit 211.
The translation control unit 112 is a unit that causes the translation apparatus 30 to perform a translation process based on the audio data acquired by the audio data acquisition unit 111. The translation control unit 112 controls the translation device 30 so as to perform translation processing based on the language information stored in the group DB 13.
The audio data transmission unit 113 is a unit that transmits the audio data that has been subjected to translation processing by the translation apparatus 30 to the headset 20 via the communication unit 12. However, when the language information of the user of the headset 20 on the transmission / reception side is the same, the audio data transmission unit 113 transmits the audio data without performing a translation process. When the translation process is performed based on the voice data from the headset 20 of one user, the voice data transmission unit 113 transmits the voice data to the headsets 20 of all other users belonging to the same conversation group.

方向データ取得手段１１４は、方向データ送信手段２１４により送信された方向データが通信部１２により受信されると、当該方向データを取得する手段である。
状態データ取得手段１１５は、状態データ送信手段２１５により送信された状態データが通信部１２により受信されると、当該状態データを取得する手段である。
特定手段１１６は、ヘッドセット２０を使用する複数のユーザの位置又はユーザ間の距離を特定する手段である。本実施形態では、特定手段１１６は、位置情報送信手段２１６により送信された位置情報が通信部１２により受信されると、当該位置情報が示すユーザの位置を特定する。 The direction data acquisition unit 114 is a unit that acquires the direction data when the direction data transmitted by the direction data transmission unit 214 is received by the communication unit 12.
The status data acquisition unit 115 is a unit that acquires the status data when the status data transmitted by the status data transmission unit 215 is received by the communication unit 12.
The specifying unit 116 is a unit that specifies the positions of a plurality of users who use the headset 20 or the distances between the users. In the present embodiment, when the position information transmitted by the position information transmitting unit 216 is received by the communication unit 12, the specifying unit 116 specifies the position of the user indicated by the position information.

グループ管理手段１１７は、グループＤＢ１３に基づいて、ヘッドセット２０のユーザが属する会話グループを管理する手段である。グループ管理手段１１７は、方向データ取得手段１１４、状態データ取得手段１１５及び特定手段１１６の各々から供給された情報に基づいてグループＤＢ１３を更新することにより、ヘッドセット２０のユーザをグループ分けする。 The group management unit 117 is a unit that manages the conversation group to which the user of the headset 20 belongs based on the group DB 13. The group management unit 117 groups the users of the headset 20 by updating the group DB 13 based on the information supplied from each of the direction data acquisition unit 114, the state data acquisition unit 115, and the identification unit 116.

具体的には、グループ管理手段１１７は、方向データ取得手段１１４が取得した方向データに基づいて、顔が向き合った２以上のユーザを特定する。２人のユーザの顔が向き合った場合、これら２人のユーザの顔の向きが正反対を向く。即ち、これら２人のユーザの方向データが示す顔の方向をベクトル化した場合、両ベクトルの成す角は１８０度である。ただし、方向データが示す方向が正反対でなくても、正反対に近ければ、２人のユーザの顔が向き合ったとみなされる。このとき、両ベクトルの成す角は１８０±α（αは定数）度の範囲内に収まる。３人以上のユーザの顔が向き合った場合とは、各ユーザの顔が、他の少なくとも１人のユーザの顔と向き合ったことをいう。
なお、ここでは、２以上のユーザの顔が同時に向き合う場合を想定するが、同時に限られず、或る程度の時間差があってもよい。 Specifically, the group management unit 117 identifies two or more users whose faces face each other based on the direction data acquired by the direction data acquisition unit 114. When the faces of two users face each other, the faces of these two users face in opposite directions. That is, when the face direction indicated by the direction data of these two users is vectorized, the angle formed by both vectors is 180 degrees. However, even if the direction indicated by the direction data is not exactly the opposite, it is considered that the faces of the two users face each other if they are close to the opposite. At this time, the angle formed by both vectors falls within the range of 180 ± α (α is a constant). The case where the faces of three or more users face each other means that each user's face faces the face of at least one other user.
Here, it is assumed that two or more user faces face each other at the same time. However, the faces are not limited at the same time, and there may be a certain time difference.

グループ管理手段１１７は、顔が向き合った２以上のユーザについて、特定手段１１６が特定した複数のユーザの位置又はユーザ間の距離が所定条件を満たした場合に、当該２以上のユーザが対面条件を満たしたと判定する。この所定条件は、例えば、ユーザ間の距離が閾値以下であることを示す条件である。グループ管理手段１１７は、対面条件を満たした２以上のユーザを、同一の会話グループに分類する。 For two or more users facing each other, the group management unit 117 determines that the two or more users have face-to-face conditions when the positions of the plurality of users specified by the specifying unit 116 or the distances between the users satisfy a predetermined condition. Judge that it is satisfied. This predetermined condition is, for example, a condition indicating that the distance between users is equal to or less than a threshold value. The group management means 117 classifies two or more users who satisfy the meeting condition into the same conversation group.

また、グループ管理手段１１７は、会話グループを形成した後も、特定手段１１６が特定した複数のユーザの位置又はユーザ間の距離や、状態データ取得手段１１５が取得した状態データが示す移動状態に基づいて、会話グループを管理する。 Further, even after the conversation group is formed, the group management unit 117 is based on the positions of a plurality of users specified by the specifying unit 116 or the distances between the users, and the movement state indicated by the state data acquired by the state data acquiring unit 115. Manage conversation groups.

次に、本実施形態の動作を説明する。
＜Ａ：会話グループの形成＞
図５は、会話システム１の会話グループを形成する処理を示すシーケンス図である。
ヘッドセット２０Ａ〜２０Ｅの各々は、自機のユーザの顔が向く方向を、方向センサ２５を用いて検知する（ステップＳ１）。ヘッドセット２０Ａ〜２０Ｅの各々は、例えば、共通の基準方向からの方向の変化量により、ユーザの顔が向く方向を検知する。基準方向は、方向センサ２５により検知されるが、ユーザにより指定されてもよい。次に、ヘッドセット２０Ａ〜２０Ｅの各々は、自機のユーザの位置を、測位部２７を用いて測位する（ステップＳ２）。ヘッドセット２０Ａ〜２０Ｅの各々は、検知した顔の向く方向を示す方向データ及び測位したユーザの位置を示す位置情報を、通信部２４を介して管理サーバ１０へ送信する（ステップＳ３）。
ヘッドセット２０Ａ〜２０Ｅの各々は、ステップＳ１〜Ｓ３の処理を、例えば所定間隔で（例えば５秒毎に）繰り返し実行する。ただし、ヘッドセット２０Ａ〜２０Ｅの各々で、ステップＳ１〜Ｓ３の処理の実行タイミングが一致（同期）している必要はない。 Next, the operation of this embodiment will be described.
<A: Formation of conversation group>
FIG. 5 is a sequence diagram showing processing for forming a conversation group of the conversation system 1.
Each of the headsets 20A to 20E detects the direction in which the face of the user of the own device faces using the direction sensor 25 (step S1). Each of the headsets 20 A to 20 E detects the direction in which the user's face is directed, for example, based on the amount of change in the direction from the common reference direction. The reference direction is detected by the direction sensor 25, but may be designated by the user. Next, each of the headsets 20A to 20E measures the position of the user of the own device using the positioning unit 27 (step S2). Each of the headsets 20A to 20E transmits the direction data indicating the detected direction of the face and the position information indicating the position of the measured user to the management server 10 via the communication unit 24 (step S3).
Each of the headsets 20A to 20E repeatedly executes the processing of steps S1 to S3, for example, at a predetermined interval (for example, every 5 seconds). However, in each of the headsets 20A to 20E, it is not necessary that the execution timings of the processes in steps S1 to S3 match (synchronize).

管理サーバ１０は、ステップＳ３の処理で送信された方向データ及び位置情報を、通信部１２を介して取得し、グループＤＢ１３を更新する（ステップＳ４）。管理サーバ１０は、方向データ及び位置情報に基づいて、対面条件を満たしたと判定した２以上のユーザを同一の会話グループに分類するように、グループＤＢ１３を更新する。
図４に示す例では、ユーザＡとユーザＢとが対面条件を満たしている。このため、管理サーバ１０は、ユーザＡとユーザＢを同じ会話グループＧ１に分類するように、グループＤＢ１３を更新する。同様に、ユーザＣとユーザＤも対面条件を満たしている。このため、管理サーバ１０は、ユーザＣとユーザＤを同じ会話グループＧ２に分類するように、グループＤＢ１３を更新する。ユーザＥは他のどのユーザとも対面していないので、管理サーバ１０は、グループ情報を「−」（ブランク）としておく。ステップＳ４の更新後のグループＤＢ１３は、図３に示すとおりである。
なお、管理サーバ１０は、どの会話グループにも属していないユーザのみを対象として会話グループに分類してもよいし、既にいずれかの会話グループにも属しているユーザを別の会話グループに分類してもよい。後者の場合、管理サーバ１０は、先に属していた会話グループから除外することで、いずれか１の会話グループにユーザを分類する。 The management server 10 acquires the direction data and position information transmitted in the process of step S3 via the communication unit 12, and updates the group DB 13 (step S4). Based on the direction data and the position information, the management server 10 updates the group DB 13 so as to classify two or more users determined to satisfy the meeting condition into the same conversation group.
In the example shown in FIG. 4, the user A and the user B satisfy the meeting condition. For this reason, the management server 10 updates the group DB 13 so as to classify the user A and the user B into the same conversation group G1. Similarly, user C and user D also meet the meeting condition. For this reason, the management server 10 updates the group DB 13 so as to classify the user C and the user D into the same conversation group G2. Since the user E is not facing any other user, the management server 10 sets the group information to “−” (blank). The updated group DB 13 in step S4 is as shown in FIG.
Note that the management server 10 may classify only users who do not belong to any conversation group into a conversation group, or classify users who already belong to any conversation group to another conversation group. May be. In the latter case, the management server 10 classifies users into one of the conversation groups by excluding them from the conversation group that previously belonged.

会話グループを形成した後、ヘッドセット２０ＡにユーザＡの会話の音声が入力されたとする（ステップＳ５）。ここでは、ユーザＡの日本語の会話の音声が入力される。この場合、ヘッドセット２０Ａは、会話の音声を示す音声データを、通信部２４を介して管理サーバ１０へ送信する（ステップＳ６）。管理サーバ１０は、通信部１２により音声データを受信（取得）すると、翻訳装置３０に翻訳処理を実行させる翻訳制御を行う（ステップＳ７）。具体的には、管理サーバ１０は、グループＤＢ１３に基づいて、ユーザＡと同じ会話グループに属する他のユーザ（ここではユーザＢ）の言語情報を特定する。ユーザＢの言語情報は、図３に示すように「英語」である。このため、管理サーバ１０は、受信した音声データに基づいて、会話の音声を「日本語」から「英語」へ翻訳する翻訳処理を、翻訳装置３０に実行させる。 After forming the conversation group, it is assumed that the voice of the conversation of the user A is input to the headset 20A (step S5). Here, the voice of the user A's Japanese conversation is input. In this case, the headset 20A transmits audio data indicating the audio of the conversation to the management server 10 via the communication unit 24 (step S6). When the communication server 12 receives (acquires) the voice data, the management server 10 performs translation control that causes the translation apparatus 30 to perform translation processing (step S7). Specifically, the management server 10 specifies language information of another user (here, user B) belonging to the same conversation group as the user A based on the group DB 13. The language information of user B is “English” as shown in FIG. For this reason, the management server 10 causes the translation apparatus 30 to execute a translation process for translating the speech voice from “Japanese” to “English” based on the received voice data.

管理サーバ１０は、翻訳処理後の音声データを翻訳装置３０から受信（取得）すると、ユーザＡと同じ会話グループに属するユーザＢのヘッドセット２０Ｂへ、当該音声データを通信部１２を介して送信する（ステップＳ８）。ヘッドセット２０Ｂは、通信部２４を介して取得した音声データに基づいて、音声出力する（ステップＳ９）。更に、ヘッドセット２０Ｂは、音声データに基づいて音声出力したことを、発光部２８に発光させることにより、ユーザへ報知する（ステップＳ１０）。ヘッドセット２０Ｂは、例えば、音声出力されている期間中、又は、音声出力の開始時若しくは終了時において、発光部２８を発光させる。
ステップＳ９の報知は、発話者であるユーザＡに対して、ユーザＢによって会話の音声が聴取されたことを通知するために行われる。ユーザＡは、自身が発話した後タイミングで、ヘッドセット２０Ｂの発光部２８の発光を知覚すれば、自身の発話内容がユーザＢに聴取されたことが分かる。反対に、発光部２８の発光がなければ、ユーザＡは発話内容が聴取されていない可能性があることが分かるので、再び同じ内容を発話する等の対処を採ることができる。
ステップＳ９の報知において、ヘッドセット２０は、発話者側であるユーザ又は当該ユーザが属する会話グループによって発光の方法（例えば色や発光のパターン）を異ならせてもよい。これにより、ヘッドセット２０は、発話者又はどの会話グループでの会話が行われたかを他者に把握させることができる。 When the management server 10 receives (acquires) the translated speech data from the translation device 30, the management server 10 transmits the speech data to the headset 20B of the user B belonging to the same conversation group as the user A via the communication unit 12. (Step S8). The headset 20B outputs a sound based on the sound data acquired via the communication unit 24 (step S9). Furthermore, the headset 20B notifies the user that the sound is output based on the sound data by causing the light emitting unit 28 to emit light (step S10). For example, the headset 20 B causes the light emitting unit 28 to emit light during a period in which sound is output or at the start or end of sound output.
The notification in step S9 is performed to notify the user A who is a speaker that the conversation voice has been heard by the user B. If the user A perceives the light emission of the light emitting unit 28 of the headset 20B at the timing after the user speaks, the user A knows that the content of his speech has been heard by the user B. On the other hand, if the light emitting unit 28 does not emit light, the user A knows that there is a possibility that the utterance content has not been listened to, so it is possible to take measures such as uttering the same content again.
In the notification in step S9, the headset 20 may vary the light emission method (for example, color or light emission pattern) depending on the user who is the speaker or the conversation group to which the user belongs. Thereby, the headset 20 can make another person grasp the speaker or in which conversation group the conversation was performed.

ユーザＢが発話した場合の会話システム１の動作は、ステップＳ５〜Ｓ１０で説明した処理と同じ流れで実行される。この動作を簡単に説明すると、ヘッドセット２０Ｂは、英語の会話の音声を示す音声データを管理サーバ１０へ送信する。ヘッドセット２０Ｂは、グループＤＢ１３に基づいて、ユーザＡの会話の音声を日本語に翻訳する翻訳処理を、翻訳装置３０に実行させる。そして、管理サーバ１０は、翻訳処理後の音声データをヘッドセット２０Ａへ送信する。ヘッドセット２０Ａは、翻訳処理後の音声データに基づいて音声出力するとともに、発光部２８に発光させる。
なお、会話グループＧ２におけるユーザＣとユーザＤとの会話も、前述した説明の手順で実行される。 The operation of the conversation system 1 when the user B speaks is executed in the same flow as the processing described in steps S5 to S10. Briefly describing this operation, the headset 20 B transmits voice data indicating the voice of the English conversation to the management server 10. Based on the group DB 13, the headset 20 B causes the translation device 30 to execute a translation process for translating the voice of the conversation of the user A into Japanese. Then, the management server 10 transmits the translated voice data to the headset 20A. The headset 20 A outputs a sound based on the sound data after the translation process and causes the light emitting unit 28 to emit light.
Note that the conversation between the user C and the user D in the conversation group G2 is also executed according to the procedure described above.

＜Ｂ：会話グループの更新／ユーザの追加＞
会話システム１では、会話グループを形成した後、この会話グループに新たなユーザを追加する機能を有する。例えば、図４で説明した会話グループＧ１が形成された後、ユーザＥが、ユーザＡ，Ｂの会話に途中から参加する場合がある。以下、会話システム１において、会話グループＧ１に、ユーザＥを追加するときの動作を説明する。 <B: Update conversation group / add user>
The conversation system 1 has a function of adding a new user to the conversation group after the conversation group is formed. For example, after the conversation group G1 described in FIG. 4 is formed, the user E may participate in the conversations of the users A and B from the middle. Hereinafter, in the conversation system 1, an operation when the user E is added to the conversation group G1 will be described.

図６は、会話システム１の会話グループにユーザを追加する処理を示すシーケンス図である。図６の処理ステップのうち、図５と同じ処理ステップについては同じ符号を付して表す。図７は、会話グループにユーザを追加する処理の具体例を説明する図である。
会話グループＧ１，Ｇ２が形成された後も、ヘッドセット２０Ａ〜２０Ｅの各々は、ステップＳ１〜Ｓ３の処理を実行する。そして、管理サーバ１０は、ステップＳ３の処理で送信された方向データ及び位置情報を通信部１２を介して取得すると、グループＤＢ１３を更新する（ステップＳ４）。ユーザＥが、会話グループＧ１に属するユーザＡ又はユーザＢとの対面条件を満たす場合、管理サーバ１０は、会話グループＧ１にユーザＥを追加するように、グループＤＢ１３を更新する。この更新により、図７（ａ）に示すように、会話グループＧ１がユーザＡ，Ｂ，Ｅの３者で構成される。グループＤＢ１３においては、図７（ｂ）に示すように、ユーザＥのユーザＩＤ「ＵＩＤ−Ｅ」に対応付けて、グループ情報として「Ｇ１」が格納される。 FIG. 6 is a sequence diagram showing processing for adding a user to the conversation group of the conversation system 1. Among the processing steps in FIG. 6, the same processing steps as those in FIG. 5 are denoted by the same reference numerals. FIG. 7 is a diagram illustrating a specific example of processing for adding a user to a conversation group.
Even after the conversation groups G1 and G2 are formed, each of the headsets 20A to 20E executes the processes of steps S1 to S3. And the management server 10 will update group DB13, if the direction data and position information which were transmitted by the process of step S3 are acquired via the communication part 12 (step S4). When the user E satisfies the meeting condition with the user A or the user B belonging to the conversation group G1, the management server 10 updates the group DB 13 so as to add the user E to the conversation group G1. As a result of this update, the conversation group G1 is composed of three users A, B, and E, as shown in FIG. In the group DB 13, as shown in FIG. 7B, “G1” is stored as group information in association with the user ID “UID-E” of the user E.

会話グループを形成した後の会話システム１の動作は、大略、＜Ａ：会話グループの形成＞の項で説明したとおりである。ただし、ヘッドセット２０ＡにユーザＡの会話の音声が入力された場合、管理サーバ１０は、翻訳処理により生成された翻訳処理後の音声データをヘッドセット２０Ａへ送信する一方で（ステップＳ８ａ）、ユーザＥのヘッドセット２０Ｅにも音声データを送信する（ステップＳ８ｂ）。図７（ｂ）に示すように、ユーザＥの言語情報は、ユーザＡと同じく日本語である。このため、管理サーバ１０は、翻訳装置３０に翻訳処理を行わせることなく、ヘッドセット２０Ａから受信した音声データを、ヘッドセット２０Ｅへ送信する。そして、ヘッドセット２０Ｂ，２０Ｅの各々は、取得した音声データに基づいて音声出力するとともに、発光部２８に発光させる（ステップＳ９，Ｓ１０）。
なお、会話グループＧ１におけるユーザＢ又はユーザＥが発話したときの会話システム１の動作は、以上の説明から容易に類推できるので、説明を省略する。 The operation of the conversation system 1 after the conversation group is formed is substantially as described in the section <A: Conversation group formation>. However, when the voice of the conversation of the user A is input to the headset 20A, the management server 10 transmits the voice data after translation processing generated by the translation processing to the headset 20A (step S8a), while the user The audio data is also transmitted to the headset 20E of E (step S8b). As shown in FIG. 7B, the language information of the user E is Japanese as with the user A. For this reason, the management server 10 transmits the audio data received from the headset 20A to the headset 20E without causing the translation device 30 to perform translation processing. Each of the headsets 20B and 20E outputs a sound based on the acquired sound data and causes the light emitting unit 28 to emit light (steps S9 and S10).
In addition, since the operation | movement of the conversation system 1 when the user B or the user E in the conversation group G1 speaks can be easily analogized from the above description, description is abbreviate | omitted.

会話グループにユーザを追加するときの対面条件は、当該ユーザと、当該会話グループに属するいずれか一のユーザとの対面条件を満たすこと以外であってもよい。会話グループにユーザを追加するときの対面条件は、当該ユーザと、当該会話グループに属する全てのユーザとの対面条件を満たすことであってもよい。この場合、会話グループに追加される対象のユーザは、当該会話グループに属するユーザの各々と顔が向き合うような動作をすることで、対面条件が満たされる。 The meeting condition when adding a user to the conversation group may be other than satisfying the meeting condition between the user and any one user belonging to the conversation group. The meeting condition when adding a user to a conversation group may be that the meeting condition between the user and all users belonging to the conversation group is satisfied. In this case, the user to be added to the conversation group performs the operation such that the face faces each of the users belonging to the conversation group, so that the facing condition is satisfied.

＜Ｃ：会話グループからのユーザの除外／移動状態＞
会話システム１では、会話グループを形成した後、当該会話グループから一部のユーザを除外する機能を有する。複数ユーザからなる会話グループにおいて、一部のユーザが立ち去る等した場合に、会話システム１では当該ユーザを会話グループから除外する。
なお、２人のユーザからなる会話グループからユーザが除外された場合、当該会話グループが解除（消滅）することとなる。 <C: Exclusion of user from conversation group / movement state>
The conversation system 1 has a function of excluding some users from the conversation group after the conversation group is formed. In a conversation group consisting of a plurality of users, when some users leave or the like, the conversation system 1 excludes the user from the conversation group.
When a user is excluded from a conversation group composed of two users, the conversation group is canceled (disappears).

図８は、会話システム１のユーザの移動状態に基づいて会話グループからユーザを除外するときの処理を示すシーケンス図である。以下、ユーザＡとユーザＢからなる会話グループＧ１から、ユーザＡ又はユーザＢを除外するときの動作を説明する。
会話グループに属するヘッドセット２０Ａ，２０Ｂの各々は、加速度センサ２６により検知された加速度に基づいて、ユーザの移動状態を検知する（ステップＳ１１）。ユーザの移動状態は、前述のとおり、ユーザの移動の有無と、移動する場合の移動方向や移動速度を含む。次に、ヘッドセット２０Ａ，２０Ｂの各々は、検知したユーザの移動状態を示す状態データを、通信部２４を介して管理サーバ１０へ送信する（ステップＳ１２）。
ヘッドセット２０Ａ，２０Ｂの各々は、ステップＳ１１，Ｓ１２の処理を、例えば所定間隔で（例えば５秒毎に）繰り返し実行する。ただし、ヘッドセット２０Ａ，２０Ｂの各々で、ステップＳ１１，Ｓ１２の処理の実行タイミングが一致（同期）している必要はない。
管理サーバ１０は、状態データが受信されると、受信された状態データを取得して、グループＤＢ１３を更新する（ステップＳ１３）。 FIG. 8 is a sequence diagram showing processing when a user is excluded from the conversation group based on the movement state of the user of the conversation system 1. Hereinafter, an operation when the user A or the user B is excluded from the conversation group G1 including the users A and B will be described.
Each of the headsets 20A and 20B belonging to the conversation group detects the movement state of the user based on the acceleration detected by the acceleration sensor 26 (step S11). As described above, the movement state of the user includes the presence / absence of the user's movement and the moving direction and moving speed when moving. Next, each of the headsets 20A and 20B transmits state data indicating the detected movement state of the user to the management server 10 via the communication unit 24 (step S12).
Each of the headsets 20A and 20B repeatedly executes the processes of steps S11 and S12 at a predetermined interval (for example, every 5 seconds). However, it is not necessary that the execution timings of the processes of steps S11 and S12 are the same (synchronized) in each of the headsets 20A and 20B.
When the status data is received, the management server 10 acquires the received status data and updates the group DB 13 (step S13).

次に、管理サーバ１０は、グループＤＢ１３の更新後の状態データに基づいて、ユーザの移動状態が会話グループからの除外条件を満たすかどうかを判断する（ステップＳ１４）。除外条件は、複数のユーザで行われている会話に参加しなくなったことを示すユーザの移動状態を示す。除外条件は、例えば、ユーザの移動の有無と、移動する場合の移動方向及び移動速度で特定される移動状態が、一のユーザと、同じ会話グループの他のユーザとで異なることを示す。管理サーバ１０は、例えば、移動の有無、移動方向又は移動速度のうちの１つ以上が異なる場合に、除外条件を満たすと判断する。管理サーバ１０は、ユーザの移動状態が除外条件を満たすと判断した場合（ステップＳ１４；ＹＥＳ）、当該ユーザを会話グループから除外する（ステップＳ１５）。会話グループＧ１に属するユーザＡとユーザＢの移動状態が異なる場合、管理サーバ１０は、ユーザＡ及びユーザＢの各々を、会話グループＧ１から除外する。 Next, the management server 10 determines whether or not the movement state of the user satisfies the exclusion condition from the conversation group based on the updated state data of the group DB 13 (step S14). The exclusion condition indicates a movement state of a user indicating that he / she does not participate in a conversation conducted by a plurality of users. The exclusion condition indicates, for example, that the presence / absence of the user's movement and the movement state specified by the movement direction and movement speed when moving are different between one user and another user in the same conversation group. For example, the management server 10 determines that the exclusion condition is satisfied when one or more of the presence / absence of movement, the movement direction, or the movement speed are different. If the management server 10 determines that the movement state of the user satisfies the exclusion condition (step S14; YES), the management server 10 excludes the user from the conversation group (step S15). When the movement states of the user A and the user B belonging to the conversation group G1 are different, the management server 10 excludes the user A and the user B from the conversation group G1.

図９に示すように、同じ会話グループに属するユーザＡとユーザＢが歩きながら会話している場合、両者は移動しているものの、会話が行われていると推測される。この場合、管理サーバ１０は、ステップＳ１４の処理で「ＮＯ」と判断し、会話グループＧ１を維持する。 As shown in FIG. 9, when the user A and the user B belonging to the same conversation group are talking while walking, it is estimated that the two are moving but the conversation is being performed. In this case, the management server 10 determines “NO” in the process of step S14, and maintains the conversation group G1.

＜Ｄ：会話グループからのユーザの除外／顔の向く方向＞
会話システム１では、会話グループを形成した後もユーザの顔が向く方向を検知し、同じ会話グループ内の他のユーザと顔が向き合わない時間が長いユーザを、当該会話グループから除外する機能を有する。一般に、会話をしている人物同士は、常にではなくとも、多くの時間で顔が向き合っているはずである。言い換えると、会話をしている人物同士は、基本的には顔が向き合っているが、一時的に視線を逸らすなどして、顔が向き合わない時間が存在することがある。しかし、顔が向き合わない時間が或る程度長くなると、人物同士が会話をしていない可能性があると考えられる。
そこで、会話システム１では、同じ会話グループ内のユーザ同士の顔が向き合わない時間に基づいて、会話グループの維持又は会話グループからユーザを除外する会話グループの管理を行う。 <D: Exclude user from conversation group / Direction of face>
The conversation system 1 has a function of detecting the direction in which a user's face is facing even after forming a conversation group, and excluding from the conversation group users who have a long time when the face does not face other users in the same conversation group. . In general, people who are having a conversation should face each other many times, if not always. In other words, the people who are having a conversation basically face each other, but there may be a time when the faces do not face each other, for example, by temporarily turning their eyes. However, if the time during which the faces do not face each other is increased to some extent, it is considered that there is a possibility that the persons are not talking with each other.
Therefore, in the conversation system 1, the conversation group is maintained or the conversation group that excludes the user from the conversation group is managed based on the time when the faces of the users in the same conversation group do not face each other.

図１０は、会話システム１のユーザの顔の向く方向に基づいて会話グループからユーザを除外するときの処理を示すシーケンス図である。以下、ユーザＡとユーザＢからなる会話グループＧ１から、ユーザＡ又はユーザＢを除外するときの動作を説明する。
会話グループに属するヘッドセット２０Ａ，２０Ｂの各々は、ステップＳ１〜Ｓ３の処理を、例えば所定間隔で（例えば５秒毎に）繰り返し実行する。そして、管理サーバ１０は、ステップＳ３の処理で送信された方向データ及び位置情報を、通信部１２を介して取得すると、グループＤＢ１３を更新する（ステップＳ４）。 FIG. 10 is a sequence diagram illustrating processing when a user is excluded from the conversation group based on the direction in which the user of the conversation system 1 faces. Hereinafter, an operation when the user A or the user B is excluded from the conversation group G1 including the users A and B will be described.
Each of the headsets 20A and 20B belonging to the conversation group repeatedly executes the processes of steps S1 to S3, for example, at predetermined intervals (for example, every 5 seconds). And the management server 10 will update group DB13, if the direction data and position information which were transmitted by the process of step S3 are acquired via the communication part 12 (step S4).

次に、管理サーバ１０は、ユーザの顔が同じ会話グループ内の他のユーザと顔が向き合わない時間が設定時間（例えば２０秒）を経過したかどうかを判断する（ステップＳ２１）。この設定時間は、例えば設計段階又はユーザ指定によって設定されるが、各ユーザの会話の意思の有無を推し量るのに適した時間に設定されればよい。
管理サーバ１０は、ユーザの顔が同じ会話グループ内の他のユーザと顔が向き合わない時間が、設定時間を経過していないと判断した場合は（ステップＳ２１；ＮＯ）、当該会話グループを維持する。図１１の左側に示すように、ユーザＡが一時的に視線を逸らした場合には、ユーザＡとユーザＢとの顔が向き合わない時間が存在する。しかし、会話が継続している場合には、この時間が設定時間に満たないので、管理サーバ１０は会話グループＧ１を維持する。 Next, the management server 10 determines whether or not a set time (for example, 20 seconds) elapses when the user's face does not face another user in the same conversation group (step S21). This set time is set, for example, at the design stage or by user designation, but may be set to a time suitable for estimating the presence or absence of each user's intention to talk.
If the management server 10 determines that the set time has not elapsed since the face of the user does not face another user in the conversation group (step S21; NO), the management server 10 maintains the conversation group. . As shown on the left side of FIG. 11, when the user A temporarily deviates his / her line of sight, there is a time when the faces of the user A and the user B do not face each other. However, if the conversation continues, this time is less than the set time, so the management server 10 maintains the conversation group G1.

他方、管理サーバ１０は、ユーザの顔が同じ会話グループ内の他のユーザと顔が向き合わない時間が設定時間を経過したと判断した場合は（ステップＳ２１；ＹＥＳ）、当該ユーザを当該会話グループから除外する（ステップＳ２２）。図１１の右側に示すように、会話グループＧ１に属するユーザＡとユーザＢの顔が向き合わない時間が設定時間以上に長くなると、会話の意思がないものとみなして、管理サーバ１０はユーザＡ及びユーザＢを会話グループＧ１から除外する。会話グループが３人以上で構成される場合には、管理サーバ１０は、同一グループ内のどのユーザとも顔が向き合わない時間が設定時間を経過すると、当該ユーザを当該会話グループから除外し、他のユーザについては当該会話グループに残したままとする。 On the other hand, when the management server 10 determines that the set time has elapsed after the user's face does not face another user in the same conversation group (step S21; YES), the management server 10 removes the user from the conversation group. Exclude (step S22). As shown on the right side of FIG. 11, if the time during which the faces of the users A and B belonging to the conversation group G1 do not face each other is longer than the set time, the management server 10 regards the user A and User B is excluded from the conversation group G1. When the conversation group is composed of three or more people, the management server 10 excludes the user from the conversation group after the set time elapses when the face does not face any user in the same group. The user remains in the conversation group.

以上説明した会話システム１によれば、ヘッドセット２０を使用するユーザ同士が、互いに顔を向け合った場合に会話グループが形成されるので、会話の相手の情報を事前に入力する手間がユーザに強いられない。また、通りすがりのユーザ同士が会話する場合であっても、ユーザが都度、会話の相手の情報を入力する必要がない。
また、会話システム１では、ヘッドセット２０を使用するユーザの顔の向きや、ユーザ同士の位置関係（又はユーザ間距離）、ユーザの移動状態に基づいて、会話グループの管理を行う。よって、会話システム１によれば、複数のユーザの各々の会話する意思に基づいて、会話グループの柔軟な管理を行いやすくなる。 According to the conversation system 1 described above, a conversation group is formed when the users who use the headset 20 face each other, so that the user is not required to input information on the conversation partner in advance. I can't be forced. Further, even when passing users have a conversation with each other, it is not necessary for the user to input the information of the conversation partner each time.
In the conversation system 1, the conversation group is managed based on the orientation of the face of the user who uses the headset 20, the positional relationship between users (or the distance between users), and the movement state of the user. Therefore, according to the conversation system 1, it becomes easy to perform flexible management of a conversation group based on each user's intention to have a conversation.

本発明は、上述した実施形態と異なる形態で実施することが可能である。本発明は、例えば、以下のような形態で実施することも可能である。また、以下に示す変形例は、各々を適宜に組み合わせてもよい。
（変形例１）
会話システム１では、前述した対面条件のほかに、会話を開始する契機となる発話内容（例えば、挨拶や他人に呼びかける声）を示すキーワードを認識した場合に、会話グループを形成してもよい。この際に、会話システム１では、キーワードの音声のレベルに基づいて、同じ会話グループに分類するユーザを決定する。キーワードの音声のレベルは、ここでは音量レベルであるが、所定の周波数帯域（例えば可聴域）の音圧レベルであってもよく、入力音声のレベルの大小の指標となるものであればよい。 The present invention can be implemented in a form different from the above-described embodiment. The present invention can also be implemented in the following forms, for example. Further, the following modifications may be combined as appropriate.
(Modification 1)
In the conversation system 1, in addition to the face-to-face condition described above, a conversation group may be formed when a keyword indicating utterance content (for example, greeting or voice calling to another person) that triggers conversation is recognized. At this time, the conversation system 1 determines users to be classified into the same conversation group based on the voice level of the keyword. The keyword voice level here is a volume level, but may be a sound pressure level in a predetermined frequency band (for example, an audible range) as long as it is an index of the level of the input voice.

図１２は、会話システム１のキーワードに基づいて会話グループを形成する処理を示すシーケンス図である。図１３は、キーワードに基づいて、ユーザＡ，Ｂ，Ｃで会話グループを形成する処理の具体例を説明する図である。以下、ユーザＡ，Ｂ，Ｃが使用するヘッドセット２０Ａ，２０Ｂ，２０Ｃの動作を例に挙げて説明する。
ヘッドセット２０Ａ〜２０Ｃの各々は、ステップＳ１〜Ｓ３の処理を、例えば所定間隔で繰り返し実行する。そして、管理サーバ１０は、ステップＳ３の処理で送信された方向データ及び位置情報を、通信部１２を介して取得すると、グループＤＢ１３を更新する（ステップＳ４）。ここで、図１３（ａ）に示すように、ユーザＡが、ユーザＢ，Ｃの各々と対面条件を満たしている場合を考える。ここでは、ユーザＡから見て、ユーザＢの方がより近い位置に居て、ユーザＣの方がより遠い位置に居るものとする。 FIG. 12 is a sequence diagram illustrating a process of forming a conversation group based on the keywords of the conversation system 1. FIG. 13 is a diagram illustrating a specific example of the process of forming a conversation group with users A, B, and C based on keywords. Hereinafter, the operation of the headsets 20A, 20B, and 20C used by the users A, B, and C will be described as an example.
Each of the headsets 20A to 20C repeatedly executes the processes of steps S1 to S3 at a predetermined interval, for example. And the management server 10 will update group DB13, if the direction data and position information which were transmitted by the process of step S3 are acquired via the communication part 12 (step S4). Here, as shown in FIG. 13A, consider a case where the user A satisfies the facing condition with each of the users B and C. Here, as viewed from the user A, it is assumed that the user B is in a closer position and the user C is in a farther position.

ここで、ヘッドセット２０Ａにおいて、音声入力部２２にユーザの音声が入力されると、入力音声からキーワードを認識する（ステップＳ３１）。そして、ヘッドセット２０Ａは、認識したキーワードを示す入力音声のレベルを検知する（ステップＳ３２）。そして、ヘッドセット２０Ａは、検出したレベルを示すレベル情報を、キーワードを認識したことを通知する通知信号とともに、管理サーバ１０へ送信する（ステップＳ３３）。 Here, in the headset 20A, when a user's voice is input to the voice input unit 22, a keyword is recognized from the input voice (step S31). Then, the headset 20A detects the level of the input voice indicating the recognized keyword (step S32). Then, the headset 20A transmits level information indicating the detected level to the management server 10 together with a notification signal notifying that the keyword has been recognized (step S33).

管理サーバ１０は、通知信号及びレベル情報が受信されると、レベル情報が示す入力音声のレベルに基づいて、会話の相手を決定する（ステップＳ３４）。ここで、管理サーバ１０は、入力音声のレベルが低いほど、ユーザから見て近い位置のユーザを会話の相手に決定し、入力音声のレベルが高いほど、ユーザから見て遠い位置のユーザを会話の相手に決定する。例えば、管理サーバ１０は、入力音声のレベルが閾値未満である場合、図１３（ｂ−１）に示すように、ユーザＢを会話の相手に決定し、ユーザＡとユーザＢを同じ会話グループに分類する。他方、管理サーバ１０は、入力音声のレベルが閾値以上である場合、図１３（ｂ−２）に示すように、ユーザＣを会話の相手に決定し、ユーザＡとユーザＣを同じ会話グループに分類する。一般に、人物が他人に声を掛けるとき、近くに居る人物に対してはさほど大きくない声で話し、遠くに居る人物に対しては大きな声で話す。会話システム１では、このような人物の習慣に基づいて会話グループを形成するので、仮に多数のユーザが存在する場所であっても、ユーザの意図した相手と会話グループを形成しやすくなる。
なお、ヘッドセット２０が入力音声のレベルを検知するのではなく、管理サーバ１０が、ヘッドセット２０から取得した音声データに基づいて、入力音声の音声レベルを検知してもよい。 When the notification signal and the level information are received, the management server 10 determines a conversation partner based on the level of the input voice indicated by the level information (step S34). Here, the management server 10 determines a user at a position closer to the user as the conversation partner as the level of the input voice is lower, and talks to a user at a position farther away from the user as the level of the input voice is higher. Decide on your opponent. For example, when the level of the input voice is less than the threshold, the management server 10 determines the user B as a conversation partner as shown in FIG. 13 (b-1), and sets the user A and the user B to the same conversation group. Classify. On the other hand, when the level of the input voice is equal to or higher than the threshold value, the management server 10 determines the user C as a conversation partner as shown in FIG. 13 (b-2), and sets the user A and the user C to the same conversation group. Classify. In general, when a person speaks to another person, he speaks not so loudly to a person who is nearby, but speaks loudly to a person who is far away. In the conversation system 1, a conversation group is formed based on the habits of such a person, so that it is easy to form a conversation group with a partner intended by the user even in a place where a large number of users exist.
Instead of the headset 20 detecting the level of the input voice, the management server 10 may detect the voice level of the input voice based on the voice data acquired from the headset 20.

（変形例２）
上述した変形例１に係る構成を変形し、管理サーバ１０は、対面条件に基づいて、ユーザＡとユーザＢとからなる会話グループと、ユーザＡとユーザＣとからなる会話グループとの両方を形成してもよい。そして、管理サーバ１０は、この会話グループを形成した後に、ユーザＡの入力音声のレベルに基づいて、どちらの会話グループで会話を行わせるか決定する。ここでは、管理サーバ１０は、入力音声のレベルが低いほどユーザから見て近い位置のユーザが属する会話グループを選択し、入力音声のレベルが高いほどユーザから見て遠い位置のユーザが属する会話グループを選択する。例えば、管理サーバ１０は、入力音声のレベルが閾値未満である場合、ユーザＢの居る会話グループで会話を実現させ、入力音声のレベルが閾値以上である場合、ユーザＣの居る会話グループで会話を実現させる。ユーザは近い場所に居るユーザに対しては小さな声で話し、遠くに居るユーザに対しては大きな声で話すことが一般的である。よって、ユーザに会話グループを選択させることなく、管理サーバ１０は、どの会話グループで会話させるかを制御することができる。 (Modification 2)
The configuration according to the first modification described above is modified, and the management server 10 forms both a conversation group composed of the user A and the user B and a conversation group composed of the user A and the user C based on the facing condition. May be. Then, after forming the conversation group, the management server 10 determines in which conversation group the conversation is performed based on the level of the input voice of the user A. Here, the management server 10 selects a conversation group to which a user at a position closer to the user as the input voice level is lower, and a conversation group to which a user at a position farther from the user as the input voice level is higher. Select. For example, when the input voice level is less than the threshold, the management server 10 realizes the conversation in the conversation group where the user B is present, and when the input voice level is equal to or higher than the threshold, the management server 10 performs the conversation in the conversation group where the user C is present. make it happen. In general, the user speaks in a small voice to a user who is in a nearby place, and speaks in a loud voice to a user who is in a distance. Therefore, the management server 10 can control which conversation group is used for conversation without causing the user to select a conversation group.

（変形例３）
会話システム１では、一のユーザと、互いに同じ会話グループに属しない２以上のユーザと（即ち、１対多のユーザで）会話グループを形成する機能を有してもよい。 (Modification 3)
The conversation system 1 may have a function of forming a conversation group with one user and two or more users that do not belong to the same conversation group (that is, one-to-many users).

図１４は、会話システム１の１対多のユーザにより会話グループを形成する処理を示すシーケンス図である。図１５は、１対多のユーザにより会話グループを形成する処理の具体例を説明する図である。ここでは、図１５（ａ）に示すように、講演会において、講演者である１人のユーザＡ（第１のユーザ）と、聴衆に相当するユーザからなるユーザ群（複数の第２のユーザ）の各ユーザとで会話グループを構成する場合を考える。以下、聴衆に相当するユーザ群のうち、ユーザＢを代表させて、会話システム１の動作を説明する。 FIG. 14 is a sequence diagram illustrating a process of forming a conversation group by a one-to-many user of the conversation system 1. FIG. 15 is a diagram for explaining a specific example of processing for forming a conversation group by a one-to-many user. Here, as shown in FIG. 15A, in a lecture, a user group (a plurality of second users) including one user A (first user) who is a lecturer and users corresponding to the audience. ) A conversation group is configured with each user. Hereinafter, the operation of the conversation system 1 will be described with the user B representing the user group corresponding to the audience.

ヘッドセット２０Ａは、自機のユーザの顔が向く方向を、方向センサ２５を用いて検知する（ステップＳ４１）。次に、ヘッドセット２０Ａは、自機のユーザの位置を、測位部２７を用いて測位する（ステップＳ４２）。ヘッドセット２０Ａは、検知した顔の向く方向を示す方向データ、及び、測位したユーザの位置を示す位置情報を、通信部２４を介して、管理サーバ１０へ送信する（ステップＳ４３）。管理サーバ１０は、ステップＳ３の処理で送信された方向データ及び位置情報を、通信部１２を介して取得すると、グループＤＢ１３を更新する（ステップＳ４４）。
ここで、図１５（ｂ）に示すように、ユーザＡが、聴衆であるユーザ群を見渡すように、矢印Ｒ方向の顔の向く方向を変化させる。これにより、ユーザＡの顔が、ユーザ群の各ユーザに向けられたことになる。管理サーバ１０は、ユーザＡが顔を向けたユーザを特定する情報を、例えばグループＤＢ１３に格納しておく。 The headset 20 A detects the direction in which the user's face of the device is facing using the direction sensor 25 (step S 41). Next, the headset 20A measures the position of the user of the own device using the positioning unit 27 (step S42). The headset 20A transmits the direction data indicating the detected direction of the face and the position information indicating the position of the measured user to the management server 10 via the communication unit 24 (step S43). When the management server 10 acquires the direction data and position information transmitted in the process of step S3 via the communication unit 12, the management server 10 updates the group DB 13 (step S44).
Here, as shown in FIG. 15 (b), the user A changes the direction of the face in the direction of the arrow R so as to overlook the user group as the audience. As a result, the face of the user A is directed to each user in the user group. The management server 10 stores information that identifies the user to whom the user A turned his face, for example, in the group DB 13.

次に、ヘッドセット２０Ｂは、自機のユーザの顔が向く方向を、方向センサ２５を用いて検知する（ステップＳ４５）。次に、ヘッドセット２０Ｂは、自機のユーザの位置を、測位部２７を用いて測位する（ステップＳ４６）。ヘッドセット２０Ｂは、検知した顔の向く方向を示す方向データ、及び、測位したユーザの位置を示す位置情報を、通信部２４を介して、管理サーバ１０へ送信する（ステップＳ４７）。ここで、図１５（ｃ）に示すように、ユーザＢが、講演者であるユーザＡに顔を向けたとする。この場合、管理サーバ１０は、ユーザＡとユーザＢを同じ会話グループに分類するように、グループＤＢ１３を更新する（ステップＳ４８）。即ち、管理サーバ１０は、２人のユーザが同時に顔を向けなくとも、各ユーザの顔が他方のユーザに向けられれば、これらを同じ会話グループに分類する。管理サーバ１０は、聴衆の他のユーザがユーザＡに顔を向けた場合も、当該ユーザを同じ会話グループに分類する。この変形例の会話システム１によれば、１対多のユーザにより会話グループを形成する場合であっても、一のユーザが、ユーザ群の各ユーザと顔を向け合う動作をしなくてもよいので、各ユーザの負担が抑制される。 Next, the headset 20 B detects the direction in which the user's face of the own device is facing using the direction sensor 25 (step S 45). Next, the headset 20B measures the position of the user of the own device using the positioning unit 27 (step S46). The headset 20B transmits the direction data indicating the detected direction of the face and the position information indicating the position of the measured user to the management server 10 via the communication unit 24 (step S47). Here, as illustrated in FIG. 15C, it is assumed that the user B faces the user A who is a speaker. In this case, the management server 10 updates the group DB 13 so as to classify the user A and the user B into the same conversation group (step S48). That is, even if two users do not face each other at the same time, if the faces of each user are directed to the other user, they are classified into the same conversation group. Even when another user of the audience faces his face to the user A, the management server 10 classifies the user into the same conversation group. According to the conversation system 1 of this modified example, even when a conversation group is formed by a one-to-many user, one user may not perform an operation of facing each user of the user group. Therefore, the burden on each user is suppressed.

（変形例４）
会話グループの除外条件は、上述した実施形態で説明した例に限られない。会話システム１において、所定時間継続していずれのユーザのヘッドセット２０に会話の音声が入力されなかった会話グループについては解除してもよい。 (Modification 4)
The conversation group exclusion condition is not limited to the example described in the above embodiment. In the conversation system 1, a conversation group in which no conversation voice is input to the headset 20 of any user may be canceled for a predetermined time.

（変形例５）
会話システム１において、ユーザ属性が所定の関係を満たすユーザ同士を、同一の会話グループに分類してもよい。ユーザ属性は、例えば、言語情報、年齢、性別、出身地、職業及び趣味等のユーザの属性であるが、他の属性であってもよい。ユーザ属性については、予めグループＤＢ１３に情報を格納しておき、管理サーバ１０はこれに従えばよい。 (Modification 5)
In the conversation system 1, users whose user attributes satisfy a predetermined relationship may be classified into the same conversation group. User attributes are, for example, user attributes such as language information, age, sex, birthplace, occupation, and hobbies, but may be other attributes. As for user attributes, information is stored in the group DB 13 in advance, and the management server 10 may follow this.

（変形例６）
上述した実施形態で説明した構成又は動作の一部が省略されてもよい。
例えば、会話システム１において、＜Ｂ：会話グループの更新／ユーザの追加＞、＜Ｃ：会話グループからのユーザの除外／移動状態＞及び＜Ｄ：会話グループからのユーザの除外／顔の向く方向＞の１つ以上が省略されてもよい。
会話システム１において、発光部２８の発光以外の方法（例えば、音声出力）でユーザへの報知が行われてもよいし、ユーザへの報知が省略されてもよい。
会話システム１において、各ユーザの使用する言語が同じである場合には、翻訳処理に係る構成（例えば翻訳装置３０や翻訳制御手段１１２）が省略されてもよい。また、会話システム１において、翻訳装置３０ではなく、通訳者によって翻訳が行われてもよい。
会話システム１において、ヘッドセット２０は、ユーザの移動状態を検知する機能、又は、ユーザの位置を測定する機能を有しなくてもよい。この場合、無線通信端末Ｐがユーザの移動状態を検知する機能、又は、ユーザの位置を測定する機能を有していれば、管理サーバ１０は、上述した実施形態と同じ方法で会話グループを管理することができる。 (Modification 6)
A part of the configuration or operation described in the above-described embodiment may be omitted.
For example, in the conversation system 1, <B: update of conversation group / addition of user>, <C: exclusion of user from conversation group / movement state>, and <D: exclusion of user from conversation group / face facing direction One or more of> may be omitted.
In the conversation system 1, notification to the user may be performed by a method (for example, voice output) other than the light emission of the light emitting unit 28, or notification to the user may be omitted.
In the conversation system 1, when the language used by each user is the same, the configuration related to the translation processing (for example, the translation device 30 and the translation control unit 112) may be omitted. In the conversation system 1, the translation may be performed by an interpreter instead of the translation device 30.
In the conversation system 1, the headset 20 may not have a function of detecting the moving state of the user or a function of measuring the position of the user. In this case, if the wireless communication terminal P has a function of detecting the movement state of the user or a function of measuring the position of the user, the management server 10 manages the conversation group by the same method as in the above-described embodiment. can do.

（変形例７）
上述した実施形態の会話システム１では、複数の無線アクセスポイントから受信した電波の強度及び到達時間に基づいて三点測量を行うことにより、ユーザの位置を測定していたが、適用可能な屋内測位技術はこの例に限られない。会話システム１では、例えば、出発点の位置を確定後、加速度センサやジャイロセンサ等を組み合わせて現在位置を測定する自律航法を採用してもよいし、Ｂｌｕｅｔｏｏｔｈ発信機からの電波を受信して、受信した電波に含まれる発信機の識別情報及び受信した電波の強度に基づいて、現在位置を測定してもよい。また、測位部２７は、超音波等の音波や可視光又は赤外光等の光を用いて測位してもよい。測位精度については、ユーザ同士が対面していることを検知するに足りる精度であることが望ましいが、例えば人物が多い場所、又は、狭い場所での会話を管理する場合ほど、測位精度は高い方が望ましい、と考えられる。
会話システム１において、ユーザが他のユーザと対面したときに操作部２９を操作した場合に、対面条件を満たすユーザが検知されてもよい。これにより、より高い精度でユーザの意図する対面相手と会話グループを形成しやすくなる。
管理サーバ１０は、ユーザ同士の位置関係やユーザ間の距離を用いないで、顔が向き合う２以上のユーザを同一の会話グループに分類してもよい。 (Modification 7)
In the conversation system 1 of the above-described embodiment, the position of the user is measured by performing three-point surveying based on the strength and arrival time of radio waves received from a plurality of wireless access points. The technology is not limited to this example. In the conversation system 1, for example, after determining the position of the starting point, it may adopt an autonomous navigation that measures the current position by combining an acceleration sensor, a gyro sensor, or the like, or receives radio waves from a Bluetooth transmitter, The current position may be measured based on the identification information of the transmitter included in the received radio wave and the intensity of the received radio wave. The positioning unit 27 may perform positioning using sound waves such as ultrasonic waves or light such as visible light or infrared light. As for positioning accuracy, it is desirable that the accuracy is sufficient to detect that the users are facing each other, but the higher the positioning accuracy is, for example, when managing conversations in a place where there are many people or in a narrow place Is considered desirable.
In the conversation system 1, a user who satisfies the meeting condition may be detected when the operation unit 29 is operated when the user meets another user. Thereby, it becomes easy to form a conversation group with the meeting partner intended by the user with higher accuracy.
The management server 10 may classify two or more users facing each other into the same conversation group without using the positional relationship between the users or the distance between the users.

（変形例８）
ヘッドセット２０は、ユーザの頭又は耳に装着して使用される通信機器であったが、本発明の通信機器は、他の形態の通信端末で実現されてもよい。本発明の通信機器は、頭部又は顔に装着されるヘッドマウントディスプレイで例示される眼鏡型の通信機器（ウェアラブルコンピュータの一例）あってもよいし、ユーザが手に持って耳に当てて使用するハンドセットにより実現されてもよい。本発明の通信機器は、更に別の形態の通信機器であってもよいが、ユーザの顔が向く方向を検知するのに適した形態であることが望ましい。 (Modification 8)
The headset 20 is a communication device used by being worn on the user's head or ear. However, the communication device of the present invention may be realized by a communication terminal of another form. The communication device of the present invention may be a glasses-type communication device (an example of a wearable computer) exemplified by a head-mounted display worn on the head or face, or used by a user holding it in his / her ear May be realized by a handset. The communication device of the present invention may be another type of communication device, but is preferably in a form suitable for detecting the direction in which the user's face is facing.

また、通信機器が情報を表示する機能を有する場合、当該通信機器は、会話グループに属するユーザの情報を表示してもよい。このユーザの情報は、例えば同じ会話グループに属するユーザの言語情報であるが、氏名等の情報を含んでもよい。報知手段２１３による報知も、情報の表示によって行われてもよい。更に、報知手段２１３は、会話グループの人数や、会話グループを構成するユーザが変化したことを報知してもよい。また、報知手段２１３は、音声出力部２３を介した音声出力により、ユーザに情報を報知してもよい。
ヘッドセット２０は、自機の機能によりネットワーク１００に接続（無線接続）可能である場合には、無線通信端末Ｐを介さずに、ネットワーク１００に接続してもよい。 Further, when the communication device has a function of displaying information, the communication device may display information of users belonging to the conversation group. The user information is, for example, language information of users belonging to the same conversation group, but may include information such as a name. The notification by the notification unit 213 may also be performed by displaying information. Further, the notification unit 213 may notify that the number of conversation groups and the users constituting the conversation group have changed. In addition, the notification unit 213 may notify the user of information by voice output via the voice output unit 23.
The headset 20 may be connected to the network 100 without going through the wireless communication terminal P when the headset 20 can be connected (wireless connection) to the network 100 by the function of the own device.

また、管理サーバ１０は、ヘッドセット２０のユーザ同士の位置関係に応じて音声データを加工してから、ヘッドセット２０へ送信してもよい。管理サーバ１０は、例えば、ユーザ間の距離が大きいほど音声のレベル（音量レベル）を低くし、ユーザ間の距離が小さいほど音声のレベルを高くする。また、ヘッドセット２０がユーザの左右の耳にステレオ音声を出力可能な場合、当該ユーザから見た会話相手の居る方向に基づいて、ステレオ音声の出力を制御してもよい。この場合、右に居るユーザからは右耳から音声が聞こえるというようなサラウンド効果を、管理サーバ１０が与えるとよい。
また、無線通信端末Ｐは、ヘッドセット２０から受信した音声データに基づいて音声認識を行うことにより、当該音声データを文字コードに変換してから送信してもよい。この場合、翻訳装置３０は、無線通信端末Ｐから受信した文字コードに基づいて翻訳処理を行う。 In addition, the management server 10 may process the audio data according to the positional relationship between the users of the headset 20 and then transmit the audio data to the headset 20. For example, the management server 10 decreases the sound level (volume level) as the distance between users increases, and increases the sound level as the distance between users decreases. Further, when the headset 20 can output stereo sound to the left and right ears of the user, the output of the stereo sound may be controlled based on the direction of the conversation partner as seen from the user. In this case, the management server 10 may provide a surround effect such that a user who is on the right can hear sound from the right ear.
Further, the wireless communication terminal P may perform voice recognition based on the voice data received from the headset 20 to convert the voice data into a character code and transmit the voice data. In this case, the translation apparatus 30 performs a translation process based on the character code received from the wireless communication terminal P.

（変形例９）
本発明において、ユーザの顔が向く方向に基づいて会話のグループを管理する構成に代えて又は組み合わせて、ユーザの身体の向く方向に基づいて会話のグループが管理されてもよい。ユーザの身体（例えば上半身又は下半身）が向く方向を検知するための方法として、例えば、ユーザが着用する衣服や履物、ベルト等の身体への装着物に、ユーザの身体が向く方向を検知するためのセンサを設ける方法がある。このセンサは、例えば、上述した実施形態で説明した方向センサ２５と同じセンサである。そして、ユーザが使用するヘッドセット等の通信機器は、センサで検知されたユーザの身体が向く方向を示す方向データを管理装置（例えば、実施形態の管理サーバ１０）へ送信する。そして、管理装置は、通信機器から受信した方向データに基づいて、身体が向き合う２以上のユーザを同一のグループに分類するグループ管理を行う。この変形例のグループ管理は、上述した実施形態で説明した顔が向く方向を、身体が向く方向に読み替えた方法により実現されてよい。 (Modification 9)
In the present invention, the group of conversations may be managed based on the direction of the body of the user instead of or in combination with the configuration of managing the group of conversations based on the direction of the user's face. As a method for detecting the direction in which the user's body (for example, the upper body or the lower body) is directed, for example, to detect the direction in which the user's body is directed to an object worn by the user such as clothes, footwear, and a belt. There is a method of providing a sensor. This sensor is, for example, the same sensor as the direction sensor 25 described in the above-described embodiment. Then, a communication device such as a headset used by the user transmits direction data indicating the direction in which the user's body is detected, which is detected by the sensor, to the management device (for example, the management server 10 of the embodiment). Then, the management device performs group management that classifies two or more users facing each other into the same group based on the direction data received from the communication device. The group management of this modification may be realized by a method in which the face facing direction described in the above-described embodiment is replaced with the body facing direction.

（変形例１０）
上述した実施形態で管理サーバ１０が実現していた会話システムの管理装置としての機能を、ユーザが使用するヘッドセット２０又は無線通信端末Ｐが実現してもよい。この場合に、ヘッドセット２０又は無線通信端末Ｐが、翻訳処理を実行する機能を有してもよい。この場合、会話システム１において、管理サーバ１０が不要である。例えば、マスタとなるヘッドセット２０又は無線通信端末Ｐが、検知した顔の方向と、他のヘッドセット２０の顔の検知結果に基づいて、会話グループを形成する。ヘッドセット２０又は無線通信端末Ｐの各々が、マスタ又はスレーブのいずれとなるかについては、ユーザにより設定されてもよいし、自動で設定されてもよい。 (Modification 10)
The headset 20 or the wireless communication terminal P used by the user may realize the function as the management device of the conversation system realized by the management server 10 in the embodiment described above. In this case, the headset 20 or the wireless communication terminal P may have a function of executing a translation process. In this case, the management server 10 is unnecessary in the conversation system 1. For example, the headset 20 or the wireless communication terminal P as a master forms a conversation group based on the detected face direction and the face detection result of the other headset 20. Whether each of the headset 20 or the wireless communication terminal P becomes a master or a slave may be set by a user or may be set automatically.

（変形例１１）
上述した実施形態において、管理サーバ１０の制御部１１やヘッドセット２０の制御部２１が実現する各機能は、複数のプログラムの組み合わせによって実現され、又は、複数のハードウェア資源の連係によって実現されうる。制御部１１，２１の機能がプログラムを用いて実現される場合、このプログラムは、磁気記録媒体（磁気テープ、磁気ディスク（ＨＤＤ（Hard Disk Drive）、ＦＤ（Flexible Disk））等）、光記録媒体（光ディスク等）、光磁気記録媒体、半導体メモリ等のコンピュータ読み取り可能な記録媒体に記憶した状態で提供されてもよいし、ネットワークを介して配信されてもよい。また、本発明は、クラウドコンピューティングを用いて実現されてもよい。また、本発明は、会話システムの管理方法として把握することも可能である。 (Modification 11)
In the embodiment described above, each function realized by the control unit 11 of the management server 10 and the control unit 21 of the headset 20 can be realized by a combination of a plurality of programs, or can be realized by a linkage of a plurality of hardware resources. . When the functions of the control units 11 and 21 are realized using a program, the program can be a magnetic recording medium (magnetic tape, magnetic disk (HDD (Hard Disk Drive), FD (Flexible Disk), etc.)), an optical recording medium. (Such as an optical disk), a magneto-optical recording medium, a semiconductor-readable recording medium such as a semiconductor memory, or the like, or may be distributed via a network. In addition, the present invention may be realized using cloud computing. The present invention can also be understood as a method for managing a conversation system.

１…会話システム、１０…管理サーバ、１１…制御部、１１１…音声データ取得手段、１１２…翻訳制御手段、１１３…音声データ送信手段、１１４…方向データ取得手段、１１５…状態データ取得手段、１１６…特定手段、１１７…グループ管理手段、１２…通信部、１３…グループＤＢ、２０，２０Ａ〜２０Ｅ…ヘッドセット、２１…制御部、２１１…音声デ―タ送信手段、２１２…音声データ取得手段、２１３…報知手段、２１４…方向データ送信手段、２１５…状態データ送信手段、２１６…位置情報送信手段、２２…音声入力部、２３…音声出力部、２４…通信部、２５…方向センサ、２６…加速度センサ、２７…測位部、２８…発光部、２９…操作部、３０…翻訳装置、１００…ネットワーク DESCRIPTION OF SYMBOLS 1 ... Conversation system, 10 ... Management server, 11 ... Control part, 111 ... Voice data acquisition means, 112 ... Translation control means, 113 ... Voice data transmission means, 114 ... Direction data acquisition means, 115 ... State data acquisition means, 116 DESCRIPTION OF REFERENCE SYMBOLS: 117: Group management unit, 12: Communication unit, 13: Group DB, 20, 20A to 20E ... Headset, 21 ... Control unit, 211: Audio data transmission unit, 212 ... Audio data acquisition unit, 213 ... Notification means, 214 ... Direction data transmission means, 215 ... Status data transmission means, 216 ... Position information transmission means, 22 ... Voice input section, 23 ... Voice output section, 24 ... Communication section, 25 ... Direction sensor, 26 ... Acceleration sensor, 27 ... Positioning unit, 28 ... Light emitting unit, 29 ... Operation unit, 30 ... Translation device, 100 ... Network

Claims

音声データを送受信して音声の入出力を行う複数の通信機器の各々を使用する複数のユーザを、グループ分けし、
同一のグループに属する前記ユーザ間で、前記音声の入出力による会話を実現させる会話システムの管理装置であって、
前記複数の通信機器から、前記複数のユーザの各々の顔又は身体が向く方向を示す方向データを取得する方向データ取得手段と、
取得された前記方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するグループ管理手段と
を備える管理装置。 Multiple users who use each of multiple communication devices that send and receive audio data and input and output audio are grouped,
A conversation system management apparatus that realizes conversation by inputting and outputting voice between the users belonging to the same group,
Direction data acquisition means for acquiring direction data indicating a direction in which each face or body of each of the plurality of users faces from the plurality of communication devices;
A management apparatus comprising: group management means for classifying the two or more users facing the face or body into the same group based on the acquired direction data.

前記複数のユーザの位置又は前記ユーザ間の距離を特定する特定手段を備え、
前記グループ管理手段は、
特定された前記位置又は前記距離が所定条件を満たし、且つ、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類する
ことを特徴とする請求項１に記載の管理装置。 A specifying means for specifying the positions of the plurality of users or the distance between the users;
The group management means includes
The management apparatus according to claim 1, wherein the two or more users in which the specified position or the distance satisfies a predetermined condition and the face or body face each other are classified into the same group. .

前記複数のユーザの各々の移動状態を示す状態データを取得する状態データ取得手段を備え、
前記グループ管理手段は、
取得された前記状態データに基づいて、前記同一のグループに属する２以上の前記ユーザのうち、前記移動状態が所定条件を満たす前記ユーザを、当該グループから除外する
ことを特徴とする請求項１又は請求項２に記載の管理装置。 Comprising state data acquisition means for acquiring state data indicating the movement state of each of the plurality of users;
The group management means includes
The user according to claim 1 or 2, wherein, based on the acquired state data, out of the two or more users belonging to the same group, the user whose moving state satisfies a predetermined condition is excluded from the group. The management device according to claim 2.

前記グループ管理手段は、
前記同一のグループ内で、一の前記ユーザの顔又は身体が、所定時間継続して他の少なくとも一部の前記ユーザの顔又は身体と向き合わなかった場合、当該一の前記ユーザを当該グループから除外する
ことを特徴とする請求項１から請求項３のいずれか１項に記載の管理装置。 The group management means includes
If the face or body of one user does not face at least some other user's face or body for a predetermined time in the same group, the one user is excluded from the group The management apparatus according to any one of claims 1 to 3, wherein

前記グループ管理手段は、
前記同一のグループに属する２以上の前記ユーザのいずれかと、他の前記ユーザとの前記顔又は身体が向き合った場合、当該他の前記ユーザを当該グループに分類する
ことを特徴とする請求項１から請求項４のいずれか１項に記載の管理装置。 The group management means includes
When one of the two or more users belonging to the same group and the face or body of the other user face each other, the other user is classified into the group. The management apparatus of any one of Claim 4.

前記複数のユーザは、第１のユーザと、複数の第２のユーザとを含み、
前記グループ管理手段は、
前記第１のユーザの顔又は身体が前記第２のユーザの方向を向いた後、当該第２のユーザの顔又は身体が当該第１のユーザの方向を向いた場合、当該第１のユーザ及び当該第２のユーザを前記同一のグループに分類する
ことを特徴とする請求項１から請求項５のいずれか１項に記載の管理装置。 The plurality of users includes a first user and a plurality of second users,
The group management means includes
After the face or body of the first user faces the direction of the second user, when the face or body of the second user faces the direction of the first user, the first user and The management apparatus according to any one of claims 1 to 5, wherein the second user is classified into the same group.

前記グループ管理手段は、
前記通信機器に入力された前記ユーザの音声のレベルに基づいて、前記同一のグループに分類する２以上の前記ユーザを決定する
ことを特徴とする請求項１から請求項６のいずれか１項に記載の管理装置。 The group management means includes
The two or more users to be classified into the same group are determined based on the voice level of the user input to the communication device. The management device described.

複数のユーザの各々に使用され、音声データを送受信して音声の入出力を行う複数の通信機器と、
前記複数のユーザをグループ分けし、同一のグループに属する前記ユーザ間で前記音声の入出力による会話を実現させる管理サーバと
を備える会話システムであって、
前記複数の通信機器の各々は、
自機を使用する前記ユーザの顔又は身体が向く方向を検知する方向検知手段と、
検知された前記顔又は身体が向く方向を示す方向データを、前記管理サーバへ送信する方向データ送信手段と
を有し、
前記管理サーバは、
前記方向データ送信手段により送信された前記方向データを取得する方向データ取得手段と、
取得された前記方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するグループ管理手段と
を有する会話システム。 A plurality of communication devices that are used for each of a plurality of users and that transmit and receive audio data and input / output audio; and
A conversation system comprising: a management server for grouping the plurality of users and realizing conversation by voice input / output between the users belonging to the same group;
Each of the plurality of communication devices is
Direction detecting means for detecting a direction in which the face or body of the user using the own device faces;
Direction data transmission means for transmitting the detected direction data indicating the direction of the face or body to the management server, and
The management server
Direction data acquisition means for acquiring the direction data transmitted by the direction data transmission means;
A conversation management system comprising: group management means for classifying the two or more users facing the face or body into the same group based on the obtained direction data.

音声データを送受信して音声の入出力を行う複数の通信機器の各々を使用する複数のユーザを、グループ分けし、
同一のグループに属する前記ユーザ間で、前記音声の入出力による会話を実現させる会話管理方法であって、
前記複数のユーザの各々の顔又は身体が向く方向を検知するステップと、
検知した前記顔が向く方向を示す方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するステップと
を備える会話管理方法。 Multiple users who use each of multiple communication devices that send and receive audio data and input and output audio are grouped,
A conversation management method for realizing conversation by input and output of the voice between the users belonging to the same group,
Detecting a direction in which the face or body of each of the plurality of users faces;
Classifying the two or more users facing the face or body into the same group based on the detected direction data indicating the direction in which the face faces.

音声データを送受信して音声の入出力を行う複数の通信機器の各々を使用する複数のユーザを、グループ分けし、
同一のグループに属する前記ユーザ間で、前記音声の入出力による会話を実現させる会話システムを管理するコンピュータに、
前記複数の通信機器から、前記複数のユーザの各々の顔又は身体が向く方向を示す方向データを取得するステップと、
取得した前記方向データに基づいて、前記顔又は身体が向き合った２以上の前記ユーザを、前記同一のグループに分類するステップと
を実行させるためのプログラム。 Multiple users who use each of multiple communication devices that send and receive audio data and input and output audio are grouped,
A computer for managing a conversation system that realizes conversation by inputting and outputting the voice between the users belonging to the same group,
Obtaining direction data indicating a direction in which each of the plurality of users faces or faces from the plurality of communication devices; and
Classifying the two or more users facing the face or body into the same group based on the obtained direction data.