JP2022054294A

JP2022054294A - Interactive guiding device

Info

Publication number: JP2022054294A
Application number: JP2020161400A
Authority: JP
Inventors: 和教荒井; Kazunori Arai; 遼小関; Ryo Koseki; 翔平今田; Shohei Imada; 秀行青木; Hideyuki Aoki
Original assignee: Secom Co Ltd
Current assignee: Secom Co Ltd
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2022-04-06

Abstract

To smoothly provide guidance for a user.SOLUTION: An interactive guiding device includes: a display that displays an image of a character; a detection unit that detects a person in a prescribed space; a voice input/output unit that outputs an utterance made by the character as a voice, and accepts an utterance made by the detected person as a voice input; and a responding unit that controls the voice input/output unit and the display to cause the character to talk with the detected person, thereby responding to the person. The responding unit, in a response, outputs guidance information for the person as an utterance by the character from the voice input/output unit, and displays character information that supplements the guidance information on the display.SELECTED DRAWING: Figure 6

Description

本発明は、対話型案内装置に関する。 The present invention relates to an interactive guide device.

従来、オフィスビルや商業施設、病院等の施設の警備及び利用者に対する案内のために、施設の入口に警備員を配置することがなされている。また、特許文献１に記載されているように、警備員の画像を表示する表示機能と、監視室との通話機能とを有することにより、警備員を配置することなく施設の警備及び利用者に対する案内を実現する警備装置が提案されている。 Conventionally, security guards have been assigned at the entrance of facilities for security of facilities such as office buildings, commercial facilities, hospitals, etc. and for guidance to users. Further, as described in Patent Document 1, by having a display function for displaying an image of a security guard and a call function with a monitoring room, the security of the facility and the user can be provided without a security guard. A security device that realizes guidance has been proposed.

特許第６２５９８９０号公報Japanese Patent No. 6259890

特許文献１に記載されている警備装置においては、利用者との音声による対話により案内がなされる。ここで、音声による対話では、多くの情報を利用者に伝えようとすると冗長になり、一方、情報を厳選すると利用者が必要とする情報を提供できないおそれがある。また、音声に代えて図画や写真を表示することで同時に多くの情報を利用者に提供することは可能となるが、対話によるコミュニケーションの中で自然に情報を伝達することができなくなってしまう。そこで、このような装置において、音声対話による案内を基本としつつ、利用者に対する案内を円滑にすることが求められていた。 In the security device described in Patent Document 1, guidance is provided by voice dialogue with the user. Here, in a voice dialogue, if a large amount of information is to be conveyed to the user, it becomes redundant, and on the other hand, if the information is carefully selected, the information required by the user may not be provided. Further, although it is possible to provide a lot of information to the user at the same time by displaying a drawing or a photograph instead of the voice, it becomes impossible to naturally convey the information in the communication by dialogue. Therefore, in such a device, it has been required to facilitate the guidance to the user while basically guiding by voice dialogue.

本発明は、上述の課題を解決するためになされたものであり、利用者に対する案内を円滑にすることを可能とする対話型案内装置を提供することを目的とする。 The present invention has been made to solve the above-mentioned problems, and an object of the present invention is to provide an interactive guidance device capable of facilitating guidance to a user.

本発明に係る対話型案内装置は、キャラクタの画像を表示する表示部と、所定空間内の人物を検知する検知部と、キャラクタによる発話を音声として出力するとともに、検知された人物による発話を音声入力として受付ける音声入出力部と、音声入出力部及び表示部を制御して、キャラクタに検知された人物と対話させることにより、人物に対する応対をする応対部と、を有し、応対部は、応対において、人物に対する案内情報をキャラクタによる発話として出力するとともに、案内情報を補足する文字情報を表示する、ことを特徴とする。 The interactive guidance device according to the present invention has a display unit that displays an image of a character, a detection unit that detects a person in a predetermined space, and outputs utterances by the character as voice, and also outputs utterances by the detected person as voice. It has a voice input / output unit that accepts input, and a response unit that controls the voice input / output unit and the display unit to interact with the person detected by the character to respond to the person. In the response, the guidance information for the person is output as an utterance by the character, and the character information that supplements the guidance information is displayed.

また、本発明に係る対話型案内装置は、複数の応対シナリオを記憶する記憶部と、人物による発話に基づいて、複数の応対シナリオのうちから人物に対する応対シナリオを選択する選択部と、をさらに有し、応対部は、応対において、選択された応対シナリオに基づいて案内情報を音声出力するとともに文字情報を表示する、ことが好ましい。 Further, the interactive guidance device according to the present invention further includes a storage unit for storing a plurality of response scenarios and a selection unit for selecting a response scenario for a person from a plurality of response scenarios based on utterances by the person. It is preferable that the response unit outputs guidance information by voice and displays text information based on the selected response scenario in the response.

また、本発明に係る対話型案内装置において、応対部は、文字情報として、案内情報に関連し且つ案内情報の案内対象とは異なる案内対象に関する関連情報を表示する、ことが好ましい。 Further, in the interactive guidance device according to the present invention, it is preferable that the response unit displays, as text information, related information related to the guidance information and related to the guidance target different from the guidance target of the guidance information.

また、本発明に係る対話型案内装置において、応対部は、文字情報として、案内情報を詳細にした詳細情報を表示する、ことが好ましい。 Further, in the interactive guidance device according to the present invention, it is preferable that the response unit displays detailed information in detail of the guidance information as text information.

本発明に係る対話型案内装置は、キャラクタの画像を表示する表示部と、所定空間内の人物を検知する検知部と、キャラクタによる発話を音声として出力するとともに、検知された人物による発話を音声入力として受付ける音声入出力部と、音声入出力部及び表示部を制御して、キャラクタに検知された人物と対話させることにより、人物に対する応対をする応対部と、を有し、応対部は、応対において、人物による発話が継続していない場合には人物に対する案内情報をキャラクタによる発話として音声入出力部から出力し、人物による発話が継続している場合には人物に対する案内情報に対応する文字情報をテキストで表示する、ことを特徴とする。 The interactive guidance device according to the present invention has a display unit that displays an image of a character, a detection unit that detects a person in a predetermined space, and outputs utterances by the character as voice, and also outputs utterances by the detected person as voice. It has a voice input / output unit that accepts input, and a response unit that controls the voice input / output unit and the display unit to interact with the person detected by the character to respond to the person. In the response, if the utterance by the person is not continued, the guidance information for the person is output from the voice input / output unit as the utterance by the character, and if the utterance by the person is continuing, the character corresponding to the guidance information for the person is output. It is characterized by displaying information in text.

本発明に係る対話型案内装置は、利用者に対する案内を円滑にすることを可能とする。 The interactive guidance device according to the present invention makes it possible to facilitate guidance to the user.

案内装置１の正面図である。It is a front view of the guide device 1. 案内装置１の概略構成の一例を示す図である。It is a figure which shows an example of the schematic structure of the guide device 1. 応対シナリオテーブル１５１のデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the response scenario table 151. 案内処理の流れの一例を示すフロー図である。It is a flow chart which shows an example of the flow of a guidance process. 応対処理の流れの一例を示すフロー図である。It is a flow chart which shows an example of the flow of a response process. 文字情報について説明するための図である。It is a figure for demonstrating the character information. 応対シナリオテーブル１５１ａのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the response scenario table 151a. 第２応対処理の流れの一例を示すフロー図である。It is a flow chart which shows an example of the flow of the 2nd response processing.

以下、図面を参照しつつ本発明の種々の実施形態について説明する。本発明の技術的範囲はこれらの実施形態には限定されず、特許請求の範囲に記載された発明及びその均等物に及ぶ点に留意されたい。 Hereinafter, various embodiments of the present invention will be described with reference to the drawings. It should be noted that the technical scope of the present invention is not limited to these embodiments but extends to the inventions described in the claims and their equivalents.

図１は、案内装置１の正面図であり、図２は、案内装置１の概略構成の一例を示す図である。案内装置１は、オフィスビル、商業施設又は病院等の施設の入口付近に配置される。案内装置１は、警備員を模したキャラクタの画像を表示して、キャラクタに目配せ等の監視動作をさせることにより施設の警備をする。また、案内装置１は、あらかじめ設定された応対シナリオに従って、キャラクタに施設に来訪した利用者と対話させることにより、利用者に対する案内をする。そのために、案内装置１は、表示部１１、測距部１２、撮像部１３、音声入出力部１４、記憶部１５及び処理部１６を有する。 FIG. 1 is a front view of the guide device 1, and FIG. 2 is a diagram showing an example of a schematic configuration of the guide device 1. The guidance device 1 is arranged near the entrance of a facility such as an office building, a commercial facility, or a hospital. The guidance device 1 guards the facility by displaying an image of a character imitating a security guard and causing the character to perform a monitoring operation such as a wink. In addition, the guidance device 1 guides the user by having the character interact with the user who has visited the facility according to a preset response scenario. Therefore, the guidance device 1 includes a display unit 11, a distance measuring unit 12, an imaging unit 13, an audio input / output unit 14, a storage unit 15, and a processing unit 16.

表示部１１は、画像を表示するための構成であり、例えば液晶ディスプレイ又は有機ＥＬ（Electro-Luminescence）ディスプレイ等を備える。表示部１１は、処理部１６から供給された画像データに基づいて画像を表示する。 The display unit 11 is configured to display an image, and includes, for example, a liquid crystal display, an organic EL (Electro-Luminescence) display, or the like. The display unit 11 displays an image based on the image data supplied from the processing unit 16.

表示部１１は、案内装置１の前面に設けられ、キャラクタＣの画像を表示する。キャラクタＣは、例えば警備員を模したキャラクタであるが、このような例に限られず施設のマスコットキャラクタ等の任意のキャラクタでよい。 The display unit 11 is provided on the front surface of the guide device 1 and displays an image of the character C. The character C is, for example, a character imitating a security guard, but is not limited to such an example and may be any character such as a mascot character of a facility.

測距部１２は、案内装置１の周囲の対象物までの距離を測定するための構成であり、例えばＴＯＦ（Time of Flight）方式のレーザ測距センサを備える。測距部１２は、測定範囲内の各方向にパルス変調されたレーザ光を照射し、その反射光を検出するまでの時間に基づいて各方向の反射点までの距離を算出する。測距部１２は、レーザ光が照射された方向と算出された距離とに基づいて、各画素に距離情報が関連付けられた距離画像のデータを生成し、処理部１６に供給する。 The distance measuring unit 12 is configured to measure the distance to an object around the guide device 1, and includes, for example, a TOF (Time of Flight) type laser distance measuring sensor. The ranging unit 12 irradiates the pulse-modulated laser light in each direction within the measurement range, and calculates the distance to the reflection point in each direction based on the time until the reflected light is detected. The distance measuring unit 12 generates distance image data in which distance information is associated with each pixel based on the direction in which the laser beam is irradiated and the calculated distance, and supplies the data to the processing unit 16.

撮像部１３は、画像を撮像するための構成であり、例えばカメラを備える。撮像部１３は、光学レンズ等を用いて撮像範囲内の被写体からの光線を集束することにより結像した被写体像に対応する撮像画像の画像信号を生成し、所定の形式の画像データに変換して処理部１６に供給する。 The image pickup unit 13 is configured to capture an image, and includes, for example, a camera. The image pickup unit 13 generates an image signal of an image captured image corresponding to the imaged subject image by focusing the light rays from the subject within the image pickup range using an optical lens or the like, and converts the image signal into image data in a predetermined format. And supplies it to the processing unit 16.

測距部１２及び撮像部１３は、案内装置１の前面上部に設けられる。測距部１２の測定範囲及び撮像部１３の撮像範囲は、何れも案内装置１の前方を含むように設定される。また、測距部１２によって生成される距離画像データの各画素と、撮像部１３によって生成される画像データの各画素との対応関係は、後述する記憶部１５にあらかじめデータとして記憶される。 The distance measuring unit 12 and the imaging unit 13 are provided on the upper front surface of the guide device 1. The measurement range of the distance measuring unit 12 and the imaging range of the imaging unit 13 are both set to include the front of the guide device 1. Further, the correspondence between each pixel of the distance image data generated by the ranging unit 12 and each pixel of the image data generated by the imaging unit 13 is stored in advance as data in the storage unit 15 described later.

音声入出力部１４は、音声入力を受付けるとともに音声を出力するための構成であり、例えばマイク及びスピーカを備える。音声入出力部１４は、受付けた音声入力をデジタルデータである音声信号に変換して処理部１６に供給するとともに、処理部１６から供給された音声信号を音声として出力する。 The voice input / output unit 14 is configured to receive voice input and output voice, and includes, for example, a microphone and a speaker. The voice input / output unit 14 converts the received voice input into a voice signal which is digital data and supplies it to the processing unit 16, and outputs the voice signal supplied from the processing unit 16 as voice.

音声入出力部１４は、案内装置１の前面の、キャラクタＣの顔の高さに設けられることが好ましいが、このような例に限られず、案内装置１の任意の位置に設けられてよい。 The audio input / output unit 14 is preferably provided at the height of the face of the character C on the front surface of the guide device 1, but is not limited to such an example, and may be provided at an arbitrary position of the guide device 1.

記憶部１５は、プログラム及びデータを記憶するための構成であり、例えば、半導体メモリを備える。記憶部１５は、処理部１６による処理に用いられるオペレーティングシステムプログラム、ドライバプログラム、アプリケーションプログラム、データ等を記憶する。プログラムは、ＣＤ（Compact Disc）－ＲＯＭ（Read Only Memory）等のコンピュータ読み取り可能且つ非一時的な可搬型記憶媒体から、セットアッププログラム等を用いて記憶部１５にインストールされる。記憶部１５は、データとして応対シナリオテーブル１５１を記憶する。 The storage unit 15 is configured to store programs and data, and includes, for example, a semiconductor memory. The storage unit 15 stores an operating system program, a driver program, an application program, data, and the like used for processing by the processing unit 16. The program is installed in the storage unit 15 from a computer-readable and non-temporary portable storage medium such as a CD (Compact Disc) -ROM (Read Only Memory) using a setup program or the like. The storage unit 15 stores the response scenario table 151 as data.

処理部１６は、案内装置１の動作を統括的に制御する構成であり、一又は複数個のプロセッサ及びその周辺回路を備える。処理部１６は、例えば、ＣＰＵ（Central Processing Unit）、ＬＳＩ（Large Scale Integration）又はＡＳＩＣ（Application Specific Integrated Circuit）等を備える。処理部１６は、ＧＰＵ（Graphics Processing Unit）、ＤＳＰ（Digital Signal Processor）、ＦＰＧＡ（Field Programmable Gate Array）等を備えてもよい。処理部１６は、記憶部１５に記憶されているプログラムに基づいて案内装置１の処理が適切に実行されるように案内装置１の各構成の動作を制御するとともに、処理を実行する。 The processing unit 16 has a configuration that comprehensively controls the operation of the guidance device 1, and includes one or a plurality of processors and peripheral circuits thereof. The processing unit 16 includes, for example, a CPU (Central Processing Unit), an LSI (Large Scale Integration), an ASIC (Application Specific Integrated Circuit), or the like. The processing unit 16 may include a GPU (Graphics Processing Unit), a DSP (Digital Signal Processor), an FPGA (Field Programmable Gate Array), and the like. The processing unit 16 controls the operation of each configuration of the guidance device 1 so that the processing of the guidance device 1 is appropriately executed based on the program stored in the storage unit 15, and executes the processing.

処理部１６は、監視部１６１、検知部１６２、応対部１６３及び選択部１６４をその機能ブロックとして備える。これらの各部は、処理部１６が実行するプログラムによって実現される機能モジュールである。これらの各部は、ファームウェアとして処理部１６に実装されてもよい。 The processing unit 16 includes a monitoring unit 161, a detection unit 162, a response unit 163, and a selection unit 164 as its functional blocks. Each of these parts is a functional module realized by a program executed by the processing unit 16. Each of these parts may be implemented in the processing unit 16 as firmware.

図３は、応対シナリオに関する情報を管理する応対シナリオテーブル１５１のデータ構造の一例を示す図である。応対シナリオテーブル１５１は、シナリオＩＤ、説明、キーワード、案内情報及び文字情報を相互に関連付けて記憶する。 FIG. 3 is a diagram showing an example of a data structure of the response scenario table 151 that manages information regarding the response scenario. The response scenario table 151 stores scenario IDs, explanations, keywords, guidance information, and character information in association with each other.

シナリオＩＤは、各応対シナリオを識別する情報である。説明は、各応対シナリオの概要を示す情報である。キーワードは、各応対シナリオに関連する単語であり、利用者に対する応対シナリオを選択するために参照される。案内情報は、利用者に対する応対において、キャラクタＣの発話として出力される情報である。文字情報は、利用者に対する応対において、表示部１１に表示される情報である。案内情報及び文字情報は、何れも利用者に対する案内のために提示される情報である。 The scenario ID is information that identifies each response scenario. The description is information that outlines each response scenario. Keywords are words associated with each response scenario and are referenced to select a response scenario for the user. The guidance information is information that is output as the utterance of the character C in the response to the user. The character information is information displayed on the display unit 11 in response to the user. The guidance information and the text information are both information presented for guidance to the user.

文字情報は、案内情報を補足する情報であり、例えば、案内情報に関連し且つ案内情報の案内対象とは異なる案内対象に関する関連情報、又は、案内情報を詳細にした詳細情報である。図３に示す例では、シナリオＩＤ「００１」の応対シナリオは「お手洗の案内」に関するシナリオである。この応対シナリオには、案内情報として「お手洗はドアを出て左手にあります。」が記憶されており、その案内対象は「（通常の）お手洗」である。また、文字情報として、関連情報にあたる「多機能トイレは一つ上のフロアにあります。」が記憶されているとともに、詳細情報にあたる「お手洗はドアを出て左手に２０メートル進んだところにあります。」が記憶されている。関連情報の案内対象は「（通常の）お手洗」とは異なる「多機能トイレ」である。このように、案内において複数の案内対象が回答として想定される場合に、案内情報の案内対象とは異なる案内対象に関する回答を関連情報として記憶する。例えば、代表的な回答を案内情報として記憶し、他の回答を関連情報として記憶する。 The text information is information that supplements the guidance information, and is, for example, related information related to the guidance target that is related to the guidance information and different from the guidance target of the guidance information, or detailed information that details the guidance information. In the example shown in FIG. 3, the response scenario of the scenario ID "001" is a scenario related to "guidance for washing". In this response scenario, "The washroom is on your left after exiting the door" is stored as guidance information, and the guidance target is "(normal) washroom". In addition, as textual information, the related information "The multi-functional toilet is on the next floor" is memorized, and the detailed information "Otari is 20 meters to the left after leaving the door." . "Is remembered. The target of related information is "multifunctional toilets", which is different from "(normal) toilets". In this way, when a plurality of guidance targets are assumed as answers in the guidance, the answers regarding the guidance targets different from the guidance targets of the guidance information are stored as related information. For example, a representative answer is stored as guidance information, and other answers are stored as related information.

また、シナリオＩＤ「００２」の応対シナリオは「打合せの取次ぎ」に関するシナリオである。この応対シナリオには、案内情報として「○階の○○会議室へお進みください。」が記憶されているとともに、文字情報として、詳細情報にあたる「○階へは左手のエレベータよりお進み頂き、○○会議室はエレベータを降りて正面にあります。」が記憶されている。このように、文字情報として、関連情報及び詳細情報の少なくとも一方が記憶される。 Further, the response scenario of the scenario ID "002" is a scenario related to "meeting agency". In this response scenario, "Please proceed to the XX meeting room on the XX floor" is stored as guidance information, and as text information, "Please proceed to the XX floor from the elevator on the left", which is detailed information. The meeting room is in front of you after getting off the elevator. " In this way, at least one of the related information and the detailed information is stored as the character information.

なお、応対シナリオにおける案内情報には、案内情報を完成させるために必要となる情報を得るための質問（予備案内情報）があわせて記憶されてもよい。例えば、「打合せの取次ぎ」に関するシナリオにおいては、利用者の氏名や訪問先の情報等が案内する内容の決定に必要となる。そこで、各シナリオＩＤに基づく案内を実行する際に必要となる情報についても予め記憶しておき、後述する案内処理において、案内装置１は当該情報が得られていない場合に予備案内情報を出力して、当該情報を取得すると各シナリオＩＤに基づく案内を実行する。このように、シナリオＩＤ「００１」のように応対シナリオと案内情報とが固定されている場合と、シナリオＩＤ「００２」のように状況によって案内情報が変動する場合とがある。前者ではシナリオが選択されると案内情報が決定され、後者ではシナリオが選択され、且つ、案内情報を完成させるために必要となる情報が取得されることで案内情報が決定される。 In addition, the guidance information in the response scenario may also store a question (preliminary guidance information) for obtaining the information necessary for completing the guidance information. For example, in a scenario related to "meeting agency", it is necessary to determine the content to be guided by the user's name, visited information, and the like. Therefore, the information required for executing the guidance based on each scenario ID is also stored in advance, and in the guidance processing described later, the guidance device 1 outputs the preliminary guidance information when the information is not obtained. Then, when the information is acquired, guidance based on each scenario ID is executed. As described above, there are cases where the response scenario and the guidance information are fixed as in the scenario ID "001", and there are cases where the guidance information changes depending on the situation such as the scenario ID "002". In the former, the guidance information is determined when the scenario is selected, and in the latter, the guidance information is determined by selecting the scenario and acquiring the information necessary for completing the guidance information.

応対シナリオテーブル１５１のデータは、案内装置１の管理者によってあらかじめ設定される。応対シナリオテーブル１５１のデータのうち、案内情報又は文字情報は、機械学習等により生成されてもよい。 The data in the response scenario table 151 is preset by the administrator of the guidance device 1. Of the data in the response scenario table 151, the guidance information or the character information may be generated by machine learning or the like.

図４は、案内装置１によって実行される案内処理の流れの一例である。案内処理は、表示部１１にキャラクタＣが表示された状態において実行される。案内処理は、記憶部１５に記憶されたプログラムに基づいて処理部１６が案内装置１の各構成と協働することにより実現される。 FIG. 4 is an example of the flow of guidance processing executed by the guidance device 1. The guidance process is executed in a state where the character C is displayed on the display unit 11. The guidance processing is realized by the processing unit 16 cooperating with each configuration of the guidance device 1 based on the program stored in the storage unit 15.

まず、監視部１６１は、監視処理を実行する（Ｓ１０１）。監視処理は、表示部１１に表示されたキャラクタＣに監視動作をさせる処理である。監視動作は、キャラクタＣが利用者の方向に視線を向ける動作である。 First, the monitoring unit 161 executes the monitoring process (S101). The monitoring process is a process of causing the character C displayed on the display unit 11 to perform a monitoring operation. The monitoring operation is an operation in which the character C directs his / her line of sight toward the user.

監視部１６１は、撮像画像に基づいてキャラクタＣに監視動作をさせる。例えば、監視部１６１は、撮像部１３から撮像画像を取得し、撮像画像における人物領域を特定する。人物領域の特定は、撮像画像の画素値から抽出した特徴量に基づいてなされてもよく、画像と人物領域との関係をあらかじめ学習された学習済みモデルを用いてなされてもよい。特徴量は、例えばＨＯＧ（Histograms of Oriented Gradients）特徴量である。学習済みモデルは、例えばＹＯＬＯ（You Only Look Once）又はＳＳＤ（Single Shot Detector）である。監視部１６１は、キャラクタＣに、特定した人物領域に対応する方向に視線を向ける動作をさせるように表示部１１を制御することにより、監視動作をさせる。 The monitoring unit 161 causes the character C to perform a monitoring operation based on the captured image. For example, the monitoring unit 161 acquires an captured image from the imaging unit 13 and identifies a person region in the captured image. The person region may be specified based on the feature amount extracted from the pixel value of the captured image, or may be specified by using a trained model in which the relationship between the image and the person region is learned in advance. The feature amount is, for example, a HOG (Histograms of Oriented Gradients) feature amount. The trained model is, for example, YOLO (You Only Look Once) or SSD (Single Shot Detector). The monitoring unit 161 causes the character C to perform a monitoring operation by controlling the display unit 11 so as to cause the character C to perform an operation of directing the line of sight in a direction corresponding to the specified person area.

監視部１６１は、距離画像に基づいてキャラクタＣに監視動作をさせてもよい。例えば、監視部１６１は、測距部１２から距離画像を取得し、距離画像における人物領域を特定する。人物領域の特定は、距離画像が示す対象物の３次元形状から抽出した特徴量に基づいてなされてもよく、距離画像と人物領域との関係をあらかじめ学習された学習済みモデルを用いてなされてもよい。監視部１６１は、キャラクタＣに、特定した人物領域に対応する方向に視線を向ける動作をさせるように表示部１１を制御することにより、監視動作をさせる。 The monitoring unit 161 may cause the character C to perform a monitoring operation based on the distance image. For example, the monitoring unit 161 acquires a distance image from the distance measuring unit 12 and identifies a person area in the distance image. The person area may be specified based on the feature amount extracted from the three-dimensional shape of the object shown by the distance image, and the relationship between the distance image and the person area may be specified by using a trained model in which the distance image and the person area are learned in advance. May be good. The monitoring unit 161 causes the character C to perform a monitoring operation by controlling the display unit 11 so as to cause the character C to perform an operation of directing the line of sight in a direction corresponding to the specified person area.

監視部１６１は、撮像画像と距離画像との両方に基づいてキャラクタＣに監視動作をさせてもよい。例えば、監視部１６１は、撮像画像における人物領域を特定し、距離画像において、特定した人物領域に対応する画素に関連付けられた距離情報を取得することにより、案内装置１と利用者との距離を特定する。監視部１６１は、特定した距離が所定距離以下である場合に、キャラクタＣに、特定した人物領域に対応する方向に視線を向ける動作をさせる。 The monitoring unit 161 may cause the character C to perform a monitoring operation based on both the captured image and the distance image. For example, the monitoring unit 161 identifies a person area in the captured image, and acquires the distance information associated with the pixel corresponding to the specified person area in the distance image to obtain the distance between the guidance device 1 and the user. Identify. When the specified distance is equal to or less than a predetermined distance, the monitoring unit 161 causes the character C to direct the line of sight in the direction corresponding to the specified person area.

撮像画像又は距離画像において複数の人物領域が特定された場合、監視部１６１は、キャラクタＣに、それらの人物領域に対応する方向に順次視線を向けさせてもよく、案内装置１との距離が小さい利用者の方向に優先して視線を向けさせてもよい。 When a plurality of person areas are specified in the captured image or the distance image, the monitoring unit 161 may cause the character C to sequentially direct the line of sight in the direction corresponding to those person areas, and the distance from the guide device 1 may be reduced. The line of sight may be directed in the direction of a small user.

監視部１６１は、順次取得された複数の撮像画像及び距離画像に基づいてキャラクタＣに監視動作をさせてもよい。例えば、監視部１６１は、複数の撮像画像においてそれぞれ人物領域を特定する。監視部１６１は、複数の撮像画像の間で、特定した人物領域の位置及び大きさ並びに人物領域内の画素の画素値等に基づいて同一の利用者に対応する蓋然性が高い人物領域を関連付けることで、利用者をトラッキングする。監視部１６１は、トラッキングしている利用者が案内装置１に対して接近しているか否かを複数の距離画像に基づいて判定し、案内装置１に接近していると判定した場合に、キャラクタＣに利用者の方向に視線を向ける動作をさせる。 The monitoring unit 161 may cause the character C to perform a monitoring operation based on a plurality of captured images and distance images sequentially acquired. For example, the monitoring unit 161 identifies a person area in each of a plurality of captured images. The monitoring unit 161 associates a plurality of captured images with a person area having a high probability of corresponding to the same user based on the position and size of the specified person area, the pixel value of the pixel in the person area, and the like. And track the user. The monitoring unit 161 determines whether or not the tracking user is approaching the guidance device 1 based on a plurality of distance images, and when it is determined that the tracking user is approaching the guidance device 1, the character. Let C move the line of sight toward the user.

監視部１６１は、監視動作として、キャラクタＣに視線だけでなく体全体を向けるような動作をさせてもよい。また、監視部１６１は音声入出力部１４をさらに制御して、監視動作として、キャラクタＣに利用者に対するあいさつをさせてもよい。監視部１６１は、監視動作として、キャラクタＣが施設の警備をしているように見せるための他の任意の動作及び発話をさせてもよい。 As a monitoring operation, the monitoring unit 161 may cause the character C to turn not only the line of sight but also the entire body. Further, the monitoring unit 161 may further control the audio input / output unit 14 to cause the character C to greet the user as a monitoring operation. As a monitoring operation, the monitoring unit 161 may make another arbitrary operation and utterance to make the character C appear to be guarding the facility.

続いて、検知部１６２は、所定空間内の利用者を検知する（Ｓ１０２）。所定空間は、応対すべき利用者がいる空間であり、例えば、案内装置１の前方であり且つ案内装置１との距離が所定距離（例えば、２メートル）以下の空間である。所定空間は、案内装置１と離隔した空間（例えば、案内装置１との距離が１メートル以上３メートル以下の空間）でもよい。 Subsequently, the detection unit 162 detects a user in the predetermined space (S102). The predetermined space is a space where there is a user to be treated, for example, a space in front of the guide device 1 and a distance from the guide device 1 of a predetermined distance (for example, 2 meters) or less. The predetermined space may be a space separated from the guide device 1 (for example, a space in which the distance from the guide device 1 is 1 meter or more and 3 meters or less).

検知部１６２は、上述したように、撮像画像における人物領域を特定する。人物領域が特定されない場合、検知部１６２は、所定空間内の利用者が検知されなかったと判定する。 As described above, the detection unit 162 specifies a person region in the captured image. When the person area is not specified, the detection unit 162 determines that the user in the predetermined space has not been detected.

人物領域が特定された場合、検知部１６２は、上述したように、距離画像に基づいて、案内装置１と特定した人物領域に対応する利用者との距離を特定する。検知部１６２は、特定した人物領域の方向及び利用者との距離に基づいて、利用者の位置を特定する。検知部１６２は、利用者の位置が所定空間内である場合、所定空間内の利用者が検知されたと判定し、利用者の位置が所定空間外である場合、所定空間内の利用者が検知されなかったと判定する。 When the person area is specified, the detection unit 162 specifies the distance between the guidance device 1 and the user corresponding to the specified person area based on the distance image, as described above. The detection unit 162 specifies the position of the user based on the direction of the specified person area and the distance to the user. The detection unit 162 determines that the user in the predetermined space is detected when the position of the user is in the predetermined space, and detects the user in the predetermined space when the position of the user is outside the predetermined space. It is determined that it was not done.

利用者が検知された場合（Ｓ１０２－Ｙｅｓ）、応対部１６３は、音声入出力部１４及び表示部１１を制御して、キャラクタＣに利用者と対話させることにより、利用者に対する応対をする応対処理を実行する（Ｓ１０３）。応対処理の詳細は後述する。利用者が検知されなかった場合（Ｓ１０２－Ｎｏ）、又は、応対処理が終了した場合、監視部１６１は監視処理を再開する（Ｓ１０１）。 When the user is detected (S102-Yes), the response unit 163 controls the audio input / output unit 14 and the display unit 11 to cause the character C to interact with the user to respond to the user. The process is executed (S103). The details of the response process will be described later. When the user is not detected (S102-No) or when the response process is completed, the monitoring unit 161 restarts the monitoring process (S101).

図５は、応対処理の流れの一例を示すフロー図である。 FIG. 5 is a flow chart showing an example of the flow of the response process.

まず、選択部１６４は、利用者による発話に基づいて、検知された利用者に対する応対シナリオを選択する（Ｓ２０１）。例えば、選択部１６４は、検知された利用者による、音声入出力部１４に対する発話を音声信号として取得する。選択部１６４は、公知の音声認識技術を用いて取得した音声信号を文字情報に変換し、形態素解析等をすることにより利用者の発話に含まれる単語を取得する。選択部１６４は、応対シナリオテーブル１５１を参照し、取得した単語がキーワードとして関連付けられている応対シナリオを選択する。 First, the selection unit 164 selects a response scenario for the detected user based on the utterance by the user (S201). For example, the selection unit 164 acquires the utterance to the audio input / output unit 14 by the detected user as an audio signal. The selection unit 164 converts a voice signal acquired by using a known voice recognition technique into character information, and acquires a word included in the user's utterance by performing morphological analysis or the like. The selection unit 164 refers to the response scenario table 151 and selects a response scenario to which the acquired word is associated as a keyword.

応対シナリオの選択方法は上述した例に限られない。応対シナリオは、利用者の発話に対応する文字情報と応対シナリオとの関係をあらかじめ学習されたニューラルネットワーク等の学習済みモデルを用いて選択されてもよい。 The method of selecting a response scenario is not limited to the above example. The response scenario may be selected using a trained model such as a neural network in which the relationship between the character information corresponding to the user's utterance and the response scenario is learned in advance.

ここで、案内情報を完成させるために必要となる情報が得られていない場合、選択部１６４は当該情報に関する質問（予備案内情報）をキャラクタＣによる発話として音声入出力部１４から出力する。例えば、シナリオＩＤ「００２」の応対シナリオが選択された場合において、「利用者の氏名」又は「訪問先の情報」等が得られていない場合、選択部１６４は、「お客様のお名前と訪問先をお知らせください。」と音声出力する。そして、選択部１６４は利用者から得られた情報に基づいて案内情報を決定する。例えば、案内装置１は、記憶部１５に打合せの情報（利用者、打合せ場所等）に関するリストを予め記憶しておき、選択部１６４は、利用者から得られた情報とリストとを照合して、案内する打合せ場所を決定する。これにより、利用者ごとに案内情報が異なる場合においても、利用者に応じた案内を行える。 Here, when the information necessary for completing the guidance information is not obtained, the selection unit 164 outputs a question (preliminary guidance information) regarding the information from the audio input / output unit 14 as an utterance by the character C. For example, when the response scenario with the scenario ID "002" is selected and the "user's name" or "visit destination information" is not obtained, the selection unit 164 "customer's name and visit". Please let me know the destination. " Then, the selection unit 164 determines the guidance information based on the information obtained from the user. For example, the guidance device 1 stores a list of meeting information (user, meeting place, etc.) in advance in the storage unit 15, and the selection unit 164 collates the information obtained from the user with the list. , Decide the meeting place to guide. As a result, even if the guidance information is different for each user, it is possible to provide guidance according to the user.

続いて、応対部１６３は、案内情報をキャラクタＣによる発話として音声入出力部１４から出力するとともに、案内情報を補足する文字情報を表示部１１に表示する（Ｓ２０２）。例えば、応対部１６３は、選択した応対シナリオに関連付けられた案内情報及び関連情報を取得する。応対部１６３は、取得した案内情報を音声信号に変換して、キャラクタＣの発話として音声入出力部１４から出力するとともに、取得した関連情報を表示部１１に表示する。なお、応対部１６３は、関連情報に代えて詳細情報を表示してもよい。 Subsequently, the response unit 163 outputs the guidance information as an utterance by the character C from the voice input / output unit 14, and displays the character information supplementing the guidance information on the display unit 11 (S202). For example, the response unit 163 acquires guidance information and related information associated with the selected response scenario. The response unit 163 converts the acquired guidance information into an audio signal, outputs it as an utterance of the character C from the audio input / output unit 14, and displays the acquired related information on the display unit 11. The response unit 163 may display detailed information instead of the related information.

図６は、表示部１１に表示される文字情報について説明するための図である。図６に示す例では、文字情報１１１は、表示部１１の上部に表示されている。なお、文字情報１１１の表示態様はこのような例に限られない。例えば、文字情報１１１は、表示部１１の任意の位置に表示されてもよく、吹出し等を用いて、キャラクタＣが発話しているような態様で表示されてもよい。 FIG. 6 is a diagram for explaining the character information displayed on the display unit 11. In the example shown in FIG. 6, the character information 111 is displayed at the upper part of the display unit 11. The display mode of the character information 111 is not limited to such an example. For example, the character information 111 may be displayed at an arbitrary position on the display unit 11, or may be displayed in a manner as if the character C is speaking by using a blowout or the like.

以上説明したように、案内装置１は、所定空間内の利用者に対する応対において、利用者に対する案内情報をキャラクタＣによる発話として出力するとともに、案内情報を補足する文字情報を表示する。これにより、案内装置１は、利用者に対する案内を円滑にすることを可能とする。すなわち、案内情報を補足する関連情報や詳細情報をキャラクタＣの発話として出力した場合、利用者はそれらすべての情報を聞き取るまで待機しなければならない。これに対し、案内装置１は、関連情報や詳細情報を文字情報として表示することにより、対話によるコミュニケーションを実現しつつ利用者が待機する時間を短縮し、円滑な案内を実現する。 As described above, the guidance device 1 outputs the guidance information to the user as an utterance by the character C and displays the character information supplementing the guidance information in the response to the user in the predetermined space. As a result, the guidance device 1 makes it possible to facilitate guidance to the user. That is, when the related information or the detailed information supplementing the guidance information is output as the utterance of the character C, the user must wait until all the information is heard. On the other hand, the guidance device 1 displays related information and detailed information as text information, thereby shortening the waiting time of the user while realizing communication by dialogue, and realizing smooth guidance.

また、案内装置１は、利用者による発話に基づいて複数の応対シナリオのうちから利用者に対する応対シナリオを選択し、選択された応対シナリオに基づいて案内情報を出力するとともに文字情報を表示する。これにより、案内装置１は、利用者が求める事項についての案内を円滑にすることができる。 Further, the guidance device 1 selects a response scenario for the user from a plurality of response scenarios based on the utterance by the user, outputs guidance information based on the selected response scenario, and displays character information. As a result, the guidance device 1 can facilitate guidance on matters requested by the user.

また、案内装置１は、案内情報として、文字情報を簡略化した情報を出力する。詳細な情報は発話により出力するよりも文字情報として表示した方が理解されやすいため、これにより、利用者が案内を理解しやすくなり、円滑な案内が実現する。 Further, the guidance device 1 outputs information in which character information is simplified as guidance information. Since detailed information is easier to understand when it is displayed as text information than when it is output by utterance, this makes it easier for the user to understand the guidance and realizes smooth guidance.

上述した説明では、Ｓ２０１において案内装置１が応対シナリオを選択するものとしたが、このような例に限られず、案内装置１は応対シナリオを選択しなくてもよい。例えば、案内装置１は、応対処理において、所定空間内に検知された利用者に対して「受付は右手にあります。」等の一般的な案内情報を発話として出力するとともに、その案内情報を補足する関連情報又は詳細情報を文字情報として表示してもよい。 In the above description, the guidance device 1 selects the response scenario in S201, but the present invention is not limited to such an example, and the guidance device 1 does not have to select the response scenario. For example, the guidance device 1 outputs general guidance information such as "The reception is on the right hand side" to the user detected in the predetermined space in the response processing, and supplements the guidance information. Related information or detailed information may be displayed as text information.

また、案内装置１は、熱画像カメラ等の利用者の体温を測定する構成を有してもよい。この場合、案内装置１は、応対処理において、所定空間内に検知された利用者が所定以上の体温（例えば、３７．５度以上）であると判定されると、「発熱の疑いがあります。検査エリアにお進みください」等の案内情報を発話として出力するとともに、この案内情報を詳細に補足する詳細情報、すなわち、「左手にあるパーテーションが検査エリアです」等を文字情報として表示してもよい。 Further, the guidance device 1 may have a configuration for measuring the body temperature of a user such as a thermal image camera. In this case, if the guidance device 1 determines in the response process that the user detected in the predetermined space has a body temperature above the predetermined temperature (for example, 37.5 degrees or higher), "there is a suspicion of fever. Even if the guidance information such as "Please proceed to the inspection area" is output as an utterance and the detailed information that supplements this guidance information in detail, that is, "The partition on the left is the inspection area" is displayed as text information. good.

上述した説明では、一つの文字情報が表示されるものとしたが、このような例に限られず、応対部１６３は、Ｓ２０２において、選択した応対シナリオに関連付けられた関連情報又は詳細情報である複数の文字情報をそれぞれ表示部１１に表示してもよい。 In the above description, one character information is displayed, but the present invention is not limited to such an example, and the response unit 163 is a plurality of related information or detailed information associated with the selected response scenario in S202. The character information of may be displayed on the display unit 11.

また、応対部１６３は、利用者の属性に基づいて、複数の文字情報のうちの何れかの文字情報を選択して表示部１１に表示してもよい。この場合、応対シナリオテーブル１５１において、複数の文字情報に、その文字情報が対象とする利用者の属性があらかじめ関連付けられる。例えば、利用者の属性が男性、女性、子供連れ等である場合には、男性用トイレを案内する文字情報には「男性」の属性が関連付けられ、多機能トイレを案内する文字情報には「子供連れ」の属性が関連付けられる。 Further, the response unit 163 may select any character information from the plurality of character information and display it on the display unit 11 based on the attribute of the user. In this case, in the response scenario table 151, the attributes of the user targeted by the character information are associated with the plurality of character information in advance. For example, when the attributes of the user are male, female, children, etc., the attribute of "male" is associated with the text information that guides the men's toilet, and the text information that guides the multi-function toilet is ". The "with children" attribute is associated.

応対部１６３は、Ｓ２０２において、撮像部１３から取得される撮像画像に基づいて利用者の属性を特定する。属性は、例えば、撮像画像の画素値から抽出した特徴量に基づいて特定されてもよく、画像と属性との関係をあらかじめ学習された学習済みモデルを用いて特定されてもよい。また、属性は、測距部１２から取得される距離画像又は音声入出力部１４から取得される音声信号に基づいて特定されてもよい。 In S202, the response unit 163 identifies the attributes of the user based on the image captured from the image pickup unit 13. The attribute may be specified, for example, based on the feature amount extracted from the pixel value of the captured image, or the relationship between the image and the attribute may be specified using a trained model trained in advance. Further, the attribute may be specified based on the distance image acquired from the distance measuring unit 12 or the audio signal acquired from the audio input / output unit 14.

応対部１６３は、応対シナリオテーブル１５１を参照して、特定した属性が関連付けられた文字情報を選択する。応対部１６３は、選択した応対シナリオに関連付けられた案内情報を発話として出力するとともに、選択した文字情報を表示する。このようにすることで、案内装置１は、利用者の属性に適した案内を円滑にすることを可能とする。 The response unit 163 refers to the response scenario table 151 and selects the character information associated with the specified attribute. The response unit 163 outputs the guidance information associated with the selected response scenario as an utterance, and displays the selected character information. By doing so, the guidance device 1 makes it possible to facilitate guidance suitable for the attributes of the user.

上述した説明では、案内装置１は、応対処理の終了後にＳ１０１に戻り監視処理を実行するものとしたが、Ｓ１０２に戻り所定空間内の利用者を検知するものとしてもよい。これにより、案内装置１は、所定空間内に複数の利用者がいる場合や、特定の利用者が続けて案内を希望する場合等に案内を円滑にすることを可能とする。 In the above description, the guidance device 1 returns to S101 to execute the monitoring process after the response process is completed, but may return to S102 to detect the user in the predetermined space. As a result, the guidance device 1 makes it possible to facilitate guidance when there are a plurality of users in a predetermined space or when a specific user continuously desires guidance.

上述した説明では、応対部１６３は、Ｓ１０３の応対処理において案内情報をキャラクタＣによる発話として出力するとともに、案内情報を補足する文字情報を表示するものとしたが、このような例に限られない。応対部１６３は、利用者の発話に基づいてキャラクタＣが発話する内容が特定されたとき（例えば、応対シナリオを選択したときや案内情報が決定されたとき）、利用者による発話が継続していない場合には案内情報をキャラクタＣによる発話として出力し、利用者による発話が継続している場合にはキャラクタＣによる発話を行わずに案内情報に対応するテキストを文字情報として表示するようにしてもよい。 In the above description, the response unit 163 outputs the guidance information as an utterance by the character C in the response processing of S103, and displays the character information supplementing the guidance information, but the present invention is not limited to such an example. .. In the response unit 163, when the content to be spoken by the character C is specified based on the user's utterance (for example, when the response scenario is selected or the guidance information is determined), the user's utterance continues. If not, the guidance information is output as the utterance by the character C, and if the utterance by the user continues, the text corresponding to the guidance information is displayed as the text information without the utterance by the character C. May be good.

この場合、記憶部１５は、応対シナリオテーブル１５１ａを記憶する。また、応対部１６３は、Ｓ１０３において、後述する第２応対処理を実行する。 In this case, the storage unit 15 stores the response scenario table 151a. Further, the response unit 163 executes the second response process described later in S103.

図７は、応対シナリオテーブル１５１ａのデータ構造の一例を示す図である。応対シナリオテーブル１５１ａは、シナリオＩＤ、説明、キーワード及び案内情報を相互に関連付けて記憶する。すなわち、応対シナリオテーブル１５１ａは、文字情報を記憶しなくてもよい点で図３の応対シナリオテーブル１５１と相違する。 FIG. 7 is a diagram showing an example of the data structure of the response scenario table 151a. The response scenario table 151a stores the scenario ID, the description, the keyword, and the guidance information in association with each other. That is, the response scenario table 151a is different from the response scenario table 151 of FIG. 3 in that it is not necessary to store the character information.

図８は、第２応対処理の流れの一例を示すフロー図である。 FIG. 8 is a flow chart showing an example of the flow of the second response process.

まず、選択部１６４は、Ｓ２０１と同様にして、検知された利用者に対する応対シナリオを選択する（Ｓ３０１）。 First, the selection unit 164 selects a response scenario for the detected user in the same manner as in S201 (S301).

続いて、応対部１６３は、利用者による発話が継続しているか否かを判定する（Ｓ３０２）。例えば、応対部１６３は、音声入出力部１４に対する音声入力が継続している、又は、音声入力の中断が所定時間（例えば、３秒）未満である場合に、利用者による発話が継続していると判定し、中断が所定時間以上である場合に発話が継続していないと判定する。なお、音声入力が継続しているとは、音声入出力部１４が供給する音声信号が示す音声入力の音量が連続して所定値以上であることをいい、音声入力が中断しているとは、音声入力の音量が所定値未満となっていることをいう。 Subsequently, the response unit 163 determines whether or not the utterance by the user continues (S302). For example, in the response unit 163, when the voice input to the voice input / output unit 14 is continued or the interruption of the voice input is less than a predetermined time (for example, 3 seconds), the utterance by the user is continued. It is determined that the utterance is not continued when the interruption is for a predetermined time or longer. The continuous voice input means that the volume of the voice input indicated by the voice signal supplied by the voice input / output unit 14 is continuously equal to or higher than a predetermined value, and the voice input is interrupted. , It means that the volume of the voice input is less than the predetermined value.

利用者による発話が継続していると判定された場合（Ｓ３０２－Ｙｅｓ）、応対部１６３は、選択した応対シナリオに関連付けられた案内情報に対応する文字情報をテキストで表示部１１に表示し（Ｓ３０３）、応対処理を終了する。発話が継続していないと判定された場合（Ｓ３０２－Ｎｏ）、応対部１６３は、選択した応対シナリオに関連付けられた案内情報をキャラクタＣによる発話として音声入出力部１４から出力し（Ｓ３０４）、応対処理を終了する。 When it is determined that the utterance by the user is continuing (S302-Yes), the response unit 163 displays the text information corresponding to the guidance information associated with the selected response scenario on the display unit 11 as text (S302-Yes). S303), the response process is terminated. When it is determined that the utterance is not continued (S302-No), the response unit 163 outputs the guidance information associated with the selected response scenario from the voice input / output unit 14 as an utterance by the character C (S304). End the response process.

なお、案内情報を文字情報として表示部１１に表示した段階では応対処理を終了せず、利用者の発話が継続していないと判定されると、案内情報をキャラクタＣの発話として音声入出力部１４から出力して、応対処理を終了してもよい。これにより、案内情報を速やかに伝えつつ、音声対話によるコミュニケーションを行える。 When it is determined that the response process is not completed at the stage where the guidance information is displayed as character information on the display unit 11 and the user's utterance is not continued, the guidance information is used as the character C's utterance in the voice input / output unit. You may output from 14 and end the response process. As a result, it is possible to communicate by voice dialogue while promptly transmitting guidance information.

このようにすることで、案内装置１は、利用者の発話を遮ることなく利用者に対し案内情報を提示することができるため、利用者に対する案内を円滑にすることを可能とする。 By doing so, the guidance device 1 can present the guidance information to the user without interrupting the utterance of the user, so that the guidance to the user can be facilitated.

上述した説明では、Ｓ３０３で表示される文字情報とＳ３０４で出力される案内情報は同一の情報であったが、このような例に限られず、両者は異なる情報でもよい。例えば、Ｓ３０３で表示される文字情報は、Ｓ３０４で表示される案内情報の関連情報又は詳細情報でもよい。 In the above description, the character information displayed in S303 and the guidance information output in S304 are the same information, but the information is not limited to such an example, and both may be different information. For example, the character information displayed in S303 may be related information or detailed information of the guidance information displayed in S304.

また、案内情報を完成させるために必要となる情報が不足している場合、案内装置１が当該情報についての回答を求める質問である予備案内情報（例えば、「お客様のお名前と訪問先をお知らせください。」）を文字情報として表示することが好適である。これにより、利用者の発話を遮ることなく、案内にあたって必要な情報を得ることができるため、利用者に対する案内を円滑にすることが可能となる。 In addition, if the information required to complete the guidance information is insufficient, the guidance device 1 is a question for which an answer to the information is requested. Preliminary guidance information (for example, "Notify the customer's name and destination". Please. ") Is preferably displayed as text information. As a result, it is possible to obtain information necessary for guidance without interrupting the user's utterance, so that guidance to the user can be facilitated.

なお、文字情報を表示部１１に表示するものとして説明したが、これに限られるものではない。例えば、プロジェクター等の投影装置を設けて、選択された文字情報を壁面や床等に投影して表示してもよい。 Although the description has been made assuming that the character information is displayed on the display unit 11, the description is not limited to this. For example, a projection device such as a projector may be provided to project and display the selected character information on a wall surface, a floor, or the like.

また、案内装置１が、利用者が案内装置１に接近したタイミングで音声の入力を受け付けている状態であることを利用者に示すようにしてもよい。例えば、監視処理から応対処理に切り替わると周囲の環境音や音声集音状態を示すアイコンや集音レベルを示す模式的な波形等を表示部１１に表示する。或いは、音声認識に関するアイコンを常時表示しておき、応対処理に切り替わると音声の入力を受け付けている状態であることを示すようにアイコンの表示を変化させてもよい。これにより、利用者は自身が案内装置１に認識されたことを把握できるとともに、キャラクタＣが音声の入力を受け付けていることも把握できるため、自然に話しかけることができるようになる。 Further, the guide device 1 may be shown to the user that the user is in a state of accepting the voice input at the timing when the user approaches the guide device 1. For example, when the monitoring process is switched to the response process, the display unit 11 displays an icon indicating the surrounding environmental sound, a voice collecting state, a schematic waveform indicating the sound collecting level, and the like. Alternatively, the icon related to voice recognition may be always displayed, and the display of the icon may be changed so as to indicate that the voice input is being accepted when the response processing is switched. As a result, the user can grasp that he / she has been recognized by the guidance device 1 and can also grasp that the character C is accepting the voice input, so that he / she can speak naturally.

当業者は、本発明の精神および範囲から外れることなく、様々な変更、置換及び修正をこれに加えることが可能であることを理解されたい。例えば、上述した各部の処理は、本発明の範囲において、適宜に異なる順序で実行されてもよい。また、上述した実施形態及び変形例は、本発明の範囲において、適宜に組み合わせて実施されてもよい。 It will be appreciated by those skilled in the art that various changes, substitutions and modifications can be made to this without departing from the spirit and scope of the invention. For example, the processes of the above-mentioned parts may be executed in different orders as appropriate within the scope of the present invention. Further, the above-described embodiments and modifications may be carried out in appropriate combinations within the scope of the present invention.

１案内装置
１１表示部
１２測距部
１３撮像部
１４音声入出力部
１５記憶部
１６１監視部
１６２検知部
１６３応対部
１６４選択部 1 Guidance device 11 Display unit 12 Distance measurement unit 13 Imaging unit 14 Audio input / output unit 15 Storage unit 161 Monitoring unit 162 Detection unit 163 Response unit 164 Selection unit

Claims

キャラクタの画像を表示する表示部と、
所定空間内の人物を検知する検知部と、
前記キャラクタによる発話を音声として出力するとともに、前記検知された人物による発話を音声入力として受付ける音声入出力部と、
前記音声入出力部及び前記表示部を制御して、前記キャラクタに前記検知された人物と対話させることにより、前記人物に対する応対をする応対部と、を有し、
前記応対部は、前記応対において、前記人物に対する案内情報を前記キャラクタによる発話として前記音声入出力部から出力するとともに、前記案内情報を補足する文字情報を前記表示部に表示する、
ことを特徴とする対話型案内装置。 A display unit that displays a character image and
A detector that detects a person in a predetermined space,
An audio input / output unit that outputs utterances by the character as voice and receives utterances by the detected person as voice input.
It has a response unit that controls the audio input / output unit and the display unit to cause the character to interact with the detected person to respond to the person.
In the response, the response unit outputs guidance information for the person from the voice input / output unit as an utterance by the character, and displays character information supplementing the guidance information on the display unit.
An interactive guidance device characterized by this.

複数の応対シナリオを記憶する記憶部と、
前記人物による発話に基づいて、前記複数の応対シナリオのうちから前記人物に対する応対シナリオを選択する選択部と、をさらに有し、
前記応対部は、前記応対において、前記選択された応対シナリオに基づいて前記案内情報を音声出力するとともに前記文字情報を表示する、
請求項１に記載の対話型案内装置。 A storage unit that stores multiple response scenarios,
Further, it has a selection unit for selecting a response scenario for the person from the plurality of response scenarios based on the utterance by the person.
In the response, the response unit outputs the guidance information by voice and displays the character information based on the selected response scenario.
The interactive guidance device according to claim 1.

前記応対部は、前記文字情報として、前記案内情報に関連し且つ前記案内情報の案内対象とは異なる案内対象に関する関連情報を表示する、
請求項１又は２に記載の対話型案内装置。 The response unit displays, as the character information, related information related to the guidance information and related to a guidance target different from the guidance target of the guidance information.
The interactive guidance device according to claim 1 or 2.

前記応対部は、前記文字情報として、前記案内情報を詳細にした詳細情報を表示する、
請求項１又は２に記載の対話型案内装置。 The response unit displays detailed information in detail of the guidance information as the character information.
The interactive guidance device according to claim 1 or 2.

キャラクタの画像を表示する表示部と、
所定空間内の人物を検知する検知部と、
前記キャラクタによる発話を音声として出力するとともに、前記検知された人物による発話を音声入力として受付ける音声入出力部と、
前記音声入出力部及び前記表示部を制御して、前記キャラクタに前記検知された人物と対話させることにより、前記人物に対する応対をする応対部と、を有し、
前記応対部は、前記応対において、前記人物による発話が継続していない場合には当該人物に対する案内情報を前記キャラクタによる発話として前記音声入出力部から出力し、前記人物による発話が継続している場合には当該人物に対する前記案内情報に対応する文字情報をテキストで前記表示部に表示する、
ことを特徴とする対話型案内装置。 A display unit that displays a character image and
A detector that detects a person in a predetermined space,
An audio input / output unit that outputs utterances by the character as voice and receives utterances by the detected person as voice input.
It has a response unit that controls the audio input / output unit and the display unit to cause the character to interact with the detected person to respond to the person.
In the response, when the utterance by the person is not continued, the response unit outputs guidance information for the person as the utterance by the character from the voice input / output unit, and the utterance by the person continues. In that case, the character information corresponding to the guidance information for the person is displayed as text on the display unit.
An interactive guidance device characterized by this.