WO2024135001A1 - Remote control equipment and remote control method - Google Patents

Remote control equipment and remote control method Download PDF

Info

Publication number
WO2024135001A1
WO2024135001A1 PCT/JP2023/031933 JP2023031933W WO2024135001A1 WO 2024135001 A1 WO2024135001 A1 WO 2024135001A1 JP 2023031933 W JP2023031933 W JP 2023031933W WO 2024135001 A1 WO2024135001 A1 WO 2024135001A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
user
target device
voice
registered
Prior art date
Application number
PCT/JP2023/031933
Other languages
French (fr)
Japanese (ja)
Inventor
秀紀 天花寺
Original Assignee
株式会社Jvcケンウッド
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP2023116692A external-priority patent/JP2024091246A/en
Application filed by 株式会社Jvcケンウッド filed Critical 株式会社Jvcケンウッド
Publication of WO2024135001A1 publication Critical patent/WO2024135001A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]

Definitions

  • This disclosure relates to a remote control device and a remote control method.
  • a remote control device may be used to operate two televisions as target devices.
  • each of the two televisions must be given a unique name and registered in the remote control device, and the user must identify the television to be operated by speaking its name and then give voice instructions on the control content.
  • voice instructions are cumbersome, and there is a need to simplify voice instructions.
  • Patent Document 1 describes a technology for identifying a target device by detecting a user's line of sight from image information captured by an imaging device.
  • the imaging device may detect the line of sight of a user other than the user who wishes to operate the target device, and the target device may not be identified properly. Therefore, a remote control device and a remote control method that can more reliably identify a target device are desirable.
  • One or more embodiments aim to provide a remote control device and a remote control method that can properly identify the device to be operated even when multiple users are present.
  • a first aspect of one or more embodiments provides a remote control device including a user identification unit that identifies a user, an operation target device registration unit in which position information of multiple operation target devices is registered, a gaze detection unit that detects the gaze of the user identified by the user identification unit, an operation target device identification unit that identifies an operation target device located in the gaze direction detected by the gaze detection unit as an operation target device to be controlled, among the multiple operation target devices registered in the operation target device registration unit, and a voice response control unit that controls the operation target device identified by the operation target device identification unit in accordance with a voice picked up by a microphone.
  • a second aspect of one or more of the embodiments provides a remote control method in which a user identification unit identifies a user, a gaze detection unit detects the gaze direction of the user identified by the user identification unit, and an operation target device identification unit identifies an operation target device located in the gaze direction detected by the gaze detection unit as the operation target device to be controlled among a plurality of operation target devices registered in an operation target device registration unit, and a voice response control unit controls the operation target device identified by the operation target device identification unit in accordance with a voice picked up by a microphone.
  • the remote control device and remote control method according to one or more embodiments can appropriately identify the device to be operated even when multiple users are present.
  • FIG. 1 is a diagram showing an example of a situation in which there is a remote control device according to the first or second embodiment and two televisions as devices to be operated.
  • FIG. 2 is a block diagram showing the remote control device according to the first embodiment.
  • FIG. 3 is an external perspective view showing the remote control device according to the first or second embodiment.
  • FIG. 4 is a diagram showing an example of map information registered in the operation target device registration unit in the remote control device according to the first or second embodiment.
  • FIG. 5 is a conceptual diagram showing an example of map information registered in the operation target device registration section in the remote control device according to the first or second embodiment.
  • FIG. 6 is a conceptual diagram showing a face image of a user registered in a face image registration section in the remote control device according to the first or second embodiment.
  • FIG. 7 is a block diagram showing a remote control device according to the second embodiment.
  • FIG. 8 is a conceptual diagram showing a user's voice registered in a voice registration unit in a remote control device according to the second embodiment.
  • FIG. 9 is a block diagram showing a remote control device according to the third embodiment.
  • FIG. 1 it is assumed that television Tv1 is placed in the living room, and television Tv2 is placed in the dining room.
  • Televisions Tv1 and Tv2 are examples of devices to be operated.
  • a user 60 operates television Tv1 or Tv2 using a remote control device 100 according to the first embodiment, a remote control device 200 according to the second embodiment, or a remote control device 300 according to the third embodiment.
  • the configurations and operations of the remote control device 100 according to the first embodiment, the remote control device 200 according to the second embodiment, and the remote control device 300 according to the third embodiment, and the remote control methods executed by the remote control devices 100, 200, and 300 will be described below with reference to the attached drawings.
  • the remote control device 100 includes a camera 1, a registered face image recognition unit 2, a face image registration unit 3, a gaze detection unit 4, an operation target device position calculation unit 5, an operation target device registration unit 6, an operation target device identification unit 7, a voice response control unit 8, and a remote control signal generation unit 9.
  • the remote control device 100 according to the first embodiment also includes a microphone 10, a voice recognition unit 11, a voice synthesis control unit 12, a voice synthesis unit 13, a speaker 14, an operation unit 15, and a registration control unit 16.
  • the remote control device 100 has a configuration in which the camera 1 is attached to the top of the housing 50.
  • the components shown in FIG. 2 except for the camera 1 are housed inside the housing 50. It is preferable that the camera 1 be a wide-angle camera with a wide imaging range, for example a 360-degree camera.
  • the registered face image recognition unit 2, the operation target device position calculation unit 5, the operation target device identification unit 7, the voice recognition unit 11, the voice response control unit 8, the voice synthesis control unit 12, the voice synthesis unit 13, and the registration control unit 16 can be configured by a central processing unit (CPU) of a microcomputer.
  • the face image registration unit 3 and the operation target device registration unit 6 can be configured by a memory of a microcomputer.
  • the gaze detection unit 4 can be configured by applying the technology described in Patent Document 2.
  • the remote control signal generator 9 can be configured with a generating circuit including a light-emitting diode that generates an infrared code.
  • the voice recognition unit 11 and the voice synthesis unit 13 can be configured with an integrated circuit separate from the CPU.
  • the operation unit 15 can be an operation button provided on the housing 50, or it can be a touch pad.
  • the registration control unit 16 registers the names of the televisions Tv1 and Tv2 in the operation target device registration unit 6. As shown in Fig. 4, it is assumed that the television Tv1 is registered as the living room television and the television Tv2 is registered as the dining room television.
  • the registration control unit 16 controls the voice synthesis unit 13 to generate a voice from the speaker 14 instructing the user to watch the televisions Tv1 and Tv2 in that order.
  • the method of giving instructions to the user 60 is not limited to voice.
  • the gaze detection unit 4 detects the direction of the user's 60 gaze.
  • the operation target device position calculation unit 5 calculates at least the direction of the television Tv1 as seen from the remote control device 100, based on the gaze direction detected by the gaze detection unit 4.
  • the target device position calculation unit 5 may calculate the approximate distance from the remote control device 100 to the television Tv1 by calculating the intersection of the left and right lines of sight detected by the line of sight detection unit 4.
  • the position of a target device such as the television Tv1 is at least the direction of the target device as seen from the remote control device 100, and preferably the direction of the target device as seen from the remote control device 100 and the distance from the remote control device 100 to the target device.
  • the registration control unit 16 registers the position information of the television Tv1 calculated by the operation target device position calculation unit 5 in the operation target device registration unit 6. Next, when the user 60 watches the television Tv2, the operation target device position calculation unit 5 similarly calculates the position of Tv2, and the registration control unit 16 registers the position information of Tv2 in the operation target device registration unit 6.
  • television Tv1 is located at a distance Ds1 in a direction Dr1 as viewed from the remote control device 100
  • television Tv2 is located at a distance Ds2 in a direction Dr2 as viewed from the remote control device 100.
  • the direction Dr1 and distance Ds1 are registered as position information for television Tv1, which is a television in the living room
  • the direction Dr2 and distance Ds2 are registered as position information for television Tv2, which is a television in the dining room.
  • the registration control unit 16 registers each operation target device as map information in the operation target device registration unit 6.
  • the method of registering target devices is not limited to the above, and the user may manually input the location information of each target device.
  • registration control unit 16 registers the face image of user 60 captured by camera 1 in face image registration unit 3.
  • face images of users who operate remote control device 100 are registered in face image registration unit 3 in association with user IDs.
  • the face images registered in face image registration unit 3 may be face images of each user captured by camera 1, or may be face images of each user captured by a different camera.
  • a user 60 shown in FIG. 1 is a user with a user ID of 0001.
  • the registered face image recognition unit 2 recognizes whether or not a person photographed by the camera 1 is a user registered in the face image registration unit 3, based on a face image registered in the face image registration unit 3.
  • the gaze detection unit 4 detects the gaze direction of a person who the registered face image recognition unit 2 recognizes as a user registered in the face image registration unit 3. By detecting the gaze direction of a person who the gaze detection unit 4 recognizes as a user registered in the face image registration unit 3, it is possible to avoid erroneously detecting the gaze of a person other than the registered user.
  • the registered face image recognition unit 2 and the face image registration unit 3 function as a user identification unit that identifies a user who operates the remote control device 100.
  • the operation target device identification unit 7 identifies, among the multiple operation target devices registered in the operation target device registration unit 6, an operation target device located in the direction of the line of sight detected by the line of sight detection unit 4 as an operation target device to be operated by voice. As shown in FIG. 1, when a user 60 is watching a television Tv1, the operation target device identification unit 7 identifies the television Tv1 as an operation target device to be operated by voice.
  • the operation target device identification unit 7 has identified the television Tv1 as the target operation target device
  • the user 60 utters, for example, "Please turn on the power.”
  • the voice recognition unit 11 recognizes the voice and instructs the voice response control unit 8 to turn on the power.
  • the voice response control unit 8 obtains information from the operation target device identification unit 7 that the television Tv1 has been identified as the target operation target device.
  • the voice response control unit 8 instructs the remote control signal generation unit 9 to generate a remote control signal to turn on the power of the television Tv1.
  • the remote control signal generating unit 9 receives an instruction from the voice response control unit 8 and generates a remote control signal to turn on the power of the television Tv1. This turns on the power of the television Tv1.
  • the operation target device identification unit 7 has identified the television Tv1 as the target operation target device, if the user 60 utters, for example, "Turn up the volume,” the volume of the television Tv1 can be turned up.
  • the user 60 does not need to instruct the remote control device 100 using long sentences such as "Please turn on the television in the living room” or "Please turn up the volume of the television in the living room.”
  • the remote control device 100 can simplify voice instructions.
  • the voice synthesis control unit 12 controls the voice synthesis unit 13 to synthesize a question such as "Would you like to turn on the TV in the living room or the TV in the dining room” and output it from the speaker 14.
  • the voice recognition unit 11 recognizes the voice uttered by user 60 and instructs the voice response control unit 8 to turn on the power of TV Tv1, which is the TV in the living room.
  • the voice response control unit 8 instructs the remote control signal generation unit 9 to generate a remote control signal to turn on the power of TV Tv1.
  • the target device identification unit 7 does not identify the target device, it may be necessary to emit voice from the speaker 14, pick up the voice with the microphone 10, and recognize the voice with the voice recognition unit 11, before the voice response control unit 8 can identify the target device.
  • the target device identification unit 7 identify the target device, it becomes possible to operate the device by speaking extremely simple sentences.
  • the device to be operated can be appropriately identified by detecting the line of sight of the registered user and identifying the device to be operated.
  • the remote control device 100 identifies the device to be operated based on the line of sight of a specific user registered in the face image registration unit 3, so even if there are multiple users, it can accurately identify the device that a specific user wants to operate.
  • Second Embodiment 7 shows a remote control device 200 according to the second embodiment.
  • the remote control device 200 includes a voice registration unit 17 and a registered voice recognition unit 18, which are components not included in the remote control device 100. Registration of a device to be operated is the same as in the first embodiment, and the description thereof will be omitted.
  • Registration control unit 16 registers the voice of user 60 in voice registration unit 17.
  • microphone 10 picks up a predetermined sentence uttered by user 60.
  • Registration control unit 16 registers the voice picked up by the microphone in voice registration unit 17.
  • the registration control unit 16 registers the user's voice in the voice registration unit 17 in association with the user ID.
  • the user IDs registered in the voice registration unit 17 correspond one-to-one with the user IDs registered in the face image registration unit 3. In other words, if a user registered in the face image registration unit 3 and a user registered in the voice registration unit 17 have the same user ID, they are the same user. Therefore, the voice registration unit 17 registers the user's voice in association with the user's face image registered in the face image registration unit 3.
  • the registered voice recognition unit 18 recognizes whether the voice picked up by the microphone 10 is the voice of a user registered in the voice registration unit 17.
  • the registered voice recognition unit 18 can recognize whether the voice is that of a registered user based on the voiceprint of the voice.
  • the registered voice recognition unit 18 recognizes that the voice picked up by the microphone 10 is the voice of a user 60 registered in the voice registration unit 17, it identifies the user ID and identifies the user face image corresponding to the identified user ID based on the user face image data registered in the face image registration unit 3.
  • the registered face image recognition unit 2 recognizes the face image of the identified user.
  • the gaze detection unit 4 detects the gaze direction of the user recognized by the registered face image recognition unit 2.
  • the operation target device identification unit 7 identifies, among the multiple operation target devices registered in the operation target device registration unit 6, the operation target device located in the gaze direction detected by the gaze detection unit 4 as the operation target device to be controlled.
  • the registered voice recognition unit 18 When the registered voice recognition unit 18 recognizes that the voice picked up by the microphone 10 is the voice of a user 60 registered in the voice registration unit 17, it notifies the voice response control unit 8 that it has recognized the voice of the user registered in the voice registration unit 17.
  • the voice response control unit 8 When the voice response control unit 8 is notified that it has recognized the voice of the registered user, it instructs the remote control signal generation unit 9 to generate a remote control signal to turn on the power of the operation target device identified by the operation target device identification unit 7, following the instruction to turn on the power from the voice recognition unit 11.
  • the face image registration unit 3 and the voice registration unit 17 are shown separately, but the face image registration unit 3 and the voice registration unit 17 may be configured as a single user registration unit.
  • the registered face image recognition unit 2, the face image registration unit 3, the voice registration unit 17, and the registered voice recognition unit 18 function as a user identification unit that identifies the user operating the remote control device 200.
  • the remote control device 200 even if there are multiple registered users and each registered user issues a voice command, the remote control device 200 identifies the user for each detected voice and identifies the device to be operated based on the gaze direction of the identified user, so that the remote control device 200 can more appropriately identify the device to be operated compared to the remote control device 100.
  • the remote control device 200 is equipped with a registered voice recognition unit 18 that identifies a user based on an operation instruction voice, in addition to the configuration of the remote control device 100. Therefore, even if there are multiple registered users, it is possible to identify the user who issued the operation instruction voice and to identify the device to be operated based on the line of sight of the identified user. Therefore, even if there are multiple users, the remote control device 200 can more accurately identify the device that a specific user wants to operate.
  • Third Embodiment 9 shows a remote control device 300 according to the third embodiment.
  • the same parts as those in the remote control device 100 are given the same reference numerals, and their description will be omitted.
  • the registered face image recognition unit 2 recognizes a face registered in the face image registration unit 3, thereby identifying the user who operates the remote control device 100.
  • the remote control device 300 includes a voice detection unit 30 that estimates the direction of voice generation, and identifies the user who operates the remote control device 300 based on the direction of voice generation estimated by the voice detection unit 30.
  • the voice detection unit 30 detects voice and estimates the direction of the detected voice.
  • the voice detection unit 30 transmits the detected voice to the voice recognition unit 11, and transmits information indicating the estimated voice generation direction to the user identification unit 31.
  • the voice detection unit 30 can estimate the voice generation direction using existing technology.
  • the voice detection unit 30 includes multiple microphones, and estimates the voice generation direction based on the phase difference of the voice detected by the multiple microphones.
  • the voice detection unit 30 does not need to be configured with multiple microphones integrated into one unit, and multiple microphones may be installed in different locations in a room.
  • the voice detection unit 30 may estimate the direction of voice generation based on the volume or phase difference of the voice detected by each of the multiple microphones installed in different locations in a room.
  • the method of estimating the direction of voice generation is not limited to the method using multiple microphones, and other methods may be used.
  • the user identification unit 31 identifies the user operating the remote control device 300 based on the information indicating the direction of sound generation supplied by the sound detection unit 30 and the imaging results of the camera 1. Specifically, the user identification unit 31 identifies a person present in the direction of sound generation estimated by the sound detection unit 30 as the user operating the remote control device 300.
  • the user identification unit 31 can be configured with a CPU.
  • the gaze detection unit 4 detects the gaze of the user identified by the user identification unit 31.
  • the operation target device identification unit 7 identifies, from among multiple operation target devices registered in the operation target device registration unit 6, an operation target device located in the direction of the gaze detected by the gaze detection unit 4 as an operation target device to be operated by voice.
  • the operation after the operation target device identification unit 7 identifies the operation target device is the same as that of the remote control device 100.
  • a modified version of the third embodiment may be configured as follows.
  • the user identification unit 31 identifies, from among the people captured by the camera 1, a person whose mouth moved at the time the sound detection unit 30 detected sound, as the user operating the remote control device 300. In this case, the sound detection unit 30 does not need to have the function of estimating the direction of sound generation.
  • the present invention is not limited to the first to third embodiments described above, and various modifications are possible without departing from the gist of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Selective Calling Equipment (AREA)

Abstract

A user identification unit (registered face image recognition unit (2) and face image registration unit (3)) identifies a user. Positional information of a plurality of operation-target devices is registered in a operation-target device registration unit (6). A line-of-sight detection unit (4) detects the line of sight of the user identified by the user identification unit. An operation-target device identification unit (7) identifies, as an operation-target device to be controlled, an operation-target device located in the direction of the line of sight detected by the line-of-sight detection unit (4), from among the plurality of operation-target devices registered in the operation-target device registration unit (6). A voice response control unit (8) controls the identified operation-target device according to voice collected by a microphone (10).

Description

遠隔制御装置及び遠隔制御方法Remote control device and remote control method
 本開示は、遠隔制御装置及び遠隔制御方法に関する。 This disclosure relates to a remote control device and a remote control method.
 音声を認識し、音声によって各種の機器を操作することができるスマートスピーカと称される遠隔制御装置が普及している。 Remote control devices known as smart speakers that can recognize voice and control various devices using voice commands are becoming widespread.
国際公開第2019/142295号International Publication No. 2019/142295 特開2017-102687号公報JP 2017-102687 A
 例えば、遠隔制御装置によって操作対象機器として2台のテレビを操作することがある。このとき、2台のテレビそれぞれに固有の名前を付けて遠隔制御装置に登録しておき、名前を発して操作しようとするテレビを特定した上で制御内容を音声で指示する必要がある。このような音声による指示は煩雑であり、音声による指示を簡略化することが求められる。特許文献1には、撮像装置で撮像した画像情報から、ユーザの視線を検出することで操作対象機器を特定する技術が記載されている。しかしながら、撮像装置の撮像範囲に複数のユーザが存在した場合、操作対象機器を操作したいユーザ以外のユーザの視線を検出してしまい、適切に操作対象機器を特定できない可能性がある。そこで、より確実に操作対象機器を特定する遠隔制御装置及び遠隔制御方法の登場が望まれる。 For example, a remote control device may be used to operate two televisions as target devices. In this case, each of the two televisions must be given a unique name and registered in the remote control device, and the user must identify the television to be operated by speaking its name and then give voice instructions on the control content. Such voice instructions are cumbersome, and there is a need to simplify voice instructions. Patent Document 1 describes a technology for identifying a target device by detecting a user's line of sight from image information captured by an imaging device. However, when multiple users are present within the imaging range of the imaging device, the imaging device may detect the line of sight of a user other than the user who wishes to operate the target device, and the target device may not be identified properly. Therefore, a remote control device and a remote control method that can more reliably identify a target device are desirable.
 1またはそれ以上の実施形態は、複数のユーザが存在しても適切に操作対象機器を特定することができる遠隔制御装置及び遠隔制御方法を提供することを目的とする。 One or more embodiments aim to provide a remote control device and a remote control method that can properly identify the device to be operated even when multiple users are present.
 1またはそれ以上の実施形態の第1の態様は、ユーザを特定するユーザ特定部と、複数の操作対象機器の位置情報が登録されている操作対象機器登録部と、前記ユーザ特定部が特定したユーザの視線を検出する視線検出部と、前記操作対象機器登録部に登録されている前記複数の操作対象機器のうち、前記視線検出部が検出した視線の方向に位置する操作対象機器を制御される対象の操作対象機器であると特定する操作対象機器特定部と、マイクロホンが収音した音声に従って、前記操作対象機器特定部によって特定された操作対象機器を制御する音声応答制御部とを備える遠隔制御装置を提供する。 A first aspect of one or more embodiments provides a remote control device including a user identification unit that identifies a user, an operation target device registration unit in which position information of multiple operation target devices is registered, a gaze detection unit that detects the gaze of the user identified by the user identification unit, an operation target device identification unit that identifies an operation target device located in the gaze direction detected by the gaze detection unit as an operation target device to be controlled, among the multiple operation target devices registered in the operation target device registration unit, and a voice response control unit that controls the operation target device identified by the operation target device identification unit in accordance with a voice picked up by a microphone.
 1またはそれ以上の実施形態の第2の態様は、ユーザ特定部がユーザを特定し、視線検出部が、前記ユーザ特定部が特定したユーザの視線の方向を検出し、操作対象機器特定部が、操作対象機器登録部に登録されている複数の操作対象機器のうち、前記視線検出部が検出した視線の方向に位置する操作対象機器を制御される対象の操作対象機器であると特定し、音声応答制御部が、マイクロホンが収音した音声に従って、前記操作対象機器特定部によって特定された操作対象機器を制御する遠隔制御方法を提供する。 A second aspect of one or more of the embodiments provides a remote control method in which a user identification unit identifies a user, a gaze detection unit detects the gaze direction of the user identified by the user identification unit, and an operation target device identification unit identifies an operation target device located in the gaze direction detected by the gaze detection unit as the operation target device to be controlled among a plurality of operation target devices registered in an operation target device registration unit, and a voice response control unit controls the operation target device identified by the operation target device identification unit in accordance with a voice picked up by a microphone.
 1またはそれ以上の実施形態に係る遠隔制御装置及び遠隔制御方法によれば、複数のユーザが存在しても適切に操作対象機器を特定することができる。 The remote control device and remote control method according to one or more embodiments can appropriately identify the device to be operated even when multiple users are present.
図1は、第1または第2実施形態に係る遠隔制御装置と、操作対象機器としての2台のテレビが存在する状況の例を示す図である。FIG. 1 is a diagram showing an example of a situation in which there is a remote control device according to the first or second embodiment and two televisions as devices to be operated. 図2は、第1実施形態に係る遠隔制御装置を示すブロック図である。FIG. 2 is a block diagram showing the remote control device according to the first embodiment. 図3は、第1または第2実施形態に係る遠隔制御装置を示す外観斜視図である。FIG. 3 is an external perspective view showing the remote control device according to the first or second embodiment. 図4は、第1または第2実施形態に係る遠隔制御装置における操作対象機器登録部に登録されているマップ情報の例を示す図である。FIG. 4 is a diagram showing an example of map information registered in the operation target device registration unit in the remote control device according to the first or second embodiment. 図5は、第1または第2実施形態に係る遠隔制御装置における操作対象機器登録部に登録されているマップ情報の例を示す概念図である。FIG. 5 is a conceptual diagram showing an example of map information registered in the operation target device registration section in the remote control device according to the first or second embodiment. 図6は、第1または第2実施形態に係る遠隔制御装置における顔画像登録部に登録されているユーザの顔画像を示す概念図である。FIG. 6 is a conceptual diagram showing a face image of a user registered in a face image registration section in the remote control device according to the first or second embodiment. 図7は、第2実施形態に係る遠隔制御装置を示すブロック図である。FIG. 7 is a block diagram showing a remote control device according to the second embodiment. 図8は、第2実施形態に係る遠隔制御装置における音声登録部に登録されているユーザの音声を示す概念図である。FIG. 8 is a conceptual diagram showing a user's voice registered in a voice registration unit in a remote control device according to the second embodiment. 図9は、第3実施形態に係る遠隔制御装置を示すブロック図である。FIG. 9 is a block diagram showing a remote control device according to the third embodiment.
 図1において、テレビTv1はリビングルームに置かれており、テレビTv2はダイニングルームに置かれているとする。テレビTv1及びTv2は操作対象機器の例である。ユーザ60が、第1実施形態に係る遠隔制御装置100、第2実施形態に係る遠隔制御装置200、または第3実施形態に係る遠隔制御装置300によってテレビTv1またはTv2を操作する。以下、第1実施形態に係る遠隔制御装置100、第2実施形態に係る遠隔制御装置200、第3実施形態に係る遠隔制御装置300の構成、動作、遠隔制御装置100、200、300で実行される遠隔制御方法について、添付図面を参照して説明する。 In FIG. 1, it is assumed that television Tv1 is placed in the living room, and television Tv2 is placed in the dining room. Televisions Tv1 and Tv2 are examples of devices to be operated. A user 60 operates television Tv1 or Tv2 using a remote control device 100 according to the first embodiment, a remote control device 200 according to the second embodiment, or a remote control device 300 according to the third embodiment. The configurations and operations of the remote control device 100 according to the first embodiment, the remote control device 200 according to the second embodiment, and the remote control device 300 according to the third embodiment, and the remote control methods executed by the remote control devices 100, 200, and 300 will be described below with reference to the attached drawings.
<第1実施形態>
 図2において、第1実施形態に係る遠隔制御装置100は、カメラ1、登録顔画像認識部2、顔画像登録部3、視線検出部4、操作対象機器位置算出部5、操作対象機器登録部6、操作対象機器特定部7、音声応答制御部8、遠隔制御信号発生部9を備える。また、第1実施形態に係る遠隔制御装置100は、マイクロホン10、音声認識部11、音声合成制御部12、音声合成部13、スピーカ14、操作部15、登録制御部16を備える。
First Embodiment
2 , the remote control device 100 according to the first embodiment includes a camera 1, a registered face image recognition unit 2, a face image registration unit 3, a gaze detection unit 4, an operation target device position calculation unit 5, an operation target device registration unit 6, an operation target device identification unit 7, a voice response control unit 8, and a remote control signal generation unit 9. The remote control device 100 according to the first embodiment also includes a microphone 10, a voice recognition unit 11, a voice synthesis control unit 12, a voice synthesis unit 13, a speaker 14, an operation unit 15, and a registration control unit 16.
 図3に示すように、遠隔制御装置100は、筐体50の上部にカメラ1が取り付けられた構成を有する。筐体50内には、カメラ1以外の図2に示す各構成が収納されている。カメラ1は撮像範囲が広い広角カメラが望ましく、例えば360度カメラである。 As shown in FIG. 3, the remote control device 100 has a configuration in which the camera 1 is attached to the top of the housing 50. The components shown in FIG. 2 except for the camera 1 are housed inside the housing 50. It is preferable that the camera 1 be a wide-angle camera with a wide imaging range, for example a 360-degree camera.
 登録顔画像認識部2、操作対象機器位置算出部5、操作対象機器特定部7、音声認識部11、音声応答制御部8、音声合成制御部12、音声合成部13、登録制御部16は、マイクロコンピュータの中央処理装置(CPU)で構成することができる。顔画像登録部3及び操作対象機器登録部6は、マイクロコンピュータのメモリで構成することができる。視線検出部4は、特許文献2に記載されている技術を応用することによって構成することができる。 The registered face image recognition unit 2, the operation target device position calculation unit 5, the operation target device identification unit 7, the voice recognition unit 11, the voice response control unit 8, the voice synthesis control unit 12, the voice synthesis unit 13, and the registration control unit 16 can be configured by a central processing unit (CPU) of a microcomputer. The face image registration unit 3 and the operation target device registration unit 6 can be configured by a memory of a microcomputer. The gaze detection unit 4 can be configured by applying the technology described in Patent Document 2.
 遠隔制御信号発生部9は、赤外線のコードを発生する発光ダイオードを含む発生回路によって構成することができる。音声認識部11及び音声合成部13は、CPUとは別の集積回路で構成されていてもよい。操作部15は、筐体50に設けられている操作ボタンであってもよいし、タッチパッドであってもよい。 The remote control signal generator 9 can be configured with a generating circuit including a light-emitting diode that generates an infrared code. The voice recognition unit 11 and the voice synthesis unit 13 can be configured with an integrated circuit separate from the CPU. The operation unit 15 can be an operation button provided on the housing 50, or it can be a touch pad.
(操作対象機器の登録)
 ユーザ60(または他のユーザ)が操作部15を操作することによって、登録制御部16は、操作対象機器登録部6にテレビTv1及びTv2の名称を登録する。図4に示すように、テレビTv1はリビングルームのテレビ、テレビTv2はダイニングルームのテレビと名称が登録されたとする。
(Registering the device to be operated)
When the user 60 (or another user) operates the operation unit 15, the registration control unit 16 registers the names of the televisions Tv1 and Tv2 in the operation target device registration unit 6. As shown in Fig. 4, it is assumed that the television Tv1 is registered as the living room television and the television Tv2 is registered as the dining room television.
 ユーザ60が操作部15を操作することによって、テレビTv1及びTv2の位置情報を登録する登録モードに設定すると、登録制御部16は、音声合成部13を制御して、スピーカ14より、テレビTv1及びTv2を順に見るよう指示する音声を発生させる。ユーザ60に対する指示の仕方は音声には限定されない。ユーザ60がテレビTv1を見ると、視線検出部4はユーザ60の視線の方向を検出する。操作対象機器位置算出部5は、視線検出部4が検出した視線の方向に基づいて、遠隔制御装置100から見たテレビTv1の少なくとも方向を算出する。 When the user 60 operates the operation unit 15 to set a registration mode for registering the position information of the televisions Tv1 and Tv2, the registration control unit 16 controls the voice synthesis unit 13 to generate a voice from the speaker 14 instructing the user to watch the televisions Tv1 and Tv2 in that order. The method of giving instructions to the user 60 is not limited to voice. When the user 60 looks at the television Tv1, the gaze detection unit 4 detects the direction of the user's 60 gaze. The operation target device position calculation unit 5 calculates at least the direction of the television Tv1 as seen from the remote control device 100, based on the gaze direction detected by the gaze detection unit 4.
 操作対象機器位置算出部5は、視線検出部4が検出した左右の視線の交点を算出することにより、遠隔制御装置100からテレビTv1までのおおよその距離を算出してもよい。テレビTv1等の操作対象機器の位置とは、少なくとも遠隔制御装置100から見た操作対象機器の方向であり、好ましくは、遠隔制御装置100から見た操作対象機器の方向と遠隔制御装置100から操作対象機器までの距離である。 The target device position calculation unit 5 may calculate the approximate distance from the remote control device 100 to the television Tv1 by calculating the intersection of the left and right lines of sight detected by the line of sight detection unit 4. The position of a target device such as the television Tv1 is at least the direction of the target device as seen from the remote control device 100, and preferably the direction of the target device as seen from the remote control device 100 and the distance from the remote control device 100 to the target device.
 登録制御部16は、操作対象機器位置算出部5が算出したテレビTv1の位置情報を操作対象機器登録部6に登録する。続けて、ユーザ60がテレビTv2を見ると、同様に、操作対象機器位置算出部5はTv2の位置を算出し、登録制御部16は、操作対象機器登録部6にTv2の位置情報を登録する。 The registration control unit 16 registers the position information of the television Tv1 calculated by the operation target device position calculation unit 5 in the operation target device registration unit 6. Next, when the user 60 watches the television Tv2, the operation target device position calculation unit 5 similarly calculates the position of Tv2, and the registration control unit 16 registers the position information of Tv2 in the operation target device registration unit 6.
 図5に示すように、テレビTv1は遠隔制御装置100から見て方向Dr1に距離Ds1に位置し、Tv2は遠隔制御装置100から見て方向Dr2に距離Ds2に位置しているとする。図4に示すように、操作対象機器登録部6には、リビングルームのテレビであるテレビTv1の位置情報として方向Dr1及び距離Ds1が登録され、ダイニングルームのテレビであるテレビTv2の位置情報として方向Dr2及び距離Ds2が登録される。このように、登録制御部16は、操作対象機器登録部6に各操作対象機器をマップ情報として登録する。 As shown in FIG. 5, television Tv1 is located at a distance Ds1 in a direction Dr1 as viewed from the remote control device 100, and television Tv2 is located at a distance Ds2 in a direction Dr2 as viewed from the remote control device 100. As shown in FIG. 4, in the operation target device registration unit 6, the direction Dr1 and distance Ds1 are registered as position information for television Tv1, which is a television in the living room, and the direction Dr2 and distance Ds2 are registered as position information for television Tv2, which is a television in the dining room. In this way, the registration control unit 16 registers each operation target device as map information in the operation target device registration unit 6.
 操作対象機器の登録方法は上記に限定されず、ユーザが各操作対象機器の位置情報を手入力で入力する方法でもよい。 The method of registering target devices is not limited to the above, and the user may manually input the location information of each target device.
(ユーザの顔画像の登録)
 図2おいて、ユーザ60(または他のユーザ)が操作部15を操作することによって、顔画像を登録する登録モードに設定すると、登録制御部16は、カメラ1が撮影したユーザ60の顔画像を顔画像登録部3に登録する。図6に示すように、顔画像登録部3には、ユーザIDに対応させて、遠隔制御装置100を操作するユーザの顔画像が登録される。顔画像登録部3に登録される顔画像は、カメラ1が各ユーザを撮影した顔画像であってもよいし、別のカメラが各ユーザを撮影した顔画像であってもよい。
(Registering a user's face image)
2, when user 60 (or another user) operates operation unit 15 to set a registration mode for registering a face image, registration control unit 16 registers the face image of user 60 captured by camera 1 in face image registration unit 3. As shown in Fig. 6, face images of users who operate remote control device 100 are registered in face image registration unit 3 in association with user IDs. The face images registered in face image registration unit 3 may be face images of each user captured by camera 1, or may be face images of each user captured by a different camera.
(機器の操作)
 図1に示すユーザ60は、ユーザIDが0001であるユーザである。登録顔画像認識部2は、顔画像登録部3に登録されている顔画像に基づいて、カメラ1によって撮影された人が顔画像登録部3に登録されているユーザであるか否かを認識する。視線検出部4は、登録顔画像認識部2が顔画像登録部3に登録されているユーザであると認識した人の視線の方向を検出する。視線検出部4が顔画像登録部3に登録されているユーザであると認識した人の視線の方向を検出することにより、登録されたユーザ以外の人の視線を誤って検出することを回避することができる。登録顔画像認識部2及び顔画像登録部3は、遠隔制御装置100を操作するユーザを特定するユーザ特定部として機能する。
(Operation of equipment)
A user 60 shown in FIG. 1 is a user with a user ID of 0001. The registered face image recognition unit 2 recognizes whether or not a person photographed by the camera 1 is a user registered in the face image registration unit 3, based on a face image registered in the face image registration unit 3. The gaze detection unit 4 detects the gaze direction of a person who the registered face image recognition unit 2 recognizes as a user registered in the face image registration unit 3. By detecting the gaze direction of a person who the gaze detection unit 4 recognizes as a user registered in the face image registration unit 3, it is possible to avoid erroneously detecting the gaze of a person other than the registered user. The registered face image recognition unit 2 and the face image registration unit 3 function as a user identification unit that identifies a user who operates the remote control device 100.
 操作対象機器特定部7は、操作対象機器登録部6に登録されている複数の操作対象機器のうち、視線検出部4が検出した視線の方向に位置する操作対象機器を音声によって操作される対象の操作対象機器であると特定する。図1に示すように、ユーザ60がテレビTv1を見ていると、操作対象機器特定部7は、テレビTv1を音声によって操作される対象の操作対象機器であると特定する。 The operation target device identification unit 7 identifies, among the multiple operation target devices registered in the operation target device registration unit 6, an operation target device located in the direction of the line of sight detected by the line of sight detection unit 4 as an operation target device to be operated by voice. As shown in FIG. 1, when a user 60 is watching a television Tv1, the operation target device identification unit 7 identifies the television Tv1 as an operation target device to be operated by voice.
 操作対象機器特定部7がテレビTv1を対象の操作対象機器であると特定している状態で、ユーザ60が例えば「電源をオンしてください」と声を発したとする。音声認識部11は、音声を認識して、音声応答制御部8に電源のオンを指示する。音声応答制御部8は、操作対象機器特定部7よりテレビTv1が対象の操作対象機器であると特定されているという情報を取得する。音声応答制御部8は、遠隔制御信号発生部9にテレビTv1の電源をオンする遠隔制御信号の発生を指示する。 Suppose that when the operation target device identification unit 7 has identified the television Tv1 as the target operation target device, the user 60 utters, for example, "Please turn on the power." The voice recognition unit 11 recognizes the voice and instructs the voice response control unit 8 to turn on the power. The voice response control unit 8 obtains information from the operation target device identification unit 7 that the television Tv1 has been identified as the target operation target device. The voice response control unit 8 instructs the remote control signal generation unit 9 to generate a remote control signal to turn on the power of the television Tv1.
 遠隔制御信号発生部9は、音声応答制御部8による指示を受けて、テレビTv1の電源をオンする遠隔制御信号を発生させる。これにより、テレビTv1は電源がオンされる。 The remote control signal generating unit 9 receives an instruction from the voice response control unit 8 and generates a remote control signal to turn on the power of the television Tv1. This turns on the power of the television Tv1.
 同様に、操作対象機器特定部7がテレビTv1を対象の操作対象機器であると特定している状態で、ユーザ60が例えば「ボリュームを上げてください」と声を発したとすると、テレビTv1のボリュームを上げることができる。 Similarly, when the operation target device identification unit 7 has identified the television Tv1 as the target operation target device, if the user 60 utters, for example, "Turn up the volume," the volume of the television Tv1 can be turned up.
 このように、複数の操作対象機器としてテレビTv1及びTv2が存在しても、ユーザ60は「リビングルームのテレビの電源をオンしてください」とか「リビングルームのテレビのボリュームを上げてください」のような長い文章で遠隔制御装置100に指示する必要はない。遠隔制御装置100によれば、音声による指示を簡略化することができる。 In this way, even if there are televisions Tv1 and Tv2 as multiple devices to be operated, the user 60 does not need to instruct the remote control device 100 using long sentences such as "Please turn on the television in the living room" or "Please turn up the volume of the television in the living room." The remote control device 100 can simplify voice instructions.
 ところで、操作対象機器特定部7が音声によって操作される対象の操作対象機器を特定していない状態で、ユーザ60が「電源をオンしてください」と声を発したとする。この場合、音声合成制御部12は、「リビングルームのテレビの電源をオンしますか、ダイニングルームのテレビの電源をオンしますか」のような問い合わせの文章を音声合成して、スピーカ14より発生させるよう、音声合成部13を制御する。 Now, suppose that the user 60 utters "Please turn on the power" when the target device identification unit 7 has not yet identified the target device to be operated by voice. In this case, the voice synthesis control unit 12 controls the voice synthesis unit 13 to synthesize a question such as "Would you like to turn on the TV in the living room or the TV in the dining room" and output it from the speaker 14.
 その問い合わせの文章を聞いたユーザ60は、「リビングルームのテレビの電源をオンしてください」と声を発する。音声認識部11は、ユーザ60が発した音声を認識して、音声応答制御部8にリビングルームのテレビであるテレビTv1の電源のオンを指示する。音声応答制御部8は、遠隔制御信号発生部9にテレビTv1の電源をオンする遠隔制御信号の発生を指示する。 Hearing this inquiry sentence, user 60 says, "Please turn on the TV in the living room." The voice recognition unit 11 recognizes the voice uttered by user 60 and instructs the voice response control unit 8 to turn on the power of TV Tv1, which is the TV in the living room. The voice response control unit 8 instructs the remote control signal generation unit 9 to generate a remote control signal to turn on the power of TV Tv1.
 このように、操作対象機器特定部7が対象の操作対象機器を特定しないと、音声応答制御部8が対象の操作対象機器を特定するまでに、スピーカ14からの音声の発生、マイクロホン10による音声の収音、音声認識部11による音声の認識が必要となることがある。操作対象機器特定部7が対象の操作対象機器を特定することにより、極めて簡単な文の音声で操作対象機器を操作することが可能となる。 In this way, if the target device identification unit 7 does not identify the target device, it may be necessary to emit voice from the speaker 14, pick up the voice with the microphone 10, and recognize the voice with the voice recognition unit 11, before the voice response control unit 8 can identify the target device. By having the target device identification unit 7 identify the target device, it becomes possible to operate the device by speaking extremely simple sentences.
 また、カメラ1の撮像範囲内に登録ユーザ以外のユーザを含めて複数のユーザが存在した場合でも、登録ユーザの視線を検出して操作対象機器を特定することで、適切に操作対象機器を特定することができる。 Even if there are multiple users, including users other than registered users, within the imaging range of camera 1, the device to be operated can be appropriately identified by detecting the line of sight of the registered user and identifying the device to be operated.
 上記の説明では、登録ユーザの視線に基づいて操作対象機器を特定した状態で音声を検出した場合に、検出した音声に基づく機器操作を行う例を説明したが、常時登録ユーザの視線を検出する必要はなく、音声を検出した場合に登録ユーザの視線を検出して操作対象機器を特定する構成としてもよい。 In the above explanation, an example was given in which, when a device to be operated is identified based on the gaze of a registered user and voice is detected, the device is operated based on the detected voice. However, it is not necessary to constantly detect the gaze of a registered user, and a configuration in which the gaze of a registered user is detected and the device to be operated is identified when voice is detected is also possible.
 遠隔制御装置100によれば、顔画像登録部3に登録されている特定のユーザの視線に基づいて操作対象機器を特定するので、複数のユーザが存在する場合でも、特定のユーザが操作したい機器を的確に特定することができる。 The remote control device 100 identifies the device to be operated based on the line of sight of a specific user registered in the face image registration unit 3, so even if there are multiple users, it can accurately identify the device that a specific user wants to operate.
<第2実施形態>
 図7は、第2実施形態に係る遠隔制御装置200を示す。遠隔制御装置200において、遠隔制御装置100と同一部分には同一符号を付し、その説明を省略する。遠隔制御装置200は、遠隔制御装置100が備えていない構成として、音声登録部17及び登録音声認識部18を備える。操作対象機器の登録に関しては第1実施形態と同様であり、その説明を省略する。
Second Embodiment
7 shows a remote control device 200 according to the second embodiment. In the remote control device 200, the same parts as those in the remote control device 100 are given the same reference numerals, and the description thereof will be omitted. The remote control device 200 includes a voice registration unit 17 and a registered voice recognition unit 18, which are components not included in the remote control device 100. Registration of a device to be operated is the same as in the first embodiment, and the description thereof will be omitted.
(ユーザの音声の登録)
 図7おいて、ユーザ60(または他のユーザ)が操作部15を操作することによって、ユーザ60を登録するユーザ登録モードに設定すると、登録制御部16は、音声登録部17にユーザ60の音声を登録する。登録モードに設定している状態で、マイクロホン10は、ユーザ60が発する所定の文を収音する。登録制御部16は、マイクロホンが収音した音声を音声登録部17に登録する。
(Registering user voice)
7 , when user 60 (or another user) operates operation unit 15 to set a user registration mode for registering user 60, registration control unit 16 registers the voice of user 60 in voice registration unit 17. In the registration mode, microphone 10 picks up a predetermined sentence uttered by user 60. Registration control unit 16 registers the voice picked up by the microphone in voice registration unit 17.
 図8に示すように、登録制御部16は、ユーザIDに対応させて、ユーザの音声を音声登録部17に登録する。音声登録部17に登録されるユーザIDは、顔画像登録部3に登録されるユーザIDと1対1で対応している。つまり、顔画像登録部3に登録されているユーザと音声登録部17に登録されているユーザは、ユーザIDが同じであれば、同一ユーザである。よって、音声登録部17は、顔画像登録部3に登録されているユーザの顔画像に関連付けて、ユーザの声を登録している。 As shown in FIG. 8, the registration control unit 16 registers the user's voice in the voice registration unit 17 in association with the user ID. The user IDs registered in the voice registration unit 17 correspond one-to-one with the user IDs registered in the face image registration unit 3. In other words, if a user registered in the face image registration unit 3 and a user registered in the voice registration unit 17 have the same user ID, they are the same user. Therefore, the voice registration unit 17 registers the user's voice in association with the user's face image registered in the face image registration unit 3.
 遠隔制御装置200の通常の動作時に、ユーザ60が例えば「電源をオンしてください」と声を発したとする。登録音声認識部18は、マイクロホン10が収音した音声が音声登録部17に登録されているユーザの音声であるか否かを認識する。登録音声認識部18は、音声の声紋に基づいて登録されているユーザの音声であるか否かを認識することができる。 When the remote control device 200 is operating normally, suppose the user 60 says, for example, "Please turn on the power." The registered voice recognition unit 18 recognizes whether the voice picked up by the microphone 10 is the voice of a user registered in the voice registration unit 17. The registered voice recognition unit 18 can recognize whether the voice is that of a registered user based on the voiceprint of the voice.
 登録音声認識部18は、マイクロホン10が収音した音声が音声登録部17に登録されているユーザ60の音声であると認識すると、ユーザIDを特定し、顔画像登録部3に登録されているユーザ顔画像データに基づいて、特定したユーザIDに対応するユーザ顔画像を特定する。登録顔画像認識部2は、特定したユーザの顔画像を認識する。視線検出部4は登録顔画像認識部2が認識したユーザの視線の方向を検出する。操作対象機器特定部7は、操作対象機器登録部6に登録されている複数の操作対象機器のうち、視線検出部4が検出した視線の方向に位置する操作対象機器を制御される対象の操作対象機器であると特定する。 When the registered voice recognition unit 18 recognizes that the voice picked up by the microphone 10 is the voice of a user 60 registered in the voice registration unit 17, it identifies the user ID and identifies the user face image corresponding to the identified user ID based on the user face image data registered in the face image registration unit 3. The registered face image recognition unit 2 recognizes the face image of the identified user. The gaze detection unit 4 detects the gaze direction of the user recognized by the registered face image recognition unit 2. The operation target device identification unit 7 identifies, among the multiple operation target devices registered in the operation target device registration unit 6, the operation target device located in the gaze direction detected by the gaze detection unit 4 as the operation target device to be controlled.
 登録音声認識部18は、マイクロホン10が収音した音声が音声登録部17に登録されているユーザ60の音声であると認識すると、音声応答制御部8に、音声登録部17に登録されているユーザの音声を認識した旨を通知する。音声応答制御部8は、登録されているユーザの音声を認識した旨が通知されれば、音声認識部11からの電源をオンする指示に従って、操作対象機器特定部7が特定した操作対象機器の電源をオンする遠隔制御信号の発生を遠隔制御信号発生部9に指示する。 When the registered voice recognition unit 18 recognizes that the voice picked up by the microphone 10 is the voice of a user 60 registered in the voice registration unit 17, it notifies the voice response control unit 8 that it has recognized the voice of the user registered in the voice registration unit 17. When the voice response control unit 8 is notified that it has recognized the voice of the registered user, it instructs the remote control signal generation unit 9 to generate a remote control signal to turn on the power of the operation target device identified by the operation target device identification unit 7, following the instruction to turn on the power from the voice recognition unit 11.
 図7において、顔画像登録部3と音声登録部17とを別々に示しているが、顔画像登録部3と音声登録部17とを1つのユーザ登録部として構成してもよい。登録顔画像認識部2、顔画像登録部3、音声登録部17、登録音声認識部18は、遠隔制御装置200を操作するユーザを特定するユーザ特定部として機能する。 In FIG. 7, the face image registration unit 3 and the voice registration unit 17 are shown separately, but the face image registration unit 3 and the voice registration unit 17 may be configured as a single user registration unit. The registered face image recognition unit 2, the face image registration unit 3, the voice registration unit 17, and the registered voice recognition unit 18 function as a user identification unit that identifies the user operating the remote control device 200.
 遠隔制御装置200によれば、例えば、登録ユーザが複数存在して、それぞれの登録ユーザが音声による指示を発声した場合でも、検出した音声ごとにユーザを特定して、特定したユーザの視線方向に基づいて操作対象機器を特定するので、遠隔制御装置100と比較して、より適切に操作対象機器を特定することができる。 According to the remote control device 200, even if there are multiple registered users and each registered user issues a voice command, the remote control device 200 identifies the user for each detected voice and identifies the device to be operated based on the gaze direction of the identified user, so that the remote control device 200 can more appropriately identify the device to be operated compared to the remote control device 100.
 遠隔制御装置200によれば、遠隔制御装置100の構成に対し、操作指示音声に基づいてユーザを特定する登録音声認識部18を備えるので、登録ユーザが複数存在する場合でも操作指示音声を発したユーザを特定して、特定したユーザの視線に基づいて操作対象機器を特定することができる。よって、遠隔制御装置200によれば、複数のユーザが存在する場合でも、特定のユーザが操作したい機器をより的確に特定することができる。 The remote control device 200 is equipped with a registered voice recognition unit 18 that identifies a user based on an operation instruction voice, in addition to the configuration of the remote control device 100. Therefore, even if there are multiple registered users, it is possible to identify the user who issued the operation instruction voice and to identify the device to be operated based on the line of sight of the identified user. Therefore, even if there are multiple users, the remote control device 200 can more accurately identify the device that a specific user wants to operate.
<第3実施形態>
 図9は、第3実施形態に係る遠隔制御装置300を示す。遠隔制御装置300において、遠隔制御装置100と同一部分には同一符号を付し、その説明を省略する。遠隔制御装置100においては、登録顔画像認識部2が顔画像登録部3に登録されている顔を認識することで遠隔制御装置100を操作するユーザを特定する。これに対して、遠隔制御装置300は音声発生方向を推定する音声検出部30を備え、音声検出部30が推定した音声発生方向に基づいて遠隔制御装置300を操作するユーザを特定する。
Third Embodiment
9 shows a remote control device 300 according to the third embodiment. In the remote control device 300, the same parts as those in the remote control device 100 are given the same reference numerals, and their description will be omitted. In the remote control device 100, the registered face image recognition unit 2 recognizes a face registered in the face image registration unit 3, thereby identifying the user who operates the remote control device 100. In contrast, the remote control device 300 includes a voice detection unit 30 that estimates the direction of voice generation, and identifies the user who operates the remote control device 300 based on the direction of voice generation estimated by the voice detection unit 30.
 図9において、音声検出部30は、音声を検出し、検出した音声の発声方向を推定する。音声検出部30は、検出した音声を音声認識部11に送信し、推定した音声発生方向を示す情報をユーザ特定部31に送信する。音声検出部30は、既存の技術を使用して音声発生方向を推定することができる。例えば、音声検出部30は複数のマイクロホンを備え、複数のマイクロホンで検出した音声の位相差に基づいて音声発生方向を推定する。 In FIG. 9, the voice detection unit 30 detects voice and estimates the direction of the detected voice. The voice detection unit 30 transmits the detected voice to the voice recognition unit 11, and transmits information indicating the estimated voice generation direction to the user identification unit 31. The voice detection unit 30 can estimate the voice generation direction using existing technology. For example, the voice detection unit 30 includes multiple microphones, and estimates the voice generation direction based on the phase difference of the voice detected by the multiple microphones.
 音声検出部30は、複数のマイクロホンを一体的に備える構成である必要はなく、複数のマイクロホンが部屋の異なる場所に設置されていてもよい。音声検出部30は、部屋の異なる場所に設置された複数のマイクロホンの各マイクロホンが検出する音声の大きさまたは位相差に基づいて音声発生方向を推定してもよい。音声発生方向の推定方法は複数のマイクロホンを用いる方法に限定されず、他の方法を用いてもよい。 The voice detection unit 30 does not need to be configured with multiple microphones integrated into one unit, and multiple microphones may be installed in different locations in a room. The voice detection unit 30 may estimate the direction of voice generation based on the volume or phase difference of the voice detected by each of the multiple microphones installed in different locations in a room. The method of estimating the direction of voice generation is not limited to the method using multiple microphones, and other methods may be used.
 ユーザ特定部31は、音声検出部30より供給される音声発生方向を示す情報とカメラ1での撮像結果とに基づいて、遠隔制御装置300を操作するユーザを特定する。具体的には、ユーザ特定部31は、音声検出部30で推定した音声発生方向に存在する人物を、遠隔制御装置300を操作するユーザとして特定する。ユーザ特定部31は、CPUで構成することができる。 The user identification unit 31 identifies the user operating the remote control device 300 based on the information indicating the direction of sound generation supplied by the sound detection unit 30 and the imaging results of the camera 1. Specifically, the user identification unit 31 identifies a person present in the direction of sound generation estimated by the sound detection unit 30 as the user operating the remote control device 300. The user identification unit 31 can be configured with a CPU.
 視線検出部4は、ユーザ特定部31が特定したユーザの視線を検出する。操作対象機器特定部7は、操作対象機器登録部6に登録されている複数の操作対象機器のうち、視線検出部4が検出した視線の方向に位置する操作対象機器を音声によって操作される対象の操作対象機器であると特定する。操作対象機器特定部7が操作対象機器を特定した以降の動作は、遠隔制御装置100と同様である。 The gaze detection unit 4 detects the gaze of the user identified by the user identification unit 31. The operation target device identification unit 7 identifies, from among multiple operation target devices registered in the operation target device registration unit 6, an operation target device located in the direction of the gaze detected by the gaze detection unit 4 as an operation target device to be operated by voice. The operation after the operation target device identification unit 7 identifies the operation target device is the same as that of the remote control device 100.
 音声検出部30が推定した音声発生方向に存在する人物を、遠隔制御装置300を操作するユーザとして特定する代わりに、第3実施形態の変形例として次のように構成されていてもよい。ユーザ特定部31は、カメラ1が撮像している人物の中から、音声検出部30が音声を検出したタイミングで口が動いた人物を、遠隔制御装置300を操作するユーザとして特定する。この場合、音声検出部30は音声発生方向を推定する機能を有する必要はない。 Instead of identifying a person who is in the direction of sound generation estimated by the sound detection unit 30 as the user operating the remote control device 300, a modified version of the third embodiment may be configured as follows. The user identification unit 31 identifies, from among the people captured by the camera 1, a person whose mouth moved at the time the sound detection unit 30 detected sound, as the user operating the remote control device 300. In this case, the sound detection unit 30 does not need to have the function of estimating the direction of sound generation.
 本発明は以上説明した第1~第3実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々変更可能である。 The present invention is not limited to the first to third embodiments described above, and various modifications are possible without departing from the gist of the present invention.
 本願は、2022年12月22日に日本国特許庁に出願された特願2022-205542号、及び、2023年7月18日に日本国特許庁に出願された特願2023-116692号に基づく優先権を主張するものであり、それらの全ての開示内容は引用によりここに援用される。 This application claims priority to Patent Application No. 2022-205542, filed with the Japan Patent Office on December 22, 2022, and Patent Application No. 2023-116692, filed with the Japan Patent Office on July 18, 2023, the entire disclosures of which are incorporated herein by reference.

Claims (6)

  1.  ユーザを特定するユーザ特定部と、
     複数の操作対象機器の位置情報が登録されている操作対象機器登録部と、
     前記ユーザ特定部が特定したユーザの視線を検出する視線検出部と、
     前記操作対象機器登録部に登録されている前記複数の操作対象機器のうち、前記視線検出部が検出した視線の方向に位置する操作対象機器を制御される対象の操作対象機器であると特定する操作対象機器特定部と、
     マイクロホンが収音した音声に従って、前記操作対象機器特定部によって特定された操作対象機器を制御する音声応答制御部と、
     を備える遠隔制御装置。
    A user identification unit that identifies a user;
    an operation target device registration unit in which location information of a plurality of operation target devices is registered;
    a gaze detection unit that detects a gaze of the user identified by the user identification unit;
    an operation target device identification unit that identifies an operation target device located in a line of sight direction detected by the line of sight detection unit as an operation target device to be controlled, among the plurality of operation target devices registered in the operation target device registration unit;
    a voice response control unit that controls the operation target device identified by the operation target device identification unit in accordance with a voice picked up by a microphone;
    A remote control device comprising:
  2.  前記ユーザ特定部は、
     操作対象機器を操作するユーザの顔画像が登録されている顔画像登録部と、
     前記顔画像登録部に登録されている顔画像に基づいて、カメラによって撮影された人が前記顔画像登録部に登録されているユーザであるか否かを認識する登録顔画像認識部と、
     前記登録顔画像認識部が前記顔画像登録部に登録されているユーザであると認識した人をユーザとして特定する
     請求項1に記載の遠隔制御装置。
    The user identification unit,
    a face image registration unit in which a face image of a user who operates the target device is registered;
    a registered face image recognition unit that recognizes whether or not a person photographed by a camera is a user registered in the face image registration unit based on the face image registered in the face image registration unit;
    The remote control device according to claim 1 , wherein a person who is recognized by the registered face image recognition unit as a user registered in the face image registration unit is specified as the user.
  3.  音声の発声方向を推定する音声検出部をさらに備え、
     前記ユーザ特定部は、前記音声検出部が推定した音声の発声方向に基づいてユーザを特定する
     請求項1に記載の遠隔制御装置。
    A voice detection unit is further provided for estimating a direction of voice,
    The remote control device according to claim 1 , wherein the user identification unit identifies the user based on the direction of speech estimated by the voice detection unit.
  4.  前記顔画像登録部に登録されているユーザの顔画像に関連付けて、ユーザの声を登録する音声登録部と、
     前記音声登録部に登録されている声に基づいて、前記マイクロホンが収音した声を発したユーザを特定する登録音声認識部と、
     をさらに備え、
     前記登録顔画像認識部は、前記登録音声認識部で特定したユーザの顔画像を特定し、
     前記視線検出部は、前記特定したユーザの視線を検出する
     請求項2に記載の遠隔制御装置。
    a voice registration unit that registers the voice of the user in association with the face image of the user registered in the face image registration unit;
    a registered voice recognition unit that identifies a user who uttered a voice picked up by the microphone based on a voice registered in the voice registration unit;
    Further equipped with
    The registered face image recognition unit identifies a face image of the user identified by the registered voice recognition unit,
    The remote control device according to claim 2 , wherein the gaze detection unit detects a gaze of the identified user.
  5.  前記各操作対象機器が位置する方向を登録する登録モード時に、前記ユーザが、前記複数の操作対象機器のうちのいずれかの操作対象機器を見ている状態で、前記視線検出部が検出した視線に基づいて、前記操作対象機器の位置を算出する操作対象機器位置算出部をさらに備え、
     前記操作対象機器登録部は、前記操作対象機器位置算出部が算出した前記操作対象機器の位置情報を登録する
     請求項1~3のいずれか1項に記載の遠隔制御装置。
    and a target device position calculation unit that calculates a position of the target device based on a line of sight detected by the line of sight detection unit when the user is looking at any one of the plurality of target devices during a registration mode in which a direction in which each of the target devices is located is registered,
    The remote control device according to claim 1 , wherein the target device registration section registers the position information of the target device calculated by the target device position calculation section.
  6.  ユーザ特定部がユーザを特定し、
     視線検出部が、前記ユーザ特定部が特定したユーザの視線の方向を検出し、
     操作対象機器特定部が、操作対象機器登録部に登録されている複数の操作対象機器のうち、前記視線検出部が検出した視線の方向に位置する操作対象機器を制御される対象の操作対象機器であると特定し、
     音声応答制御部が、マイクロホンが収音した音声に従って、前記操作対象機器特定部によって特定された操作対象機器を制御する
     遠隔制御方法。
    A user identification unit identifies a user,
    a gaze detection unit that detects a gaze direction of the user identified by the user identification unit;
    an operation target device identification unit identifies an operation target device located in a direction of the line of sight detected by the line of sight detection unit as an operation target device to be controlled, among a plurality of operation target devices registered in an operation target device registration unit;
    A remote control method comprising: a voice response control unit that controls a target device identified by the target device identification unit in accordance with a voice picked up by a microphone.
PCT/JP2023/031933 2022-12-22 2023-08-31 Remote control equipment and remote control method WO2024135001A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2022205542 2022-12-22
JP2022-205542 2022-12-22
JP2023116692A JP2024091246A (en) 2022-12-22 2023-07-18 Remote control device and remote control method
JP2023-116692 2023-07-18

Publications (1)

Publication Number Publication Date
WO2024135001A1 true WO2024135001A1 (en) 2024-06-27

Family

ID=91588067

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/031933 WO2024135001A1 (en) 2022-12-22 2023-08-31 Remote control equipment and remote control method

Country Status (1)

Country Link
WO (1) WO2024135001A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019003228A (en) * 2017-06-09 2019-01-10 富士通株式会社 Equipment cooperation system, equipment cooperation device, equipment cooperation method, and equipment cooperation program
WO2019093123A1 (en) * 2017-11-07 2019-05-16 ソニー株式会社 Information processing device and electronic apparatus
WO2019142295A1 (en) * 2018-01-18 2019-07-25 三菱電機株式会社 Device operation apparatus, device operation system and device operation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019003228A (en) * 2017-06-09 2019-01-10 富士通株式会社 Equipment cooperation system, equipment cooperation device, equipment cooperation method, and equipment cooperation program
WO2019093123A1 (en) * 2017-11-07 2019-05-16 ソニー株式会社 Information processing device and electronic apparatus
WO2019142295A1 (en) * 2018-01-18 2019-07-25 三菱電機株式会社 Device operation apparatus, device operation system and device operation method

Similar Documents

Publication Publication Date Title
US11043231B2 (en) Speech enhancement method and apparatus for same
JP6520878B2 (en) Voice acquisition system and voice acquisition method
EP3410921B1 (en) Personalized, real-time audio processing
JP4356663B2 (en) Camera control device and electronic conference system
US6975991B2 (en) Wearable display system with indicators of speakers
US9008320B2 (en) Apparatus, system, and method of image processing, and recording medium storing image processing control program
US10303929B2 (en) Facial recognition system
JP2000347692A (en) Person detecting method, person detecting device, and control system using it
JP2012040655A (en) Method for controlling robot, program, and robot
JP2005250397A (en) Robot
US9853758B1 (en) Systems and methods for signal mixing
JP2009166184A (en) Guide robot
JP2010154260A (en) Voice recognition device
JP5206151B2 (en) Voice input robot, remote conference support system, and remote conference support method
WO2024135001A1 (en) Remote control equipment and remote control method
WO2019147034A1 (en) Electronic device for controlling sound and operation method therefor
US10225670B2 (en) Method for operating a hearing system as well as a hearing system
JP2024091246A (en) Remote control device and remote control method
US11227423B2 (en) Image and sound pickup device, sound pickup control system, method of controlling image and sound pickup device, and method of controlling sound pickup control system
WO2020021861A1 (en) Information processing device, information processing system, information processing method, and information processing program
JP6842489B2 (en) Electronics, control methods and programs
US20210208550A1 (en) Information processing apparatus and information processing method
CN111667822B (en) Voice processing device, conference system, and voice processing method
JP7111202B2 (en) SOUND COLLECTION CONTROL SYSTEM AND CONTROL METHOD OF SOUND COLLECTION CONTROL SYSTEM
TWI756966B (en) Video device and operation method thereof