CN111741225A - Human-computer interaction device, method and computer-readable storage medium - Google Patents

Human-computer interaction device, method and computer-readable storage medium Download PDF

Info

Publication number
CN111741225A
CN111741225A CN202010786060.6A CN202010786060A CN111741225A CN 111741225 A CN111741225 A CN 111741225A CN 202010786060 A CN202010786060 A CN 202010786060A CN 111741225 A CN111741225 A CN 111741225A
Authority
CN
China
Prior art keywords
human
image
computer interaction
camera
assembly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010786060.6A
Other languages
Chinese (zh)
Inventor
张哲�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Jimi Technology Co Ltd
Chengdu XGIMI Technology Co Ltd
Original Assignee
Chengdu Jimi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Jimi Technology Co Ltd filed Critical Chengdu Jimi Technology Co Ltd
Priority to CN202010786060.6A priority Critical patent/CN111741225A/en
Publication of CN111741225A publication Critical patent/CN111741225A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present application relates to the field of human-computer interaction technologies, and in particular, to a human-computer interaction device, a human-computer interaction method, and a computer-readable storage medium. The man-machine interaction equipment provided by the embodiment of the application comprises a processing assembly and a camera shooting assembly, wherein the processing assembly is connected with the camera shooting assembly. The processing assembly is used for collecting voice information of an environment where the human-computer interaction equipment is located, determining a voice direction corresponding to the voice information, generating a rotation control instruction according to the voice direction, and sending the rotation control instruction to the camera shooting assembly. The camera shooting assembly is used for executing rotation action according to the rotation control instruction so as to rotate a shooting surface of the camera shooting assembly to a first target position corresponding to the voice direction and collect a scene image at the first target position. The human-computer interaction device, the human-computer interaction method and the computer-readable storage medium can improve the automation degree of the human-computer interaction device.

Description

Human-computer interaction device, method and computer-readable storage medium
Technical Field
The present application relates to the field of human-computer interaction technologies, and in particular, to a human-computer interaction device, a human-computer interaction method, and a computer-readable storage medium.
Background
Human-computer interaction is a study of the interaction between a research system and a target user, and can be various machines (human-computer interaction devices) or computerized systems and software. For the human-computer interaction device, in the prior art, the human-computer interaction device is generally a fixedly arranged machine device, for example, a conference audio-visual device, a banking business handling device, a station ticket booking device, and the like, and in a use process of the existing human-computer interaction device, a target user needs to be fixed at a certain position of the human-computer interaction device, for example, a position facing a camera component in the human-computer interaction device, so that a normal interaction behavior between the target user and the human-computer interaction device can be realized, and therefore, in the prior art, an automation degree of the human-computer interaction device is relatively weak.
Disclosure of Invention
An object of the present application is to provide a human-computer interaction device, a human-computer interaction method, and a computer-readable storage medium, so as to solve the above problems.
In a first aspect, the human-computer interaction device provided by the application comprises a processing component and a camera shooting component, wherein the processing component is connected with the camera shooting component;
the processing assembly is used for acquiring voice information of the environment where the human-computer interaction equipment is located, determining a voice direction corresponding to the voice information, generating a rotation control instruction according to the voice direction, and sending the rotation control instruction to the camera assembly;
the camera shooting assembly is used for executing rotation action according to the rotation control instruction so as to rotate a shooting surface of the camera shooting assembly to a first target position corresponding to the voice direction and collect a scene image at the first target position.
With reference to the first aspect, an embodiment of the present application further provides a first optional implementation manner of the first aspect, where the processing component includes a voice acquisition device and a processing device, the voice acquisition device is connected to the processing device, and the processing device is further connected to the camera component;
the voice collecting device is used for collecting voice information of the environment where the human-computer interaction equipment is located, determining a voice direction corresponding to the voice information and sending the voice direction to the processing device;
the processing device is used for generating a rotation control instruction according to the voice direction and sending the rotation control instruction to the camera shooting assembly.
With reference to the first aspect, an embodiment of the present application further provides a second optional implementation manner of the first aspect, where the human-computer interaction device further includes a first display device, and the first display device is connected to the processing component;
the camera shooting assembly is also used for sending the scene image to the processing assembly;
the processing component is further used for determining a first person image from the scene image, generating a target display image according to the first person image and sending the target display image to the first display device;
the first display device is used for displaying a target display image.
In the above embodiment, the human-computer interaction device further includes a first display device, the first display device is connected to the processing component, the camera component is further configured to send the scene image to the processing component, the processing component is further configured to determine a first person image from the scene image, generate a target display image according to the first person image, and send the target display image to the first display device, and the first display device is configured to display the target display image, so that interactivity between the target user and the human-computer interaction device is enhanced.
With reference to the second optional implementation manner of the first aspect, an embodiment of the present application further provides a third optional implementation manner of the first aspect, where the processing component is configured to determine a first person image and first expression information of the first person image from the scene image, simulate a second person image according to the first expression information, serve as a target display image, and send the target display image to the first display device.
With reference to the first aspect, an embodiment of the present application further provides a fourth optional implementation manner of the first aspect, and the processing component is further connected to a second display device;
the processing component is further used for determining a first person image from the scene image and sending the first person image to the second display device;
the second display device is used for displaying the first human image.
In the above embodiment, the processing component is further connected to the second display device, and the processing component is further configured to determine the first person image from the scene image, and send the first person image to the second display device, and the second display device is configured to display the first person image, so that the human-computer interaction device can be applied to the teleconference system, and the applicable range of the human-computer interaction device is increased.
With reference to the fourth optional implementation manner of the first aspect, an embodiment of the present application further provides a fifth optional implementation manner of the first aspect, where the processing component is configured to, after determining the first person image from the scene image, obtain person identity information corresponding to the first person image, and send the first person image and the person identity information to the second display device;
the second display device is used for displaying the first person image and the person identity information.
With reference to the second, third, fourth, or fifth optional implementation manner of the first aspect, an embodiment of the present application further provides a sixth optional implementation manner of the first aspect, where after the processing component determines the first person image from the scene image, the processing component is further configured to determine a target face from the first person image, determine position information of the target face in the scene image, generate a fine adjustment instruction according to the position information, and send the fine adjustment instruction to the image capturing component;
the camera shooting assembly is used for executing fine adjustment action according to the fine adjustment instruction so as to adjust the shooting surface of the camera shooting assembly to a second target position corresponding to the first person image.
With reference to the sixth optional implementation manner of the first aspect, an embodiment of the present application further provides a seventh optional implementation manner of the first aspect, where the image capturing assembly includes a first rotation control component and a camera, and the first rotation control component and the camera are respectively connected to the processing assembly;
the first rotating control component is used for receiving the fine adjustment instruction and executing fine adjustment action according to the fine adjustment instruction so as to drive the camera to rotate in a first direction, the shooting surface of the camera is adjusted to a second target position corresponding to the first person image, and the first direction is a vertical direction.
With reference to the first aspect, an embodiment of the present application further provides an eighth optional implementation manner of the first aspect, where the camera assembly includes a second rotation control component and a camera, and the second rotation control component and the camera are respectively connected to the processing assembly;
the second rotation control part is used for receiving the rotation control instruction and executing rotation action according to the rotation control instruction so as to drive the camera to rotate in a second direction, and the shooting surface of the camera is rotated to a first target position corresponding to the voice direction, wherein the second direction is a horizontal direction.
With reference to the first aspect, an embodiment of the present application further provides a ninth optional implementation manner of the first aspect, and the human-computer interaction device further includes a storage case and a lifting control assembly, where the lifting control assembly is connected to the processing assembly and the camera assembly respectively;
the processing component is also used for sending the starting control instruction or the shutdown control instruction to the lifting control component when receiving the starting control instruction or the shutdown control instruction;
the lifting control assembly is used for controlling the camera shooting assembly to lift up from the containing shell to be exposed outside the containing shell when receiving a starting control instruction, and is used for controlling the camera shooting assembly to be contained inside the containing shell from a position exposed outside the containing shell when receiving a closing control instruction.
With reference to the first aspect, an embodiment of the present application further provides a tenth optional implementation manner of the first aspect, where the human-computer interaction device further includes a projection host, and the projection host is connected to the processing component;
the camera shooting assembly is also used for sending the scene image to the processing assembly;
the processing component is also used for determining a projection region characteristic image from the scene image and performing color segmentation on the projection region characteristic image through a linear filtering difference method to obtain a characteristic region;
the processing component is also used for matching the characteristic points according to the characteristic areas and carrying out trapezoidal correction on the projection picture of the projection host according to the characteristic point matching result.
With reference to the tenth optional implementation manner of the first aspect, this application example further provides an eleventh optional implementation manner of the tenth aspect, and the linear filtering difference method may be represented by the following calculation formula:
Figure 379081DEST_PATH_IMAGE001
where f is the differential value of the linear filtering, c is the differential increment, o is the filtering range, L is the light intensity value, and i is the index value of the pixel value scanned by the line.
In a second aspect, the human-computer interaction method provided in the embodiment of the present application is applied to the human-computer interaction device provided in the first aspect or any optional implementation manner of the first aspect;
acquiring voice information of the environment where the human-computer interaction equipment is located through the processing assembly, determining a voice direction corresponding to the voice information, generating a rotation control instruction according to the voice direction, and sending the rotation control instruction to the camera assembly;
the camera shooting assembly executes a rotating action according to the rotating control instruction so as to rotate a shooting surface of the camera shooting assembly to a first target position corresponding to the voice direction and acquire a scene image at the first target position.
In a third aspect, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed, the human-computer interaction method provided in the second aspect is implemented.
The man-machine interaction equipment provided by the embodiment of the application comprises a processing component and a camera component, wherein the processing component is used for acquiring the voice information of the environment where the man-machine interaction equipment is located and determining the voice direction corresponding to the voice information, so as to generate a rotation control instruction according to the voice direction and send the rotation control instruction to the camera shooting component, the camera shooting component is used for executing rotation action according to the rotation control instruction so as to rotate the shooting surface of the camera shooting component to a first target position corresponding to the voice direction and acquire a scene image at the first target position, wherein the voice message can be sent by the target user, so that the camera assembly can automatically rotate to the position of the target user, namely, the first target position, therefore, the normal interaction behavior of the target user and the human-computer interaction equipment is realized, and the automation degree of the human-computer interaction equipment is improved.
The human-computer interaction device and the computer-readable storage medium provided by the present application have the same beneficial effects as the human-computer interaction device provided by the first aspect, and are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic structural block diagram of a human-computer interaction device according to an embodiment of the present disclosure.
Fig. 2 is another schematic structural block diagram of a human-computer interaction device according to an embodiment of the present disclosure.
Fig. 3 is a schematic structural diagram of a human-computer interaction device according to an embodiment of the present application.
Fig. 4 is a block diagram of another schematic structure of a human-computer interaction device according to an embodiment of the present disclosure.
Fig. 5 is a block diagram of another schematic structure of a human-computer interaction device according to an embodiment of the present disclosure.
Fig. 6 is another schematic structural diagram of a human-computer interaction device according to an embodiment of the present application.
Fig. 7 is a flowchart illustrating steps of a human-computer interaction method according to an embodiment of the present disclosure.
Reference numerals: 100-a human-computer interaction device; 110-a processing component; 111-a voice acquisition device; 112-a processing device; 120-a camera assembly; 121-a first rotation control member; 122-a camera; 123-a second rotation control member; 130-a base; 140-a first display device; 150-projection host.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
Further, it is noted that in the description of the present application, like reference numerals and letters refer to like items in the following drawings, and thus, once an item is defined in one drawing, it is not necessary to further define and explain it in the following drawings.
Referring to fig. 1, a schematic structural block diagram of a human-computer interaction device 100 according to an embodiment of the present disclosure is shown, where the human-computer interaction device 100 according to the embodiment of the present disclosure includes a processing component 110 and a camera component 120, and the processing component 110 is connected to the camera component 120. The processing component 110 is configured to collect voice information of an environment where the human-computer interaction device 100 is located, determine a voice direction corresponding to the voice information, generate a rotation control instruction according to the voice direction, and send the rotation control instruction to the camera component 120, and the camera component 120 is configured to execute a rotation action according to the rotation control instruction, so as to rotate a shooting surface of the camera component 120 to a first target position corresponding to the voice direction, and collect a scene image at the first target position.
It can be understood that, in the embodiment of the present application, the voice information may be sent by the target user, so that the camera module 120 can automatically rotate to the position where the target user is located, that is, the first target position, and then the captured scene image at the first target position is the captured character image of the target user, so as to implement a normal interaction behavior between the target user and the human-computer interaction device 100, so as to improve the automation degree of the human-computer interaction device 100.
Referring to fig. 2, in the embodiment of the present application, as an optional implementation manner, for the processing component 110, the processing component may include a voice acquisition device 111 and a processing device 112, where the voice acquisition device 111 is connected to the processing device 112, and the processing device 112 is further connected to the camera component 120. The voice collecting device 111 is configured to collect voice information of an environment where the human-computer interaction device 100 is located, and determine a voice direction corresponding to the voice information, so as to send the voice direction to the processing device 112, and the processing device 112 is configured to generate a rotation control instruction according to the voice direction, and send the rotation control instruction to the camera module 120.
Referring to fig. 3, in terms of mechanical structure design, the human-computer interaction device provided in the embodiment of the present application may include a base 130, based on which, in the embodiment of the present application, the camera assembly 120 may be disposed above the base 130 and can rotate relative to the base 130, the voice collecting device 111 may be disposed on the base 130, and the processing device 112 may be disposed inside the base 130. Further, in this embodiment of the application, the voice collecting device 111 may include a plurality of sound collectors and a microprocessor, the plurality of sound collectors are disposed at the periphery of the base 130 in an array manner and are respectively connected to the microprocessor, based on this, the plurality of sound collectors collect the voice information of the environment where the human-computer interaction device 100 is located, and send the respective collected voice information to the microprocessor, the microprocessor may determine the voice direction corresponding to the voice information according to the strength of the voice information, and send the voice direction to the processing device 112.
In the embodiment of the present Application, the microprocessor may be an Integrated Circuit chip having Signal processing capability, or may be a general-purpose Processor, for example, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a discrete gate or transistor logic device, or a discrete hardware component, and may implement or execute the disclosed logic block diagram in the embodiment of the present Application. Also, in the embodiment of the present application, the processing device 112 may be an integrated circuit chip having signal processing capability, and the processing device 112 may also be a general-purpose processor, for example, a DSP, an ASIC, a discrete gate or transistor logic device, or a discrete hardware component, which may implement or execute the logic block disclosed in the embodiment of the present application.
It should be noted that, in the embodiment of the present application, the microprocessor and the processing device 112 may be two independent integrated circuit chips or two independent general-purpose processors, but they may also be the same integrated circuit chip or general-purpose processor, and the embodiment of the present application is not limited thereto.
Further, please refer to fig. 4, in an embodiment of the present application, the human-computer interaction device 100 may further include a first display device 140, and the first display device 140 is connected to the processing component 110. Based on this, in the embodiment of the present application, the camera component 120 is further configured to send the scene image to the processing component 110, the processing component 110 is further configured to determine a first person image from the scene image, to generate a target display image according to the first person image, and send the target display image to the first display device 140, where the first display device 140 is configured to display the target display image.
As an optional implementation manner, in this embodiment of the application, the processing component 110 is specifically configured to determine the first person image and the first expression information of the first person image from the scene image, simulate a second person image according to the first expression information, serve as a target display image, and send the target display image to the first display device 140. In practical implementation, after obtaining the first expression information, the second expression information corresponding to the first expression information may be obtained from a preset expression information repository, a second person image including the second expression information is simulated as a target display image, and the target display image is sent to the first display device 140. It should be noted that, in this embodiment of the application, the correspondence policy of the first expression information and the second expression information may be determined by user setting, and specifically includes a first correspondence policy and a second correspondence policy, where the first correspondence policy may instruct to acquire, from a preset expression information repository, an expression that is the same as or similar to an expression of the first expression information as the second expression information, and the second correspondence policy may instruct to acquire, from the preset expression information repository, an expression that is opposite to the expression of the first expression information as the second expression information. After receiving a selection instruction triggered by a user, the processing component 110 determines a corresponding policy corresponding to the selection instruction as a target policy, and acquires second expression information corresponding to the first expression information from a preset expression information repository according to the target policy.
For example, in this embodiment of the application, if the target policy is the first corresponding policy, when it is determined that the first expression information of the first person image is "surprise", the second expression information obtained from the expression information repository may also be "surprise", and for example, in this embodiment of the application, if the target policy is the second corresponding policy, when it is determined that the first expression information of the first person image is "sadness", the second expression information obtained from the expression information repository may be "happy", so as to enhance interactivity between the target user and the human-computer interaction device 100.
In order to expand the applicable range of the human-computer interaction device 100 provided in the embodiment of the present application, the processing component 110 is further connected to a second display device (not shown in the figure), and the processing component 110 is further configured to determine a first person image from the scene image and send the first person image to the second display device, and the second display device is configured to display the first person image, so that the human-computer interaction device 100 can be applied to a teleconference system, thereby improving the applicable range of the human-computer interaction device 100, and when the human-computer interaction device 100 is applied to the teleconference system, the human-computer interaction device 100 can be disposed at a first conference site where a target user is located, and a second display can be disposed at a second conference site where other conference participants are located. In addition, in the embodiment of the present application, the processing component 110 and the second display device may be interconnected through remote wireless communication.
In order to facilitate other parties to obtain identity information of a target user, in this embodiment of the application, the processing component 110 is specifically configured to determine a first person image from a scene image, obtain person identity information corresponding to the first person image, and send the first person image and the person identity information to the second display device, where the second display device is specifically configured to display the first person image and the person identity information. In actual implementation, the processing component 110 is specifically configured to extract, after determining the first person image from the scene image, the person identity information corresponding to the first person image from the preset information database, and send the first person image and the person identity information to the second display device, where the person identity information may include a name, a supply and employment unit, a job title, a work experience, and the like. It should be noted that, in the embodiment of the present application, if the personal identity information corresponding to the first personal image is not stored in the preset information database, a missing prompt is generated and displayed to remind the target user to enter the personal identity information of the target user into the information database.
It should be noted that, in the embodiment of the present application, after receiving the power-on control command from the human-computer interaction device 100, after the corresponding boot-up operation is started, the processing element 110 monitors whether a mode adjustment command is received, and if a first mode adjustment command for setting the operating mode to the interactive mode is received, the processing component 110 performs the steps of determining a first person image from the scene image, generating a target display image from the first person image, and sending the target display image to the first display device 140, so that the first display device 140 displays the target display image, and if a second mode adjustment and control command for setting the operation mode to the conference mode is received, the processing component 110 performs the act of determining a first person image from the scene image and sending the first person image to the second display device to cause the second display device to display the first person image.
Further, in order to ensure that the shooting surface of the camera module 120 is directed toward the target user, in this embodiment of the application, after the processing module 110 determines the first person image from the scene image, it is further configured to determine the target face from the first person image, and determine the position information of the target face in the scene image, so as to generate a fine adjustment instruction according to the position information, and send the fine adjustment instruction to the camera module 120, and the camera module 120 is configured to perform a fine adjustment operation according to the fine adjustment instruction, so as to adjust the shooting surface of the camera module 120 to a second target position corresponding to the first person image. For example, after the processing component 110 determines the first person image from the scene image, determines the target face from the first person image, and determines the position information of the target face in the scene image, if the target face is located in the left part of the scene image, then a fine adjustment instruction for controlling the camera face of the camera component 120 to rotate towards the left side is generated, if the target face is located in the right part of the scene image, then a fine adjustment instruction for controlling the camera face of the camera component 120 to rotate towards the right side is generated, and as for the corresponding fine adjustment action amplitude, it may be determined according to the specific position of the target face in the scene image, and for example, after the processing component 110 determines the first person image from the scene image, and determines the target face from the first person image, and determines the position information of the target face in the scene image, if the target face is located in the upper part of the scene image, a fine-tuning instruction for controlling the shooting surface of the camera module 120 to rotate upward is generated, if the target face is located at the lower part of the scene image, a fine-tuning instruction for controlling the shooting surface of the camera module 120 to rotate downward is generated, and as for the corresponding fine-tuning action amplitude, the fine-tuning action amplitude can be determined according to the specific position of the target face in the scene image.
Based on the above description, please refer to fig. 5, regarding the camera assembly 120, in terms of mechanical structure design, it may include a first rotation control component 121 and a camera 122, where the first rotation control component 121 and the camera 122 are respectively connected to the processing assembly 110, and are configured to receive a fine adjustment instruction, and perform a fine adjustment action according to the fine adjustment instruction, so as to drive the camera 122 to rotate in a first direction, and adjust a shooting surface of the camera 122 to a second target position corresponding to a first person image, where the first direction is a vertical direction. Meanwhile, the camera module 120 further includes a second rotation control component 123 and a camera 122, where the second rotation control component 123 and the camera 122 are respectively connected to the processing module 110, and are configured to receive the rotation control instruction, and execute a rotation action according to the rotation control instruction, so as to drive the camera 122 to rotate in a second direction, and rotate the shooting surface of the camera 122 to a first target position corresponding to the voice direction, where the second direction is a horizontal direction. It is understood that, in the embodiment of the present application, the first direction is a vertical direction, that is, an up-down direction, and the second direction is a horizontal direction, that is, a left-right direction.
Further, in the embodiment of the present application, the human-computer interaction device 100 further includes a storage case (not shown in the figure) and a lifting control component (not shown in the figure), and the lifting control component is respectively connected to the processing component 110 and the camera component 120, based on this, in the embodiment of the present application, the processing component 110 is further configured to, when receiving a power-on control instruction or a power-off control instruction, the power-on control command or the power-off control command is sent to the lifting control component, the lifting control component is used for controlling the camera component 120 to lift up from the containing shell to be exposed outside the containing shell when receiving the power-on control command, and for controlling the camera assembly 120 to be received into the housing from a position exposed to the outside of the housing upon receiving a shutdown control command, so as to protect the camera assembly 120 and thus prolong the service life of the human-computer interaction device 100.
Please refer to fig. 6, in order to expand the application range of the human-computer interaction device 100, in the embodiment of the present application, the human-computer interaction device 100 further includes a projection host 150, the projection host 150 is connected to the processing component 110, the camera component 120 may be disposed on the projection host 150, for example, disposed right above the projection host 150, or disposed right below the projection host 150, and fixedly connected to the projection host 150, specifically, fixedly connected to the projection host 150 through the base 130, based on which the camera component 120 is further configured to send a scene image to the processing component 110, the processing component 110 is configured to determine a projection region feature image from the scene image, and decode the projection region feature image based on a De Bruijn pseudo-random sequence structured light image decoding method, and decode the projection region feature image may be understood as stripe-dividing, fringe center point extraction and color segmentation. Taking color segmentation as an example, in the embodiment of the present application, the projection region feature image may be color-segmented by a linear filtering difference method to obtain a feature region, and then, feature point matching is performed according to the feature region, and trapezoidal correction is performed on the projection screen of the projection host 150 according to a feature point matching result. Specifically, the matching of the feature points according to the feature region may include matching and comparing the feature region with a standard square image feature region to obtain a torsion degree of the feature region, and using the torsion degree as a feature point matching result to perform trapezoidal correction on the projection image of the projection host 150 according to the feature point matching result.
In the embodiment of the present application, the linear filtering difference method may be represented by the following calculation formula:
Figure 565343DEST_PATH_IMAGE002
where f is the differential value of the linear filtering, c is the differential increment, o is the filtering range, L is the light intensity value, and i is the index value of the pixel value scanned by the line.
Further, in this embodiment of the application, the processing component 110 may further obtain an optimal projection focal length of the projection host 150 through a voting algorithm, and in practical implementation, the processing component 110 is configured to control the camera component 120 to acquire a plurality of scene images at different projection focal lengths, and determine a plurality of projection area characteristic images from the plurality of scene images, and then obtain a sharpness characterization coefficient of each projection area characteristic image, so as to determine a projection focal length corresponding to the maximum sharpness characterization coefficient, which is used as the target focal length.
In the embodiment of the present application, the voting algorithm can be represented by the following calculation formula:
Figure 367077DEST_PATH_IMAGE003
and if the p (x, j) corresponding pixel point j reaches the clearest point at the focal length x, and the pixel point j is a characteristic pixel point in any projection region characteristic image.
Based on the same inventive concept as the human-computer interaction device, the embodiment of the present application further provides a human-computer interaction method, which is applied to the human-computer interaction device, and please refer to fig. 7, which is a schematic flow diagram of the human-computer interaction method provided by the embodiment of the present application. It should be noted that the human-computer interaction method provided in the embodiment of the present application is not limited by the sequence shown in fig. 7 and the following, and the specific flow and steps of the substance synthesis control method are described below with reference to fig. 7.
And S100, acquiring voice information of the environment where the human-computer interaction equipment is located through the processing assembly, determining a voice direction corresponding to the voice information, generating a rotation control instruction according to the voice direction, and sending the rotation control instruction to the camera assembly.
And step S200, executing a rotation action by the camera shooting component according to the rotation control instruction so as to rotate the shooting surface of the camera shooting component to a first target position corresponding to the voice direction, and acquiring a scene image at the first target position.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed, the human-computer interaction method provided in the foregoing method embodiment is implemented.
In summary, the human-computer interaction device provided by the embodiment of the present application includes a processing component and a camera component, the processing component is configured to collect voice information of an environment where the human-computer interaction device is located, and determine a voice direction corresponding to the voice information, so as to generate a rotation control instruction according to the voice direction and send the rotation control instruction to the camera shooting component, the camera shooting component is used for executing rotation action according to the rotation control instruction so as to rotate the shooting surface of the camera shooting component to a first target position corresponding to the voice direction and acquire a scene image at the first target position, wherein the voice message can be sent by the target user, so that the camera assembly can automatically rotate to the position of the target user, namely, the first target position, therefore, the normal interaction behavior of the target user and the human-computer interaction equipment is realized, and the automation degree of the human-computer interaction equipment is improved.
In addition, the human-computer interaction method and the electronic device provided in the embodiment of the present application have the same beneficial effects as those of the human-computer interaction device, and are not described herein again.
In the description of the present application, it should be noted that, unless otherwise explicitly specified or limited, the terms "connected" and "disposed" should be interpreted broadly, for example, they may be mechanically fixed, detachably connected or integrally connected, they may be electrically connected, and they may be communicatively connected, where the communications connection may be a wired communications connection or a wireless communications connection, and furthermore, they may be directly connected, indirectly connected through an intermediate medium, or be communicated between two elements.
Furthermore, in the description of the present application, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
The above description is only a few examples of the present application and is not intended to limit the present application, and those skilled in the art will appreciate that various modifications and variations can be made in the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (14)

1. The human-computer interaction equipment is characterized by comprising a processing component and a camera shooting component, wherein the processing component is connected with the camera shooting component;
the processing assembly is used for acquiring voice information of the environment where the human-computer interaction equipment is located, determining a voice direction corresponding to the voice information, generating a rotation control instruction according to the voice direction, and sending the rotation control instruction to the camera assembly;
the camera shooting assembly is used for executing a rotating action according to the rotating control instruction so as to rotate a shooting surface of the camera shooting assembly to a first target position corresponding to the voice direction and collect a scene image at the first target position.
2. The human-computer interaction device of claim 1, wherein the processing component comprises a voice acquisition device and a processing device, the voice acquisition device is connected with the processing device, and the processing device is further connected with the camera component;
the voice acquisition device is used for acquiring voice information of the environment where the human-computer interaction equipment is located, determining a voice direction corresponding to the voice information and sending the voice direction to the processing device;
the processing device is used for generating a rotation control instruction according to the voice direction and sending the rotation control instruction to the camera shooting assembly.
3. The human-computer interaction device of claim 1, further comprising a first display device, the first display device being connected to the processing component;
the camera shooting assembly is also used for sending the scene image to the processing assembly;
the processing component is further used for determining a first person image from the scene image, generating a target display image according to the first person image, and sending the target display image to the first display device;
the first display device is used for displaying the target display image.
4. The human-computer interaction device of claim 3, wherein the processing component is configured to determine the first person image and first facial expression information of the first person image from the scene image, simulate a second person image as the target display image according to the first facial expression information, and send the target display image to the first display device.
5. A human-computer interaction device according to claim 1, wherein the processing component is further connected to a second display device;
the processing component is further used for determining a first person image from the scene image and sending the first person image to the second display device;
the second display device is used for displaying the first human image.
6. The human-computer interaction device of claim 5, wherein the processing component is configured to obtain person identity information corresponding to a first person image after the first person image is determined from the scene image, and send the first person image and the person identity information to the second display device;
the second display device is used for displaying the first person image and the person identity information.
7. The human-computer interaction device according to any one of claims 3 to 6, wherein after the processing component determines the first human face image from the scene image, the processing component is further configured to determine a target human face from the first human face image, determine position information of the target human face in the scene image, generate a fine adjustment instruction according to the position information, and send the fine adjustment instruction to the camera component;
the camera shooting assembly is used for executing fine adjustment action according to the fine adjustment instruction so as to adjust the shooting surface of the camera shooting assembly to a second target position corresponding to the first human image.
8. The human-computer interaction device of claim 7, wherein the camera assembly comprises a first rotation control component and a camera, and the first rotation control component and the camera are respectively connected with the processing assembly;
the first rotation control component is used for receiving the fine adjustment instruction and executing fine adjustment action according to the fine adjustment instruction so as to drive the camera to rotate in a first direction, and adjusting a shooting surface of the camera to a second target position corresponding to the first person image, wherein the first direction is a vertical direction.
9. The human-computer interaction device of claim 1, wherein the camera assembly comprises a second rotation control component and a camera, and the second rotation control component and the camera are respectively connected with the processing assembly;
the second rotation control component is used for receiving the rotation control instruction and executing rotation action according to the rotation control instruction so as to drive the camera to rotate in a second direction, and the shooting surface of the camera is rotated to a first target position corresponding to the voice direction, wherein the second direction is a horizontal direction.
10. The human-computer interaction device of claim 1, further comprising a storage case and a lifting control assembly, wherein the lifting control assembly is connected with the processing assembly and the camera assembly respectively;
the processing component is also used for sending a starting-up control instruction or a shutdown control instruction to the lifting control component when receiving the starting-up control instruction or the shutdown control instruction;
the lifting control assembly is used for controlling the camera shooting assembly to lift from the containing shell to be exposed outside the containing shell when receiving the starting control instruction, and controlling the camera shooting assembly to be contained into the containing shell from a position exposed outside the containing shell when receiving the shutdown control instruction.
11. The human-computer interaction device of claim 1, further comprising a projection host, the projection host being connected to the processing component;
the camera shooting assembly is also used for sending the scene image to the processing assembly;
the processing component is further used for determining a projection region characteristic image from the scene image and performing color segmentation on the projection region characteristic image through a linear filtering difference method to obtain a characteristic region;
and the processing component is also used for matching the feature points according to the feature areas and performing trapezoidal correction on the projection picture of the projection host according to the feature point matching result.
12. A human-computer interaction device according to claim 11, wherein the linear filtering difference method is represented by the following calculation formula:
Figure 134716DEST_PATH_IMAGE001
where f is the differential value of the linear filtering, c is the differential increment, o is the filtering range, L is the light intensity value, and i is the index value of the pixel value scanned by the line.
13. A human-computer interaction method, which is applied to the human-computer interaction device of any one of claims 1 to 12;
acquiring voice information of the environment where the man-machine interaction equipment is located through the processing assembly, determining a voice direction corresponding to the voice information, generating a rotation control instruction according to the voice direction, and sending the rotation control instruction to the camera assembly;
and executing a rotation action by the camera shooting assembly according to the rotation control instruction so as to rotate the shooting surface of the camera shooting assembly to a first target position corresponding to the voice direction and acquire a scene image at the first target position.
14. A computer-readable storage medium, having stored thereon a computer program which, when executed, implements the human-computer interaction method of claim 13.
CN202010786060.6A 2020-08-07 2020-08-07 Human-computer interaction device, method and computer-readable storage medium Pending CN111741225A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010786060.6A CN111741225A (en) 2020-08-07 2020-08-07 Human-computer interaction device, method and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010786060.6A CN111741225A (en) 2020-08-07 2020-08-07 Human-computer interaction device, method and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN111741225A true CN111741225A (en) 2020-10-02

Family

ID=72658114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010786060.6A Pending CN111741225A (en) 2020-08-07 2020-08-07 Human-computer interaction device, method and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN111741225A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113043999A (en) * 2021-04-19 2021-06-29 遥相科技发展(北京)有限公司 Wiper control method based on automobile data recorder, electronic equipment and computer storage medium
CN114845056A (en) * 2022-04-29 2022-08-02 清华大学 Auxiliary photographing robot

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130016124A1 (en) * 2011-07-14 2013-01-17 Samsung Electronics Co., Ltd. Method, apparatus, and system for processing virtual world
CN103292741A (en) * 2013-05-29 2013-09-11 哈尔滨工程大学 Structured light vision measurement method for 3D surface profiles of objects on the basis of K-means color clustering
CN105093986A (en) * 2015-07-23 2015-11-25 百度在线网络技术(北京)有限公司 Humanoid robot control method based on artificial intelligence, system and the humanoid robot
CN105676572A (en) * 2016-04-19 2016-06-15 深圳市神州云海智能科技有限公司 Projection correction method and device for projector equipped on mobile robot
CN106547884A (en) * 2016-11-03 2017-03-29 深圳量旌科技有限公司 Behavior pattern learning system of substitute robot
CN109308466A (en) * 2018-09-18 2019-02-05 宁波众鑫网络科技股份有限公司 The method that a kind of pair of interactive language carries out Emotion identification
CN110519580A (en) * 2019-10-11 2019-11-29 成都极米科技股份有限公司 A kind of projector automatic focusing method, device, equipment and readable storage medium storing program for executing
CN110738273A (en) * 2019-10-23 2020-01-31 成都极米科技股份有限公司 Image feature point matching method, device, equipment and storage medium
CN110855892A (en) * 2019-11-27 2020-02-28 华人运通(江苏)技术有限公司 Photographing method, photographing system and computer-readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130016124A1 (en) * 2011-07-14 2013-01-17 Samsung Electronics Co., Ltd. Method, apparatus, and system for processing virtual world
CN103292741A (en) * 2013-05-29 2013-09-11 哈尔滨工程大学 Structured light vision measurement method for 3D surface profiles of objects on the basis of K-means color clustering
CN105093986A (en) * 2015-07-23 2015-11-25 百度在线网络技术(北京)有限公司 Humanoid robot control method based on artificial intelligence, system and the humanoid robot
CN105676572A (en) * 2016-04-19 2016-06-15 深圳市神州云海智能科技有限公司 Projection correction method and device for projector equipped on mobile robot
CN106547884A (en) * 2016-11-03 2017-03-29 深圳量旌科技有限公司 Behavior pattern learning system of substitute robot
CN109308466A (en) * 2018-09-18 2019-02-05 宁波众鑫网络科技股份有限公司 The method that a kind of pair of interactive language carries out Emotion identification
CN110519580A (en) * 2019-10-11 2019-11-29 成都极米科技股份有限公司 A kind of projector automatic focusing method, device, equipment and readable storage medium storing program for executing
CN110738273A (en) * 2019-10-23 2020-01-31 成都极米科技股份有限公司 Image feature point matching method, device, equipment and storage medium
CN110855892A (en) * 2019-11-27 2020-02-28 华人运通(江苏)技术有限公司 Photographing method, photographing system and computer-readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113043999A (en) * 2021-04-19 2021-06-29 遥相科技发展(北京)有限公司 Wiper control method based on automobile data recorder, electronic equipment and computer storage medium
CN114845056A (en) * 2022-04-29 2022-08-02 清华大学 Auxiliary photographing robot

Similar Documents

Publication Publication Date Title
JP4529837B2 (en) Imaging apparatus, image correction method, and program
CN107026973B (en) Image processing device, image processing method and photographic auxiliary equipment
US8345106B2 (en) Camera-based scanning
US9547791B2 (en) Image processing system, image processing apparatus, image processing method, and program
JP4352980B2 (en) Enlarged display device and enlarged image control device
KR102124617B1 (en) Method for composing image and an electronic device thereof
JP2007201948A (en) Imaging apparatus, image processing method and program
WO2008012905A1 (en) Authentication device and method of displaying image for authentication
CN109670427A (en) A kind of processing method of image information, device and storage medium
JP6448674B2 (en) A portable information processing apparatus having a camera function for performing guide display for capturing an image capable of character recognition, a display control method thereof, and a program
CN111741225A (en) Human-computer interaction device, method and computer-readable storage medium
CN109691080A (en) Shoot image method, device and terminal
CN103034042A (en) Panoramic shooting method and device
JP4348028B2 (en) Image processing method, image processing apparatus, imaging apparatus, and computer program
US10819894B2 (en) Human machine interface system and method of providing guidance and instruction for iris recognition on mobile terminal
CN114007053B (en) Image generation method, image generation system, and recording medium
JP6283329B2 (en) Augmented Reality Object Recognition Device
WO2018196854A1 (en) Photographing method, photographing apparatus and mobile terminal
KR20230017774A (en) Information processing device, information processing method, and program
JP2018125658A (en) Portable information processing device having camera function, display control method thereof, and program
JP2010217962A (en) Program for camera-equipped mobile terminal device, and camera-equipped mobile terminal device
CN112529770B (en) Image processing method, device, electronic equipment and readable storage medium
CN114092323A (en) Image processing method, image processing device, storage medium and electronic equipment
JP2006190106A (en) Pattern detection program and pattern detection apparatus
JP2018191094A (en) Document reader, method of controlling document reader, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201002

RJ01 Rejection of invention patent application after publication