WO2020024692A1 - Man-machine interaction method and apparatus - Google Patents

Man-machine interaction method and apparatus Download PDF

Info

Publication number
WO2020024692A1
WO2020024692A1 PCT/CN2019/089209 CN2019089209W WO2020024692A1 WO 2020024692 A1 WO2020024692 A1 WO 2020024692A1 CN 2019089209 W CN2019089209 W CN 2019089209W WO 2020024692 A1 WO2020024692 A1 WO 2020024692A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
action instruction
action
terminal device
sender
Prior art date
Application number
PCT/CN2019/089209
Other languages
French (fr)
Chinese (zh)
Inventor
荣涛
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020024692A1 publication Critical patent/WO2020024692A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/012Walk-in-place systems for allowing a user to walk in a virtual environment while constraining him to a given position in the physical environment

Definitions

  • This specification relates to the field of computer technology, and in particular, to a method and device for human-computer interaction.
  • Augmented reality (AR) technology is to increase the user's perception of the real world through the information provided by the computer system. It applies virtual information to the real world and superimposes computer-generated virtual objects, scenes, or system prompts to the real scene. In order to achieve the enhancement of reality and achieve a sensory experience beyond reality.
  • AR Augmented reality
  • VR virtual reality
  • Users can play games, activities or perform certain specific operations in this virtual reality world. The whole process is as if it were real. It is general in the world, providing users with a full range of simulation experiences such as sight, hearing, and touch.
  • MR Mixed reality
  • augmented reality refers to a new visual environment created by combining real and virtual worlds.
  • physical and virtual objects ie digital objects
  • AR, VR, and MR technologies are still in the development stage, and the human-computer interaction technologies related to the above technologies are not yet mature, so it is necessary to provide a human-computer interaction solution.
  • the embodiments of the present specification provide a human-machine interaction method and device, which are used to implement human-machine interaction.
  • a human-machine interaction method including: acquiring an image for instructing a terminal device to perform an action; determining a matching action instruction based on an image characteristic of the image; Action instructions match the operation.
  • a human-computer interaction method which is applied to a receiver and includes: receiving an action instruction from a sender; and responding to the action instruction, displaying an effect corresponding to the action instruction, and the effect
  • the effect corresponding to the action instruction includes at least one of the following: a processing effect on the sender's avatar of the terminal device and / or a processing effect on the receiver's avatar; and a processing effect on the color of the message frame that communicates with the sender. ; Screen vibration is reversed; or video or animation playback.
  • a human-machine interaction device including: an image acquisition module that acquires an image for instructing a terminal device to perform an action; an action instruction determination module that determines a matching action instruction based on image characteristics of the image; an execution module In response to the action instruction, an operation matching the action instruction is performed.
  • a human-machine interaction device including: a receiving module that receives an action instruction from a sender; and an effect display module that displays an effect corresponding to the action instruction in response to the action instruction.
  • the effect corresponding to the action instruction includes at least one of the following: a processing effect on the sender's avatar of the terminal device and / or a processing effect on the receiver's avatar; and the color of the frame of the message communicating with the sender. Processing effects; screen vibration inversion; or video or animation playback.
  • an electronic device including: a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the computer program is implemented as follows when executed by the processor: Operation: Acquire an image for instructing the terminal device to perform an action; determine a matching action instruction based on the image characteristics of the image; and perform an operation that matches the action instruction in response to the action instruction.
  • an electronic device including: a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the computer program is implemented as follows when executed by the processor: Operation: receiving an action instruction from a sender; in response to the action instruction, displaying an effect corresponding to the action instruction, the effect corresponding to the action instruction including at least one of the following: sending to a terminal device The processing effect of the party avatar and / or the processing effect of the receiving party avatar of the terminal device; the processing effect of the color of the message frame communicating with the sender; the screen vibration is reversed; or the video or animation is played.
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the following operation is performed: acquiring an image for instructing a terminal device to perform an action. Determining a matching action instruction based on the image characteristics of the image; and in response to the action instruction, performing an operation that matches the action instruction.
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the following operations are performed: receiving an action instruction from a sender; and responding
  • the action instruction displays an effect corresponding to the action instruction, and the effect corresponding to the action instruction includes at least one of the following: a processing effect on a sender's avatar of the terminal device and / or a terminal device
  • the processing effect of the receiver's avatar the processing effect of the color of the message border communicating with the sender; the screen vibration is reversed; or the video or animation is played.
  • the at least one technical solution adopted in the embodiment of the present specification can achieve the following beneficial effects: determining a matching action instruction based on the image characteristics of the acquired image, and performing an operation matching the action instruction in response to the action instruction, thereby achieving Human-computer interaction based on acquired images.
  • FIG. 1 is a schematic flowchart of a human-computer interaction method according to an embodiment of the present specification
  • FIG. 2 is a schematic flowchart of a human-computer interaction method according to another embodiment of the present specification.
  • FIG. 3 is a schematic diagram of a display interface in the embodiment shown in FIG. 2;
  • FIG. 4 is a schematic flowchart of a human-computer interaction method according to another embodiment of the present specification.
  • FIG. 5 is a schematic diagram of a display interface in the embodiment shown in FIG. 4;
  • FIG. 6 is a schematic flowchart of a human-computer interaction method according to another embodiment of the present specification.
  • FIG. 7 is a schematic diagram of a display interface in the embodiment shown in FIG. 6;
  • FIG. 8 is a schematic diagram of an initial interface of a human-computer interaction method according to an embodiment of the present specification.
  • FIG. 9 is another schematic diagram of an initial interface of a human-computer interaction method according to an embodiment of the present specification.
  • FIG. 10 is a schematic flowchart of a human-computer interaction method provided by a next embodiment of the present specification.
  • FIG. 11 is a schematic diagram of a display interface in the embodiment shown in FIG. 10;
  • FIG. 12 is a schematic structural diagram of a human-computer interaction device according to an embodiment of the present specification.
  • FIG. 13 is a schematic structural diagram of a human-computer interaction device according to another embodiment of the present specification.
  • FIG. 14 is a schematic diagram of effects that can be achieved by various embodiments of this specification.
  • FIG. 15 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present specification.
  • an embodiment of the present specification provides a human-computer interaction method 100 including the following steps:
  • S102 Acquire an image used to instruct the terminal device to perform an action.
  • the images used to instruct the terminal device to perform actions obtained in the embodiments of the present specification may be gesture images, face images, human body images of the entire body of the user, or partial images of the user's body, etc., and are not specifically limited in this specification.
  • the image acquired in the embodiment of the present specification may be a single image or a multi-frame image in a captured video stream.
  • the acquired image in this step may be an image of a single user or an image of multiple users.
  • This step may be acquiring images from multiple images stored in advance, or acquiring images in real time. If the above image can be stored in advance, in this way, step S102 can obtain an image from the stored multiple images, for example, obtain an image selected by the user. In addition, if the above-mentioned images are still acquired in real time, in this way, step S102 may acquire images in real time based on the image sensor of the terminal device.
  • S104 Determine a matching action instruction based on the image characteristics of the image.
  • the image feature in this step corresponds to the acquired image, and may specifically be extracted from the acquired image. For example, if a gesture image is acquired, the image feature there may be a gesture characteristic; the acquired If the image is a human face image, the image feature at this place may be a human face feature; if the acquired image is a human body image, the image feature at this place may be a pose or action feature of the human body, and so on.
  • a mapping relationship table between image features and motion instructions may be established in advance.
  • a matching motion instruction may be directly determined by looking up the table.
  • the same image feature can also correspond to different action instructions. Therefore, before this embodiment is executed, the mapping relationship between image features and action instructions can also be established in different scenarios. Table, this embodiment may be executed in a determined scenario. For example, this embodiment may be executed in a scenario selected by a user, and for example, this embodiment may also be executed in a scenario obtained based on an AR scan. , Or executed in a preset VR environment, or executed in a preset MR environment, and so on.
  • an operation matching the action instruction is performed.
  • a rendering instruction may be specifically generated based on the action instruction; The target object related to the action instruction is rendered.
  • the action instruction may also be sent to the receiver, so that the receiver generates a rendering instruction based on the action instruction, and Rendering a target object related to the action instruction.
  • the target object of the above-mentioned augmented reality display is also displayed on the sender side.
  • the aforementioned target objects may specifically be augmented reality scenes, virtual reality scenes, mixed reality scenes, etc.
  • the display effects and related display technologies mentioned in the embodiments of the present specification may be implemented based on the OpenCV vision library.
  • the above-mentioned sending of the action instruction to the receiver may specifically be sending the action instruction to the server, and then the server sends the action instruction to the receiver; or, if there is no server, it is directly In a client-to-client scenario, the sender can directly send the action instruction to the receiver.
  • the human-computer interaction method determines a matching action instruction based on the image characteristics of the acquired image, and performs an operation matching the action instruction in response to the action instruction, thereby realizing the operation based on the acquired image. Human-computer interaction.
  • the embodiments of the present specification can also be applied in scenarios such as AR, VR, and MR.
  • FIG. 2 and FIG. 3 another embodiment of the specification provides a human-computer interaction method 200, which includes the following steps:
  • S202 Obtain a selected gesture image, a face image, or a human body image in response to a user's selection operation on the displayed preset image.
  • multiple gesture images can be displayed on the display interface in advance. For details, see the box under the text "Gesture Selection" on the right side of Figure 3.
  • the user clicks to select one of the gestures In the case of an image, the above gesture image can be obtained in this step.
  • multiple facial expression images, human action posture images, and the like can be displayed in advance.
  • the above facial expression images or human action images can be obtained.
  • the above-mentioned gesture image displayed in advance may include a gesture image of the left hand; a gesture image of the right hand; may also include a gesture image of one-handed fist or fingers closed; a gesture image of one-handed release or finger extension; and the middle finger and The ring finger closes the gesture image of love with other fingers spread out and so on.
  • the aforementioned facial expression image displayed in advance may be a smiling expression image, a sad expression image, a crying expression image, and the like.
  • the above-mentioned pre-shown human action posture image may be a human posture image bent at 90 degrees, a standing human posture image, or the like.
  • S204 Determine an action instruction based on the image characteristics of the selected image in a preset scene.
  • the correspondence between the above-mentioned image and image features can be stored before execution of this embodiment, so that the image features can be directly determined based on the image selected by the user. For example, if the gesture image selected by the user is an image of a single-handed fist, the gesture feature can be Represents the characteristics of a single-handed fist.
  • a mapping relationship table between image features and motion instructions may be established in advance.
  • a matching motion instruction may be directly determined by looking up the table.
  • the same image feature can also correspond to different action instructions. Therefore, before this embodiment is executed, the mapping relationship between image features and action instructions can also be established in different scenarios. Table, this embodiment may be executed in a determined scenario. For example, this embodiment may be executed in a scenario selected by a user, and for example, this embodiment may also be executed in a scenario obtained based on an AR scan. , Or in a preset VR scene, or in a preset MR scene, etc. In this way, before this embodiment is executed, a scene image can be obtained in advance, and the implementation is performed in the obtained scene. example.
  • the current application scenario when determining an action instruction based on the image feature, the current application scenario may be determined first, and then the action instruction corresponding to the image feature obtained in the current application scenario may be determined. For example, in the scenario of a stand-alone fighting game, based on one hand The gesture characteristics of the fist can determine the action instruction of the fist.
  • performing an operation matching the action instruction may specifically generate a rendering instruction based on the action instruction, and render a target object related to the action instruction, for example, in FIG. 3
  • the box to the left of the gesture image displayed in advance displays target objects of strong reality, virtual reality, or mixed reality.
  • the displayed target objects can be images of augmented reality, virtual reality, or mixed reality.
  • the action instruction may also be sent to the receiver so that the receiver generates a rendering instruction based on the action instruction to The target object related to the motion instruction is rendered.
  • the above-mentioned sending of the action instruction to the receiver may specifically be sending the action instruction to the server, and then the server sends the action instruction to the receiver; or, if there is no server, it is directly In a client-to-client scenario, the sender can directly send the action instruction to the receiver.
  • the interaction method provided in the embodiment of the present specification determines a matching action instruction based on the image characteristics of the acquired image, and performs an operation matching the action instruction in response to the action instruction, thereby realizing human-computer interaction based on the acquired image. .
  • a plurality of gesture images, face images, or human body images are stored in advance. This makes it easy for users to quickly select and improve the user experience.
  • the order of the gesture images displayed in advance in the display interface shown in FIG. 3, or the display order of the face image or the human body image in other embodiments may be sorted based on the user's historical usage frequency. For example, the user One-handed fist gesture image is selected most frequently, and the one-handed fist gesture image is displayed first, which is further convenient for users to select and improve the user experience.
  • step S202 a gesture image selected by a user such as A, B, and C from a plurality of displayed gesture images is acquired; in steps S204 and S206, in a preset scenario where A, B, and C interact with each other,
  • the above image features are sent to users such as A, B, and C based on the image features of the gesture images selected respectively.
  • each terminal device can collect the gesture image of each user in real time. If it matches the pre-selected image characteristics to a certain degree, it will perform subsequent logical operations.
  • the scene selected by terminal devices such as A, B, and C is an ancient temple. , There is a stone door in front, when multiple devices recognize the movement of the hand pushing forward, the stone door will slowly open and so on.
  • a gesture image, a face image, or a human body image is displayed in advance, considering that the number of images displayed is limited; and the content of the image displayed in advance is not rich enough, in order to further improve the image Quantity, and increase the richness of images, enhance user interaction, and increase user interaction fun.
  • a human-computer interaction method 400 including the following steps:
  • S402 Acquire an image feature, where the image feature includes at least one of the following: a gesture image feature, a face image feature, a human image feature, and an action feature.
  • the terminal device includes components that can be used to collect images.
  • the components used to collect images on the terminal device can include an infrared camera. After arriving at the image, acquire image features based on the acquired image.
  • the above motion characteristics include, for example, the characteristics of punching out, the characteristics of waving, the characteristics of releasing the palm, the characteristics of running, the characteristics of standing and standing, the characteristics of shaking the head, and the characteristics of nodding.
  • an application scenario may be identified in advance.
  • the application scenario may specifically include a scenario in which a sender and a receiver chat with each other; an application scenario in a network fighting game; a scenario in which multiple terminal devices chat and interact with each other Wait.
  • a gesture feature classification model may be used to acquire gesture features.
  • the input parameters of the gesture feature classification model may be collected gesture images (or pre-processed gesture images, which will be described in the next paragraph), and the output parameters may be gesture features.
  • the gesture feature classification model can be generated through machine learning based on algorithms such as Support Vector Machine (SVM), Convolutional Neural Network (CNN), or DL.
  • this step may further preprocess the collected gesture image in order to remove noise.
  • the pre-processing operation on the gesture image may include, but is not limited to, performing image enhancement on the acquired gesture image; image binarization; image graying and denoising processing.
  • a gesture image, a face image, a human body image, and an action image may be collected in advance, and then gesture image features, face image features, human image features, and motion features are extracted based on the collected images.
  • this embodiment may also determine whether to perform image preprocessing or determine the image preprocessing method to be used according to image feature accuracy requirements and performance requirements (such as response speed requirements). Specifically, for example, in an application scenario of a network fighting game with a high response speed requirement, the gesture image may not be preprocessed; in a scenario with a high requirement for gesture accuracy, the collected image may be preprocessed.
  • image feature accuracy requirements and performance requirements such as response speed requirements
  • S404 Determine a matching action instruction based on the image feature and the additional dynamic feature selected by the user in a preset scene.
  • a scene image may be obtained in advance, and this embodiment is executed under the obtained scene.
  • this step when determining a matching action instruction based on the image feature and the additional dynamic feature selected by the user, first determine the current application scenario, and then determine the action instruction corresponding to the image feature and the additional dynamic feature selected by the user in the current application scenario. For example, in the scenario of a stand-alone fighting game, based on the gesture characteristics of a single-handed fist and the dynamic characteristics of an additional fireball selected by the user, an action command of a fist + fireball can be determined. As shown in the schematic diagram of the application interface in FIG. 5, in this embodiment, multiple additional dynamic effects can be displayed on the display interface in advance. Specifically, see the circle under the text “Additional Dynamic Effects” on the right side of FIG. 5. When the user clicks to select one of them, For an additional dynamic effect, this step may determine an action instruction based on the gesture feature and the additional dynamic effect feature.
  • the selected additional dynamic feature corresponds to the acquired image.
  • this may also display multiple dynamic effects related to additional faces on the display interface in advance for the user to select, and generate additional dynamic features when the user selects, to Display effects, etc. for enhanced display.
  • this may also display a plurality of additional human body or motion-related dynamic effects in advance on the display interface for the user to select, and generate additional dynamic features when the user selects them.
  • the gesture characteristics representing a single-handed fist are obtained in step S402. If the above-mentioned additional dynamic effect (or feature) is not selected, the action instruction determined in this step only represents the action instruction of the fist; "Snowball", then the action instruction determined in this step may be an action instruction with a cool effect that includes punching and firing a snowball.
  • performing an operation matching the action instruction in this step may specifically generate a rendering instruction based on the action instruction, and render a target object related to the action instruction, for example, in the figure
  • the box on the left in 5 shows the target object of augmented reality, virtual reality, or mixed reality.
  • the target object displayed can be an image of the scene of augmented reality, virtual reality, or mixed reality.
  • This embodiment may also send the action instruction to the receiver, so that the receiver generates a rendering instruction based on the action instruction to render a target object related to the action instruction.
  • the sender can also display the target of the augmented reality. Object.
  • the interaction method provided in the embodiment of the present specification acquires image features, determines an action instruction based on the image features and additional dynamic features selected by the user, and implements human-computer interaction based on the acquired image features in response to the action instructions.
  • this embodiment acquires gesture image features, face image features, human image features, and motion features based on real-time collected images. Compared to a limited number of pre-stored images, the image features that can be acquired are more abundant. And diverse.
  • additional dynamic effects are stored in advance for the user to select, so that the user can quickly select them, so as to generate more cool special effects and improve the user experience.
  • the order of the additional dynamic effects previously displayed in the display interface shown in FIG. 5, or the display order of the additional dynamic effects on the face features or the additional dynamic effects on the human features in other embodiments may be Sort based on the user's historical usage frequency. For example, users select “Fireball” the most frequently. Refer to Figure 5, the additional dynamic effects of "Fireball” will be displayed first to further facilitate user selection and improve user experience.
  • FIG. 6 and FIG. 7 another embodiment of the present specification provides a human-computer interaction method 600 including the following steps:
  • S602 Acquire a scene feature selected by a user.
  • the scene features in this embodiment are specifically shown in the schematic diagram of the application interface in FIG. 7.
  • multiple preset scenes can be displayed on the display interface in advance, such as the “avatar” scene shown in FIG. 7.
  • Multiple scenes are shown schematically with "***”.
  • the application interface of FIG. 7 further includes a “more” button, which can display more preset scenes when the user clicks.
  • S604 Determine an action instruction based on the scene feature and the acquired image feature.
  • the image feature includes at least one of the following: a gesture image feature, a face image feature, a human image feature, and an action feature.
  • the terminal device includes components that can be used to collect images.
  • the components used to collect images on the terminal device can include an infrared camera and the like.
  • the acquired image acquires image features. For a specific acquisition process, refer to the embodiment shown in FIG. 4. The following takes an example of acquiring facial features as an example.
  • a facial feature classification model can be used to obtain facial features.
  • the input parameters of the facial feature classification model may be collected facial images (or pre-processed facial images, which will be described in the next paragraph), and the output parameters may be facial features.
  • the facial feature classification model can be generated through machine learning based on algorithms such as Support Vector Machine (SVM), Convolutional Neural Network (CNN) or DL.
  • this step may also preprocess the collected face images in order to remove noise.
  • the pre-processing operation on the face image may include, but is not limited to, performing image enhancement on the collected face image; image binarization; image graying and denoising processing.
  • the image feature and the scene feature may be fused, such as a face feature Fusion with scene features to generate an action instruction for the fusion of face features and scene features.
  • a human face area is reserved in the scene selected by the user, and the user's face feature is fused and displayed in the reserved face area. Realize the seamless docking of the user's face with the selected scene, and generate the effect that the user is actually in the above scene, for example, the user is in the middle of the picture, and the face of the character in the above scene becomes the user's face.
  • This embodiment is particularly applicable to application scenarios such as group photos, artistic photo stickers, artistic modeling, and cosplay.
  • performing an operation matching the action instruction in this step may specifically generate a rendering instruction based on the action instruction to render a target object related to the action instruction; or Send the action instruction to the receiver, so that the receiver generates a rendering instruction based on the action instruction, renders a target object related to the action instruction, and finally displays the target object of augmented reality, virtual reality, or mixed reality.
  • a message carrying human face features and the scene features may also be sent to the receiver, and the receiver may obtain the receiver ’s face features, thereby realizing the sender
  • the fusion of the facial features of the user, the facial features of the receiver, and the scene selected by the sender is convenient for improving the user experience.
  • the interactive method provided in the embodiment of the present specification acquires image features and scene features, determines an action instruction based on the image features and the scene features, and responds to the action instructions to achieve the fusion of image features and various preset scenes, Facilitate user experience.
  • different preset scenes are stored in advance for the user to choose, which realizes that the acquired image changes into different shapes in different scenes, which increases interest and improves user experience.
  • this embodiment may also save the target objects of the augmented reality, virtual reality, or mixed reality shown above, which is convenient for users to use later.
  • a third-party camera device may be requested to record and record the augmented reality, virtual reality, or mixed reality view displayed on the screen of the current terminal device from the outside, thereby indirectly implementing augmented reality, virtual reality, or mixed reality view storage, which can be flexible To get the augmented reality, virtual reality, or mixed reality views that users need to store.
  • the augmented reality, virtual reality, or mixed reality view that the user sees on the display screen can also be captured and saved in a screenshot manner.
  • This implementation method not only intercepts and stores all the augmented reality, virtual reality, or mixed reality content displayed on the screen, but also selectively stores the augmented reality, virtual reality, or mixed reality views according to the needs of the user.
  • the initial display interface can refer to FIG. 8 to FIG.
  • the Card function is stored in the chat interface, as shown in Figure 8, where the ** Card can be AR Card, MR Card, or VR Card, etc.
  • a ** Card option may be popped up in the message interface for users to select and use, thereby improving the user experience.
  • FIG. 8 and FIG. 9 only schematically show a trigger execution mode.
  • the methods described in the foregoing embodiments may also be triggered by other methods, such as shaking the terminal device for automatic execution.
  • the execution of the specific voice issued by the user is not specifically limited.
  • another embodiment of the present specification provides a human-computer interaction method 1000, which is applied to a receiver and includes the following steps:
  • the action instruction in this embodiment may be the action instruction mentioned in the embodiments shown in FIG. 1 to FIG. 7 in the foregoing, that is, the embodiment is applied to the receiver, and the operation performed by the sender may be The operations of the various embodiments as shown in FIGS. 1 to 7.
  • action instructions in this embodiment may also be other action instructions, that is, independent of each of the embodiments shown in FIG. 1 to FIG. 7.
  • Screen vibration is reversed, that is, the entire terminal device screen vibrates and reverses;
  • the above animations include gif images.
  • the above video may specifically be a video file in an encoding format such as H264, H265, and the receiver may automatically play after receiving the above video file;
  • the above animation may specifically be an animation that enhances expression of a character, an artistic text of a voice-over, and some background animation effects, etc. , The receiver will play automatically after receiving the above animation.
  • the display interface of the sender can also show that the status of the three-dimensional model of the receiver changes, which can specifically show three-dimensional display effects such as augmented reality, virtual reality, or mixed reality, such as shots on the receiver and snowflakes on the receiver.
  • three-dimensional display effects such as augmented reality, virtual reality, or mixed reality, such as shots on the receiver and snowflakes on the receiver.
  • the processing effect of the avatar can also be displayed on the display interface of the sender.
  • the avatar can be turned into a turtle or other 3D display style of the receiver's avatar such as augmented reality, virtual reality, or mixed reality , Improve fun and enhance user experience.
  • the generation and extinction of the actions of both parties can be displayed on the display interface of the sender, and the final state of the receiver, such as the status and avatar; and the generation and extinction of the actions of both parties can be displayed on the display interface of the receiver.
  • the final state of the receiver such as the status and avatar
  • the generation and extinction of the actions of both parties can be displayed on the display interface of the receiver.
  • this embodiment can also receive a drag instruction, and move the displayed object on the display interface.
  • the human-computer interaction method receives a motion instruction from a sender, and displays an effect corresponding to the motion instruction in response to the motion instruction, thereby realizing human-computer interaction based on the motion instruction.
  • the effects corresponding to the action instructions may be displayed in a three-dimensional state, and specifically may be a three-dimensional augmented reality, virtual reality, or mixed reality display.
  • the following effects can also be generated in the display interface of the sender: A (sender) sends a snowball, and B (receiver) sends a fireball.
  • the fireball will weaken after the fireball and snowball collide. And fly to Party A, and then Party A's image catches fire; for example, Party A and Party B send fireballs or water polo at the same time, after the collision, they will be scattered into sparks or snowflakes, forming a fantasy artistic effect, improving fun and enhancing users Experience.
  • this specification also provides a human-machine interaction device 1200.
  • the device 1200 includes:
  • the image acquisition module 1202 may be configured to acquire an image used to instruct the terminal device to perform an action
  • the action instruction determining module 1204 may be configured to determine a matching action instruction based on the image characteristics of the image;
  • the execution module 1206 may be configured to perform an operation matching the action instruction in response to the action instruction.
  • the interaction device determines an action instruction based on the image characteristics of the acquired image and executes an operation matching the action instruction in response to the action instruction, thereby realizing human-computer interaction based on the acquired image.
  • the image acquisition module 1202 may be configured to acquire a selected image in response to a user's selection operation of the preset image displayed.
  • the image acquisition module 1202 may be configured to acquire an image of a user through a camera acquisition device.
  • the image for instructing the terminal device to perform an action includes a gesture image, a face image, or a human body image.
  • the action instruction determining module 1204 may be configured to determine a matching action instruction based on the gesture feature and the acquired additional dynamic feature.
  • the action instruction determination module 1204 may be configured to determine a matched action instruction based on an image feature of the image and the additional dynamic feature in a preset scene.
  • the action instruction determining module 1204 may be configured to determine a matching action instruction based on an image feature of the image and an acquired scene feature.
  • the apparatus 1200 further includes a saving module, which may be used to save the image feature and the scene feature.
  • the execution module 1206 may be configured to generate a rendering instruction based on the action instruction to render a target object related to the action instruction.
  • the apparatus 1200 further includes a sending module, which may be configured to send the action instruction to a receiver.
  • a sending module which may be configured to send the action instruction to a receiver.
  • the above-mentioned human-computer interaction device 1200 may refer to the flow of the human-machine interaction method shown in FIG. 1 to FIG. 9 corresponding to the embodiment of the previous text description, and each unit / module in the human-machine interaction device 1200
  • the operations and / or functions described above are respectively for the purpose of realizing the corresponding processes in the human-computer interaction method, and for the sake of brevity, they are not repeated here.
  • this specification also provides a human-computer interaction device 1300.
  • the device 1300 includes:
  • the receiving module 1302 may be used to receive an action instruction from a sender
  • the effect display module 1304 may be configured to display an effect corresponding to the action instruction in response to the action instruction, and the effect corresponding to the action instruction includes at least one of the following:
  • Video or animation playback
  • the above video can be a video file in H264, H265 and other encoding formats, or a three-dimensional model can be calculated in time. That is, the receiver can automatically play the video file after receiving the video file.
  • the above animation can be an animation that enhances the expression of a character, a voiceover. Artistic text and some background animation effects, etc., the receiver can automatically play after receiving the above animation.
  • the display interface of the sender can also show that the status of the receiver's 3D model has changed. Specifically, it can show the receiver's shot, 3D display effects such as snowflakes on the receiver, virtual reality or mixed reality, etc. .
  • the display effect of the receiver's avatar can also be displayed on the display interface of the sender.
  • the receiver's avatar becomes a turtle or other 3D display of the receiver's avatar such as augmented reality, virtual reality, or mixed reality. Change styles, improve fun, and enhance user experience.
  • the generation and extinction of the actions of both parties can be displayed on the display interface of the sender, and the final state of the receiver, such as the status and avatar; and the generation and extinction of the actions of both parties can be displayed on the display interface of the receiver.
  • the final state of the receiver such as the status and avatar
  • the generation and extinction of the actions of both parties can be displayed on the display interface of the receiver.
  • the human-computer interaction device receives a motion instruction from a sender, and displays an effect corresponding to the motion instruction in response to the motion instruction, thereby realizing human-computer interaction based on the received motion instruction.
  • the above-mentioned human-machine interaction device 1300 may refer to the flow of the human-machine interaction method shown in FIG. 10 to FIG. 11 corresponding to the embodiment of the previous text description, and each unit / module in the human-machine interaction device 1300
  • the other operations and / or functions mentioned above are for realizing the corresponding processes in the human-computer interaction method, and for the sake of brevity, they are not repeated here.
  • FIG. 14 The effects that can be achieved by the foregoing embodiments in this specification can be specifically seen in FIG. 14.
  • the user inputs, not only text input, voice input, picture input, and short video input, but also face recognition, motion recognition, scene recognition, etc. , And send different effects based on the recognized faces, actions, and scenes.
  • the user receives, not only ordinary text display, voice playback, short video playback of dynamic picture playback, etc., but also effects such as status changes, animation sound playback screen vibration feedback, etc.
  • the above status changes such as the sender ’s body Bomb, sender's avatar becomes a turtle, dynamically change the background, etc.
  • the electronic device includes a processor, and optionally, includes an internal bus, a network interface, and a memory.
  • the memory may include a memory, such as a high-speed random access memory (Random-Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. Wait.
  • the electronic device may also include hardware required to implement other services.
  • the processor, network interface, and memory can be connected to each other through an internal bus, which can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an extended industry standard Structure (Extended Industry, Standard Architecture, EISA) bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry, Standard Architecture
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only a two-way arrow is used in FIG. 15, but it does not mean that there is only one bus or one type of bus.
  • the program may include program code, where the program code includes a computer operation instruction.
  • the memory may include memory and non-volatile memory, and provide instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it to form a device for forwarding chat information on a logical level.
  • the processor executes a program stored in the memory, and is specifically configured to perform operations of the method embodiment described earlier in this specification.
  • the methods and methods executed by the devices disclosed in the embodiments shown in FIG. 1 to FIG. 11 may be applied to a processor, or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the above processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc .; it may also be a digital signal processor (DSP), special integration Circuit (Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in combination with the embodiments of the present specification may be directly embodied as being executed by a hardware decoding processor, or may be executed and completed by using a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like.
  • the storage medium is located in a memory, and the processor reads the information in the memory and completes the steps of the foregoing method in combination with its hardware.
  • the electronic device shown in FIG. 15 can also execute the methods of FIGS. 1 to 11 and implement the functions of the embodiment of the human-computer interaction method shown in FIG. 1 to FIG. 11, which will not be described again in the embodiment of this specification.
  • the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, etc.
  • the execution body of the following processing flow is not limited to each logical unit. It can also be a hardware or logic device.
  • the embodiments of the present specification also provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the processes of the method embodiments shown in FIG. 1 to FIG. 11 are implemented. , And can achieve the same technical effect, in order to avoid repetition, will not repeat them here.
  • the computer-readable storage medium is, for example, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input / output interfaces, network interfaces, and memory.
  • processors CPUs
  • input / output interfaces output interfaces
  • network interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-persistent memory, random access memory (RAM), and / or non-volatile memory in computer-readable media, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information can be stored by any method or technology.
  • Information may be computer-readable instructions, data structures, modules of a program, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media may be used to store information that can be accessed by computing devices.
  • computer-readable media does not include temporary computer-readable media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • User Interface Of Digital Computer (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Electrotherapy Devices (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

Disclosed are a man-machine interaction method and apparatus. The method comprises: acquiring an image for instructing a terminal device to execute an action; determining a matching action instruction based on image features of the image; and in response to the action instruction, executing an operation matching the action instruction. Also disclosed are another man-machine interaction method and apparatus.

Description

一种人机交互方法和装置Human-machine interaction method and device 技术领域Technical field
本说明书涉及计算机技术领域,尤其涉及一种人机交互方法和装置。This specification relates to the field of computer technology, and in particular, to a method and device for human-computer interaction.
背景技术Background technique
增强现实(Augmented reality,AR)技术是通过计算机***提供的信息增加用户对现实世界感知,其将虚拟的信息应用到真实世界,并将计算机生成的虚拟物体、场景或***提示信息叠加到真实场景中,从而实现对现实的增强,达到超越现实的感官体验。Augmented reality (AR) technology is to increase the user's perception of the real world through the information provided by the computer system. It applies virtual information to the real world and superimposes computer-generated virtual objects, scenes, or system prompts to the real scene. In order to achieve the enhancement of reality and achieve a sensory experience beyond reality.
虚拟现实(Virtual Reality,VR)通过模拟计算产生出一个与现实场景相同或相似的三维虚拟世界,用户可以在这个虚拟现实世界中进行游戏、活动或执行某些特定的操作,整个过程如同在真实世界中进行一般,给用户提供了视觉、听觉、触觉等全方位的模拟体验。Virtual reality (VR) generates a three-dimensional virtual world that is the same or similar to the real scene through simulation calculations. Users can play games, activities or perform certain specific operations in this virtual reality world. The whole process is as if it were real. It is general in the world, providing users with a full range of simulation experiences such as sight, hearing, and touch.
混合现实(Mix reality,MR)技术包括增强现实和增强虚拟,指的是合并现实和虚拟世界而产生的新的可视化环境。在新的可视化环境中,物理和虚拟对象(也即数字对象)共存,并实时互动。Mixed reality (MR) technology includes augmented reality and augmented virtual reality, which refers to a new visual environment created by combining real and virtual worlds. In the new visualization environment, physical and virtual objects (ie digital objects) coexist and interact in real time.
目前,AR、VR和MR技术还处于开发阶段,与上述技术相关的人机交互技术尚不成熟,因此有必要提供一种人机交互方案。At present, AR, VR, and MR technologies are still in the development stage, and the human-computer interaction technologies related to the above technologies are not yet mature, so it is necessary to provide a human-computer interaction solution.
发明内容Summary of the invention
本说明书实施例提供一种人机交互方法和装置,用于实现人机交互。The embodiments of the present specification provide a human-machine interaction method and device, which are used to implement human-machine interaction.
本说明书实施例采用下述技术方案:The embodiments of this specification adopt the following technical solutions:
第一方面,提供了一种人机交互方法,包括:获取用于指示终端设备执行动作的图像;基于所述图像的图像特征确定匹配的动作指令;响应于所述动作指令,执行与所述动作指令相匹配的操作。According to a first aspect, a human-machine interaction method is provided, including: acquiring an image for instructing a terminal device to perform an action; determining a matching action instruction based on an image characteristic of the image; Action instructions match the operation.
第二方面,提供了一种人机交互方法,应用在接收方,包括:接收来自于发送方的动作指令;响应于所述动作指令,显示与所述动作指令对应的效果,所述与所述动作指令对应的效果包括下述至少一种:对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;对与发送方进行通讯的消息边框颜色的处理效果;屏幕振动 反转;或视频或动画播放。In a second aspect, a human-computer interaction method is provided, which is applied to a receiver and includes: receiving an action instruction from a sender; and responding to the action instruction, displaying an effect corresponding to the action instruction, and the effect The effect corresponding to the action instruction includes at least one of the following: a processing effect on the sender's avatar of the terminal device and / or a processing effect on the receiver's avatar; and a processing effect on the color of the message frame that communicates with the sender. ; Screen vibration is reversed; or video or animation playback.
第三方面,提供了一种人机交互装置,包括:图像获取模块,获取用于指示终端设备执行动作的图像;动作指令确定模块,基于所述图像的图像特征确定匹配的动作指令;执行模块,响应于所述动作指令,执行与所述动作指令相匹配的操作。According to a third aspect, a human-machine interaction device is provided, including: an image acquisition module that acquires an image for instructing a terminal device to perform an action; an action instruction determination module that determines a matching action instruction based on image characteristics of the image; an execution module In response to the action instruction, an operation matching the action instruction is performed.
第四方面,提供了一种人机交互装置,包括:接收模块,接收来自于发送方的动作指令;效果显示模块,响应于所述动作指令,显示与所述动作指令对应的效果,所述与所述动作指令对应的效果包括下述至少一种:对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;对与发送方进行通讯的消息边框颜色的处理效果;屏幕振动反转;或视频或动画播放。According to a fourth aspect, a human-machine interaction device is provided, including: a receiving module that receives an action instruction from a sender; and an effect display module that displays an effect corresponding to the action instruction in response to the action instruction. The effect corresponding to the action instruction includes at least one of the following: a processing effect on the sender's avatar of the terminal device and / or a processing effect on the receiver's avatar; and the color of the frame of the message communicating with the sender. Processing effects; screen vibration inversion; or video or animation playback.
第五方面,提供了一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如下操作:获取用于指示终端设备执行动作的图像;基于所述图像的图像特征确定匹配的动作指令;响应于所述动作指令,执行与所述动作指令相匹配的操作。According to a fifth aspect, an electronic device is provided, including: a memory, a processor, and a computer program stored on the memory and executable on the processor. The computer program is implemented as follows when executed by the processor: Operation: Acquire an image for instructing the terminal device to perform an action; determine a matching action instruction based on the image characteristics of the image; and perform an operation that matches the action instruction in response to the action instruction.
第六方面,提供了一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如下操作:接收来自于发送方的动作指令;响应于所述动作指令,显示与所述动作指令对应的效果,所述与所述动作指令对应的效果包括下述至少一种:对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;对与发送方进行通讯的消息边框颜色的处理效果;屏幕振动反转;或视频或动画播放。According to a sixth aspect, an electronic device is provided, including: a memory, a processor, and a computer program stored on the memory and executable on the processor. The computer program is implemented as follows when executed by the processor: Operation: receiving an action instruction from a sender; in response to the action instruction, displaying an effect corresponding to the action instruction, the effect corresponding to the action instruction including at least one of the following: sending to a terminal device The processing effect of the party avatar and / or the processing effect of the receiving party avatar of the terminal device; the processing effect of the color of the message frame communicating with the sender; the screen vibration is reversed; or the video or animation is played.
第七方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:获取用于指示终端设备执行动作的图像;基于所述图像的图像特征确定匹配的动作指令;响应于所述动作指令,执行与所述动作指令相匹配的操作。According to a seventh aspect, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the following operation is performed: acquiring an image for instructing a terminal device to perform an action. Determining a matching action instruction based on the image characteristics of the image; and in response to the action instruction, performing an operation that matches the action instruction.
第八方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:接收来自于发送方的动作指令;响应于所述动作指令,显示与所述动作指令对应的效果,所述与所述动作指令对应的效果包括下述至少一种:对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;对与发送方进行通讯的消息边框颜色的处理效果;屏幕振动反转;或视频或动画播放。According to an eighth aspect, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the following operations are performed: receiving an action instruction from a sender; and responding The action instruction displays an effect corresponding to the action instruction, and the effect corresponding to the action instruction includes at least one of the following: a processing effect on a sender's avatar of the terminal device and / or a terminal device The processing effect of the receiver's avatar; the processing effect of the color of the message border communicating with the sender; the screen vibration is reversed; or the video or animation is played.
本说明书实施例采用的上述至少一个技术方案能够达到以下有益效果:基于获取到图像的图像特征确定匹配的动作指令,并响应于所述动作指令执行与所述动作指令相匹配的操作,实现了基于获取的图像的人机交互。The at least one technical solution adopted in the embodiment of the present specification can achieve the following beneficial effects: determining a matching action instruction based on the image characteristics of the acquired image, and performing an operation matching the action instruction in response to the action instruction, thereby achieving Human-computer interaction based on acquired images.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
此处所说明的附图用来提供对本说明书的进一步理解,构成本说明书的一部分,本说明书的示意性实施例及其说明用于解释本说明书,并不构成对本说明书的不当限定。在附图中:The drawings described here are used to provide a further understanding of the specification and constitute a part of the specification. The schematic embodiments and the description of the specification are used to explain the specification, and do not constitute an improper limitation on the specification. In the drawings:
图1为本说明书的一个实施例提供的人机交互方法流程示意图;FIG. 1 is a schematic flowchart of a human-computer interaction method according to an embodiment of the present specification; FIG.
图2为本说明书的另一个实施例提供的人机交互方法流程示意图;2 is a schematic flowchart of a human-computer interaction method according to another embodiment of the present specification;
图3为图2所示的实施例中的显示界面示意图;3 is a schematic diagram of a display interface in the embodiment shown in FIG. 2;
图4为本说明书的再一个实施例提供的人机交互方法流程示意图;4 is a schematic flowchart of a human-computer interaction method according to another embodiment of the present specification;
图5为图4所示的实施例中的显示界面示意图;5 is a schematic diagram of a display interface in the embodiment shown in FIG. 4;
图6为本说明书的又一个实施例提供的人机交互方法流程示意图;6 is a schematic flowchart of a human-computer interaction method according to another embodiment of the present specification;
图7为图6所示的实施例中的显示界面示意图;7 is a schematic diagram of a display interface in the embodiment shown in FIG. 6;
图8为本说明书的一个实施例提供的人机交互方法初始界面示意图;FIG. 8 is a schematic diagram of an initial interface of a human-computer interaction method according to an embodiment of the present specification; FIG.
图9为本说明书的一个实施例提供的人机交互方法初始界面另一示意图;9 is another schematic diagram of an initial interface of a human-computer interaction method according to an embodiment of the present specification;
图10为本说明书的下一个实施例提供的人机交互方法流程示意图;10 is a schematic flowchart of a human-computer interaction method provided by a next embodiment of the present specification;
图11为图10所示的实施例中的显示界面示意图;11 is a schematic diagram of a display interface in the embodiment shown in FIG. 10;
图12为本说明书的一个实施例提供的人机交互装置结构示意图;FIG. 12 is a schematic structural diagram of a human-computer interaction device according to an embodiment of the present specification; FIG.
图13为本说明书的另一个实施例提供的人机交互装置结构示意图;13 is a schematic structural diagram of a human-computer interaction device according to another embodiment of the present specification;
图14本说明书各个实施例能够实现的效果示意图。FIG. 14 is a schematic diagram of effects that can be achieved by various embodiments of this specification.
图15为实现本说明书各个实施例的电子设备硬件结构示意图。FIG. 15 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present specification.
具体实施方式detailed description
为使本说明书的目的、技术方案和优点更加清楚,下面将结合本说明书具体实施例及相应的附图对本说明书技术方案进行清楚、完整地描述。显然,所描述的实施例仅是 本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本说明书保护的范围。In order to make the purpose, technical solution, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described in combination with specific embodiments of the specification and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, but not all the embodiments. Based on the embodiments in the present specification, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present specification.
如图1所示,本说明书的一个实施例提供一种人机交互方法100,包括如下步骤:As shown in FIG. 1, an embodiment of the present specification provides a human-computer interaction method 100 including the following steps:
S102:获取用于指示终端设备执行动作的图像。S102: Acquire an image used to instruct the terminal device to perform an action.
本说明书实施例中获取的用于指示终端设备执行动作的图像可以是手势图像、人脸图像、用户全身的人体图像或者是用户身体的局部图像等等,本说明书不作具体限行。The images used to instruct the terminal device to perform actions obtained in the embodiments of the present specification may be gesture images, face images, human body images of the entire body of the user, or partial images of the user's body, etc., and are not specifically limited in this specification.
本说明书实施例中获取的图像可以是单张图像,也可以是采集的视频流中的多帧图像。The image acquired in the embodiment of the present specification may be a single image or a multi-frame image in a captured video stream.
另外,该步骤中获取图像可以是单个用户的图像,也可以是多个用户的图像。In addition, the acquired image in this step may be an image of a single user or an image of multiple users.
该步骤可以是从预先存储的多个图像中获取图像,也可以是实时采集得到图像。如果上述图像可以是预先存储的,这样,步骤S102可以从存储的多个图像中获取一个图像,例如获取用户选择的一个图像。另外,如果上述图像还是实时采集得到,这样,步骤S102可以基于终端设备的图像传感器等实时采集图像。This step may be acquiring images from multiple images stored in advance, or acquiring images in real time. If the above image can be stored in advance, in this way, step S102 can obtain an image from the stored multiple images, for example, obtain an image selected by the user. In addition, if the above-mentioned images are still acquired in real time, in this way, step S102 may acquire images in real time based on the image sensor of the terminal device.
S104:基于所述图像的图像特征确定匹配的动作指令。S104: Determine a matching action instruction based on the image characteristics of the image.
该步骤中的图像特征和获取到的图像相对应,具体可以是从获取到的图像中提取的到,例如,获取到的是手势图像,则该处的图像特征可以是手势特征;获取到的图像是人脸图像,则该处的图像特征可以是人脸特征;获取到的图像是人体图像,则该处的图像特征可以是人体的姿势或动作特征等等。The image feature in this step corresponds to the acquired image, and may specifically be extracted from the acquired image. For example, if a gesture image is acquired, the image feature there may be a gesture characteristic; the acquired If the image is a human face image, the image feature at this place may be a human face feature; if the acquired image is a human body image, the image feature at this place may be a pose or action feature of the human body, and so on.
该实施例执行之前,可以预先建立图像特征和动作指令的映射关系表,这样,步骤S104则可以直接通过查表的方式确定匹配的动作指令。Before this embodiment is executed, a mapping relationship table between image features and motion instructions may be established in advance. In this way, in step S104, a matching motion instruction may be directly determined by looking up the table.
可选地,在不同的应用场景下,同一个图像特征还可以对应与不同的动作指令,因此,该实施例执行之前,还可以在不同的场景下,分别建立图像特征和动作指令的映射关系表,该实施例则可以是在已确定的场景下执行,例如,该实施例可以是在用户选择的场景下执行,又例如,该实施例还可以是在基于AR扫描获取到的场景下执行,或者是在预设的VR环境下执行,又或者是在预设的MR环境下执行,等等。Optionally, in different application scenarios, the same image feature can also correspond to different action instructions. Therefore, before this embodiment is executed, the mapping relationship between image features and action instructions can also be established in different scenarios. Table, this embodiment may be executed in a determined scenario. For example, this embodiment may be executed in a scenario selected by a user, and for example, this embodiment may also be executed in a scenario obtained based on an AR scan. , Or executed in a preset VR environment, or executed in a preset MR environment, and so on.
S106:响应于所述动作指令,执行与所述动作指令相匹配的操作。S106: In response to the action instruction, perform an operation matching the action instruction.
该步骤中的响应于所述动作指令,执行与所述动作指令相匹配的操作,例如,在单 机人机交互的增强现实场景下,具体可以是基于所述动作指令生成渲染指令;然后以对所述动作指令相关的目标对象进行渲染。In this step, in response to the action instruction, an operation matching the action instruction is performed. For example, in a single-machine human-machine interaction augmented reality scenario, a rendering instruction may be specifically generated based on the action instruction; The target object related to the action instruction is rendered.
另外,在发送方和接收方的聊天场景下,对所述动作指令相关的目标对象进行渲染的同时,还可以向接收方发送所述动作指令,以便接收方基于上述动作指令生成渲染指令,以对所述动作指令相关的目标对象进行渲染。同时,在发送方也显示上述增强现实显示的目标对象。上述提到的目标对象,具体可以是增强现实场景、虚拟现实场景、混合现实场景等等;另外,本说明书各个实施例提到的显示效果以及相关的显示技术可以基于Open CV视觉库实现。In addition, in a chat scenario between the sender and the receiver, while rendering the target object related to the action instruction, the action instruction may also be sent to the receiver, so that the receiver generates a rendering instruction based on the action instruction, and Rendering a target object related to the action instruction. At the same time, the target object of the above-mentioned augmented reality display is also displayed on the sender side. The aforementioned target objects may specifically be augmented reality scenes, virtual reality scenes, mixed reality scenes, etc. In addition, the display effects and related display technologies mentioned in the embodiments of the present specification may be implemented based on the OpenCV vision library.
上述提到的向接收方发送所述动作指令,具体可以是将所述动作指令发送至服务端,再由服务端向接收方发送所述动作指令;或者是,在不存在服务端而直接是客户端对客户端的场景下,发送方可以直接将所述动作指令发送至接收方。The above-mentioned sending of the action instruction to the receiver may specifically be sending the action instruction to the server, and then the server sends the action instruction to the receiver; or, if there is no server, it is directly In a client-to-client scenario, the sender can directly send the action instruction to the receiver.
本说明书实施例提供的人机交互方法,基于获取到的图像的图像特征确定匹配的动作指令,并响应于所述动作指令执行与所述动作指令相匹配的操作,实现了基于获取的图像的人机交互。The human-computer interaction method provided in the embodiment of the present specification determines a matching action instruction based on the image characteristics of the acquired image, and performs an operation matching the action instruction in response to the action instruction, thereby realizing the operation based on the acquired image. Human-computer interaction.
可选地,本说明书的各个实施例还可以应用在AR、VR以及MR等场景下。Optionally, the embodiments of the present specification can also be applied in scenarios such as AR, VR, and MR.
为详细说明本说明书实施例提供的人机交互方法,如图2和图3所示,本说明书的另一个实施例提供一种人机交互方法200,包括如下步骤:In order to describe the human-computer interaction method provided by the embodiment of the specification in detail, as shown in FIG. 2 and FIG. 3, another embodiment of the specification provides a human-computer interaction method 200, which includes the following steps:
S202:响应于用户对展示的预设图像的选择操作,获取被选择的手势图像、人脸图像或人体图像。S202: Obtain a selected gesture image, a face image, or a human body image in response to a user's selection operation on the displayed preset image.
如图3的应用界面示意图所示,该实施例可以预先在显示界面显示多个手势图像,具体见图3中右侧的文字“手势选择”下方的方框,当用户点击选择其中的一个手势图像时,该步骤即可获取到了上述手势图像。As shown in the schematic diagram of the application interface in FIG. 3, in this embodiment, multiple gesture images can be displayed on the display interface in advance. For details, see the box under the text "Gesture Selection" on the right side of Figure 3. When the user clicks to select one of the gestures, In the case of an image, the above gesture image can be obtained in this step.
可选地,该实施例还可以预先展示多个人脸表情图像、人体动作姿势图像等,当用户选取时,该步骤即可获取上述人脸表情图像或人体动作图像。Optionally, in this embodiment, multiple facial expression images, human action posture images, and the like can be displayed in advance. When the user selects this step, the above facial expression images or human action images can be obtained.
可选地,上述预先显示的手势图像可以包括左手的手势图像;右手的手势图像;还可以包括单手握拳或手指合拢的手势图像;单手放开或手指伸开的手势图像;以及中指和无名指合拢其他手指伸开的爱的手势图像等等。Optionally, the above-mentioned gesture image displayed in advance may include a gesture image of the left hand; a gesture image of the right hand; may also include a gesture image of one-handed fist or fingers closed; a gesture image of one-handed release or finger extension; and the middle finger and The ring finger closes the gesture image of love with other fingers spread out and so on.
上述预先展示的人脸表情图像可以是欢笑的表情图像、悲伤的表情图像、大哭的表 情图像等。The aforementioned facial expression image displayed in advance may be a smiling expression image, a sad expression image, a crying expression image, and the like.
上述预先展示的人体动作姿势图像可以是弯腰90度的人体姿势图像、站军姿的人体动作姿势图像等等。The above-mentioned pre-shown human action posture image may be a human posture image bent at 90 degrees, a standing human posture image, or the like.
S204:在预设场景下基于选取的图像的图像特征确定动作指令。S204: Determine an action instruction based on the image characteristics of the selected image in a preset scene.
该实施例执行之前可以预先存储上述图像和图像特征的对应关系,这样,基于用户选择的图像即可直接确定图像特征,例如,用户选取的手势图像是单手握拳的图像,则手势特征可以是表示单手握拳的特征。The correspondence between the above-mentioned image and image features can be stored before execution of this embodiment, so that the image features can be directly determined based on the image selected by the user. For example, if the gesture image selected by the user is an image of a single-handed fist, the gesture feature can be Represents the characteristics of a single-handed fist.
该实施例执行之前,可以预先建立图像特征和动作指令的映射关系表,这样,步骤S204则可以直接通过查表的方式确定匹配的动作指令。Before this embodiment is executed, a mapping relationship table between image features and motion instructions may be established in advance. In this way, in step S204, a matching motion instruction may be directly determined by looking up the table.
可选地,在不同的应用场景下,同一个图像特征还可以对应与不同的动作指令,因此,该实施例执行之前,还可以在不同的场景下,分别建立图像特征和动作指令的映射关系表,该实施例则可以是在已确定的场景下执行,例如,该实施例可以是在用户选择的场景下执行,又例如,该实施例还可以是在基于AR扫描获取到的场景下执行,或者是在预设的VR场景下执行,又或者是在预设的MR场景下执行,等等,这样,该实施例执行之前还可以预先获取场景图像,在获取到的场景下执行该实施例。Optionally, in different application scenarios, the same image feature can also correspond to different action instructions. Therefore, before this embodiment is executed, the mapping relationship between image features and action instructions can also be established in different scenarios. Table, this embodiment may be executed in a determined scenario. For example, this embodiment may be executed in a scenario selected by a user, and for example, this embodiment may also be executed in a scenario obtained based on an AR scan. , Or in a preset VR scene, or in a preset MR scene, etc. In this way, before this embodiment is executed, a scene image can be obtained in advance, and the implementation is performed in the obtained scene. example.
该步骤基于所述图像特征确定动作指令时,可以先确定当前的应用场景,然后确定在当前应用场景下获取到的图像特征对应的动作指令,例如,在单机格斗游戏的场景下,基于单手握拳的手势特征可以确定出拳的动作指令。In this step, when determining an action instruction based on the image feature, the current application scenario may be determined first, and then the action instruction corresponding to the image feature obtained in the current application scenario may be determined. For example, in the scenario of a stand-alone fighting game, based on one hand The gesture characteristics of the fist can determine the action instruction of the fist.
S206:响应于所述动作指令,执行与所述动作指令相匹配的操作。S206: In response to the action instruction, perform an operation matching the action instruction.
该步骤中的响应于所述动作指令,执行与所述动作指令相匹配的操作具体可以是基于所述动作指令生成渲染指令,对所述动作指令相关的目标对象进行渲染,例如,在图3中预先显示的手势图像左侧的方框内展示强现实、虚拟现实或混合现实的目标对象,展示的目标对象可以是增强现实、虚拟现实或混合现实场景图像。In response to the action instruction in this step, performing an operation matching the action instruction may specifically generate a rendering instruction based on the action instruction, and render a target object related to the action instruction, for example, in FIG. 3 The box to the left of the gesture image displayed in advance displays target objects of strong reality, virtual reality, or mixed reality. The displayed target objects can be images of augmented reality, virtual reality, or mixed reality.
该步骤中提到的响应于所述动作指令,执行与所述动作指令相匹配的操作之后,还可以向接收方发送所述动作指令,以便接收方基于上述动作指令生成渲染指令,以对所述动作指令相关的目标对象进行渲染。In response to the action instruction mentioned in this step, after performing an operation matching the action instruction, the action instruction may also be sent to the receiver so that the receiver generates a rendering instruction based on the action instruction to The target object related to the motion instruction is rendered.
上述提到的向接收方发送所述动作指令,具体可以是将所述动作指令发送至服务端,再由服务端向接收方发送所述动作指令;或者是,在不存在服务端而直接是客户端对客 户端的场景下,发送方可以直接将所述动作指令发送至接收方。The above-mentioned sending of the action instruction to the receiver may specifically be sending the action instruction to the server, and then the server sends the action instruction to the receiver; or, if there is no server, it is directly In a client-to-client scenario, the sender can directly send the action instruction to the receiver.
本说明书实施例提供的交互方法,基于获取到图像的图像特征确定匹配的动作指令,并响应于所述动作指令执行与所述动作指令相匹配的操作,实现了基于获取的图像的人机交互。The interaction method provided in the embodiment of the present specification determines a matching action instruction based on the image characteristics of the acquired image, and performs an operation matching the action instruction in response to the action instruction, thereby realizing human-computer interaction based on the acquired image. .
另外,本说明书实施例预先保存有多个手势图像、人脸图像或人体图像。从而方便用户快速选取,提高用户体验。In addition, in the embodiment of the present specification, a plurality of gesture images, face images, or human body images are stored in advance. This makes it easy for users to quickly select and improve the user experience.
可选地,在图3所示的显示界面中预先展示的手势图像的顺序,或者是其他实施例中的人脸图像或人体图像的显示顺序,可以基于用户历史使用频率进行排序,例如,用户选择单手握拳的手势图像的频率最高,则将单手握拳的手势图像排在第一位进行展示,进一步方便用户选取,提高用户体验。Optionally, the order of the gesture images displayed in advance in the display interface shown in FIG. 3, or the display order of the face image or the human body image in other embodiments may be sorted based on the user's historical usage frequency. For example, the user One-handed fist gesture image is selected most frequently, and the one-handed fist gesture image is displayed first, which is further convenient for users to select and improve the user experience.
需要说明的是,上述实施例还可以同时应用在多个设备多个用户交互的场景下。具体例如,通过步骤S202获取甲、乙、丙等用户从多个展示的手势图像中选取的手势图像;通过步骤S204和步骤S206,在预设的甲、乙、丙等互相交互的场景下,基于各自选取的手势图像的图像特征向甲、乙、丙等用户发送上述图像特征。同时,每个终端设备可以实时采集每个用户的手势图像,如果匹配预先选取的图像特性达到一定契合度,则执行后续逻辑操作,例如甲、乙、丙等终端设备选择的场景是一个古代庙宇,前面有道石门、当多设备识别到手往前推的动作,石门就会缓缓打开等。It should be noted that the above embodiments may also be applied in a scenario where multiple devices interact with multiple users at the same time. For example, in step S202, a gesture image selected by a user such as A, B, and C from a plurality of displayed gesture images is acquired; in steps S204 and S206, in a preset scenario where A, B, and C interact with each other, The above image features are sent to users such as A, B, and C based on the image features of the gesture images selected respectively. At the same time, each terminal device can collect the gesture image of each user in real time. If it matches the pre-selected image characteristics to a certain degree, it will perform subsequent logical operations. For example, the scene selected by terminal devices such as A, B, and C is an ancient temple. , There is a stone door in front, when multiple devices recognize the movement of the hand pushing forward, the stone door will slowly open and so on.
在图2和图3所示的实施例中预先展示有手势图像、人脸图像或人体图像等,考虑到展示的图像的数量有限;并且预先展示的图像的内容不够丰富,为了进一步提高图像的数量,并且提高图像的丰富程度,增强用户互动,增加用户交互乐趣,如图4和图5所示,本说明书的另一个实施例提供一种人机交互方法400,包括如下步骤:In the embodiments shown in FIG. 2 and FIG. 3, a gesture image, a face image, or a human body image is displayed in advance, considering that the number of images displayed is limited; and the content of the image displayed in advance is not rich enough, in order to further improve the image Quantity, and increase the richness of images, enhance user interaction, and increase user interaction fun. As shown in FIG. 4 and FIG. 5, another embodiment of the present specification provides a human-computer interaction method 400 including the following steps:
S402:获取图像特征,所述图像特征包括下述至少一种:手势图像特征、人脸图像特征、人体图像特征以及动作特征。S402: Acquire an image feature, where the image feature includes at least one of the following: a gesture image feature, a face image feature, a human image feature, and an action feature.
该实施例可以应用在终端设备上,该终端设备包括有可用于采集图像的部件,以运行增强现实应用的终端设备为例,终端设备上用于采集图像的部件可以包括红外摄像头等,在获取到图像后基于获取的图像获取图像特征。This embodiment can be applied to a terminal device. The terminal device includes components that can be used to collect images. Taking a terminal device running an augmented reality application as an example, the components used to collect images on the terminal device can include an infrared camera. After arriving at the image, acquire image features based on the acquired image.
上述动作特征,例如包括:出拳的动作特征、挥手的动作特征、出掌的动作特征、跑步的动作特征、直立静止的动作特征、摇头的动作特征、点头的动作特征等。The above motion characteristics include, for example, the characteristics of punching out, the characteristics of waving, the characteristics of releasing the palm, the characteristics of running, the characteristics of standing and standing, the characteristics of shaking the head, and the characteristics of nodding.
可选地,该实施例执行之前还可以预先识别应用场景,例如,上述应用场景具体可 以包括发送方和接收方相互聊天的场景;网络格斗游戏的应用场景;多个终端设备互相聊天交互的场景等。Optionally, before this embodiment is executed, an application scenario may be identified in advance. For example, the application scenario may specifically include a scenario in which a sender and a receiver chat with each other; an application scenario in a network fighting game; a scenario in which multiple terminal devices chat and interact with each other Wait.
该步骤在获取图像特征时,例如获取手势特征时,可使用手势特征分类模型获取手势特征。该手势特征分类模型的输入参数可以是采集到的手势图像(或者预处理后的手势图像,下一段进行介绍),输出参数可以是手势特征。该手势特征分类模型可基于支持向量机(Support Vector Machine,SVM))、卷积神经网络(Convolutional Neural Network,简称CNN)或DL等算法,通过机器学习的方式生成得到。In this step, when acquiring image features, for example, when acquiring gesture features, a gesture feature classification model may be used to acquire gesture features. The input parameters of the gesture feature classification model may be collected gesture images (or pre-processed gesture images, which will be described in the next paragraph), and the output parameters may be gesture features. The gesture feature classification model can be generated through machine learning based on algorithms such as Support Vector Machine (SVM), Convolutional Neural Network (CNN), or DL.
为了提高手势特征的识别精度,可选地,该步骤还可以对采集到的手势图像进行预处理,以便去除噪声。具体地,对手势图像的预处理操作可包括但不限于:对采集到的手势图像进行图像增强;图像二值化;图像灰度化以及去噪声处理等。In order to improve the recognition accuracy of the gesture feature, optionally, this step may further preprocess the collected gesture image in order to remove noise. Specifically, the pre-processing operation on the gesture image may include, but is not limited to, performing image enhancement on the acquired gesture image; image binarization; image graying and denoising processing.
对于人脸图像特征、人体图像特征以及动作特征的获取方式与上述手势特征的获取方式类似,在此不再赘述。The manners of obtaining the facial image features, the human image features, and the motion features are similar to those of the gesture features described above, and are not repeated here.
该实施例执行之前可以预先采集手势图像、人脸图像、人体图像以及动作图像等,然后基于采集的图像提取手势图像特征、人脸图像特征、人体图像特征以及动作特征。Before this embodiment is executed, a gesture image, a face image, a human body image, and an action image may be collected in advance, and then gesture image features, face image features, human image features, and motion features are extracted based on the collected images.
可选地,该实施例还可以根据图像特征精度要求以及性能要求(比如响应速度要求)等来确定是否进行图像预处理,或者确定所采用的图像预处理方法。具体例如,在响应速度要求比较高的网络格斗游戏的应用场景下,可以不对手势图像进行预处理;在对手势精度要求比较高的场景下,可以对采集到的图像进行预处理。Optionally, this embodiment may also determine whether to perform image preprocessing or determine the image preprocessing method to be used according to image feature accuracy requirements and performance requirements (such as response speed requirements). Specifically, for example, in an application scenario of a network fighting game with a high response speed requirement, the gesture image may not be preprocessed; in a scenario with a high requirement for gesture accuracy, the collected image may be preprocessed.
S404:在预设场景下基于所述图像特征以及用户选取的附加动态特征确定匹配的动作指令。S404: Determine a matching action instruction based on the image feature and the additional dynamic feature selected by the user in a preset scene.
该实施例执行之前还可以预先获取场景图像,在获取到的场景下执行该实施例。Before executing this embodiment, a scene image may be obtained in advance, and this embodiment is executed under the obtained scene.
该步骤具体基于所述图像特征以及用户选取的附加动态特征确定匹配的动作指令时,可以先确定当前的应用场景,然后确定在当前应用场景下图像特征以及用户选取的附加动态特征对应的动作指令,例如,在单机格斗游戏的场景下,基于单手握拳的手势特征以及用户选择的附加火球的动态特征,可以确定出拳+火球的动作指令。如图5的应用界面示意图所示,该实施例可以预先在显示界面显示多个附加动态效果,具体见图5中右侧的文字“附加动态效果”下方的圆形,当用户点击选择其中的一个附加动态效果时,该步骤即可基于所述手势特征和所述附加动态效果特征确定动作指令。In this step, when determining a matching action instruction based on the image feature and the additional dynamic feature selected by the user, first determine the current application scenario, and then determine the action instruction corresponding to the image feature and the additional dynamic feature selected by the user in the current application scenario. For example, in the scenario of a stand-alone fighting game, based on the gesture characteristics of a single-handed fist and the dynamic characteristics of an additional fireball selected by the user, an action command of a fist + fireball can be determined. As shown in the schematic diagram of the application interface in FIG. 5, in this embodiment, multiple additional dynamic effects can be displayed on the display interface in advance. Specifically, see the circle under the text “Additional Dynamic Effects” on the right side of FIG. 5. When the user clicks to select one of them, For an additional dynamic effect, this step may determine an action instruction based on the gesture feature and the additional dynamic effect feature.
该实施例中,选取的附加动态特征和获取的图像相对应。在其他的实施例中,如果 获取到的是人脸特征,这还可以预先在显示界面显示多个附加人脸相关的动态效果供用户选取,当用户选取时生成附加动态特征,以对人脸显示效果等进行增强显示。In this embodiment, the selected additional dynamic feature corresponds to the acquired image. In other embodiments, if the facial features are acquired, this may also display multiple dynamic effects related to additional faces on the display interface in advance for the user to select, and generate additional dynamic features when the user selects, to Display effects, etc. for enhanced display.
在其他的实施例中,如果获取到的是人体图像特征或动作特征,这还可以预先在显示界面显示多个附加人体或动作相关的动态效果供用户选取,当用户选取时生成附加动态特征。In other embodiments, if the acquired human body image features or motion features are obtained, this may also display a plurality of additional human body or motion-related dynamic effects in advance on the display interface for the user to select, and generate additional dynamic features when the user selects them.
具体例如,步骤S402中获取到的是表示单手握拳的手势特征,如果不选择上述附加动态效果(或称特征),则该步骤确定的动作指令仅仅表示出拳的动作指令;如果选择附加“雪球”的附加动态效果,则该步骤确定的动作指令可以是包括出拳加发射雪球的具有炫酷效果的动作指令。Specifically, for example, the gesture characteristics representing a single-handed fist are obtained in step S402. If the above-mentioned additional dynamic effect (or feature) is not selected, the action instruction determined in this step only represents the action instruction of the fist; "Snowball", then the action instruction determined in this step may be an action instruction with a cool effect that includes punching and firing a snowball.
S406:响应于所述动作指令,执行与所述动作指令相匹配的操作。S406: In response to the action instruction, perform an operation matching the action instruction.
该步骤中的响应于所述动作指令,执行与所述动作指令相匹配的操作,具体可以是基于所述动作指令生成渲染指令,对所述动作指令相关的目标对象进行渲染,例如,在图5中左侧的方框内展示增强现实、虚拟现实或混合现实的目标对象,展示的目标对象可以是增强现实、虚拟现实或混合现实场景图像。In response to the action instruction, performing an operation matching the action instruction in this step may specifically generate a rendering instruction based on the action instruction, and render a target object related to the action instruction, for example, in the figure The box on the left in 5 shows the target object of augmented reality, virtual reality, or mixed reality. The target object displayed can be an image of the scene of augmented reality, virtual reality, or mixed reality.
该实施例还可以向接收方发送所述动作指令,以便接收方基于上述动作指令生成渲染指令,以对所述动作指令相关的目标对象进行渲染,当然在发送方也可以同样展示增强现实的目标对象。This embodiment may also send the action instruction to the receiver, so that the receiver generates a rendering instruction based on the action instruction to render a target object related to the action instruction. Of course, the sender can also display the target of the augmented reality. Object.
本说明书实施例提供的交互方法,获取图像特征,并基于所述图像特征以及用户选取的附加动态特征确定动作指令并响应于所述动作指令,实现基于获取的图像特征的人机交互。The interaction method provided in the embodiment of the present specification acquires image features, determines an action instruction based on the image features and additional dynamic features selected by the user, and implements human-computer interaction based on the acquired image features in response to the action instructions.
另外,该实施例基于实时采集的图像获取手势图像特征、人脸图像特征、人体图像特征以及动作特征等,相对于获取数量有限的、预先存储的图像而言,能够获取到的图像特征更加丰富、多样。In addition, this embodiment acquires gesture image features, face image features, human image features, and motion features based on real-time collected images. Compared to a limited number of pre-stored images, the image features that can be acquired are more abundant. And diverse.
同时,通过实时采集用户图像并获取图像特征的方式,增加用户的互动,特别是在一些游戏场景下,提高用户的融入感和互动性,提高用户体验。At the same time, by collecting user images and acquiring image features in real time, user interaction is increased, especially in some game scenarios, to improve the user's sense of integration and interactivity, and to improve the user experience.
另外,本说明书实施例预先保存有附加动态效果供用户选择,从而方便用户快速选取,以便与生成更加炫酷的特技效果,提高用户体验。In addition, in the embodiment of the present specification, additional dynamic effects are stored in advance for the user to select, so that the user can quickly select them, so as to generate more cool special effects and improve the user experience.
可选地,在图5所示的显示界面中预先展示的附加动态效果的顺序,或者是其他实 施例中的对人脸特征的附加动态效果、或人体特征的附加动态效果等显示顺序,可以基于用户历史使用频率进行排序,例如,用户选择“火球”的频率最高,参见图5,则将“火球”的附加动态效果排在第一位进行展示,进一步方便用户选取,提高用户体验。Optionally, the order of the additional dynamic effects previously displayed in the display interface shown in FIG. 5, or the display order of the additional dynamic effects on the face features or the additional dynamic effects on the human features in other embodiments may be Sort based on the user's historical usage frequency. For example, users select "Fireball" the most frequently. Refer to Figure 5, the additional dynamic effects of "Fireball" will be displayed first to further facilitate user selection and improve user experience.
需要说明的是,上述实施例不仅可以应用在单个终端设备的场景下,还可以同时应用在多个设备交互的场景下。It should be noted that the above embodiments can be applied not only in the scenario of a single terminal device, but also in scenarios where multiple devices interact.
如图6和图7所示,本说明书的另一个实施例提供一种人机交互方法600,包括如下步骤:As shown in FIG. 6 and FIG. 7, another embodiment of the present specification provides a human-computer interaction method 600 including the following steps:
S602:获取用户选取的场景特征。S602: Acquire a scene feature selected by a user.
该实施例中的场景特征,具体如图7的应用界面示意图所示,该实施例可以预先在显示界面显示多个预设场景,例如图7所示的“阿凡达(avatar)”场景,后续的多个场景以“***”进行示意显示,当用户点击选择其中的一个场景时,该步骤即相当于是获取到的场景特征。The scene features in this embodiment are specifically shown in the schematic diagram of the application interface in FIG. 7. In this embodiment, multiple preset scenes can be displayed on the display interface in advance, such as the “avatar” scene shown in FIG. 7. Multiple scenes are shown schematically with "***". When the user clicks to select one of the scenes, this step is equivalent to the acquired scene feature.
另外,在图7的应用界面还包括有“more”按钮,当用户点击时可以展现更多的预设场景。In addition, the application interface of FIG. 7 further includes a “more” button, which can display more preset scenes when the user clicks.
S604:基于所述场景特征以及获取的图像特征确定动作指令,所述图像特征包括下述至少一种:手势图像特征、人脸图像特征、人体图像特征以及动作特征。S604: Determine an action instruction based on the scene feature and the acquired image feature. The image feature includes at least one of the following: a gesture image feature, a face image feature, a human image feature, and an action feature.
该实施例可以应用在终端设备上,该终端设备包括有可用于采集图像的部件,以运行增强现实应用的终端设备为例,终端设备上用于采集图像的部件可以包括红外摄像头等,并基于获取的图像获取图像特征,具体的获取过程参见图4所示的实施例,以下以获取人脸特征为例进行介绍。This embodiment can be applied to a terminal device. The terminal device includes components that can be used to collect images. Taking a terminal device running an augmented reality application as an example, the components used to collect images on the terminal device can include an infrared camera and the like. The acquired image acquires image features. For a specific acquisition process, refer to the embodiment shown in FIG. 4. The following takes an example of acquiring facial features as an example.
在获取人脸特征时,可使用人脸特征分类模型获取人脸特征。该人脸特征分类模型的输入参数可以是采集到的人脸图像(或者预处理后的人脸图像,下一段进行介绍),输出参数可以是人脸特征。该人脸特征分类模型可基于支持向量机(Support Vector Machine,SVM))、卷积神经网络(Convolutional Neural Network,简称CNN)或DL等算法,通过机器学习的方式生成得到。When obtaining facial features, a facial feature classification model can be used to obtain facial features. The input parameters of the facial feature classification model may be collected facial images (or pre-processed facial images, which will be described in the next paragraph), and the output parameters may be facial features. The facial feature classification model can be generated through machine learning based on algorithms such as Support Vector Machine (SVM), Convolutional Neural Network (CNN) or DL.
为了提高人脸特征的识别精度,可选地,该步骤还可以对采集到的人脸图像进行预处理,以便去除噪声。具体地,对人脸图像的预处理操作可包括但不限于:对采集到的人脸图像进行图像增强;图像二值化;图像灰度化以及去噪声处理等。In order to improve the recognition accuracy of face features, optionally, this step may also preprocess the collected face images in order to remove noise. Specifically, the pre-processing operation on the face image may include, but is not limited to, performing image enhancement on the collected face image; image binarization; image graying and denoising processing.
该步骤基于所述图像特征和所述场景特征确定匹配的动作指令时,例如,在具有发送方和接收方的网络聊天的应用场景下,可以将图像特征和场景特征融合,如将人脸特征和场景特征融合,生成人脸特征和场景特征融合的动作指令,具体例如,在用户选择的场景中预留有人脸区域,将用户的人脸特征融合展示在上述预留的人脸区域,从而实现用户人脸与选择的场景的无缝对接,生成用户真实处于上述场景中的效果,具体如,用户人在画中游、上述场景中的角色的脸部变成了用户的人脸等。In this step, when a matching action instruction is determined based on the image feature and the scene feature, for example, in an application scenario of a network chat with a sender and a receiver, the image feature and the scene feature may be fused, such as a face feature Fusion with scene features to generate an action instruction for the fusion of face features and scene features. Specifically, for example, a human face area is reserved in the scene selected by the user, and the user's face feature is fused and displayed in the reserved face area. Realize the seamless docking of the user's face with the selected scene, and generate the effect that the user is actually in the above scene, for example, the user is in the middle of the picture, and the face of the character in the above scene becomes the user's face.
该实施例尤其适用于合影、艺术大头贴、艺术造型、cosplay等应用场景下。This embodiment is particularly applicable to application scenarios such as group photos, artistic photo stickers, artistic modeling, and cosplay.
S606:响应于所述动作指令,执行与所述动作指令相匹配的操作。S606: In response to the action instruction, perform an operation matching the action instruction.
该步骤中的响应于所述动作指令,执行与所述动作指令相匹配的操作,具体可以是基于所述动作指令生成渲染指令,以对所述动作指令相关的目标对象进行渲染;还可以是向接收方发送所述动作指令,以便接收方基于上述动作指令生成渲染指令,对所述动作指令相关的目标对象进行渲染,最终展示增强现实、虚拟现实或混合现实的目标对象。In response to the action instruction, performing an operation matching the action instruction in this step may specifically generate a rendering instruction based on the action instruction to render a target object related to the action instruction; or Send the action instruction to the receiver, so that the receiver generates a rendering instruction based on the action instruction, renders a target object related to the action instruction, and finally displays the target object of augmented reality, virtual reality, or mixed reality.
在上述合影的应用场景下,通过步骤S606的操作之后,还可以将携带有人脸特征和所述场景特征的消息发送至接收方,在接收方在获取接收方的人脸特征,从而实现发送方的人脸特征、接收方的人脸特征以及发送方选择的场景的融合,便于提高用户体验。In the above-mentioned application scenario, after the operation in step S606, a message carrying human face features and the scene features may also be sent to the receiver, and the receiver may obtain the receiver ’s face features, thereby realizing the sender The fusion of the facial features of the user, the facial features of the receiver, and the scene selected by the sender is convenient for improving the user experience.
本说明书实施例提供的交互方法,获取图像特征以及场景特征,基于所述图像特征和所述场景特征确定动作指令并响应于所述动作指令,实现了图像特征和各种预设场景的融合,便于提升用户体验。The interactive method provided in the embodiment of the present specification acquires image features and scene features, determines an action instruction based on the image features and the scene features, and responds to the action instructions to achieve the fusion of image features and various preset scenes, Facilitate user experience.
需要说明的是,上述实施例不仅可以应用在单个终端设备的场景下,还可以同时应用在多个设备交互的场景下。It should be noted that the above embodiments can be applied not only in the scenario of a single terminal device, but also in scenarios where multiple devices interact.
另外,该实施例预先存储有不同的预设场景供用户选择,实现了获取的图像在不同的场景下变幻出不同的造型,增加趣味性,提高用户体验。In addition, in this embodiment, different preset scenes are stored in advance for the user to choose, which realizes that the acquired image changes into different shapes in different scenes, which increases interest and improves user experience.
可选地,该实施例还可以保存上述展示的增强现实、虚拟现实或混合现实的目标对象,方便用户后续使用。在一个实施例中,可以请求第三方摄像器材从外界拍摄记录当前终端设备屏幕上所显示的增强现实、虚拟现实或混合现实视图,从而间接实现增强现实、虚拟现实或混合现实视图存储,能够灵活的获取用户所需要存储的增强现实、虚拟现实或混合现实视图。Optionally, this embodiment may also save the target objects of the augmented reality, virtual reality, or mixed reality shown above, which is convenient for users to use later. In one embodiment, a third-party camera device may be requested to record and record the augmented reality, virtual reality, or mixed reality view displayed on the screen of the current terminal device from the outside, thereby indirectly implementing augmented reality, virtual reality, or mixed reality view storage, which can be flexible To get the augmented reality, virtual reality, or mixed reality views that users need to store.
在另一个实施例中,还可以通过截图的方式截取并保存用户在显示屏幕上所看到的增强现实、虚拟现实或混合现实视图。该实现方式不仅截取并存储屏幕上显示的所有增强现实、虚拟现实或混合现实内容,还可以根据用户需要有选择的存储增强现实、虚拟现实或混合现实视图。In another embodiment, the augmented reality, virtual reality, or mixed reality view that the user sees on the display screen can also be captured and saved in a screenshot manner. This implementation method not only intercepts and stores all the augmented reality, virtual reality, or mixed reality content displayed on the screen, but also selectively stores the augmented reality, virtual reality, or mixed reality views according to the needs of the user.
对于本说明书前文图1至图7所示的实施例具体应用时,其初始显示界面可以参见图8至图9,用户点击最右侧的添加按钮则会出现**Card选项,并且将**Card功能保存在聊天界面中,如图8所示,该处的**Card可以是AR Card、MR Card或者是VR Card等等。For the specific application of the embodiments shown in FIG. 1 to FIG. 7 in the foregoing description, the initial display interface can refer to FIG. 8 to FIG. The Card function is stored in the chat interface, as shown in Figure 8, where the ** Card can be AR Card, MR Card, or VR Card, etc.
后续用户使用时,首先可以点击如图8所示的**Card按钮,然后即可以执行图1至图7所示的各个实施例的操作步骤;或者,检测到用户目前的场景能够执行前文图1至图7所示的实施例的方法步骤时,可以在消息界面弹出**Card选项以供用户选择使用,提高用户体验。When subsequent users use it, they can first click the ** Card button shown in Figure 8, and then they can perform the operation steps of the various embodiments shown in Figure 1 to Figure 7; During the method steps of the embodiment shown in FIG. 1 to FIG. 7, a ** Card option may be popped up in the message interface for users to select and use, thereby improving the user experience.
需要说明的是,图8和图9只是示意性地展示了一种触发执行方式,实际上,前文几个实施例介绍的方法还可以是由其他方式触发执行,例如摇一摇终端设备自动执行、通过识别用户发出的特定语音执行等等,本说明书实施例不作具体限定。It should be noted that FIG. 8 and FIG. 9 only schematically show a trigger execution mode. In fact, the methods described in the foregoing embodiments may also be triggered by other methods, such as shaking the terminal device for automatic execution. The execution of the specific voice issued by the user is not specifically limited.
如图10和图11所示,本说明书的另一个实施例提供一种人机交互方法1000,应用在接收方,包括如下步骤:As shown in FIG. 10 and FIG. 11, another embodiment of the present specification provides a human-computer interaction method 1000, which is applied to a receiver and includes the following steps:
S1002:接收来自于发送方的动作指令。S1002: Receive an operation instruction from the sender.
该实施例中的动作指令,可以是前文中的图1至图7所示的实施例中所提到的动作指令,也即,该实施例应用在接收方,其发送方执行的操作可以是如图1至图7所示的各个实施例的操作。The action instruction in this embodiment may be the action instruction mentioned in the embodiments shown in FIG. 1 to FIG. 7 in the foregoing, that is, the embodiment is applied to the receiver, and the operation performed by the sender may be The operations of the various embodiments as shown in FIGS. 1 to 7.
当然,该实施例中的动作指令也可以是其他的动作指令,即与图1至图7所示的各个实施例相互独立。Of course, the action instructions in this embodiment may also be other action instructions, that is, independent of each of the embodiments shown in FIG. 1 to FIG. 7.
S1004:响应于所述动作指令,显示与所述动作指令对应的效果;S1004: In response to the action instruction, displaying an effect corresponding to the action instruction;
其中,所述与所述动作指令对应的效果包括下述至少一种:The effect corresponding to the action instruction includes at least one of the following:
对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;The processing effect on the sender's avatar of the terminal device and / or the processing effect on the receiver's avatar on the terminal device;
对与发送方进行通讯的消息边框颜色的处理效果,对于该处提到的消息边框,可以参见图11,在显示界面中,网名为***的朋友发送了三条消息,每一条消息都包括 有消息边框。The effect of processing the color of the message border for communication with the sender. For the message border mentioned here, see Figure 11. In the display interface, a friend with a net name of *** sent three messages, each of which Includes message border.
屏幕振动反转,即整个终端设备屏幕振动并发生反转;或Screen vibration is reversed, that is, the entire terminal device screen vibrates and reverses; or
自动播放视频、动画以及语音等,上述动画包括gif图像。Automatically play videos, animations, voices, etc. The above animations include gif images.
上述视频具体可以是H264、H265等编码格式的视频文件,接收方接收到上述视频文件后即可自动播放;上述动画具体可以是强化表现人物表情的动画、画外音的艺术文字以及一些背景动画效果等,接收方接收到上述动画后自动播放。The above video may specifically be a video file in an encoding format such as H264, H265, and the receiver may automatically play after receiving the above video file; the above animation may specifically be an animation that enhances expression of a character, an artistic text of a voice-over, and some background animation effects, etc. , The receiver will play automatically after receiving the above animation.
另外,该实施例在发送方的显示界面还可以显示接收方三维模型状态发生变化,具体可以展示接收方身上中弹、接收方身上有雪花等增强现实、虚拟现实或混合现实等三维显示效果。In addition, in this embodiment, the display interface of the sender can also show that the status of the three-dimensional model of the receiver changes, which can specifically show three-dimensional display effects such as augmented reality, virtual reality, or mixed reality, such as shots on the receiver and snowflakes on the receiver.
此外,该实施例在发送方的显示界面还可以显示头像的处理效果,例如,具体可以是接收方头像变成乌龟或其他的增强现实、虚拟现实或混合现实等接收方头像的三维显示变化样式,提高趣味性,增强用户体验。In addition, in this embodiment, the processing effect of the avatar can also be displayed on the display interface of the sender. For example, the avatar can be turned into a turtle or other 3D display style of the receiver's avatar such as augmented reality, virtual reality, or mixed reality , Improve fun and enhance user experience.
上述显示效果中,在发送方的显示界面中可以显示出双方动作的产生到消亡,以及接收方的状态、头像等最后的状态;在接收方的显示界面中可以显示出双方动作的产生到消亡,通常不会显示上述接收方的状态、头像等最后的状态,提高趣味性,增强用户体验。In the above display effect, the generation and extinction of the actions of both parties can be displayed on the display interface of the sender, and the final state of the receiver, such as the status and avatar; and the generation and extinction of the actions of both parties can be displayed on the display interface of the receiver. , Usually does not display the receiver ’s status, avatar, and other final status, which improves fun and enhances user experience.
另外,该实施例还可以接收拖动指令,在显示界面移动展示的对象等。In addition, this embodiment can also receive a drag instruction, and move the displayed object on the display interface.
本说明书实施例提供的人机交互方法,接收来自于发送方的动作指令,并响应于所述动作指令显示与所述动作指令对应的效果,实现了基于动作指令的人机交互。The human-computer interaction method provided in the embodiment of the present specification receives a motion instruction from a sender, and displays an effect corresponding to the motion instruction in response to the motion instruction, thereby realizing human-computer interaction based on the motion instruction.
本说明书实施例提供的人机交互方法,与所述动作指令对应的效果均可以是在三维状态下展示,具体可以是三维增强现实、虚拟现实或混合现实展示。In the human-computer interaction method provided by the embodiment of the present specification, the effects corresponding to the action instructions may be displayed in a three-dimensional state, and specifically may be a three-dimensional augmented reality, virtual reality, or mixed reality display.
在一个具体的实施例中,在发送方的显示界面中还可以生成如下效果:甲(发送方)发送一个雪球,乙(接收方)发送一个火球,火球和雪球相撞后火球会削弱并飞向甲方,然后甲方图像着火等;又例如,甲方和乙方同时发送火球或同时发送水球,碰撞后会散落成火花或雪花溅落,形成奇幻的艺术效果,提高趣味性,增强用户体验。In a specific embodiment, the following effects can also be generated in the display interface of the sender: A (sender) sends a snowball, and B (receiver) sends a fireball. The fireball will weaken after the fireball and snowball collide. And fly to Party A, and then Party A's image catches fire; for example, Party A and Party B send fireballs or water polo at the same time, after the collision, they will be scattered into sparks or snowflakes, forming a fantasy artistic effect, improving fun and enhancing users Experience.
以上说明书部分详细介绍了人机交互方法实施例,如图12所示,本说明书还提供了一种人机交互装置1200,如图12所示,装置1200包括:The above description part introduces the embodiment of the human-computer interaction method in detail. As shown in FIG. 12, this specification also provides a human-machine interaction device 1200. As shown in FIG. 12, the device 1200 includes:
图像获取模块1202,可以用于获取用于指示终端设备执行动作的图像;The image acquisition module 1202 may be configured to acquire an image used to instruct the terminal device to perform an action;
动作指令确定模块1204,可以用于基于所述图像的图像特征确定匹配的动作指令;The action instruction determining module 1204 may be configured to determine a matching action instruction based on the image characteristics of the image;
执行模块1206,可以用于响应于所述动作指令,执行与所述动作指令相匹配的操作。The execution module 1206 may be configured to perform an operation matching the action instruction in response to the action instruction.
本说明书实施例提供的交互装置,基于获取到图像的图像特征确定动作指令并响应于所述动作指令,执行与所述动作指令相匹配的操作,实现了基于获取的图像的人机交互。The interaction device provided in the embodiment of the present specification determines an action instruction based on the image characteristics of the acquired image and executes an operation matching the action instruction in response to the action instruction, thereby realizing human-computer interaction based on the acquired image.
可选地,作为一个实施例,所述图像获取模块1202,可以用于响应于用户对展示的预设图像的选择操作,获取被选择的图像。Optionally, as an embodiment, the image acquisition module 1202 may be configured to acquire a selected image in response to a user's selection operation of the preset image displayed.
可选地,作为一个实施例,所述图像获取模块1202,可以用于通过摄像采集设备采集用户的图像。Optionally, as an embodiment, the image acquisition module 1202 may be configured to acquire an image of a user through a camera acquisition device.
可选地,作为一个实施例,所述用于指示终端设备执行动作的图像包括手势图像、人脸图像或人体图像。Optionally, as an embodiment, the image for instructing the terminal device to perform an action includes a gesture image, a face image, or a human body image.
可选地,作为一个实施例,所述动作指令确定模块1204,可以用于基于所述手势特征和获取的附加动态特征确定匹配的动作指令。Optionally, as an embodiment, the action instruction determining module 1204 may be configured to determine a matching action instruction based on the gesture feature and the acquired additional dynamic feature.
可选地,作为一个实施例,所述动作指令确定模块1204,可以用于在预设场景下,基于所述图像的图像特征和所述附加动态特征确定匹配的动作指令。Optionally, as an embodiment, the action instruction determination module 1204 may be configured to determine a matched action instruction based on an image feature of the image and the additional dynamic feature in a preset scene.
可选地,作为一个实施例,所述动作指令确定模块1204,可以用于基于所述图像的图像特征和获取的场景特征确定匹配的动作指令。Optionally, as an embodiment, the action instruction determining module 1204 may be configured to determine a matching action instruction based on an image feature of the image and an acquired scene feature.
可选地,作为一个实施例,所述装置1200还包括保存模块,可以用于保存所述图像特征和所述场景特征。Optionally, as an embodiment, the apparatus 1200 further includes a saving module, which may be used to save the image feature and the scene feature.
可选地,作为一个实施例,所述执行模块1206,可以用于基于所述动作指令生成渲染指令,以对所述动作指令相关的目标对象进行渲染。Optionally, as an embodiment, the execution module 1206 may be configured to generate a rendering instruction based on the action instruction to render a target object related to the action instruction.
可选地,作为一个实施例,所述装置1200还包括发送模块,可以用于向接收方发送所述动作指令。Optionally, as an embodiment, the apparatus 1200 further includes a sending module, which may be configured to send the action instruction to a receiver.
根据本说明书实施例的上述人机交互装置1200可以参照对应前文本说明书实施例的图1至图9所示的人机交互方法的流程,并且,该人机交互装置1200中的各个单元/模块和上述其他操作和/或功能分别为了实现人机交互方法中的相应流程,为了简洁, 在此不再赘述。The above-mentioned human-computer interaction device 1200 according to the embodiment of the present specification may refer to the flow of the human-machine interaction method shown in FIG. 1 to FIG. 9 corresponding to the embodiment of the previous text description, and each unit / module in the human-machine interaction device 1200 The operations and / or functions described above are respectively for the purpose of realizing the corresponding processes in the human-computer interaction method, and for the sake of brevity, they are not repeated here.
如图13所示,本说明书还提供了一种人机交互装置1300,如图13所示,该装置1300包括:As shown in FIG. 13, this specification also provides a human-computer interaction device 1300. As shown in FIG. 13, the device 1300 includes:
接收模块1302,可以用于接收来自于发送方的动作指令;The receiving module 1302 may be used to receive an action instruction from a sender;
效果显示模块1304,可以用于响应于所述动作指令,显示与所述动作指令对应的效果,所述与所述动作指令对应的效果包括下述至少一种:The effect display module 1304 may be configured to display an effect corresponding to the action instruction in response to the action instruction, and the effect corresponding to the action instruction includes at least one of the following:
对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;The processing effect on the sender's avatar of the terminal device and / or the processing effect on the receiver's avatar on the terminal device;
对与发送方进行通讯的消息边框颜色的处理效果;The effect of processing the color of the message border to communicate with the sender;
屏幕振动反转;或Screen vibration is reversed; or
视频或动画播放。Video or animation playback.
上述视频具体可以是H264、H265等编码格式的视频文件,或是三维模型及时演算动画,即接收方接收到上述视频文件后即可自动播放;上述动画具体可以是强化表现人物表情的动画、画外音的艺术文字以及一些背景动画效果等,接收方接收到上述动画后即可自动播放。The above video can be a video file in H264, H265 and other encoding formats, or a three-dimensional model can be calculated in time. That is, the receiver can automatically play the video file after receiving the video file. The above animation can be an animation that enhances the expression of a character, a voiceover. Artistic text and some background animation effects, etc., the receiver can automatically play after receiving the above animation.
另外,该实施例在发送方的显示界面还可以显示接收方三维模型状态发生变化,具体可以是展示接收方身上中弹、接收方身上有雪花等增强现实、虚拟现实或混合现实等三维显示效果。In addition, in this embodiment, the display interface of the sender can also show that the status of the receiver's 3D model has changed. Specifically, it can show the receiver's shot, 3D display effects such as snowflakes on the receiver, virtual reality or mixed reality, etc. .
此外,该实施例在发送方的显示界面还可以显示接收方的头像的处理效果例如,具体可以是接收方头像变成乌龟或其他的增强现实、虚拟现实或混合现实等接收方头像的三维显示变化样式,提高趣味性,增强用户体验。In addition, in this embodiment, the display effect of the receiver's avatar can also be displayed on the display interface of the sender. For example, the receiver's avatar becomes a turtle or other 3D display of the receiver's avatar such as augmented reality, virtual reality, or mixed reality. Change styles, improve fun, and enhance user experience.
上述显示效果中,在发送方的显示界面中可以显示出双方动作的产生到消亡,以及接收方的状态、头像等最后的状态;在接收方的显示界面中可以显示出双方动作的产生到消亡,通常不会显示上述接收方的状态、头像等最后的状态,提高趣味性,增强用户体验。In the above display effect, the generation and extinction of the actions of both parties can be displayed on the display interface of the sender, and the final state of the receiver, such as the status and avatar; and the generation and extinction of the actions of both parties can be displayed on the display interface of the receiver. , Usually does not display the receiver ’s status, avatar, and other final status, which improves fun and enhances user experience.
本说明书实施例提供的人机交互装置,接收来自于发送方的动作指令,并响应于所述动作指令显示与所述动作指令对应的效果,实现了基于接收的动作指令的人机交互。The human-computer interaction device provided in the embodiment of the present specification receives a motion instruction from a sender, and displays an effect corresponding to the motion instruction in response to the motion instruction, thereby realizing human-computer interaction based on the received motion instruction.
根据本说明书实施例的上述人机交互装置1300可以参照对应前文本说明书实施 例的图10至图11所示的人机交互方法的流程,并且,该人机交互装置1300中的各个单元/模块和上述其他操作和/或功能分别为了实现人机交互方法中的相应流程,为了简洁,在此不再赘述。The above-mentioned human-machine interaction device 1300 according to the embodiment of the present specification may refer to the flow of the human-machine interaction method shown in FIG. 10 to FIG. 11 corresponding to the embodiment of the previous text description, and each unit / module in the human-machine interaction device 1300 The other operations and / or functions mentioned above are for realizing the corresponding processes in the human-computer interaction method, and for the sake of brevity, they are not repeated here.
本说明书上述各个实施例能够实现的效果具体可以参见图14,在用户输入时,不仅实现了文本输入、语音输入、图片输入和短视频输入,还可以实现人脸识别、动作识别、场景识别等,并根据识别的人脸、动作和场景等变幻出不同的效果发送。用户接收时,不仅实现了普通的文本展示、语音播放、图片动态播放短视频播放等,还实现了状态发生变化、动画声音播放屏幕震动反馈等效果,上述状态发生变化,例如包括发送方身上中弹、发送方头像变成乌龟、动态更换背景等。The effects that can be achieved by the foregoing embodiments in this specification can be specifically seen in FIG. 14. When the user inputs, not only text input, voice input, picture input, and short video input, but also face recognition, motion recognition, scene recognition, etc. , And send different effects based on the recognized faces, actions, and scenes. When the user receives, not only ordinary text display, voice playback, short video playback of dynamic picture playback, etc., but also effects such as status changes, animation sound playback screen vibration feedback, etc. The above status changes, such as the sender ’s body Bomb, sender's avatar becomes a turtle, dynamically change the background, etc.
下面将结合图15详细描述根据本说明书实施例的电子设备。参考图15,在硬件层面,电子设备包括处理器,可选地,包括内部总线、网络接口、存储器。其中,如图15所示,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括实现其他业务所需要的硬件。An electronic device according to an embodiment of the present specification will be described in detail below with reference to FIG. 15. Referring to FIG. 15, at the hardware level, the electronic device includes a processor, and optionally, includes an internal bus, a network interface, and a memory. As shown in FIG. 15, the memory may include a memory, such as a high-speed random access memory (Random-Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. Wait. Of course, the electronic device may also include hardware required to implement other services.
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图15中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。The processor, network interface, and memory can be connected to each other through an internal bus, which can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an extended industry standard Structure (Extended Industry, Standard Architecture, EISA) bus, etc. The bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only a two-way arrow is used in FIG. 15, but it does not mean that there is only one bus or one type of bus.
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。Memory for storing programs. Specifically, the program may include program code, where the program code includes a computer operation instruction. The memory may include memory and non-volatile memory, and provide instructions and data to the processor.
处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成转发聊天信息的装置。处理器,执行存储器所存放的程序,并具体用于执行本说明书前文所述的方法实施例的操作。The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it to form a device for forwarding chat information on a logical level. The processor executes a program stored in the memory, and is specifically configured to perform operations of the method embodiment described earlier in this specification.
上述图1至图11所示实施例揭示的方法、装置执行的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、 网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本说明书实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本说明书实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。The methods and methods executed by the devices disclosed in the embodiments shown in FIG. 1 to FIG. 11 may be applied to a processor, or implemented by a processor. The processor may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software. The above processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc .; it may also be a digital signal processor (DSP), special integration Circuit (Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logical block diagrams disclosed in the embodiments of this specification may be implemented or executed. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in combination with the embodiments of the present specification may be directly embodied as being executed by a hardware decoding processor, or may be executed and completed by using a combination of hardware and software modules in the decoding processor. The software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like. The storage medium is located in a memory, and the processor reads the information in the memory and completes the steps of the foregoing method in combination with its hardware.
图15所示的电子设备还可执行图1至图11的方法,并实现人机交互方法在图1至图11所示实施例的功能,本说明书实施例在此不再赘述。The electronic device shown in FIG. 15 can also execute the methods of FIGS. 1 to 11 and implement the functions of the embodiment of the human-computer interaction method shown in FIG. 1 to FIG. 11, which will not be described again in the embodiment of this specification.
当然,除了软件实现方式之外,本说明书的电子设备并不排除其他实现方式,比如逻辑器件抑或软硬件结合的方式等等,也就是说以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。Of course, in addition to the software implementation, the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, etc. In other words, the execution body of the following processing flow is not limited to each logical unit. It can also be a hardware or logic device.
本说明书实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述图1至图11所示的各个方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等。The embodiments of the present specification also provide a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, the processes of the method embodiments shown in FIG. 1 to FIG. 11 are implemented. , And can achieve the same technical effect, in order to avoid repetition, will not repeat them here. The computer-readable storage medium is, for example, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.
本领域内的技术人员应明白,本说明书的实施例可提供为方法、***、或计算机程序产品。因此,本说明书可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present specification may be provided as a method, a system, or a computer program product. Therefore, this specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, this specification may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk memory, CD-ROM, optical memory, etc.) containing computer-usable program code.
本说明书是参照根据本说明书实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理 器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。This specification is described with reference to flowcharts and / or block diagrams of methods, devices (systems), and computer program products according to embodiments of the specification. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing device to produce a machine, so that the instructions generated by the processor of the computer or other programmable data processing device are used to generate instructions Means for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions The device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input / output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-persistent memory, random access memory (RAM), and / or non-volatile memory in computer-readable media, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media includes both permanent and non-persistent, removable and non-removable media. Information can be stored by any method or technology. Information may be computer-readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media may be used to store information that can be accessed by computing devices. As defined herein, computer-readable media does not include temporary computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "including," "including," or any other variation thereof are intended to encompass non-exclusive inclusion, so that a process, method, product, or device that includes a range of elements includes not only those elements, but also Other elements not explicitly listed, or those that are inherent to such a process, method, product, or device. Without more restrictions, the elements defined by the sentence "including a ..." do not exclude the existence of other identical elements in the process, method, product or equipment including the elements.
以上仅为本说明书的实施例而已,并不用于限制本说明书。对于本领域技术人员来说,本说明书可以有各种更改和变化。凡在本说明书的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本说明书的权利要求范围之内。The above are only examples of this specification, and are not intended to limit this specification. For those skilled in the art, this specification may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this specification shall be included in the scope of claims of this specification.

Claims (17)

  1. 一种人机交互方法,包括:A human-computer interaction method includes:
    获取用于指示终端设备执行动作的图像;Acquiring an image used to instruct the terminal device to perform an action;
    基于所述图像的图像特征确定匹配的动作指令;Determining a matching action instruction based on the image characteristics of the image;
    响应于所述动作指令,执行与所述动作指令相匹配的操作。In response to the action instruction, an operation matching the action instruction is performed.
  2. 根据权利要求1所述的方法,所述获取用于指示终端设备执行动作的图像包括:The method according to claim 1, wherein the acquiring an image for instructing the terminal device to perform an action comprises:
    响应于用户对展示的预设图像的选择操作,获取被选择的图像。The selected image is acquired in response to a user's selection operation on the displayed preset image.
  3. 根据权利要求1所述的方法,所述获取用于指示终端设备执行动作的图像包括:The method according to claim 1, wherein the acquiring an image for instructing the terminal device to perform an action comprises:
    通过摄像采集设备采集用户的图像。Capture the user's image through the camera capture device.
  4. 根据权利要求1至3任一项所述的方法,所述用于指示终端设备执行动作的图像包括手势图像、人脸图像或人体图像。The method according to any one of claims 1 to 3, wherein the image for instructing the terminal device to perform an action comprises a gesture image, a face image, or a human body image.
  5. 根据权利要求4所述的方法,所述基于所述图像的图像特征确定匹配的动作指令之前,所述方法还包括:The method according to claim 4, before the determining a matching action instruction based on the image characteristics of the image, the method further comprises:
    获取与所述图像相关的附加动态特征;Acquiring additional dynamic features related to the image;
    其中,所述基于所述图像的图像特征确定匹配的动作指令包括:基于所述图像的图像特征和所述附加动态特征确定匹配的动作指令。Wherein, the action instruction for determining a match based on the image feature of the image includes: determining the action instruction for a match based on the image feature of the image and the additional dynamic feature.
  6. 根据权利要求5所述的方法,The method according to claim 5,
    所述基于所述图像的图像特征和所述附加动态特征确定匹配的动作指令包括:在预设场景下,基于所述图像的图像特征和所述附加动态特征确定匹配的动作指令。The action instruction for determining a match based on the image feature of the image and the additional dynamic feature includes: determining a matched action instruction based on the image feature of the image and the additional dynamic feature in a preset scene.
  7. 根据权利要求1所述的方法,The method according to claim 1,
    所述方法还包括:获取所述图像所应用的场景特征;The method further includes: acquiring scene features applied to the image;
    其中,所述基于所述图像的图像特征确定匹配的动作指令包括:基于所述图像的图像特征和所述场景特征确定匹配的动作指令。Wherein, the action instruction for determining a match based on the image feature of the image includes: determining the action instruction for a match based on the image feature of the image and the scene feature.
  8. 根据权利要求7所述的方法,The method according to claim 7,
    所述方法还包括:保存所述图像特征和所述场景特征。The method further includes: saving the image feature and the scene feature.
  9. 根据权利要求1所述的方法,The method according to claim 1,
    所述响应于所述动作指令,执行与所述动作指令相匹配的操作包括:In response to the action instruction, performing an operation matching the action instruction includes:
    基于所述动作指令生成渲染指令,以对所述动作指令相关的目标对象进行渲染。Generate a rendering instruction based on the action instruction to render a target object related to the action instruction.
  10. 根据权利要求9所述的方法,The method according to claim 9,
    所述方法还包括:向接收方发送所述动作指令。The method further includes: sending the action instruction to a receiver.
  11. 一种人机交互方法,应用在接收方,包括:A human-computer interaction method applied to a receiver includes:
    接收来自于发送方的动作指令;Receive action instructions from the sender;
    响应于所述动作指令,显示与所述动作指令对应的效果;In response to the action instruction, displaying an effect corresponding to the action instruction;
    其中,所述与所述动作指令对应的效果包括下述至少一种:The effect corresponding to the action instruction includes at least one of the following:
    对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;The processing effect on the sender's avatar of the terminal device and / or the processing effect on the receiver's avatar on the terminal device;
    对与发送方进行通讯的消息边框颜色的处理效果;The effect of processing the color of the message border to communicate with the sender;
    屏幕振动反转;或Screen vibration is reversed; or
    视频或动画播放播放。Video or animation playback.
  12. 一种人机交互装置,包括:A human-computer interaction device includes:
    图像获取模块,获取用于指示终端设备执行动作的图像;An image acquisition module, which acquires an image used to instruct a terminal device to perform an action;
    动作指令确定模块,基于所述图像的图像特征确定匹配的动作指令;An action instruction determining module, which determines a matching action instruction based on the image characteristics of the image;
    执行模块,响应于所述动作指令,执行与所述动作指令相匹配的操作。The execution module executes an operation matching the action instruction in response to the action instruction.
  13. 一种人机交互装置,包括:A human-computer interaction device includes:
    接收模块,接收来自于发送方的动作指令;A receiving module that receives action instructions from a sender;
    效果显示模块,响应于所述动作指令,显示与所述动作指令对应的效果;An effect display module, in response to the action instruction, displaying an effect corresponding to the action instruction;
    其中,所述与所述动作指令对应的效果包括下述至少一种:The effect corresponding to the action instruction includes at least one of the following:
    对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;The processing effect on the sender's avatar of the terminal device and / or the processing effect on the receiver's avatar on the terminal device;
    对与发送方进行通讯的消息边框颜色的处理效果;The effect of processing the color of the message border to communicate with the sender;
    屏幕振动反转;或Screen vibration is reversed; or
    视频或动画播放。Video or animation playback.
  14. 一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如下操作:An electronic device includes: a memory, a processor, and a computer program stored on the memory and executable on the processor. When the computer program is executed by the processor, the following operations are implemented:
    获取用于指示终端设备执行动作的图像;Acquiring an image used to instruct the terminal device to perform an action;
    基于所述图像的图像特征确定匹配的动作指令;Determining a matching action instruction based on the image characteristics of the image;
    响应于所述动作指令,执行与所述动作指令相匹配的操作。In response to the action instruction, an operation matching the action instruction is performed.
  15. 一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如下操作:An electronic device includes: a memory, a processor, and a computer program stored on the memory and executable on the processor. When the computer program is executed by the processor, the following operations are implemented:
    接收来自于发送方的动作指令;Receive action instructions from the sender;
    响应于所述动作指令,显示与所述动作指令对应的效果;In response to the action instruction, displaying an effect corresponding to the action instruction;
    其中,所述与所述动作指令对应的效果包括下述至少一种:The effect corresponding to the action instruction includes at least one of the following:
    对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;The processing effect on the sender's avatar of the terminal device and / or the processing effect on the receiver's avatar on the terminal device;
    对与发送方进行通讯的消息边框颜色的处理效果;The effect of processing the color of the message border to communicate with the sender;
    屏幕振动反转;或Screen vibration is reversed; or
    视频或动画播放。Video or animation playback.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:A computer-readable storage medium stores a computer program on the computer-readable storage medium. When the computer program is executed by a processor, the following operations are implemented:
    获取用于指示终端设备执行动作的图像;Acquiring an image used to instruct the terminal device to perform an action;
    基于所述图像的图像特征确定匹配的动作指令;Determining a matching action instruction based on the image characteristics of the image;
    响应于所述动作指令,执行与所述动作指令相匹配的操作。In response to the action instruction, an operation matching the action instruction is performed.
  17. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:A computer-readable storage medium stores a computer program on the computer-readable storage medium. When the computer program is executed by a processor, the following operations are implemented:
    接收来自于发送方的动作指令;Receive action instructions from the sender;
    响应于所述动作指令,显示与所述动作指令对应的效果;In response to the action instruction, displaying an effect corresponding to the action instruction;
    其中,所述与所述动作指令对应的效果包括下述至少一种:The effect corresponding to the action instruction includes at least one of the following:
    对终端设备的发送方头像的处理效果和/或对终端设备的接收方头像的处理效果;The processing effect on the sender's avatar of the terminal device and / or the processing effect on the receiver's avatar on the terminal device;
    对与发送方进行通讯的消息边框颜色的处理效果;The effect of processing the color of the message border to communicate with the sender;
    屏幕振动反转;或Screen vibration is reversed; or
    视频或动画播放。Video or animation playback.
PCT/CN2019/089209 2018-08-02 2019-05-30 Man-machine interaction method and apparatus WO2020024692A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810871070.2A CN109254650B (en) 2018-08-02 2018-08-02 Man-machine interaction method and device
CN201810871070.2 2018-08-02

Publications (1)

Publication Number Publication Date
WO2020024692A1 true WO2020024692A1 (en) 2020-02-06

Family

ID=65049153

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089209 WO2020024692A1 (en) 2018-08-02 2019-05-30 Man-machine interaction method and apparatus

Country Status (3)

Country Link
CN (2) CN109254650B (en)
TW (1) TWI782211B (en)
WO (1) WO2020024692A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022017184A1 (en) * 2020-07-23 2022-01-27 北京字节跳动网络技术有限公司 Interaction method and apparatus, and electronic device and computer-readable storage medium

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109254650B (en) * 2018-08-02 2021-02-09 创新先进技术有限公司 Man-machine interaction method and device
CN110083238A (en) * 2019-04-18 2019-08-02 深圳市博乐信息技术有限公司 Man-machine interaction method and system based on augmented reality
CN110609921B (en) * 2019-08-30 2022-08-19 联想(北京)有限公司 Information processing method and electronic equipment
CN110807395A (en) * 2019-10-28 2020-02-18 支付宝(杭州)信息技术有限公司 Information interaction method, device and equipment based on user behaviors
CN111338808B (en) * 2020-05-22 2020-08-14 支付宝(杭州)信息技术有限公司 Collaborative computing method and system
CN111627097B (en) * 2020-06-01 2023-12-01 上海商汤智能科技有限公司 Virtual scene display method and device
CN114035684A (en) * 2021-11-08 2022-02-11 百度在线网络技术(北京)有限公司 Method and apparatus for outputting information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045398A (en) * 2015-09-07 2015-11-11 哈尔滨市一舍科技有限公司 Virtual reality interaction device based on gesture recognition
CN105487673A (en) * 2016-01-04 2016-04-13 京东方科技集团股份有限公司 Man-machine interactive system, method and device
CN106095068A (en) * 2016-04-26 2016-11-09 乐视控股(北京)有限公司 The control method of virtual image and device
US20180088663A1 (en) * 2016-09-29 2018-03-29 Alibaba Group Holding Limited Method and system for gesture-based interactions
CN109254650A (en) * 2018-08-02 2019-01-22 阿里巴巴集团控股有限公司 A kind of man-machine interaction method and device

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7159008B1 (en) * 2000-06-30 2007-01-02 Immersion Corporation Chat interface with haptic feedback functionality
US9041775B2 (en) * 2011-03-23 2015-05-26 Mgestyk Technologies Inc. Apparatus and system for interfacing with computers and other electronic devices through gestures by using depth sensing and methods of use
CN103916621A (en) * 2013-01-06 2014-07-09 腾讯科技(深圳)有限公司 Method and device for video communication
JP5503782B1 (en) * 2013-06-20 2014-05-28 株式会社 ディー・エヌ・エー Electronic game machine, electronic game processing method, and electronic game program
CN105468142A (en) * 2015-11-16 2016-04-06 上海璟世数字科技有限公司 Interaction method and system based on augmented reality technique, and terminal
CN105988583A (en) * 2015-11-18 2016-10-05 乐视致新电子科技(天津)有限公司 Gesture control method and virtual reality display output device
CN106125903B (en) * 2016-04-24 2021-11-16 林云帆 Multi-person interaction system and method
CN106155311A (en) * 2016-06-28 2016-11-23 努比亚技术有限公司 AR helmet, AR interactive system and the exchange method of AR scene
US10471353B2 (en) * 2016-06-30 2019-11-12 Sony Interactive Entertainment America Llc Using HMD camera touch button to render images of a user captured during game play
CN106293461B (en) * 2016-08-04 2018-02-27 腾讯科技(深圳)有限公司 Button processing method and terminal and server in a kind of interactive application
CN107885317A (en) * 2016-09-29 2018-04-06 阿里巴巴集团控股有限公司 A kind of exchange method and device based on gesture
US20180126268A1 (en) * 2016-11-09 2018-05-10 Zynga Inc. Interactions between one or more mobile devices and a vr/ar headset
US10168788B2 (en) * 2016-12-20 2019-01-01 Getgo, Inc. Augmented reality user interface
CN106657060A (en) * 2016-12-21 2017-05-10 惠州Tcl移动通信有限公司 VR communication method and system based on reality scene
CN107705278B (en) * 2017-09-11 2021-03-02 Oppo广东移动通信有限公司 Dynamic effect adding method and terminal equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045398A (en) * 2015-09-07 2015-11-11 哈尔滨市一舍科技有限公司 Virtual reality interaction device based on gesture recognition
CN105487673A (en) * 2016-01-04 2016-04-13 京东方科技集团股份有限公司 Man-machine interactive system, method and device
CN106095068A (en) * 2016-04-26 2016-11-09 乐视控股(北京)有限公司 The control method of virtual image and device
US20180088663A1 (en) * 2016-09-29 2018-03-29 Alibaba Group Holding Limited Method and system for gesture-based interactions
CN109254650A (en) * 2018-08-02 2019-01-22 阿里巴巴集团控股有限公司 A kind of man-machine interaction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022017184A1 (en) * 2020-07-23 2022-01-27 北京字节跳动网络技术有限公司 Interaction method and apparatus, and electronic device and computer-readable storage medium
US11842425B2 (en) 2020-07-23 2023-12-12 Beijing Bytedance Network Technology Co., Ltd. Interaction method and apparatus, and electronic device and computer-readable storage medium

Also Published As

Publication number Publication date
TW202008143A (en) 2020-02-16
TWI782211B (en) 2022-11-01
CN112925418A (en) 2021-06-08
CN109254650A (en) 2019-01-22
CN109254650B (en) 2021-02-09

Similar Documents

Publication Publication Date Title
WO2020024692A1 (en) Man-machine interaction method and apparatus
US11182615B2 (en) Method and apparatus, and storage medium for image data processing on real object and virtual object
US11595617B2 (en) Communication using interactive avatars
US10699461B2 (en) Telepresence of multiple users in interactive virtual space
US20180088663A1 (en) Method and system for gesture-based interactions
JP7268071B2 (en) Virtual avatar generation method and generation device
WO2018033137A1 (en) Method, apparatus, and electronic device for displaying service object in video image
WO2019173108A1 (en) Electronic messaging utilizing animatable 3d models
CN113228625A (en) Video conference supporting composite video streams
CN110555507B (en) Interaction method and device for virtual robot, electronic equipment and storage medium
WO2023070021A1 (en) Mirror-based augmented reality experience
WO2022252866A1 (en) Interaction processing method and apparatus, terminal and medium
CN108876878B (en) Head portrait generation method and device
CN111880664B (en) AR interaction method, electronic equipment and readable storage medium
WO2020042442A1 (en) Expression package generating method and device
CN113411537A (en) Video call method, device, terminal and storage medium
KR20160010810A (en) Realistic character creation method and creating system capable of providing real voice
CN114779948B (en) Method, device and equipment for controlling instant interaction of animation characters based on facial recognition
US11960653B2 (en) Controlling augmented reality effects through multi-modal human interaction
CN113176827B (en) AR interaction method and system based on expressions, electronic device and storage medium
US20240193838A1 (en) Computer-implemented method for controlling a virtual avatar
US20230154126A1 (en) Creating a virtual object response to a user input
CN113908553A (en) Game character expression generation method and device, electronic equipment and storage medium
TWI583198B (en) Communication using interactive avatars

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19844048

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19844048

Country of ref document: EP

Kind code of ref document: A1