WO2023160072A1 - Human-computer interaction method and apparatus in augmented reality (ar) scene, and electronic device - Google Patents

Human-computer interaction method and apparatus in augmented reality (ar) scene, and electronic device Download PDF

Info

Publication number
WO2023160072A1
WO2023160072A1 PCT/CN2022/134830 CN2022134830W WO2023160072A1 WO 2023160072 A1 WO2023160072 A1 WO 2023160072A1 CN 2022134830 W CN2022134830 W CN 2022134830W WO 2023160072 A1 WO2023160072 A1 WO 2023160072A1
Authority
WO
WIPO (PCT)
Prior art keywords
pose
terminal
virtual object
dimensional coordinates
real scene
Prior art date
Application number
PCT/CN2022/134830
Other languages
French (fr)
Chinese (zh)
Inventor
王润之
冯思淇
李江伟
时天欣
徐其超
林涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023160072A1 publication Critical patent/WO2023160072A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer

Definitions

  • the embodiments of the present application relate to the technical field of augmented reality AR, and in particular to a method, device and electronic device for human-computer interaction in an augmented reality AR scene.
  • augmented reality augmented reality, AR
  • AR augmented reality
  • Embodiments of the present application provide a human-computer interaction method, device, and electronic device in an augmented reality AR scene, which can enrich styles of virtual objects displayed in the AR scene.
  • the embodiments of the present application provide a method for human-computer interaction in an AR scene.
  • the execution subject for executing the method may be a terminal or a chip in the terminal.
  • the terminal is used as an example for illustration.
  • the terminal can shoot a real scene, and display the captured real scene on an interface of the terminal, and the interface also displays an identifier of a virtual object to be selected.
  • the terminal may display the first virtual object in the real scene captured by the terminal in response to the user's operation on the identification of the first virtual object, and in response to the user's operation on the identification of the second virtual object, after the The second virtual object is displayed in a real scene displaying the first virtual object.
  • the user can continuously operate the virtual object identification to display multiple virtual objects in the real scene captured by the terminal, and the multiple virtual objects can be the same or different.
  • the user is not limited to the one-time use of AR props, but can display virtual objects at corresponding positions in the real scene, enrich the styles of displaying virtual objects in the AR scene, and enrich the human-computer interaction in the AR scene way to improve user experience.
  • the terminal may shoot a real scene, and display the shot real scene on an interface of the terminal.
  • the virtual object when the terminal displays the captured real scene, the virtual object may be displayed at a corresponding position of the real scene corresponding to the preset position.
  • the first virtual object is displayed at a third position of the real scene captured by the terminal
  • the second virtual object is displayed at a fourth position of the real scene captured by the terminal.
  • the second virtual object may be the same as or different from the first virtual object.
  • the third position of the real scene may be a position in the real scene corresponding to the first position
  • the fourth position of the real scene may be a position in the real scene corresponding to the second position.
  • the terminal when the terminal displays the captured real scene, the identifier of the virtual object to be selected can be displayed on the interface of the terminal.
  • the terminal in response to the user's operation on the identification of the first virtual object, the terminal may display the first virtual object at a corresponding position of the real scene corresponding to the preset position, and in response to the user's operation on the identification of the second virtual object, The second virtual object is displayed at a corresponding position of the real scene corresponding to the preset position.
  • the terminal may display the first virtual object at a corresponding position of the real scene corresponding to a preset position, and respond to the user's operation on the identification of the second virtual object , displaying the second virtual object at a corresponding position of the real scene corresponding to another preset position.
  • the terminal may shoot a real scene, and display the shot real scene on an interface of the terminal.
  • the user can also use voice interaction to control the terminal to display the first virtual object at a position of the captured real scene, and the terminal to display the second virtual object at another position of the captured real scene.
  • the terminal displays the first virtual object in the captured real scene, and the terminal displays the second virtual object in the real field where the first virtual object has been displayed:
  • the terminal acquires the first pose of the terminal in response to the user's operation on the identification of the first virtual object, and acquires the first A first mapping point located on the first virtual plane, and obtaining three-dimensional coordinates of the first mapping point in the camera coordinate system.
  • the terminal may display the first virtual object in the real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system.
  • the first position may be preset or determined by a user operation.
  • the terminal may send and display the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system, so that the terminal displays the first virtual object in the real scene captured by the terminal.
  • a set of preset virtual planes is preset, and the set may include, for example, a first virtual plane, a second virtual plane, and a third virtual plane.
  • the second virtual plane may be perpendicular to the second virtual plane and the third virtual plane respectively, and the second virtual plane and the third virtual plane are perpendicular to each other.
  • Any virtual plane in the first virtual plane, the second virtual plane, and the third virtual plane may be in the same plane as the ground (or horizontal plane).
  • the terminal can acquire the two-dimensional coordinates of the first position in the image coordinate system, and An intersection point of a ray from the first pose to the two-dimensional coordinates and the first virtual plane is used as the first mapping point.
  • the intersection point with other virtual planes in the set of preset virtual planes is used as the first mapping point.
  • the virtual plane can be set in advance, which can avoid the operation of pre-scanning the plane when the user uses the AR function, simplify the user's operation, and improve the efficiency.
  • setting a plurality of different virtual planes in the embodiment of the present application can ensure that the ray from the first pose to the two-dimensional coordinates has an intersection point with one of the virtual planes, so as to ensure the smooth generation of virtual objects and improve the accuracy of virtual objects. Generate accuracy.
  • the terminal in response to the user's operation on the identification of the first virtual object, the terminal can quickly generate first dummy object.
  • the terminal in order to make the user truly feel that the first virtual object exists in the real scene, and the size of the first virtual object does not change suddenly with the change of the terminal's pose, the terminal can track the first virtual object.
  • the distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or, the terminal is moving from the first pose
  • the number of frames of the image captured during the process to the second pose is greater than or equal to the preset number of frames; and/or, the duration of the terminal moving from the first pose to the second pose is longer than A preset duration; and/or, the second pose is a pose when the terminal successfully triangulates the first mapping point.
  • the terminal according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second pose, and the first The two-dimensional coordinates of the first position in the image coordinate system in the second pose, and displaying the first virtual object in the real scene captured by the terminal may specifically include: the terminal according to the first pose , the two-dimensional coordinates of the first position in the image coordinate system in the first pose, the second pose, and the image coordinates of the first position in the second pose.
  • the two-dimensional coordinates in the world coordinate system obtain the three-dimensional coordinates of the first position in the world coordinate system, and obtain the scaling ratio corresponding to the first position according to the first distance and the second distance
  • the first distance is: The distance from the first pose to the first mapping point
  • the second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system.
  • the terminal may respectively scale the second pose and the three-dimensional coordinates of the first position in the world coordinate system according to the scaling ratio corresponding to the first position , obtain the third pose and the scaled three-dimensional coordinates corresponding to the first position, and then display the three-dimensional coordinates corresponding to the third pose and the first position in the real scene captured by the terminal Describe the first virtual object.
  • the terminal may scale the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system, so The scaled three-dimensional coordinates corresponding to the first position are the same as the three-dimensional coordinates of the first mapping point in the camera coordinate system. Furthermore, the terminal may correspondingly scale the second pose according to the scaling ratio of the three-dimensional coordinates of the first position in the world coordinate system to obtain the third pose.
  • the process of displaying the first virtual object by the terminal in the real scene captured by the terminal is briefly described.
  • For displaying the second virtual object in a real scene reference may be made to related descriptions about displaying the first virtual object by the terminal.
  • the terminal needs to convert each pose of the terminal (zoomed pose) , and the zoomed three-dimensional coordinates corresponding to the positions are sent for display, and the terminal has a large amount of calculation and a slow speed when rendering and displaying the first virtual object and the second virtual object.
  • the scaled three-dimensional coordinates corresponding to each pose can be unified in the same pose (coordinate system).
  • the terminal when the user performs an operation on the identification of the second virtual object, the terminal is in the first pose.
  • the terminal may acquire a second mapping point of the second position on the second virtual plane in response to the user's operation on the identification of the second virtual object, and acquire the second mapping point in the camera coordinate system
  • the three-dimensional coordinates of the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system, the real scene shot at the terminal (the real scene is the first virtual object displayed
  • the second virtual object is displayed in the real scene displayed on the terminal interface).
  • the second mapping point at the second position on the second virtual plane and the acquisition process of the two-dimensional coordinates of the second mapping point may refer to the related description of the first mapping point.
  • the second pose is a pose after the user identifies the second virtual object
  • the terminal may, when the terminal is in the second pose, according to
  • the scaling ratio corresponding to the second position is to respectively scale the second pose and the three-dimensional coordinates of the second position in the world coordinate system to obtain the fifth pose and the corresponding to the second position
  • Scaling the three-dimensional coordinates, the three-dimensional coordinates of the second position in the world coordinate system are: based on the first pose and the first pose, the second position in the image coordinate system two-dimensional coordinates, the second pose, and the two-dimensional coordinates of the second position in the image coordinate system during the second pose.
  • the terminal obtains the three-dimensional coordinates of the second position in the world coordinate system, the scaling ratio corresponding to the second position, and the scaling process may refer to the relevant description corresponding to the first position.
  • the terminal may display the second virtual object in the real scene captured by the terminal according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
  • the terminal may translate the fifth pose to the third pose, and according to the distance and direction of the fifth pose to the third pose, translate the The three-dimensional coordinates are scaled to obtain the scaled three-dimensional coordinates corresponding to the second position in the third pose. Furthermore, the terminal may display the second virtual object in the real scene captured by the terminal according to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose.
  • the zoomed three-dimensional coordinates corresponding to each position can be sent for display under the pose of the same terminal, instead of sending the zoomed three-dimensional coordinates corresponding to each position and the pose of the terminal for display, which can reduce The amount of calculation when the terminal renders virtual objects improves the speed of displaying virtual objects, thereby improving user experience.
  • the user can also interact with the virtual objects displayed on the terminal.
  • the user can perform operations such as selecting, deleting, moving, and scaling the virtual object, and the user can also interact with the virtual object in a voice manner.
  • the embodiment of the present application provides a human-computer interaction device in an AR scene, and the device may be the terminal in the first aspect above, or a chip in the terminal.
  • the device can include:
  • the shooting module is used to shoot real scenes.
  • the display module is configured to display the captured real scene on the interface of the terminal, the interface also displays the identification of the virtual object to be selected, and in response to the user's operation on the identification of the first virtual object, displaying the first virtual object in a real scene, and displaying the second virtual object in a real scene where the first virtual object has been displayed in response to the user's operation on the identification of the second virtual object.
  • the processing module is configured to acquire the first pose of the terminal in response to the user's operation on the identification of the first virtual object; A first mapping point on a virtual plane, where the first position is a preset position on the interface or a position determined by the user on the interface; obtaining a three-dimensional view of the first mapping point in the camera coordinate system coordinate.
  • the display module is specifically configured to display the first virtual object in the real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system.
  • the processing module is specifically configured to acquire the two-dimensional coordinates of the first position in the image coordinate system; combine the ray from the first pose to the two-dimensional coordinates with the second An intersection point of a virtual plane is used as the first mapping point.
  • the first virtual plane is included in a set of preset virtual planes, and the processing module is further configured to: If there is no intersection point in the virtual plane, then obtain the intersection point of the ray from the first pose to the two-dimensional coordinates and other virtual planes in the preset virtual plane set; The intersection point of the planes is used as the first mapping point.
  • the processing module is further configured to track the image block corresponding to the first position; when the terminal is in the second pose, obtain the first position in the image coordinate system two-dimensional coordinates of .
  • the display module is further configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second pose, and the The second pose is the two-dimensional coordinates of the first position in the image coordinate system, and the first virtual object is displayed in the real scene captured by the terminal.
  • the distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or, the terminal moves from the first pose to the The number of frames of the image captured during the second pose is greater than or equal to the preset number of frames; and/or, the duration of the terminal moving from the first pose to the second pose is longer than the preset duration; and/or, the second pose is a pose when the terminal successfully triangulates the first mapping point.
  • the processing module is specifically configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second Two poses, and the two-dimensional coordinates of the first position in the image coordinate system in the second pose, obtaining the three-dimensional coordinates of the first position in the world coordinate system; according to the first distance and the second Two distances, obtaining the scaling ratio corresponding to the first position, the first distance is: the distance from the first pose to the three-dimensional coordinates of the first mapping point in the camera coordinate system, the first distance
  • the second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system; according to the scaling ratio corresponding to the first position, the second pose, the The three-dimensional coordinates of the first position in the world coordinate system are scaled to obtain the third pose and the scaled three-dimensional coordinates corresponding to the first position.
  • the display module is further configured to display the first virtual object in the real scene captured by the terminal according to the third pose and the scaled three-dimensional coordinates corresponding to the first position.
  • the processing module is specifically configured to scale the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system , the scaled three-dimensional coordinates corresponding to the first position are the same as the three-dimensional coordinates of the first mapping point in the camera coordinate system.
  • the terminal when the user performs the operation of identifying the second virtual object, the terminal is in the first pose, and the processing module is specifically configured to respond to the user's operation on the second virtual object
  • the operation of identifying the virtual object obtaining a second mapping point of the second position on the second virtual plane, where the second position is a preset position on the interface or determined by the user on the interface
  • Position acquiring the three-dimensional coordinates of the second mapping point in the camera coordinate system.
  • the display module is further configured to display the first virtual object in the real scene where the first virtual object has been displayed according to the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system. Two dummy objects.
  • the processing module is specifically configured to, when the terminal is in the second pose, respectively calculate the second pose, the Scaling the three-dimensional coordinates of the second position in the world coordinate system to obtain the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, the three-dimensional coordinates of the second position in the world coordinate system are: Based on the first pose, the two-dimensional coordinates of the second position in the image coordinate system at the first pose, the second pose, and the first pose at the second pose Two positions are obtained in two-dimensional coordinates in the image coordinate system.
  • the display module is specifically configured to display the second virtual object in the real scene where the first virtual object has been displayed according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
  • the processing module is specifically configured to translate the fifth pose to the third pose; translate the fifth pose to the third pose according to the distance and direction , and translate the scaled three-dimensional coordinates corresponding to the second position to obtain the scaled three-dimensional coordinates corresponding to the second position in the third pose.
  • the display module is specifically configured to display the first virtual object in the real scene where the first virtual object has been displayed according to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose. Describe the second virtual object.
  • an embodiment of the present application provides an electronic device, and the electronic device may include: a processor and a memory.
  • the memory is used to store computer-executable program codes, and the program codes include instructions; when the processor executes the instructions, the instructions cause the electronic device to execute the method in the first aspect.
  • the embodiment of the present application provides an electronic device, and the electronic device may be a human-computer interaction device in an AR scene.
  • the electronic device may include a unit, module or circuit for performing the method provided in the above first aspect.
  • the embodiments of the present application provide a computer program product including instructions, which, when run on a computer, cause the computer to execute the method in the first aspect above.
  • an embodiment of the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, it causes the computer to execute the method in the above-mentioned first aspect.
  • Embodiments of the present application provide a human-computer interaction method, device, and electronic device in an augmented reality AR scene.
  • the terminal can shoot the real scene, and display the captured real scene on the interface of the terminal, and the interface also displays The identification of the virtual object to be selected; in response to the user's operation on the identification of the first virtual object, the first virtual object is displayed in the real scene captured by the terminal; in response to the user's operation on the identification of the second virtual object, the displayed The second virtual object is displayed in the real scene of the first virtual object.
  • the human-computer interaction method provided by the embodiment of the present application can not only enrich the human-computer interaction mode in the augmented reality AR scene, but also enrich the styles of displaying virtual objects in the AR scene.
  • FIG. 1 is a schematic diagram of an interface of an AR application program in the prior art
  • Fig. 2 is a schematic interface diagram of another AR application program in the prior art
  • FIG. 3 is a schematic flowchart of an embodiment of a method for human-computer interaction in an AR scene provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of an interface provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of another interface provided by the embodiment of the present application.
  • Fig. 6 is a schematic diagram of another interface provided by the embodiment of the present application.
  • FIG. 7 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application.
  • FIG. 8 is a schematic diagram of a virtual plane provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of an image block corresponding to a first position provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of triangulation provided by the embodiment of the present application.
  • Fig. 11A is a schematic diagram of scaling provided by the embodiment of the present application.
  • FIG. 11B is another schematic diagram of scaling provided by the embodiment of the present application.
  • FIG. 12 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application.
  • FIG. 13 is a schematic diagram of unifying the scaled three-dimensional coordinates to the pose of a terminal provided by the embodiment of the present application;
  • Figure 14 is a schematic diagram of another interface provided by the embodiment of the present application.
  • Figure 15 is a schematic diagram of another interface provided by the embodiment of the present application.
  • FIG. 16 is a schematic structural diagram of a terminal provided in an embodiment of the present application.
  • FIG. 17 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application.
  • FIG. 18 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application.
  • FIG. 19 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application.
  • FIG. 20 is a schematic structural diagram of a terminal provided in an embodiment of the present application.
  • FIG. 21 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 1 is a schematic diagram of an interface of an AR application program in the prior art.
  • an augmented reality (augmented reality, AR) application is taken as an example of an AR measurement application for illustration.
  • the user opens the AR measurement application program, and the interface of the terminal displays a real scene captured by the terminal, and prompt information is displayed on the interface.
  • the prompt information such as "slowly move the device to find the plane where the object is located" is used to prompt the user to use the terminal to scan the real scene to determine the plane in the real scene, and then measure the distance, area, volume, height, etc.
  • AR applications developed based on ARkit and ARCore need to scan the real scene for plane detection before using the functions in the AR application.
  • the prior map information may include but not limited to: a three-dimensional point cloud map of a real scene, or a map containing a plane, and the like.
  • the terminal stores the prior map information of the real scene, when using the AR application, the terminal can use the AR application based on the terminal's pose and the plane in the prior map information. function in .
  • this method does not need to scan the real scene when using the AR application, it needs to obtain the prior map information of the real scene in advance, and the prior map information of the real scene needs to be obtained by professionals, which is difficult and inefficient.
  • FIG. 2 is a schematic diagram of an interface of another AR application program in the prior art.
  • the AR application is taken as an example of a short video application.
  • the short video application provides a "virtual object" prop.
  • the virtual object is "cat" as an example for illustration. Referring to a and b in Figure 2, in this short video application, when the user is shooting a real scene, the user can click on the "cat" prop, and the user clicks on any position of the real scene, and the terminal can display the real scene on the interface. A virtual cat is displayed at the corresponding position.
  • the terminal does not need to pre-scan the real scene or obtain a priori map information of the real scene, and can display a virtual cat at any position in the real scene, but the virtual objects displayed by the terminal are limited by props, only A virtual cat can be generated based on the style of the prop.
  • the style of the prop is a virtual cat
  • the user can use the prop to trigger the terminal to display one virtual cat
  • the style of the prop is two virtual cats
  • the user can use the prop to trigger the terminal to display two virtual cats.
  • Cats have a single way of interacting with users.
  • the terminal does not support the user to click multiple times to generate virtual cats multiple times.
  • the terminal can only support the use of the prop once. Accordingly, the human-computer interaction method in the AR scene shown in FIG. 2 is single.
  • the embodiment of the present application provides a human-computer interaction method in the AR scene.
  • the terminal does not need to pre-scan the plane in the real scene or obtain the prior map information of the real scene, and can respond to the user's multiple Once triggered, multiple virtual objects are continuously generated and displayed, which can enrich human-computer interaction methods and improve user experience.
  • the embodiment of the present application does not limit the virtual object.
  • the virtual object may be: an animal, a character, an object, and the like.
  • the human-computer interaction method in the AR scene provided by the embodiment of the present application is applied to a terminal, and the terminal may be called user equipment (user equipment, UE), etc.
  • the terminal may be a mobile phone, a tablet computer (portable android device, PAD), Personal digital assistant (PDA), handheld devices with wireless communication functions, computing devices or wearable devices, virtual reality (virtual reality, VR) terminal equipment, augmented reality (augmented reality, AR) terminal equipment, industrial control
  • VR virtual reality
  • AR augmented reality
  • the terminal in (industrial control), the terminal in smart home (smart home), etc., the form of the terminal is not specifically limited in the embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an embodiment of a method for human-computer interaction in an AR scene provided by an embodiment of the present application.
  • the human-computer interaction method in the AR scene provided by the embodiment of the present application may include:
  • the terminal can shoot a real scene in response to a user's instruction.
  • the real scene is opposite to the virtual scene, and the real scene is a real scene.
  • the terminal may be triggered to control the terminal's photographing device (such as a camera) to start photographing the real scene.
  • the terminal can shoot the real scene before recording the video, or shoot the real scene during the recording of the video.
  • the terminal shoots the real scene, it can display the captured real scene on the screen of the terminal, or it can be said The terminal may display the photographed picture of the real scene on the interface of the terminal.
  • the terminal takes shooting a real scene before recording a video as an example.
  • the terminal can shoot the real scene, and display the captured picture of the real scene in the picture preview box 41 .
  • a table is included in a real scene.
  • the user can click the shooting control 42 , and the terminal can also shoot a real scene, and display the captured real scene in the picture preview box 41 .
  • the first operation may include but is not limited to: the user operates the interface of the terminal, or the user speaks a piece of voice, etc.
  • the embodiment of the present application does not limit the first operation.
  • the interface of the terminal operated by the user may include but not limited to: click, slide, or perform a gesture without touching the screen of the terminal.
  • the first virtual object is preset.
  • the first virtual object is a stickman
  • the terminal can respond to the user's click operation, and the real The stickman is displayed at the corresponding position in the scene, as shown in b in Fig. 4 .
  • taking the first operation as the user uttering a piece of voice as an example if the user says "display at the center of the screen", the terminal may respond to the user's operation of uttering the voice, and the corresponding The stickman is displayed at the corresponding position in the real scene.
  • the first virtual object is set by the user.
  • a in FIG. 5 the user opens an AR application program.
  • an icon 43 of at least one virtual object to be selected may determine the first virtual object.
  • a in FIG. 5 represents an icon of a virtual object in words, for example, at least one icon 43 of a virtual object to be selected may include: a stickman, a princess, a cuboid, and the like.
  • the first operation may be: the user clicks the icon 43 of the virtual object to be selected and clicks on the interface of the terminal, or the user presses and holds the icon 43 of the virtual object to be selected and slides to a position, Alternatively, the user speaks a piece of voice, etc.
  • the user can first click the "stickman" icon, and then click the position A on the interface of the terminal, then the terminal responds to the user's first operation, and in the real scene corresponding to the position A The stickman is displayed at the corresponding position.
  • the hand of the user who clicks first is represented by a dotted line
  • the hand of the user who clicks later is represented by a solid line.
  • the terminal can respond to the user's first operation and display at the corresponding position in the real scene corresponding to position A Matchstick Men.
  • the terminal may display the stickman at the corresponding position in the real scene corresponding to the "center of the screen” in response to the user's first operation of uttering the voice.
  • Matchstick Men the terminal can recognize the voice from the user to determine whether the voice contains at least one virtual object to be selected displayed on the terminal interface, and then determine the first virtual object. Wherein, the terminal may use the voice of the user to include the virtual object to be selected as the first virtual object.
  • the terminal in response to the user's first operation, may also perform edge detection on the captured real-world scene, and then display the first virtual object at the target position of the real-world scene captured by the terminal.
  • the real scene includes a table
  • the terminal may perform edge detection on the table, so as to display the first virtual object on the table (that is, the target position) instead of displaying the first virtual object at other positions, so that the user can truly feel the The first virtual object is in the real scene, and the first virtual object does not violate the real scene.
  • the target position may be the ground, an object plane, a head, a shoulder of a character, and the like.
  • the second operation may be the same as or different from the first operation, and the second operation may refer to related descriptions of the first operation.
  • the second virtual object may be the same as or different from the first virtual object.
  • the second virtual object may be preset, or the first virtual object may be set by the user, for details, refer to the related description of the first virtual object in S302.
  • the terminal may display a cuboid at a corresponding position in the real scene corresponding to position B.
  • the real scene where the second virtual object is displayed and the real scene where the first virtual object is displayed are the same real scene, but the interface of the terminal can simultaneously display the second For the virtual object and the first virtual object, it should be understood that, in d of FIG. 5 , the second virtual object and the first virtual object are simultaneously displayed on the same screen displayed by the terminal as an example.
  • the corresponding position in the real scene corresponding to position B may be referred to as the second position of the real scene captured by the terminal.
  • S303 may be replaced by: displaying the second virtual object at the second position of the real scene captured by the terminal in response to the second operation of the user.
  • the user sets the positions for displaying virtual objects (such as the first virtual object and the second virtual object).
  • the positions for displaying virtual objects may be preset.
  • the preset positions of the virtual object displayed on the interface of the terminal are position A and position B.
  • the user opens the AR application, and can be in the real scene corresponding to position A and position B.
  • the stickman is displayed at the corresponding position.
  • the pictures of the real scene captured by the terminal can be the same or different, and both can be referred to as the real scene captured by the terminal, or the real scene displayed on the interface of the terminal. Scenes.
  • the preset positions of the virtual object displayed on the interface of the terminal are position A and position B.
  • the scene where the virtual object is set by the user refer to a-c in FIG.
  • the user can click the "stickman" icon, and the terminal can display the stickman at the corresponding position in the real scene corresponding to position A and position B in response to the user's operation.
  • the user can also sequentially select the first virtual object displayed at the corresponding position in the real scene corresponding to position A, and the second virtual object displayed at the corresponding position in the real scene corresponding to position B.
  • virtual objects displayed at different preset positions may be the same or different.
  • S302 and S303 may be replaced by: the terminal displays the first virtual object and the second virtual object at corresponding positions in the real scene corresponding to the preset position on the interface.
  • the terminal does not need to obtain the prior map information of scanning the real scene or obtaining the real scene in advance, and can generate and display multiple virtual objects based on the user's multiple operations, which can not only improve efficiency, but also enrich human-machine Interactive way to improve user experience.
  • S302 in the above embodiment may include:
  • the terminal may acquire the first pose of the terminal in response to the first operation of the user.
  • a sensor for detecting the pose of the terminal may be provided in the terminal, and the sensor may include but not limited to: an acceleration sensor, an angular velocity sensor, a gravity detection sensor, and the like.
  • the terminal may acquire inertial measurement (inertial measurement unit, IMU) data of the terminal based on the acceleration sensor and the angular velocity sensor, the terminal may acquire the gravity axis data of the terminal based on the gravity detection sensor, and then the terminal may acquire the gravity axis data based on the IMU data and the gravity axis data,
  • IMU inertial measurement unit
  • the position corresponding to the first position in the real scene is used to display the position of the first virtual object.
  • the first position may be preset or set by the user. It should be understood that the first position is a position on the interface of the terminal . Wherein, the picture captured by the terminal can be regarded as a plane, and the two-dimensional coordinates of the first position in the image coordinate system can be obtained. It should be understood that the first position can be a preset position or a position set by the user. For example, if the lower left corner of the terminal interface can be used as the origin of the two-dimensional coordinate system, the terminal can obtain the two-dimensional coordinates of the first position in the image coordinate system. limit. In an embodiment, the two-dimensional coordinates of the first position in the image coordinate system may also be referred to as: the two-dimensional coordinates of the first position in the frame of the real scene captured by the terminal.
  • the method for the terminal to track the first location may include but not limited to: a feature point method, an optical flow method, an image block tracking method based on deep learning, and the like.
  • the tracking of the first position by the terminal may be understood as: the terminal tracks the image block corresponding to the first position.
  • the feature point method refers to that the terminal tracks the image block corresponding to the first position according to the feature of the image block corresponding to the first position.
  • the terminal can acquire the features of the image block in the following ways: scale-invariant feature transform (SIFT) algorithm, speeded up robust features (SURF) algorithm, or fast orientation and rotation (oriented fast and rotated brief, ORB) algorithm.
  • SIFT scale-invariant feature transform
  • SURF speeded up robust features
  • ORB fast orientation and rotation
  • Optical flow is the movement of the target caused by the target, scene or camera moving between two consecutive frames of images. It is the two-dimensional vector field of the image during the translation process. It is the velocity field that represents the three-dimensional motion of the object point through the two-dimensional image. Movement rate.
  • the optical flow method may include but not limited to: Lukas-Kanade algorithm, KLT (Kanade-Lucas-Tomasi) tracking algorithm and the like.
  • the image block tracking method based on deep learning can be understood as: the terminal inputs the image block corresponding to the first position into a pre-trained model based on deep learning, so as to realize the tracking of the image block corresponding to the first position.
  • a pre-trained model based on deep learning so as to realize the tracking of the image block corresponding to the first position.
  • the first location may be a location set by a user.
  • the terminal can track the first location in response to the first operation of the user.
  • the first position is preset.
  • S3021 may be correspondingly replaced with: tracking the first position.
  • the picture of the real scene captured by the terminal is shown in FIG. 9 .
  • the terminal can track the image block displaying the position of the virtual object.
  • the image block corresponding to the first position is represented by the image block (pixel block) contained in the box.
  • Virtual planes are preset. In the embodiment of the present application, there is no need to pre-scan the real scene to obtain the plane in the real scene, and it is not necessary to obtain the prior map information of the real scene, because the virtual plane is preset in the embodiment of the present application, and virtual objects can be displayed on the virtual plane.
  • a virtual plane can be characterized by the three-dimensional coordinates of a point and the normal vector of the virtual plane. Exemplarily, the three-dimensional coordinates of the point corresponding to the virtual plane are (X0, Y0, Z1), and the normal vector of the virtual plane is a vector n.
  • the Z value of the virtual plane is the Z value in the camera coordinate system, and the camera coordinate system refers to a coordinate system with the optical center of the terminal as the origin.
  • the position is taken as an example.
  • the position of the terminal may be understood as the pose of the terminal in the embodiment of the present application.
  • the mapping point of the first position on the virtual plane may be referred to as a first mapping point.
  • the three-dimensional coordinates of the mapping point of the first position on the virtual plane may be understood as: the three-dimensional coordinates of the mapping point of the first position on the virtual plane in the camera coordinate system.
  • FIG. 10 is a schematic diagram of triangulation provided by the embodiment of the present application.
  • the virtual plane is parallel to the screen displayed on the terminal as an example for illustration.
  • the first pose of the terminal is C1.
  • C1 can be understood as the position of the optical center of the terminal when the terminal pose is the first pose.
  • the position of the first position in the image coordinate system is a1.
  • the mapping point of a position on the virtual plane can be understood as: the intersection point A' of the ray from C1 towards a1 and the virtual plane.
  • the abscissa and ordinate of the mapping point of the first position on the virtual plane are respectively the same as the abscissa and ordinate of the two-dimensional coordinates of the first position, and the Z value of the mapping point of the first position on the virtual plane is the virtual plane Z value (preset value, such as the distance between the virtual plane and the plane where the picture displayed on the terminal is located).
  • the two-dimensional coordinates of the first position in the image coordinate system are (X1, Y1)
  • the Z value of the virtual plane is Z1
  • the three-dimensional coordinates of the mapping point of the first position on the virtual plane are (X1, Y1, Z1 ).
  • the preset at least one virtual plane may be referred to as a set of preset virtual planes.
  • the embodiment of the present application may be described by setting three virtual planes in advance as an example.
  • the three virtual planes preset may include: a first virtual plane, a second virtual plane, and a third virtual plane.
  • the first virtual plane is parallel to the ground (or horizontal plane)
  • the second virtual plane and the third virtual plane are both perpendicular to the first virtual plane
  • the second virtual plane is perpendicular to the third virtual plane, as shown in FIG. 8 .
  • the terminal may use A' as a mapping point of the first position on the first virtual plane, so as to obtain the three-dimensional coordinates of A'. If the ray has no intersection with the first virtual plane, the terminal may detect whether the ray has an intersection with the second virtual plane, and then obtain the three-dimensional coordinates of the intersection. In an embodiment, the terminal may sequentially obtain whether there is an intersection point between the rays of the first poses C1 to a1 and the virtual plane according to the priority of at least one virtual plane, so as to obtain the three-dimensional coordinates of the intersection point.
  • multiple virtual planes can be preset in the embodiment of the present application, so that when the ray and the virtual plane A virtual object is created on a virtual plane where the planes intersect.
  • the terminal may display the first virtual object at the three-dimensional coordinates of the mapping point on the virtual plane, for example, the three-dimensional coordinates are the three-dimensional coordinates of the mapping point at position A on the virtual plane.
  • the center of the first virtual object is the mapping point, and this embodiment of the present application does not limit the relative positions of the first virtual object and the mapping point.
  • the terminal may send and display the first pose and the first position on the three-dimensional coordinates of the mapping point on the virtual plane, and then display the first virtual object at the three-dimensional coordinates of the mapping point on the virtual plane.
  • sending display can be understood as: the processor in the terminal sends the first pose and the three-dimensional coordinates of the mapping point of the first position on the virtual plane to the display, so that the three-dimensional coordinates of the mapping point of the display on the virtual plane
  • the embodiment of the present application does not repeat the process of sending for display.
  • the mapping point of the first position on the virtual plane is the position where the first virtual object is displayed, that is, the first position.
  • the terminal can obtain the three-dimensional coordinates of the first position in the real scene, and then obtain the three-dimensional coordinates of the first position in the real scene of the picture captured by the terminal to display virtual objects.
  • the three-dimensional coordinates of the first position in the real scene may be referred to as: the three-dimensional coordinates of the first position in the world coordinate system.
  • the terminal may perform triangulation processing on the mapping points, so as to obtain the three-dimensional coordinates of the first position in the real scene.
  • the process of triangulating the mapping points by the terminal is described below:
  • the terminal may, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system in the first pose, the second pose, and the two-dimensional coordinates of the first position in the image coordinate system in the second pose, Acquire the 3D coordinates of the first position in the real scene.
  • the second pose is a pose after the first pose, because the terminal can track the first position, and therefore can obtain the two-dimensional coordinates of the first position in the image coordinate system at the second pose.
  • the second pose of the terminal is C2, and the two-dimensional coordinate of the first position in the image coordinate system at the second pose is a2.
  • C1 can be directed toward the ray of a1 and C2 toward a2
  • the intersection point of the ray is used as the position of the first position in the real scene
  • the three-dimensional coordinates of the intersection point A are used as the three-dimensional coordinates of the first position in the real scene.
  • the internal parameter matrix of the terminal can be combined to triangulate (or triangulation) to obtain the three-dimensional coordinates of the intersection point A.
  • the specific method of triangulation is as follows, assuming that the coordinates of the camera coordinate system of the intersection point A in the first pose C1 are (x1, y1, z1), and the coordinates of the camera coordinate system in the second pose C2 are (x2, y2 , z2), the two-dimensional coordinate a1 of the first position in the image coordinate system in the first pose is (u1, v1), and the two-dimensional coordinate a2 of the first position in the image coordinate system in the second pose is (u2 , v2), the internal parameter matrix K of the terminal, then according to the imaging process of the camera, formula 1 can be obtained:
  • d1 and d2 are the depths of the intersection point A in the camera coordinate system at the first pose C1 and the second pose C2 respectively. Then, according to the first pose C1 and the second pose C2 of the terminal, the rotation matrix R and the translation matrix t of the camera coordinate system of the first pose C1 transformed to the camera coordinate system of the second pose C2 can be calculated, then Get formula 2:
  • Equation 3 Equation 3:
  • Equation 3 Equation 4
  • the unknown quantity D can be solved by formulas 4 and 5, that is, the depth of the intersection point A in the camera coordinate system at the first pose C1 and the second pose C2.
  • the intersection point can be calculated according to formula 1
  • the three-dimensional coordinates of A in the camera coordinate system at C1 and C2, and then through the first pose C1 of the intersection point A, the coordinates of the camera coordinate system can be converted to the coordinates in the real scene coordinate system, and the intersection point A can be obtained. 3D coordinates.
  • the real scene coordinate system can be understood as a world coordinate system.
  • mapping points The above describes the specific process of triangulating the mapping points.
  • the following describes the conditions for the terminal to triangulate the mapping points:
  • the pose of the terminal changes in real time
  • the terminal can obtain the distance between the pose of the terminal and the first pose, and respond to the distance between the pose of the terminal and the first pose being greater than Or equal to the preset distance, triangulate the map point.
  • the terminal performs triangulation processing on the mapping point in response to the distance between the second pose and the first pose being greater than or equal to the preset distance, that is, the distance between C2 and C1 is greater than or equal to the preset distance .
  • the terminal may obtain multiple consecutive frames of images, and the terminal may perform triangulation processing on the mapping points in response to the number of frames of the captured images reaching a preset number of frames.
  • the terminal may also perform triangulation processing on the mapping points in response to the duration of the capture reaching a preset duration.
  • the mapping point may be triangulated in combination with the pose of the terminal, and the terminal may obtain the three-dimensional coordinates of the first position in the real scene in response to successful triangulation.
  • the conditions for successful triangulation may include but not limited to:
  • the image block corresponding to the first position can be tracked in at least two images.
  • X, Y, and Z in the three-dimensional coordinates of the first position obtained by triangulation in the real scene are not null values (null).
  • the three-dimensional coordinates of the first position obtained by triangulation in the real scene are in front of the terminal (or the shooting device in the terminal), that is, the Z value of the three-dimensional coordinates of the first position in the camera coordinate system is greater than 0.
  • the three-dimensional coordinates of the first position in the real scene obtained by triangulating the image of the image block that can be traced to the first position are back-projected into the two-dimensional coordinate system in the screen to obtain the two-dimensional coordinates of the back-projected point , calculating the distance between the backprojection point and the actually observed position on the display screen of the terminal, where the distance is less than or equal to a distance threshold.
  • the terminal when the shooting device set in the terminal is a binocular shooting device, the terminal can take the pictures taken by the two cameras according to the poses of the two cameras and the first position when the terminal is in the first position.
  • the two-dimensional coordinates of the first position in the real scene are obtained to obtain the three-dimensional coordinates of the first position in the real scene, without the need of the terminal to obtain the three-dimensional coordinates of the first position in the real scene in the second pose.
  • the three-dimensional coordinates of the mapping point of the first position on the virtual plane, and the three-dimensional coordinates of the first position in the real scene obtain the scaling ratio corresponding to the first position.
  • the terminal After the terminal obtains the 3D coordinates of the first position in the real scene, if it directly displays the virtual object at the 3D coordinates of the first position in the real scene, for example, the terminal combines the second pose and the first position in the real scene
  • the three-dimensional coordinates are sent to display, and the size of the virtual object will change instantly according to the principle that the near side is large and the far side is small, resulting in poor user experience.
  • the terminal obtains the 3D coordinates of a1 in the real scene are far from the first pose, compared to the 3D coordinates of the mapping point of a1 on the virtual plane, if a virtual object is displayed at A, then The user can see that the virtual objects on the screen become smaller instantly, and the user experience is poor.
  • the terminal in order to make the user feel that the virtual object actually exists in the real scene, for example, the position of the virtual object in the real scene can be restored, and on the other hand, to ensure that the size of the virtual object does not change suddenly and cause trouble to the user , the terminal can obtain the scaling ratio corresponding to the first position according to the first pose, the three-dimensional coordinates of the mapping point of the first position on the virtual plane, and the three-dimensional coordinates of the first position in the real scene, and then according to the scaling ratio, Scale the second pose and the 3D coordinates of the first position in the real scene.
  • the terminal may determine a scaling ratio, that is, a quotient of the first distance and the second distance, according to the first distance between the first pose and A' and the second distance between the first pose and A'.
  • the terminal may scale A to A' based on the scaling ratio, where A' is the 3D coordinate of the scaled first position in the real scene, that is, the scaled 3D coordinate corresponding to the first position.
  • the second pose C2 can be scaled to C2'A
  • C2'A is the scaled second pose, that is, the third pose.
  • the terminal may send C2'A, A' for display to display the first virtual object.
  • the relative position between the virtual object and the terminal is the same as the relative position in the real scene, and the user can feel the real existence of the virtual object In the real scene, that is, the relative positional relationship between A' and C2'A can represent the relative positional relationship between A and C2, so the terminal sends C2'A and A' for display, which can restore the position of the virtual object in the real scene.
  • the three-dimensional coordinate corresponding to the first position to be displayed is A', there will be no sudden change in the size of the virtual object, which can improve user experience.
  • the terminal according to the scaled third pose (C2'A), the scaled three-dimensional coordinates corresponding to the first position (A '), the size of the displayed first virtual object will not change suddenly, and the user has no perception of the processing process of the terminal. As shown in b in Figure 5, the user still sees the first virtual object in the real scene corresponding to position A .
  • S303 may include:
  • S3031-S3038 reference may be made to related descriptions in S3021-S3028.
  • second location refer to the related description of the first location, and the second location may be preset or set by the user.
  • the fourth pose is the same as the first pose. In one embodiment, if the position where the virtual object is displayed is set by the user, the fourth pose is the pose of the terminal when the user sets the second position, and the fourth pose may be the same as or different from the first pose. In the following embodiments, an example in which the user sets the first position and the second position are the same, that is, the fourth position is the same as the first position, is used as an example for illustration.
  • the mapping point of the second position on the virtual plane is: the ray of C1 towards b1 and the point of the virtual plane
  • the intersection point B', the three-dimensional coordinates of the mapping point of the second position on the virtual plane is the three-dimensional coordinates of B'.
  • the mapping point of the second position on the virtual plane may be referred to as a second mapping point.
  • the three-dimensional coordinates of the mapping point of the second position on the virtual plane may be referred to as: the three-dimensional coordinates of the second mapping point in the camera coordinate system.
  • the second virtual object may be displayed on the screen where the first virtual object has already been displayed.
  • the user can see the first object or the second object in the screen displaying the second virtual object.
  • the screen displaying the first virtual object and the screen displaying the second virtual object may be different, but the sum of the screen displaying the first virtual object and the screen displaying the second virtual object is the same reality captured by the terminal In the scene, the sum of the picture of the first virtual object and the picture of the second virtual object can be called a panoramic picture.
  • the position of the second position in the screen is b2.
  • the intersection of the ray from C1 towards b1 and the ray from C2 towards b2 can be used as the second position at
  • the three-dimensional coordinates of the intersection point B are used as the three-dimensional coordinates of the second position in the real scene.
  • the three-dimensional coordinates of the second position in the real scene may be referred to as: the three-dimensional coordinates of the second position in the world coordinate system.
  • the three-dimensional coordinates of the mapping point of the second position on the virtual plane, and the three-dimensional coordinates of the second position in the real scene obtain the scaling ratio corresponding to the second position.
  • the terminal can determine the scaling ratio corresponding to the second position, that is, the third distance, according to the third distance between the first pose C1 and B', and the fourth distance between the first pose C1 and B'. and the quotient of the fourth distance.
  • the terminal can scale B to B' based on the scaling ratio corresponding to the second position, and B' is the three-dimensional coordinate of the scaled second position in the real scene, that is, the scaled three-dimensional coordinate corresponding to the second position, corresponding Yes, the second pose C2 can be scaled to C2'B, and C2'B is the scaled second pose, that is, the fifth pose.
  • the terminal may send C2'B, B' for display to display the second virtual object.
  • the terminal sends C2'B and B' to display, which can restore the position of the virtual object in the real scene.
  • the second pose and the three-dimensional coordinates corresponding to the second position of the terminal sent for display change at the same time, not only the three-dimensional coordinates corresponding to the second position change, so there will be no sudden change in the size of the virtual object, which can improve user experience.
  • the virtual plane corresponding to the first position is the same as the virtual plane corresponding to the second position.
  • the virtual plane corresponding to the second position is different from the virtual plane corresponding to the second position.
  • the terminal obtains the three-dimensional coordinates of the mapping point of the second position on the virtual plane, and obtains the three-dimensional coordinates of the second position in the real scene.
  • the difference from Figure 11A is that the virtual plane used by the terminal It is not the virtual plane corresponding to the first position, but the virtual plane corresponding to the second position, as shown in FIG. 11B .
  • the virtual plane corresponding to the first position is the first virtual plane
  • the virtual plane corresponding to the second position is the second virtual plane
  • the positional relationship between the first virtual plane and the second virtual plane in FIG. 11B Illustrated as an example.
  • the terminal may send and display the fifth pose (C2'B) and the scaled three-dimensional coordinates (B') corresponding to the second position, so as to display the second virtual object.
  • the virtual plane is preset, and virtual objects can be displayed on the preset virtual plane, so there is no need to pre-scan the real scene to obtain the plane in the real scene, and there is no need to obtain the prior map information of the real scene, and the response speed is fast ,efficient.
  • the virtual object can be displayed at the corresponding position in the real scene, so that the user can feel that the virtual object actually exists in the real scene, and based on the corresponding scaling of different positions Scale, to avoid the phenomenon of sudden changes in the size of virtual objects, and improve user experience.
  • the terminal when displaying the first virtual object, can send the third pose (C2'A) and the scaled three-dimensional coordinates (A') corresponding to the first position for display, and when displaying the second virtual object , the fifth pose (C2'B) and the scaled three-dimensional coordinates (B') corresponding to the second position can be sent to display, so that the scaled three-dimensional coordinates corresponding to each position correspond to the pose of a terminal, so that the terminal at When displaying the first virtual object and the second virtual object in two poses, the calculation amount is large and the speed is slow.
  • the scaled three-dimensional coordinates corresponding to the first position (A' ) and the scaled three-dimensional coordinates (B') corresponding to the second position are unified under one terminal pose, so that when the terminal displays the first virtual object and the second virtual object, it only needs to send and display one terminal pose and multiple display virtual objects
  • the three-dimensional coordinates can be scaled, which can improve the speed of terminal rendering and displaying virtual objects.
  • S3028 and S3038 can be replaced by S3028A-S3029A:
  • the terminal may use the three-dimensional coordinate system where the third pose is located as a unified coordinate system, that is, the scaled three-dimensional coordinates corresponding to the first position and the scaled three-dimensional coordinates corresponding to the second position are unified under the third pose .
  • the terminal can translate the fifth pose to the third pose, for example, the terminal can translate the fifth pose C2'B along the line between the first pose C1 and the second pose C2 to the third pose C2'A.
  • the scaled three-dimensional coordinates corresponding to the first position are obtained according to the same scaling ratio as the third pose, the scaled three-dimensional coordinates corresponding to the first position in the third pose remain unchanged.
  • the terminal can be translated to B" on the virtual plane along the direction parallel to "C2'B towards C2'A", the translation distance is equal to the distance between C2'B and C2'A. That is, C2'B, C2'A, B' and B" can form a parallelogram.
  • the scaled three-dimensional coordinate corresponding to the second position in the third pose is the three-dimensional coordinate of B".
  • the terminal may acquire the three-dimensional coordinates of B" according to the relative position between the fifth pose C2'B and the third pose C2'A, and the three-dimensional coordinates of B'.
  • the terminal may also use the fifth pose as a unified pose, and then translate the third pose to the fifth pose to obtain the zoomed 3D corresponding to the first position in the fifth pose.
  • the coordinates and the scaled three-dimensional coordinates corresponding to the second position are sent for display, and the unified pose is not limited in the embodiment of the present application.
  • the terminal may obtain the scaled three-dimensional coordinates corresponding to each position and the pose of the terminal.
  • the terminal in order to reduce the amount of calculation when the terminal renders the virtual object and increase the speed of displaying the virtual object, the terminal can convert each anchor point pose and the corresponding terminal pose into multiple poses under one terminal pose.
  • An anchor pose It should be understood that in FIG. 13 , each position (such as position A and position B) is represented by an anchor point, and the scaled three-dimensional coordinates corresponding to each position are represented by an anchor point pose.
  • the third pose (C2'A), the scaled three-dimensional coordinates (A') corresponding to the first position, and the scaled three-dimensional coordinates (A') corresponding to the second position in the third pose are scaled for display to display the first virtual object and the second virtual object.
  • the zoomed three-dimensional coordinates corresponding to each position can be sent for display under the pose of the same terminal, instead of sending the zoomed three-dimensional coordinates corresponding to each position and the pose of the terminal for display, which can reduce The amount of calculation when the terminal renders virtual objects improves the speed of displaying virtual objects, thereby improving user experience.
  • the terminal displays the virtual object at the scaled three-dimensional coordinates corresponding to the first position and the second position, when the terminal switches the captured real scene picture and then returns to the original real scene picture, the user can still The first virtual object and the second virtual object can be seen on the interface of the terminal.
  • the real-world scene shot by the terminal includes a table, and a laptop is on the right side of the table. It should be understood that when the terminal shoots the table, the laptop is captured, so a in FIG. 14 Laptop not shown.
  • the terminal displays the first virtual object "stickman" at the zoomed three-dimensional coordinates corresponding to position A, and displays the second virtual object "cuboid” at the zoomed three-dimensional coordinates corresponding to position B.
  • the user holds the terminal and moves to the right to shoot.
  • the real scene picture captured by the terminal includes the laptop computer, but does not include the table, as shown in b in Figure 14, because the anchor points of the first virtual object and the second virtual object are in the real world.
  • the screen shown in b in FIG. 14 does not include the table, the user cannot see the first virtual object and the second virtual object in the screen.
  • the user can also perform the first operation and/or the second operation above to trigger the terminal to display other virtual objects in the screen, as shown in c in Figure 14, the terminal responds Based on the user's operation, the third virtual object "stickman" can be displayed on the screen.
  • the user when the user holds the terminal and moves to the left to shoot, and retakes the picture containing the table, the user can see the first virtual object at the zoomed three-dimensional coordinates corresponding to position A, and the zoomed three-dimensional coordinates corresponding to position B.
  • the second virtual object is seen at the three-dimensional coordinates, as shown in d in FIG. 14 .
  • the terminal displays the virtual object at the zoomed three-dimensional coordinates corresponding to position A (or other positions), when the terminal switches the captured real scene picture, the virtual object is still at position A (or other positions). Other positions) corresponding to the zoomed three-dimensional coordinates will not change with the change of the shooting picture.
  • the terminal re-switches to the original real scene picture, the user can still see the virtual object displayed at the corresponding zoomed three-dimensional coordinates.
  • the user can also interact with virtual objects displayed on the terminal.
  • the user can perform operations such as selecting, deleting, moving, and scaling the virtual object, and the user can also interact with the virtual object in a voice manner.
  • a virtual object displayed on the terminal may be selected.
  • a stick figure and a cuboid are displayed on the screen displayed by the terminal.
  • the user clicks on the stick figure and the terminal may display a box 151 in response to the user's click operation to circle the stick figure, indicating that the stick figure is selected, as shown in b in FIG. 15 .
  • the terminal may display a delete control 152 on the upper right corner of the stick figure, as shown in c in FIG. 15 .
  • Users can move virtual objects. Exemplarily, if the user long presses and drags the virtual object displayed on the terminal, the virtual object can be moved. If the user presses and holds the stickman displayed on the terminal and drags the stickman to another position C, the terminal can display the stickman at position C.
  • Users can scale virtual objects. Exemplarily, if the user uses two fingers on the virtual object, the virtual object can be scaled. For example, if the user puts two fingers on the stick figure displayed on the interface of the terminal, the stick figure can be zoomed out if the two fingers are close to each other, and the stick figure can be enlarged if the two fingers are far away.
  • the user can also use voice to interact with the virtual object.
  • the user can say "select all stick figures", and the terminal can select all the stick figures in the current screen in response to the voice, for example, the terminal can display a box 151 in the current screen to circle the stick figures, representing The stickman is selected, as shown in b in Figure 15.
  • the terminal in response to the voice, can select the stick figures in all the pictures, as shown in a and b in Figure 14, the two pictures both contain stick figures, and the terminal can respond to the voice, and can A panoramic picture containing "desk and laptop” is displayed, and a box 151 is displayed on the picture to circle each stick figure in the picture, as shown in e in FIG. 15 .
  • the user can also change the attributes of the virtual object by voice.
  • the attributes of the virtual object may include but not limited to: position, size, shape, color, action, expression and so on.
  • the terminal adjusts the cuboid displayed on the screen to red in response to the voice.
  • the user can also use voice to interact with the virtual object.
  • the embodiment of the present application does not limit how the user uses voice to interact with the virtual object.
  • the user can interact with the virtual object displayed on the terminal, which enriches the interaction mode between the user and the virtual object, and can improve user experience.
  • the terminal may include: an input data module, a multi-anchor real-time tracking and management module, an output result module, a rendering generation module, and a user interaction module.
  • the input data module is used to collect the data of interaction between the user and the terminal, the captured picture data, the IMU data, and the gravity axis data, etc.
  • the multi-anchor real-time tracking and management module is used to execute S3021-S3023, S3025-S3027, S3031-S3033, and S3035-S3037 in the above embodiments.
  • the output result module is used to execute S3028A-S3029A in the above embodiment.
  • the rendering generation module is used for rendering and displaying virtual objects.
  • the user interaction module is used to realize the interaction between the user and the virtual objects displayed on the terminal. Wherein, the steps performed by each module can be simplified as shown in FIG. 17 .
  • an illustration is made by taking the identification of the virtual object to be selected displayed on the interface of the terminal as an example, wherein, the identification of the virtual object to be selected can refer to the relevant description of the icon 43 of at least one virtual object to be selected,
  • the human-computer interaction method in the AR scene provided by the embodiment of the present application may include:
  • the interface of the terminal further displays the identification of the virtual object to be selected.
  • the user operates different virtual object identifiers, which may trigger the terminal to display the virtual object at a corresponding position of the real scene captured by the terminal.
  • the terminal may display the virtual object at the corresponding position of the real scene captured by the terminal.
  • the virtual object may not be displayed in the real scene captured by the terminal, but when the terminal When the captured picture is switched back to the real scene displaying the virtual object, the terminal may display the virtual object in the captured real scene, and reference may be made to the relevant description in FIG. 14 .
  • S1902 may refer to related descriptions in S302.
  • S1903 may refer to related descriptions in S302.
  • S1902-S1903 can also refer to the accompanying drawing shown in FIG. 5 .
  • FIG. 20 is a schematic structural diagram of a terminal provided in an embodiment of the present application.
  • an electronic device 2000 may include: a camera module 2001 , a display module 2002 and a processing module 2003 .
  • the shooting module 2001 can be included in the input data module
  • the display module 2002 can be included in the rendering generation module
  • the processing module 2003 can include a multi-anchor real-time tracking and management module, an output result module and a user interaction module.
  • the photographing module 2001 is configured to photograph a real scene.
  • the display module 2002 is configured to display the captured real scene on the interface of the terminal, the interface also displays the identification of the virtual object to be selected, and responds to the user's operation on the identification of the first virtual object, and shoots the scene on the terminal.
  • the first virtual object is displayed in a real scene
  • the second virtual object is displayed in the real scene where the first virtual object has already been displayed.
  • the processing module 2003 is configured to acquire the first pose of the terminal in response to the user's operation on the identification of the first virtual object; A first mapping point on a virtual plane, where the first position is a preset position on the interface or a position determined by the user on the interface; obtaining the first mapping point in the camera coordinate system 3D coordinates.
  • the display module 2002 is specifically configured to display the first virtual object in the real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system .
  • the processing module 2003 is specifically configured to acquire the two-dimensional coordinates of the first position in the image coordinate system; combine the ray from the first pose to the two-dimensional coordinates with the The intersection point of the first virtual plane is used as the first mapping point.
  • the first virtual plane is included in a set of preset virtual planes, and the processing module 2003 is further configured to: If a virtual plane has no intersection point, then obtain the intersection point of the ray from the first pose to the two-dimensional coordinates and other virtual planes in the preset virtual plane set; The intersection point of the virtual plane is used as the first mapping point.
  • the processing module 2003 is further configured to track the image block corresponding to the first position; when the terminal is in the second pose, obtain the first position in the image coordinate system Two-dimensional coordinates in .
  • the display module 2002 is further configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second pose, and the The two-dimensional coordinates of the first position in the image coordinate system in the second pose, and display the first virtual object in the real scene captured by the terminal.
  • the distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or, the terminal moves from the first pose to the The number of frames of the image captured during the second pose is greater than or equal to the preset number of frames; and/or, the duration of the terminal moving from the first pose to the second pose is longer than the preset duration; and/or, the second pose is a pose when the terminal successfully triangulates the first mapping point.
  • the processing module 2003 is specifically configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the The second pose, and the two-dimensional coordinates of the first position in the image coordinate system during the second pose, obtaining the three-dimensional coordinates of the first position in the world coordinate system; according to the first distance and The second distance is to obtain the scaling ratio corresponding to the first position, the first distance is: the distance from the first pose to the three-dimensional coordinates of the first mapping point in the camera coordinate system, the The second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system; according to the scaling ratio corresponding to the first position, the second pose, The three-dimensional coordinates of the first position in the world coordinate system are scaled to obtain the third pose and the scaled three-dimensional coordinates corresponding to the first position.
  • the display module 2002 is further configured to display the first virtual object in the real scene captured by the terminal according to the third pose and the scaled three-dimensional coordinates corresponding to the first position.
  • the processing module 2003 is specifically configured to scale the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system , the scaled three-dimensional coordinates corresponding to the first position are the same as the three-dimensional coordinates of the first mapping point in the camera coordinate system.
  • the terminal when the user performs an operation on identifying the second virtual object, the terminal is in the first pose, and the processing module 2003 is specifically configured to respond to the user's operation on the second virtual object
  • the operation of identifying the second virtual object, obtaining the second mapping point of the second position on the second virtual plane, the second position is a preset position on the interface or determined by the user on the interface the position of; obtaining the three-dimensional coordinates of the second mapping point in the camera coordinate system.
  • the display module 2002 is further configured to display the first virtual object in the real scene where the first virtual object has been displayed according to the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system. Second dummy object.
  • the processing module 2003 is specifically configured to, when the terminal is in the second pose, respectively calculate the second pose, the The three-dimensional coordinates of the second position in the world coordinate system are scaled to obtain the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, and the three-dimensional coordinates of the second position in the world coordinate system are : Based on the first pose, the two-dimensional coordinates of the second position in the image coordinate system during the first pose, the second pose, and the The second position is obtained in two-dimensional coordinates in the image coordinate system.
  • the display module 2002 is specifically configured to display the second virtual object in the real scene where the first virtual object has been displayed according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
  • the processing module 2003 is specifically configured to translate the fifth pose to the third pose; direction, and translate the scaled three-dimensional coordinates corresponding to the second position to obtain the scaled three-dimensional coordinates corresponding to the second position in the third pose.
  • the display module 2002 is specifically configured to display in the real scene where the first virtual object has been displayed according to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose the second virtual object.
  • the terminal provided in the embodiment of the present application can execute the steps in the above embodiment, and can realize the technical effect in the above embodiment, and you can refer to the relevant description in the above embodiment.
  • an embodiment of the present application also provides an electronic device, which may be the terminal described in the above embodiments, and may include: a processor 2101 (such as a CPU) , Storage 2102.
  • the memory 2102 may include a high-speed random-access memory (random-access memory, RAM), and may also include a non-volatile memory (non-volatile memory, NVM), such as at least one disk memory, and various instructions may be stored in the memory 2102 , so as to complete various processing functions and realize the method steps of the present application.
  • RAM random-access memory
  • NVM non-volatile memory
  • the electronic device involved in this application may further include: a power supply 2103 , a communication bus 2104 , a communication port 2105 , and a display 2106 .
  • the above-mentioned communication port 2105 is used to realize connection and communication between the electronic device and other peripheral devices.
  • the memory 2102 is used to store computer-executable program codes, and the program codes include instructions; when the processor 2101 executes the instructions, the instructions cause the processor 2101 of the electronic device to perform the actions in the above-mentioned method embodiments. The principles and technical effects are similar and will not be repeated here.
  • the display 2106 is used to display the real scene captured by the terminal, and display the virtual object in the real scene.
  • modules or components described in the above embodiments may be one or more integrated circuits configured to implement the above method, for example: one or more application specific integrated circuits (ASIC), or , one or more microprocessors (digital signal processor, DSP), or, one or more field programmable gate arrays (field programmable gate array, FPGA), etc.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • the processing element can be a general-purpose processor, such as a central processing unit (central processing unit, CPU) or other processors that can call program codes such as control device.
  • these modules can be integrated together and implemented in the form of a system-on-a-chip (SOC).
  • SOC system-on-a-chip
  • a computer program product includes one or more computer instructions.
  • a computer can be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, computer instructions may be transmitted from a website site, computer, server or data center by wire (such as Coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server, a data center, etc. integrated with one or more available media. Available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)).
  • the term "plurality” herein means two or more.
  • the term “and/or” in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations.
  • the character "/" in this article generally indicates that the front and back related objects are an “or” relationship; in the formula, the character “/” indicates that the front and back related objects are a “division” relationship.
  • words such as “first” and “second” are only used to distinguish the purpose of description, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or imply order.
  • sequence numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be used in the implementation of this application.
  • the implementation of the examples constitutes no limitation.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Processing Or Creating Images (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Provided in the embodiments of the present application are a human-computer interaction method and apparatus in an augmented reality (AR) scene, and an electronic device. In the method, a real scene can be photographed, the photographed real scene is displayed on an interface of a terminal, and identifiers of virtual objects to be selected are also displayed on the interface; in response to an operation of a user for an identifier of a first virtual object, the first virtual object is displayed in the real scene, which is photographed by the terminal; and in response to an operation of the user for an identifier of a second virtual object, the second virtual object is displayed in the real scene where the first virtual object has been displayed. The embodiments of the present application can enrich human-computer interaction modes in an AR scene, and enrich the style of displaying a virtual object in the AR scene.

Description

增强现实AR场景中的人机交互方法、装置和电子设备Human-computer interaction method, device and electronic equipment in augmented reality AR scene
本申请要求于2022年02月23日提交中国专利局、申请号为202210168435.1、申请名称为“增强现实AR场景中的人机交互方法、装置和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application submitted to the China Patent Office on February 23, 2022, with the application number 202210168435.1 and the application name "Human-computer interaction method, device and electronic equipment in augmented reality AR scene", all of which The contents are incorporated by reference in this application.
技术领域technical field
本申请实施例涉及增强现实AR技术领域,尤其涉及一种增强现实AR场景中的人机交互方法、装置和电子设备。The embodiments of the present application relate to the technical field of augmented reality AR, and in particular to a method, device and electronic device for human-computer interaction in an augmented reality AR scene.
背景技术Background technique
随着终端的处理和渲染能力的不断增强,许多基于增强现实(augmented reality,AR)技术的应用程序逐渐增多,如AR测量应用程序或者AR道具等。As the processing and rendering capabilities of terminals continue to increase, many applications based on augmented reality (augmented reality, AR) technology gradually increase, such as AR measurement applications or AR props.
目前,增强现实AR场景中显示虚拟对象的样式单一。Currently, there is a single style of displaying virtual objects in an augmented reality AR scene.
发明内容Contents of the invention
本申请实施例提供一种增强现实AR场景中的人机交互方法、装置和电子设备,可以丰富AR场景中显示虚拟对象的样式。Embodiments of the present application provide a human-computer interaction method, device, and electronic device in an augmented reality AR scene, which can enrich styles of virtual objects displayed in the AR scene.
第一方面,本申请实施例提供一种AR场景中的人机交互方法,执行该方法的执行主体可以为终端或终端中的芯片,下述实施例中以终端为例进行说明。In the first aspect, the embodiments of the present application provide a method for human-computer interaction in an AR scene. The execution subject for executing the method may be a terminal or a chip in the terminal. In the following embodiments, the terminal is used as an example for illustration.
在该方法中,在一种实施例中,终端可以拍摄现实场景,且在终端的界面上显示拍摄的现实场景,所述界面上还显示有待选择的虚拟对象的标识。终端可以响应于用户对第一虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第一虚拟对象,以及响应于所述用户对第二虚拟对象的标识的操作,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。In this method, in an embodiment, the terminal can shoot a real scene, and display the captured real scene on an interface of the terminal, and the interface also displays an identifier of a virtual object to be selected. The terminal may display the first virtual object in the real scene captured by the terminal in response to the user's operation on the identification of the first virtual object, and in response to the user's operation on the identification of the second virtual object, after the The second virtual object is displayed in a real scene displaying the first virtual object.
本申请实施例中,用户可以连续操作虚拟对象的标识,在终端拍摄的现实场景中显示多个虚拟对象,该多个虚拟对象可以相同或不同。本申请实施例中,用户并非局限于AR道具一次性使用的限制,可以在现实场景的相应位置显示虚拟对象,可以丰富AR场景中显示虚拟对象的样式,且能够丰富AR场景中的人机交互方式,提高用户体验。In this embodiment of the present application, the user can continuously operate the virtual object identification to display multiple virtual objects in the real scene captured by the terminal, and the multiple virtual objects can be the same or different. In the embodiment of this application, the user is not limited to the one-time use of AR props, but can display virtual objects at corresponding positions in the real scene, enrich the styles of displaying virtual objects in the AR scene, and enrich the human-computer interaction in the AR scene way to improve user experience.
该方法中,在一种实施例中,终端可以拍摄现实场景,且在终端的界面上显示拍摄的现实场景。终端的界面上存在预设位置,该预设位置对应的现实场景的相应位置用于显示虚拟对象。该预设位置可以为至少一个。In this method, in an embodiment, the terminal may shoot a real scene, and display the shot real scene on an interface of the terminal. There is a preset position on the interface of the terminal, and the corresponding position of the real scene corresponding to the preset position is used to display the virtual object. There may be at least one preset position.
在该实施例中,若虚拟对象为预设的,则在终端显示拍摄的现实场景时,可以在预设位置对应的现实场景的相应位置显示虚拟对象。如在终端拍摄的现实场景的第三位置显示第一虚拟对象,以及在所述终端拍摄的现实场景的第四位置显示第二虚拟对象。第二虚拟对象可以与第一虚拟对象相同或不同。在一种实施例中,现实场景的第三位置可以为第一位置对应的现实场景中的位置,现实场景的第四位置可以为第二位置对应的现实场景中的位置。In this embodiment, if the virtual object is preset, when the terminal displays the captured real scene, the virtual object may be displayed at a corresponding position of the real scene corresponding to the preset position. For example, the first virtual object is displayed at a third position of the real scene captured by the terminal, and the second virtual object is displayed at a fourth position of the real scene captured by the terminal. The second virtual object may be the same as or different from the first virtual object. In an embodiment, the third position of the real scene may be a position in the real scene corresponding to the first position, and the fourth position of the real scene may be a position in the real scene corresponding to the second position.
在该实施例中,若虚拟对象为用户设置的,则在终端显示拍摄的现实场景时,可以在终 端的界面上显示待选择的虚拟对象的标识。其中,终端可以响应于用户对第一虚拟对象的标识的操作,在预设位置对应的现实场景的相应位置显示第一虚拟对象,且响应于所述用户对第二虚拟对象的标识的操作,在预设位置对应的现实场景的相应位置显示所述第二虚拟对象。In this embodiment, if the virtual object is set by the user, when the terminal displays the captured real scene, the identifier of the virtual object to be selected can be displayed on the interface of the terminal. Wherein, in response to the user's operation on the identification of the first virtual object, the terminal may display the first virtual object at a corresponding position of the real scene corresponding to the preset position, and in response to the user's operation on the identification of the second virtual object, The second virtual object is displayed at a corresponding position of the real scene corresponding to the preset position.
或者,终端可以响应于用户对第一虚拟对象的标识的操作,在一个预设位置对应的现实场景的相应位置显示第一虚拟对象,且响应于所述用户对第二虚拟对象的标识的操作,在另一预设位置对应的现实场景的相应位置显示所述第二虚拟对象。Or, in response to the user's operation on the identification of the first virtual object, the terminal may display the first virtual object at a corresponding position of the real scene corresponding to a preset position, and respond to the user's operation on the identification of the second virtual object , displaying the second virtual object at a corresponding position of the real scene corresponding to another preset position.
该方法中,在一种实施例中,终端可以拍摄现实场景,且在终端的界面上显示拍摄的现实场景。用户还可以采用语音交互的方式,控制终端在拍摄的现实场景的一位置显示第一虚拟对象,以及在终端在拍摄的现实场景的另一位置显示第二虚拟对象。In this method, in an embodiment, the terminal may shoot a real scene, and display the shot real scene on an interface of the terminal. The user can also use voice interaction to control the terminal to display the first virtual object at a position of the captured real scene, and the terminal to display the second virtual object at another position of the captured real scene.
下述实施例中,对终端在拍摄的现实场景中显示第一虚拟对象,以及在终端在已显示第一虚拟对象的现实场中显示第二虚拟对象的过程进行说明:In the following embodiments, the terminal displays the first virtual object in the captured real scene, and the terminal displays the second virtual object in the real field where the first virtual object has been displayed:
以终端在拍摄的现实场景中显示第一虚拟对象为例,终端响应于所述用户对所述第一虚拟对象的标识的操作,获取所述终端的第一位姿,且获取所述第一位置在第一虚拟平面上的第一映射点,以及获取所述第一映射点在相机坐标系中的三维坐标。如此,终端可以根据所述第一位姿和所述第一映射点在所述相机坐标系中的三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。在一种实施例中,第一位置可以为预先设置的,或者由用户操作确定的。Taking the terminal displaying the first virtual object in the captured real scene as an example, the terminal acquires the first pose of the terminal in response to the user's operation on the identification of the first virtual object, and acquires the first A first mapping point located on the first virtual plane, and obtaining three-dimensional coordinates of the first mapping point in the camera coordinate system. In this way, the terminal may display the first virtual object in the real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system. In an embodiment, the first position may be preset or determined by a user operation.
具体的,终端可以将第一位姿和所述第一映射点在所述相机坐标系中的三维坐标进行送显,以使得终端在终端拍摄的现实场景中显示所述第一虚拟对象。Specifically, the terminal may send and display the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system, so that the terminal displays the first virtual object in the real scene captured by the terminal.
在一种可能的实现方式中,本申请实施例中预先设置有预设虚拟平面集合,该集合中如可以包括第一虚拟平面、第二虚拟平面,以及第三虚拟平面。第二虚拟平面可以分别与第二虚拟平面、第三虚拟平面垂直,且第二虚拟平面和第三虚拟平面相互垂直。第一虚拟平面、第二虚拟平面,以及第三虚拟平面中的任意虚拟平面可以与地面(或水平面)平面。In a possible implementation manner, in this embodiment of the present application, a set of preset virtual planes is preset, and the set may include, for example, a first virtual plane, a second virtual plane, and a third virtual plane. The second virtual plane may be perpendicular to the second virtual plane and the third virtual plane respectively, and the second virtual plane and the third virtual plane are perpendicular to each other. Any virtual plane in the first virtual plane, the second virtual plane, and the third virtual plane may be in the same plane as the ground (or horizontal plane).
其中,在一种实施例中,默认第一位姿至所述二维坐标的射线与第一虚拟平面存在交点,则终端可以获取所述第一位置在图像坐标系中的二维坐标,且将所述第一位姿至所述二维坐标的射线与所述第一虚拟平面的交点作为所述第一映射点。Wherein, in one embodiment, by default, there is an intersection point between the ray from the first pose to the two-dimensional coordinates and the first virtual plane, then the terminal can acquire the two-dimensional coordinates of the first position in the image coordinate system, and An intersection point of a ray from the first pose to the two-dimensional coordinates and the first virtual plane is used as the first mapping point.
在一种实施例中,若所述第一位姿至所述二维坐标的射线与所述第一虚拟平面没有交点,则获取所述第一位姿至所述二维坐标的射线与所述预设虚拟平面集合中其他虚拟平面的交点;将所述与所述预设虚拟平面集合中其他虚拟平面的交点作为所述第一映射点。In one embodiment, if the ray from the first pose to the two-dimensional coordinates does not intersect with the first virtual plane, then obtain the ray from the first pose to the two-dimensional coordinates and the The intersection point with other virtual planes in the set of preset virtual planes; the intersection point with other virtual planes in the set of preset virtual planes is used as the first mapping point.
本申请实施例中,可以预先设置虚拟平面,可以避免用户使用AR功能时需要预先扫描平面的操作,简化用户的操作,提高效率。另外,本申请实施例中设置多个不同的虚拟平面,可以保证第一位姿至所述二维坐标的射线与其中的一个虚拟平面存在交点,以保证虚拟对象的顺利生成,提高虚拟对象的生成准确性。In the embodiment of the present application, the virtual plane can be set in advance, which can avoid the operation of pre-scanning the plane when the user uses the AR function, simplify the user's operation, and improve the efficiency. In addition, setting a plurality of different virtual planes in the embodiment of the present application can ensure that the ray from the first pose to the two-dimensional coordinates has an intersection point with one of the virtual planes, so as to ensure the smooth generation of virtual objects and improve the accuracy of virtual objects. Generate accuracy.
在一种可能的实现方式中,终端在响应于用户对第一虚拟对象的标识的操作,可以根据第一位姿和所述第一映射点在所述相机坐标系中的三维坐标,快速生成第一虚拟对象。但本申请实施例中为了使得用户真实感受到第一虚拟对象存在于现实场景中,且随着终端的位姿的变化,第一虚拟对象的尺寸不发生突变,则终端可以追踪所述第一位置对应的图像块,且在所述终端为第二位姿时,获取所述第一位置在所述图像坐标系中的二维坐标;进而根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。In a possible implementation manner, in response to the user's operation on the identification of the first virtual object, the terminal can quickly generate first dummy object. However, in the embodiment of the present application, in order to make the user truly feel that the first virtual object exists in the real scene, and the size of the first virtual object does not change suddenly with the change of the terminal's pose, the terminal can track the first virtual object. The image block corresponding to the position, and when the terminal is in the second pose, obtain the two-dimensional coordinates of the first position in the image coordinate system; and then according to the first pose, the first position The two-dimensional coordinates of the first position in the image coordinate system in the pose, the second pose, and the two-dimensional coordinates of the first position in the image coordinate system in the second pose , displaying the first virtual object in the real scene captured by the terminal.
其中,在一种可能的实现方式中,所述第二位姿与所述第一位姿之间的距离大于或等于距离阈值;和/或,所述终端在从所述第一位姿移动至所述第二位姿的过程中拍摄的图像的帧数大于或等于预设帧数;和/或,所述终端在从所述第一位姿移动至所述第二位姿的时长大于预设时长;和/或,所述第二位姿为所述终端对所述第一映射点执行三角化成功时的位姿。Wherein, in a possible implementation manner, the distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or, the terminal is moving from the first pose The number of frames of the image captured during the process to the second pose is greater than or equal to the preset number of frames; and/or, the duration of the terminal moving from the first pose to the second pose is longer than A preset duration; and/or, the second pose is a pose when the terminal successfully triangulates the first mapping point.
本申请实施例中,终端根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象,具体可以包括:终端根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,获取所述第一位置在世界坐标系中的三维坐标,且根据第一距离和第二距离,获取所述第一位置对应的缩放比例,所述第一距离为:所述第一位姿至所述第一映射点的距离,所述第二距离为:所述第一位姿至所述第一位置在所述世界坐标系中的三维坐标的距离。In this embodiment of the present application, the terminal, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second pose, and the first The two-dimensional coordinates of the first position in the image coordinate system in the second pose, and displaying the first virtual object in the real scene captured by the terminal may specifically include: the terminal according to the first pose , the two-dimensional coordinates of the first position in the image coordinate system in the first pose, the second pose, and the image coordinates of the first position in the second pose The two-dimensional coordinates in the world coordinate system, obtain the three-dimensional coordinates of the first position in the world coordinate system, and obtain the scaling ratio corresponding to the first position according to the first distance and the second distance, and the first distance is: The distance from the first pose to the first mapping point, the second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system.
终端在得到第一位置对应的缩放比例后,可以根据所述第一位置对应的缩放比例,分别对所述第二位姿、所述第一位置在所述世界坐标系中的三维坐标进行缩放,得到第三位姿、所述第一位置对应的缩放三维坐标,进而根据所述第三位姿,以及所述第一位置对应的缩放三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。After obtaining the scaling ratio corresponding to the first position, the terminal may respectively scale the second pose and the three-dimensional coordinates of the first position in the world coordinate system according to the scaling ratio corresponding to the first position , obtain the third pose and the scaled three-dimensional coordinates corresponding to the first position, and then display the three-dimensional coordinates corresponding to the third pose and the first position in the real scene captured by the terminal Describe the first virtual object.
其中,在一种可能的实现方式中,终端可以将所述第一位置在所述世界坐标系中的三维坐标缩放至所述第一映射点在所述相机坐标系中的三维坐标处,所述第一位置对应的缩放三维坐标与所述第一映射点在所述相机坐标系中的三维坐标相同。进而,终端可以按照第一位置在所述世界坐标系中的三维坐标的缩放比例,对应缩放第二位姿,得到第三位姿。Wherein, in a possible implementation manner, the terminal may scale the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system, so The scaled three-dimensional coordinates corresponding to the first position are the same as the three-dimensional coordinates of the first mapping point in the camera coordinate system. Furthermore, the terminal may correspondingly scale the second pose according to the scaling ratio of the three-dimensional coordinates of the first position in the world coordinate system to obtain the third pose.
如上实施例中简述了终端在所述终端拍摄的现实场景中显示所述第一虚拟对象的过程,其中,终端响应于所述用户对第二虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第二虚拟对象可以参照终端显示第一虚拟对象的相关描述。In the above embodiment, the process of displaying the first virtual object by the terminal in the real scene captured by the terminal is briefly described. For displaying the second virtual object in a real scene, reference may be made to related descriptions about displaying the first virtual object by the terminal.
在一种可能的实现方式中,若终端在终端拍摄的现实场景的多个位置显示虚拟对象,则基于如上实施例中的描述,终端需要将终端的每个位姿(缩放后的位姿),以及位置对应的缩放三维坐标进行送显,终端在渲染、显示第一虚拟对象和第二虚拟对象时的计算量大,速度慢。本申请实施例中,为了减少终端的计算量,提高终端显示虚拟对象的速度,可以将每个位姿对应的缩放三维坐标统一在同一位姿下(坐标系下)。In a possible implementation, if the terminal displays virtual objects at multiple positions of the real scene captured by the terminal, based on the description in the above embodiment, the terminal needs to convert each pose of the terminal (zoomed pose) , and the zoomed three-dimensional coordinates corresponding to the positions are sent for display, and the terminal has a large amount of calculation and a slow speed when rendering and displaying the first virtual object and the second virtual object. In the embodiment of the present application, in order to reduce the calculation amount of the terminal and increase the speed at which the terminal displays virtual objects, the scaled three-dimensional coordinates corresponding to each pose can be unified in the same pose (coordinate system).
其中,所述用户执行对所述第二虚拟对象的标识的操作时所述终端为所述第一位姿。终端可以响应于所述用户对第二虚拟对象的标识的操作,获取所述第二位置在第二虚拟平面上的第二映射点,且获取所述第二映射点在所述相机坐标系中的三维坐标,以及根据所述第一位姿和所述第二映射点在所述相机坐标系中的三维坐标,在所述终端拍摄的现实场景(该现实场景为显示有第一虚拟对象的终端界面上显示的现实场景)中显示所述第二虚拟对象。其中,第二位置在第二虚拟平面上的第二映射点,以及第二映射点的二维坐标的获取过程可以参照第一映射点的相关描述。Wherein, when the user performs an operation on the identification of the second virtual object, the terminal is in the first pose. The terminal may acquire a second mapping point of the second position on the second virtual plane in response to the user's operation on the identification of the second virtual object, and acquire the second mapping point in the camera coordinate system The three-dimensional coordinates of the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system, the real scene shot at the terminal (the real scene is the first virtual object displayed The second virtual object is displayed in the real scene displayed on the terminal interface). Wherein, the second mapping point at the second position on the second virtual plane and the acquisition process of the two-dimensional coordinates of the second mapping point may refer to the related description of the first mapping point.
在一种可能的实现方式中,所述第二位姿为所述用户执行对所述第二虚拟对象的标识之后的位姿,终端可以在所述终端为所述第二位姿时,根据所述第二位置对应的缩放比例,分别对所述第二位姿、所述第二位置在所述世界坐标系中的三维坐标进行缩放,得到第五位姿、所述第二位置对应的缩放三维坐标,所述第二位置在所述世界坐标系中的三维坐标为:基于所述第一位姿、所述第一位姿时所述第二位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第二位置在所述图像坐标系中的二维坐标获取的。In a possible implementation manner, the second pose is a pose after the user identifies the second virtual object, and the terminal may, when the terminal is in the second pose, according to The scaling ratio corresponding to the second position is to respectively scale the second pose and the three-dimensional coordinates of the second position in the world coordinate system to obtain the fifth pose and the corresponding to the second position Scaling the three-dimensional coordinates, the three-dimensional coordinates of the second position in the world coordinate system are: based on the first pose and the first pose, the second position in the image coordinate system two-dimensional coordinates, the second pose, and the two-dimensional coordinates of the second position in the image coordinate system during the second pose.
其中,终端获取第二位置在所述世界坐标系中的三维坐标、第二位置对应的缩放比例,以及缩放过程可以参照第一位置对应的相关描述。Wherein, the terminal obtains the three-dimensional coordinates of the second position in the world coordinate system, the scaling ratio corresponding to the second position, and the scaling process may refer to the relevant description corresponding to the first position.
在一种实施例中,终端可以根据所述第五位姿,以及所述第二位置对应的缩放三维坐标,在所述终端拍摄的现实场景中显示所述第二虚拟对象。In an embodiment, the terminal may display the second virtual object in the real scene captured by the terminal according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
具体的,终端可以将所述第五位姿平移至所述第三位姿,且按照所述第五位姿平移至所述第三位姿的距离和方向,平移所述第二位置对应的缩放三维坐标,得到所述第三位姿下所述第二位置对应的缩放三维坐标。进而,终端可以根据所述第三位姿,以及所述第三位姿下所述第二位置对应的缩放三维坐标,在所述终端拍摄的现实场景中显示所述第二虚拟对象。Specifically, the terminal may translate the fifth pose to the third pose, and according to the distance and direction of the fifth pose to the third pose, translate the The three-dimensional coordinates are scaled to obtain the scaled three-dimensional coordinates corresponding to the second position in the third pose. Furthermore, the terminal may display the second virtual object in the real scene captured by the terminal according to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose.
本申请实施例中,可以将每个位置对应的缩放三维坐标统一在同一终端的位姿下进行送显,而不是将每个位置对应的缩放三维坐标以及终端的位姿进行送显,可以减少终端渲染虚拟对象时的计算量,提高显示虚拟对象的速度,进而提高用户体验。In the embodiment of the present application, the zoomed three-dimensional coordinates corresponding to each position can be sent for display under the pose of the same terminal, instead of sending the zoomed three-dimensional coordinates corresponding to each position and the pose of the terminal for display, which can reduce The amount of calculation when the terminal renders virtual objects improves the speed of displaying virtual objects, thereby improving user experience.
在一种可能的实现方式中,用户还可以与终端上显示的虚拟对象进行交互。如用户可以对虚拟对象进行选中、删除、移动、缩放等操作,以及,用户还可以采用语音方式与虚拟对象进行交互等。In a possible implementation manner, the user can also interact with the virtual objects displayed on the terminal. For example, the user can perform operations such as selecting, deleting, moving, and scaling the virtual object, and the user can also interact with the virtual object in a voice manner.
第二方面,本申请实施例提供一种AR场景中的人机交互装置,该装置可以为如上第一方面中的终端,或终端中的芯片。该装置可以包括:In the second aspect, the embodiment of the present application provides a human-computer interaction device in an AR scene, and the device may be the terminal in the first aspect above, or a chip in the terminal. The device can include:
在一种实施例中,拍摄模块,用于拍摄现实场景。In one embodiment, the shooting module is used to shoot real scenes.
显示模块,用于在终端的界面上显示拍摄的现实场景,所述界面上还显示有待选择的虚拟对象的标识,且响应于用户对第一虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第一虚拟对象,以及响应于所述用户对第二虚拟对象的标识的操作,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module is configured to display the captured real scene on the interface of the terminal, the interface also displays the identification of the virtual object to be selected, and in response to the user's operation on the identification of the first virtual object, displaying the first virtual object in a real scene, and displaying the second virtual object in a real scene where the first virtual object has been displayed in response to the user's operation on the identification of the second virtual object.
在一种可能的实现方式中,处理模块,用于响应于所述用户对所述第一虚拟对象的标识的操作,获取所述终端的第一位姿;获取所述第一位置在第一虚拟平面上的第一映射点,所述第一位置为所述界面上的预设位置或所述用户在所述界面上确定的位置;获取所述第一映射点在相机坐标系中的三维坐标。In a possible implementation manner, the processing module is configured to acquire the first pose of the terminal in response to the user's operation on the identification of the first virtual object; A first mapping point on a virtual plane, where the first position is a preset position on the interface or a position determined by the user on the interface; obtaining a three-dimensional view of the first mapping point in the camera coordinate system coordinate.
所述显示模块,具体用于根据所述第一位姿和所述第一映射点在所述相机坐标系中的三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。The display module is specifically configured to display the first virtual object in the real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system.
在一种可能的实现方式中,处理模块,具体用于获取所述第一位置在图像坐标系中的二维坐标;将所述第一位姿至所述二维坐标的射线与所述第一虚拟平面的交点作为所述第一映射点。In a possible implementation manner, the processing module is specifically configured to acquire the two-dimensional coordinates of the first position in the image coordinate system; combine the ray from the first pose to the two-dimensional coordinates with the second An intersection point of a virtual plane is used as the first mapping point.
在一种可能的实现方式中,所述第一虚拟平面包含于预设虚拟平面集合中,处理模块,还用于若所述第一位姿至所述二维坐标的射线与所述第一虚拟平面没有交点,则获取所述第一位姿至所述二维坐标的射线与所述预设虚拟平面集合中其他虚拟平面的交点;将所述与所述预设虚拟平面集合中其他虚拟平面的交点作为所述第一映射点。In a possible implementation manner, the first virtual plane is included in a set of preset virtual planes, and the processing module is further configured to: If there is no intersection point in the virtual plane, then obtain the intersection point of the ray from the first pose to the two-dimensional coordinates and other virtual planes in the preset virtual plane set; The intersection point of the planes is used as the first mapping point.
在一种可能的实现方式中,处理模块,还用于追踪所述第一位置对应的图像块;在所述终端为第二位姿时,获取所述第一位置在所述图像坐标系中的二维坐标。In a possible implementation manner, the processing module is further configured to track the image block corresponding to the first position; when the terminal is in the second pose, obtain the first position in the image coordinate system two-dimensional coordinates of .
所述显示模块,还用于根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。The display module is further configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second pose, and the The second pose is the two-dimensional coordinates of the first position in the image coordinate system, and the first virtual object is displayed in the real scene captured by the terminal.
在一种可能的实现方式中,所述第二位姿与所述第一位姿之间的距离大于或等于距离阈值;和/或,所述终端在从所述第一位姿移动至所述第二位姿的过程中拍摄的图像的帧数大于 或等于预设帧数;和/或,所述终端在从所述第一位姿移动至所述第二位姿的时长大于预设时长;和/或,所述第二位姿为所述终端对所述第一映射点执行三角化成功时的位姿。In a possible implementation, the distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or, the terminal moves from the first pose to the The number of frames of the image captured during the second pose is greater than or equal to the preset number of frames; and/or, the duration of the terminal moving from the first pose to the second pose is longer than the preset duration; and/or, the second pose is a pose when the terminal successfully triangulates the first mapping point.
在一种可能的实现方式中,处理模块,具体用于根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,获取所述第一位置在世界坐标系中的三维坐标;根据第一距离和第二距离,获取所述第一位置对应的缩放比例,所述第一距离为:所述第一位姿至所述第一映射点在所述相机坐标系中的三维坐标的距离,所述第二距离为:所述第一位姿至所述第一位置在所述世界坐标系中的三维坐标的距离;根据所述第一位置对应的缩放比例,分别对所述第二位姿、所述第一位置在所述世界坐标系中的三维坐标进行缩放,得到第三位姿、所述第一位置对应的缩放三维坐标。In a possible implementation manner, the processing module is specifically configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second Two poses, and the two-dimensional coordinates of the first position in the image coordinate system in the second pose, obtaining the three-dimensional coordinates of the first position in the world coordinate system; according to the first distance and the second Two distances, obtaining the scaling ratio corresponding to the first position, the first distance is: the distance from the first pose to the three-dimensional coordinates of the first mapping point in the camera coordinate system, the first distance The second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system; according to the scaling ratio corresponding to the first position, the second pose, the The three-dimensional coordinates of the first position in the world coordinate system are scaled to obtain the third pose and the scaled three-dimensional coordinates corresponding to the first position.
所述显示模块,还用于根据所述第三位姿,以及所述第一位置对应的缩放三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。The display module is further configured to display the first virtual object in the real scene captured by the terminal according to the third pose and the scaled three-dimensional coordinates corresponding to the first position.
在一种可能的实现方式中,处理模块,具体用于将所述第一位置在所述世界坐标系中的三维坐标缩放至所述第一映射点在所述相机坐标系中的三维坐标处,所述第一位置对应的缩放三维坐标与所述第一映射点在所述相机坐标系中的三维坐标相同。In a possible implementation manner, the processing module is specifically configured to scale the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system , the scaled three-dimensional coordinates corresponding to the first position are the same as the three-dimensional coordinates of the first mapping point in the camera coordinate system.
在一种可能的实现方式中,所述用户执行对所述第二虚拟对象的标识的操作时所述终端为所述第一位姿,处理模块,具体用于响应于所述用户对第二虚拟对象的标识的操作,获取所述第二位置在第二虚拟平面上的第二映射点,所述第二位置为所述界面上的预设位置或所述用户在所述界面上确定的位置;获取所述第二映射点在所述相机坐标系中的三维坐标。In a possible implementation manner, when the user performs the operation of identifying the second virtual object, the terminal is in the first pose, and the processing module is specifically configured to respond to the user's operation on the second virtual object The operation of identifying the virtual object, obtaining a second mapping point of the second position on the second virtual plane, where the second position is a preset position on the interface or determined by the user on the interface Position: acquiring the three-dimensional coordinates of the second mapping point in the camera coordinate system.
所述显示模块,还用于根据所述第一位姿和所述第二映射点在所述相机坐标系中的三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module is further configured to display the first virtual object in the real scene where the first virtual object has been displayed according to the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system. Two dummy objects.
在一种可能的实现方式中,处理模块,具体用于在所述终端为所述第二位姿时,根据所述第二位置对应的缩放比例,分别对所述第二位姿、所述第二位置在所述世界坐标系中的三维坐标进行缩放,得到第五位姿、所述第二位置对应的缩放三维坐标,所述第二位置在所述世界坐标系中的三维坐标为:基于所述第一位姿、所述第一位姿时所述第二位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第二位置在所述图像坐标系中的二维坐标获取的。In a possible implementation manner, the processing module is specifically configured to, when the terminal is in the second pose, respectively calculate the second pose, the Scaling the three-dimensional coordinates of the second position in the world coordinate system to obtain the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, the three-dimensional coordinates of the second position in the world coordinate system are: Based on the first pose, the two-dimensional coordinates of the second position in the image coordinate system at the first pose, the second pose, and the first pose at the second pose Two positions are obtained in two-dimensional coordinates in the image coordinate system.
所述显示模块,具体用于根据所述第五位姿,以及所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module is specifically configured to display the second virtual object in the real scene where the first virtual object has been displayed according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
在一种可能的实现方式中,处理模块,具体用于将所述第五位姿平移至所述第三位姿;按照所述第五位姿平移至所述第三位姿的距离和方向,平移所述第二位置对应的缩放三维坐标,得到所述第三位姿下所述第二位置对应的缩放三维坐标。In a possible implementation manner, the processing module is specifically configured to translate the fifth pose to the third pose; translate the fifth pose to the third pose according to the distance and direction , and translate the scaled three-dimensional coordinates corresponding to the second position to obtain the scaled three-dimensional coordinates corresponding to the second position in the third pose.
所述显示模块,具体用于根据所述第三位姿,以及所述第三位姿下所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module is specifically configured to display the first virtual object in the real scene where the first virtual object has been displayed according to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose. Describe the second virtual object.
第三方面,本申请实施例提供一种电子设备,该电子设备可以包括:处理器、存储器。存储器用于存储计算机可执行程序代码,程序代码包括指令;当处理器执行指令时,指令使所述电子设备执行如第一方面中的方法。In a third aspect, an embodiment of the present application provides an electronic device, and the electronic device may include: a processor and a memory. The memory is used to store computer-executable program codes, and the program codes include instructions; when the processor executes the instructions, the instructions cause the electronic device to execute the method in the first aspect.
第四方面,本申请实施例提供一种电子设备,该电子设备可以为AR场景中的人机交互装置。该电子设备可以包括用于执行以上第一方面所提供的方法的单元、模块或电路。In a fourth aspect, the embodiment of the present application provides an electronic device, and the electronic device may be a human-computer interaction device in an AR scene. The electronic device may include a unit, module or circuit for performing the method provided in the above first aspect.
第五方面,本申请实施例提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面中的方法。In a fifth aspect, the embodiments of the present application provide a computer program product including instructions, which, when run on a computer, cause the computer to execute the method in the first aspect above.
第六方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面中的方法。In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, it causes the computer to execute the method in the above-mentioned first aspect.
上述第二方面至第六方面的各可能的实现方式,其有益效果可以参见上述第一方面所带来的有益效果,在此不加赘述。For the beneficial effects of the possible implementation manners of the above-mentioned second aspect to the sixth aspect, reference may be made to the beneficial effects brought about by the above-mentioned first aspect, which will not be repeated here.
本申请实施例提供一种增强现实AR场景中的人机交互方法、装置和电子设备,在该方法中,终端可以拍摄现实场景,且在终端的界面上显示拍摄的现实场景,界面上还显示有待选择的虚拟对象的标识;响应于用户对第一虚拟对象的标识的操作,在终端拍摄的现实场景中显示第一虚拟对象;响应于用户对第二虚拟对象的标识的操作,在已显示第一虚拟对象的现实场景中显示第二虚拟对象。本申请实施例提供的人机交互方法,不仅可以丰富增强现实AR场景中的人机交互方式,还丰富了AR场景中显示虚拟对象的样式。Embodiments of the present application provide a human-computer interaction method, device, and electronic device in an augmented reality AR scene. In this method, the terminal can shoot the real scene, and display the captured real scene on the interface of the terminal, and the interface also displays The identification of the virtual object to be selected; in response to the user's operation on the identification of the first virtual object, the first virtual object is displayed in the real scene captured by the terminal; in response to the user's operation on the identification of the second virtual object, the displayed The second virtual object is displayed in the real scene of the first virtual object. The human-computer interaction method provided by the embodiment of the present application can not only enrich the human-computer interaction mode in the augmented reality AR scene, but also enrich the styles of displaying virtual objects in the AR scene.
附图说明Description of drawings
图1为现有技术中一AR应用程序的界面示意图;FIG. 1 is a schematic diagram of an interface of an AR application program in the prior art;
图2为现有技术中另一AR应用程序的界面示意图;Fig. 2 is a schematic interface diagram of another AR application program in the prior art;
图3为本申请实施例提供的AR场景中的人机交互方法的一种实施例的流程示意图;FIG. 3 is a schematic flowchart of an embodiment of a method for human-computer interaction in an AR scene provided by an embodiment of the present application;
图4为本申请实施例提供的一种界面示意图;FIG. 4 is a schematic diagram of an interface provided by an embodiment of the present application;
图5为本申请实施例提供的另一种界面示意图;Fig. 5 is a schematic diagram of another interface provided by the embodiment of the present application;
图6为本申请实施例提供的另一种界面示意图;Fig. 6 is a schematic diagram of another interface provided by the embodiment of the present application;
图7为本申请实施例提供的AR场景中的人机交互方法的另一种实施例的流程示意图;FIG. 7 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application;
图8为本申请实施例提供的虚拟平面的示意图;FIG. 8 is a schematic diagram of a virtual plane provided by an embodiment of the present application;
图9为本申请实施例提供的第一位置对应的图像块的一种示意图;FIG. 9 is a schematic diagram of an image block corresponding to a first position provided by an embodiment of the present application;
图10为本申请实施例提供的三角化的示意图;FIG. 10 is a schematic diagram of triangulation provided by the embodiment of the present application;
图11A为本申请实施例提供的缩放的一种示意图;Fig. 11A is a schematic diagram of scaling provided by the embodiment of the present application;
图11B为本申请实施例提供的缩放的另一种示意图;FIG. 11B is another schematic diagram of scaling provided by the embodiment of the present application;
图12为本申请实施例提供的AR场景中的人机交互方法的另一种实施例的流程示意图;FIG. 12 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application;
图13为本申请实施例提供的将缩放三维坐标统一至一个终端的位姿的示意图;FIG. 13 is a schematic diagram of unifying the scaled three-dimensional coordinates to the pose of a terminal provided by the embodiment of the present application;
图14为本申请实施例提供的另一种界面示意图;Figure 14 is a schematic diagram of another interface provided by the embodiment of the present application;
图15为本申请实施例提供的另一种界面示意图;Figure 15 is a schematic diagram of another interface provided by the embodiment of the present application;
图16为本申请实施例提供的终端的一种结构示意图;FIG. 16 is a schematic structural diagram of a terminal provided in an embodiment of the present application;
图17为本申请实施例提供的AR场景中的人机交互方法的另一种实施例的流程示意图;FIG. 17 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application;
图18为本申请实施例提供的AR场景中的人机交互方法的另一种实施例的流程示意图;FIG. 18 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application;
图19为本申请实施例提供的AR场景中的人机交互方法的另一种实施例的流程示意图;FIG. 19 is a schematic flowchart of another embodiment of the human-computer interaction method in the AR scene provided by the embodiment of the present application;
图20为本申请实施例提供的终端的一种结构示意图;FIG. 20 is a schematic structural diagram of a terminal provided in an embodiment of the present application;
图21为本申请实施例提供的电子设备的一种结构示意图。FIG. 21 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
图1为现有技术中一AR应用程序的界面示意图。图1中以增强现实(augmented reality,AR)应用程序为AR测量应用程序为例进行说明。参照图1,用户打开AR测量应用程序,终端的界面上显示有终端拍摄的现实场景,且在界面上显示有提示信息。示例性的,提示信息如“缓缓移动设备,寻找物体所在平面”,以提示用户使用终端扫描现实场景,以确定现实 场景中的平面,进而进行距离、面积、体积、身高等的测量。在该类AR应用程序中,需要预先扫描现实场景得到现实场景中的平面,才能继续使用AR应用程序中的功能,操作复杂、效率低。示例性的,如基于ARkit与ARCore开发的AR应用程序,在使用AR应用程序中的功能前均需要扫描现实场景进行平面检测。FIG. 1 is a schematic diagram of an interface of an AR application program in the prior art. In FIG. 1, an augmented reality (augmented reality, AR) application is taken as an example of an AR measurement application for illustration. Referring to FIG. 1 , the user opens the AR measurement application program, and the interface of the terminal displays a real scene captured by the terminal, and prompt information is displayed on the interface. Exemplarily, the prompt information such as "slowly move the device to find the plane where the object is located" is used to prompt the user to use the terminal to scan the real scene to determine the plane in the real scene, and then measure the distance, area, volume, height, etc. In this type of AR application, it is necessary to scan the real scene in advance to obtain the plane in the real scene before continuing to use the functions in the AR application. The operation is complicated and the efficiency is low. Exemplarily, for example, AR applications developed based on ARkit and ARCore need to scan the real scene for plane detection before using the functions in the AR application.
或者,现有技术中,在使用AR应用程序中的功能前无需预先扫描现实场景,但是需要预先获取并存储现实场景的先验地图信息,进而基于该现实场景的先验地图信息使用AR应用程序中的功能。示例性的,先验地图信息可以包括但不限于:现实场景的三维点云地图、或者包含有平面的地图等。在该种示例中,因为终端中存储有现实场景的先验地图信息,因此在使用AR应用程序时,终端可以基于终端的位姿,以及先验地图信息中的平面等信息,使用AR应用程序中的功能。该种方法中虽然在使用AR应用程序时无需扫描现实场景,但需要预先获取现实场景的先验地图信息,而获取现实场景的先验地图信息需要专业人员进行获取,难度大且效率低。Or, in the prior art, there is no need to pre-scan the real scene before using the functions in the AR application, but it is necessary to obtain and store the prior map information of the real scene in advance, and then use the AR application based on the prior map information of the real scene function in . Exemplarily, the prior map information may include but not limited to: a three-dimensional point cloud map of a real scene, or a map containing a plane, and the like. In this example, because the terminal stores the prior map information of the real scene, when using the AR application, the terminal can use the AR application based on the terminal's pose and the plane in the prior map information. function in . Although this method does not need to scan the real scene when using the AR application, it needs to obtain the prior map information of the real scene in advance, and the prior map information of the real scene needs to be obtained by professionals, which is difficult and inefficient.
图2为现有技术中另一AR应用程序的界面示意图。图2中以AR应用程序为短视频类应用程序为例进行说明,短视频类应用程序提供“虚拟对象”道具,图2中以虚拟对象为“猫”为例进行说明。参照图2中的a和b,在该短视频类应用程序中,用户在拍摄现实场景时,可以点击“猫”道具,且用户点击现实场景的任意位置,终端可以在界面显示的现实场景的对应位置处显示一个虚拟猫。FIG. 2 is a schematic diagram of an interface of another AR application program in the prior art. In FIG. 2, the AR application is taken as an example of a short video application. The short video application provides a "virtual object" prop. In FIG. 2, the virtual object is "cat" as an example for illustration. Referring to a and b in Figure 2, in this short video application, when the user is shooting a real scene, the user can click on the "cat" prop, and the user clicks on any position of the real scene, and the terminal can display the real scene on the interface. A virtual cat is displayed at the corresponding position.
在该类短视频应用程序中,终端无需预先扫描现实场景或获取现实场景的先验地图信息,可以在现实场景的任一位置显示一个虚拟猫,但终端显示的虚拟对象受到道具的限制,仅能基于道具的样式生成虚拟猫。示例性的,如道具的样式为一个虚拟猫,则用户使用该道具,可以触发终端显示一个虚拟猫,如道具的样式为两个虚拟猫,则用户使用该道具,可以触发终端显示两个虚拟猫,与用户的交互方式单一。且终端不支持用户多次点击,多次生成虚拟猫,在一次道具的使用过程中,终端只能支持使用一次道具。据此,图2所示的AR场景中的人机交互方法单一。In this type of short video application, the terminal does not need to pre-scan the real scene or obtain a priori map information of the real scene, and can display a virtual cat at any position in the real scene, but the virtual objects displayed by the terminal are limited by props, only A virtual cat can be generated based on the style of the prop. Exemplarily, if the style of the prop is a virtual cat, the user can use the prop to trigger the terminal to display one virtual cat; if the style of the prop is two virtual cats, the user can use the prop to trigger the terminal to display two virtual cats. Cats have a single way of interacting with users. And the terminal does not support the user to click multiple times to generate virtual cats multiple times. During the use of a prop, the terminal can only support the use of the prop once. Accordingly, the human-computer interaction method in the AR scene shown in FIG. 2 is single.
基于现有技术中的问题,本申请实施例提供一种AR场景中的人机交互方法,终端无需预先扫描现实场景中的平面或获取现实场景的先验地图信息,且可以响应于用户的多次触发,连续生成并显示多个虚拟对象,能够丰富人机交互方式,提高用户体验。Based on the problems in the prior art, the embodiment of the present application provides a human-computer interaction method in the AR scene. The terminal does not need to pre-scan the plane in the real scene or obtain the prior map information of the real scene, and can respond to the user's multiple Once triggered, multiple virtual objects are continuously generated and displayed, which can enrich human-computer interaction methods and improve user experience.
应理解,本申请实施例对虚拟对象不做限制,示例性的,如虚拟对象可以为:动物、人物,物体等。It should be understood that the embodiment of the present application does not limit the virtual object. Exemplarily, the virtual object may be: an animal, a character, an object, and the like.
本申请实施例提供的AR场景中的人机交互方法应用于终端中,终端可以称为用户设备(user equipment,UE)等,例如,终端可以为手机、平板电脑(portable android device,PAD)、个人数字处理(personal digital assistant,PDA)、具有无线通信功能的手持设备、计算设备或可穿戴设备,虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、工业控制(industrial control)中的终端、智慧家庭(smart home)中的终端等,本申请实施例中对终端的形态不做具体限定。The human-computer interaction method in the AR scene provided by the embodiment of the present application is applied to a terminal, and the terminal may be called user equipment (user equipment, UE), etc., for example, the terminal may be a mobile phone, a tablet computer (portable android device, PAD), Personal digital assistant (PDA), handheld devices with wireless communication functions, computing devices or wearable devices, virtual reality (virtual reality, VR) terminal equipment, augmented reality (augmented reality, AR) terminal equipment, industrial control The terminal in (industrial control), the terminal in smart home (smart home), etc., the form of the terminal is not specifically limited in the embodiment of the present application.
下面结合具体的实施例对本申请实施例提供的AR场景中的人机交互方法进行说明。下面这几个实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。The human-computer interaction method in the AR scene provided by the embodiment of the present application will be described below in combination with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments.
图3为本申请实施例提供的AR场景中的人机交互方法的一种实施例的流程示意图。参照图3,本申请实施例提供的AR场景中的人机交互方法可以包括:FIG. 3 is a schematic flowchart of an embodiment of a method for human-computer interaction in an AR scene provided by an embodiment of the present application. Referring to Figure 3, the human-computer interaction method in the AR scene provided by the embodiment of the present application may include:
S301,拍摄现实场景,且显示拍摄得到的现实场景。S301. Shoot a real scene, and display the shot real scene.
在一种实施例中,终端可以响应于用户的指令,拍摄现实场景。在一种实施例中,现实场景与虚拟场景相对,现实场景为现实中的场景。示例性的,如用户打开AR应用程序,或者用户执行一操作,即为用户的指令,可以触发终端控制终端的拍摄装置(如摄像头)开始拍摄现实场景。在一种实施例中,终端可以在录制视频前拍摄现实场景,或者在录制视频的过程中拍摄现实场景,终端拍摄现实场景时,可以在终端的屏幕上显示拍摄得到的现实场景,或可以说终端可以在终端的界面上显示拍摄得到的现实场景的画面。In an embodiment, the terminal can shoot a real scene in response to a user's instruction. In one embodiment, the real scene is opposite to the virtual scene, and the real scene is a real scene. Exemplarily, if the user opens the AR application program, or the user performs an operation, which is the user's instruction, the terminal may be triggered to control the terminal's photographing device (such as a camera) to start photographing the real scene. In one embodiment, the terminal can shoot the real scene before recording the video, or shoot the real scene during the recording of the video. When the terminal shoots the real scene, it can display the captured real scene on the screen of the terminal, or it can be said The terminal may display the photographed picture of the real scene on the interface of the terminal.
图4中以终端录制视频前拍摄现实场景为例,参照图4中的a,用户打开一AR应用程序,终端的界面上显示有画面预览框41和拍摄控件42。在用户未点击拍摄控件42的场景下,终端可以拍摄现实场景,且在画面预览框41中显示拍摄得到的现实场景的画面。示例性的,如现实场景中包括桌子。在一种实施例中,可以想到的是,用户可以点击拍摄控件42,终端也可以拍摄现实场景,且在画面预览框41中显示拍摄得到的现实场景。In FIG. 4, the terminal takes shooting a real scene before recording a video as an example. Referring to a in FIG. In the scene where the user does not click the shooting control 42 , the terminal can shoot the real scene, and display the captured picture of the real scene in the picture preview box 41 . Exemplarily, a table is included in a real scene. In one embodiment, it is conceivable that the user can click the shooting control 42 , and the terminal can also shoot a real scene, and display the captured real scene in the picture preview box 41 .
S302,响应于用户的第一操作,在终端拍摄的现实场景中显示第一虚拟对象。S302. Display the first virtual object in the real scene captured by the terminal in response to the first operation of the user.
第一操作可以包括但不限于:用户操作终端的界面,或者用户说出一段语音等,本申请实施例对第一操作不做限制。其中,用户操作终端的界面可以包括但不限于:点击、滑动,或者用户不接触终端的屏幕执行一手势等。The first operation may include but is not limited to: the user operates the interface of the terminal, or the user speaks a piece of voice, etc. The embodiment of the present application does not limit the first operation. Wherein, the interface of the terminal operated by the user may include but not limited to: click, slide, or perform a gesture without touching the screen of the terminal.
在一种实施例中,第一虚拟对象为预设的。示例性的,如第一虚拟对象为火柴人,以第一操作为用户点击终端的界面为例,用户点击界面上的A位置,则终端响应于用户的点击操作,可以在A位置对应的现实场景中的相应位置处显示火柴人,如图4中的b所示。示例性的,以第一操作为用户说出一段语音为例,如用户说出“在屏幕的中心位置显示”,则终端响应于用户说出该语音的操作,可以在“屏幕中心位置”对应的现实场景中的相应位置处显示火柴人。In one embodiment, the first virtual object is preset. Exemplarily, if the first virtual object is a stickman, take the first operation as an example where the user clicks on the interface of the terminal, and the user clicks on the position A on the interface, then the terminal can respond to the user's click operation, and the real The stickman is displayed at the corresponding position in the scene, as shown in b in Fig. 4 . Exemplarily, taking the first operation as the user uttering a piece of voice as an example, if the user says "display at the center of the screen", the terminal may respond to the user's operation of uttering the voice, and the corresponding The stickman is displayed at the corresponding position in the real scene.
在一种实施例中,第一虚拟对象为用户设置的。In one embodiment, the first virtual object is set by the user.
示例性的,参照图5中的a,用户打开一AR应用程序,终端的界面上除了显示有画面预览框41和拍摄控件42,还显示有至少一个待选择的虚拟对象的图标43,用户操作界面上显示的任一待选择的虚拟对象的图标43,可以确定第一虚拟对象。应理解,图5中的a以文字表征虚拟对象的图标,如至少一个待选择的虚拟对象的图标43可以包括:火柴人、公主、长方体等。在该种实施例中,第一操作可以为:用户点击待选择的虚拟对象的图标43以及点击终端的界面,或者,用户按住待选择的虚拟对象的图标43不放且滑动至一位置,或者,用户说出一段语音等。Exemplarily, referring to a in FIG. 5 , the user opens an AR application program. In addition to displaying a screen preview box 41 and a shooting control 42 on the terminal interface, there is also an icon 43 of at least one virtual object to be selected. The icon 43 of any virtual object to be selected displayed on the interface may determine the first virtual object. It should be understood that a in FIG. 5 represents an icon of a virtual object in words, for example, at least one icon 43 of a virtual object to be selected may include: a stickman, a princess, a cuboid, and the like. In this embodiment, the first operation may be: the user clicks the icon 43 of the virtual object to be selected and clicks on the interface of the terminal, or the user presses and holds the icon 43 of the virtual object to be selected and slides to a position, Alternatively, the user speaks a piece of voice, etc.
示例性的,参照图5中的b,用户可以先点击“火柴人”图标,然后在终端的界面上点击A位置,则终端响应于用户的第一操作,在A位置对应的现实场景中的相应位置处显示火柴人。应理解,图5中以先点击的用户的手以虚线表征,以后点击的用户的手以实线表征。For example, referring to b in Figure 5, the user can first click the "stickman" icon, and then click the position A on the interface of the terminal, then the terminal responds to the user's first operation, and in the real scene corresponding to the position A The stickman is displayed at the corresponding position. It should be understood that in FIG. 5 , the hand of the user who clicks first is represented by a dotted line, and the hand of the user who clicks later is represented by a solid line.
示例性的,如用户可以按住“火柴人”图标不放滑动至终端的界面上的A位置,则终端响应于用户的第一操作,可以在A位置对应的现实场景中的相应位置处显示火柴人。示例性的,如用户说出“在屏幕的中心位置显示火柴人”,则终端响应于用户说出该语音的第一操作,可以在“屏幕中心位置”对应的现实场景中的相应位置处显示火柴人。在用户采用语音方式的场景中,终端可以识别来自用户的语音,以确定语音中是否包含有终端界面上显示的至少一个待选择的虚拟对象,进而确定第一虚拟对象。其中,终端均可以将用于的语音中包含有待选择的虚拟对象作为第一虚拟对象。Exemplarily, if the user can press and hold the "stickman" icon and slide to position A on the interface of the terminal, then the terminal can respond to the user's first operation and display at the corresponding position in the real scene corresponding to position A Matchstick Men. Exemplarily, if the user says "display the stickman at the center of the screen", the terminal may display the stickman at the corresponding position in the real scene corresponding to the "center of the screen" in response to the user's first operation of uttering the voice. Matchstick Men. In the scenario where the user uses voice, the terminal can recognize the voice from the user to determine whether the voice contains at least one virtual object to be selected displayed on the terminal interface, and then determine the first virtual object. Wherein, the terminal may use the voice of the user to include the virtual object to be selected as the first virtual object.
在一种实施例中,终端响应于用户的第一操作,还可以对拍摄的现实场景的画面进行边缘检测,进而在终端拍摄的现实场景的目标位置显示第一虚拟对象。示例性的,现实场景中 包括桌子,终端可以对桌子进行边缘检测,以在桌子上(即目标位置)显示第一虚拟对象,而并非在其他位置显示第一虚拟对象,使得用户能真实感受该第一虚拟对象处于现实场景中,该第一虚拟对象与现实场景并不违和。在一种实施例中,目标位置可以为地面、物体平面、人物的头部、肩部等。In an embodiment, in response to the user's first operation, the terminal may also perform edge detection on the captured real-world scene, and then display the first virtual object at the target position of the real-world scene captured by the terminal. Exemplarily, the real scene includes a table, and the terminal may perform edge detection on the table, so as to display the first virtual object on the table (that is, the target position) instead of displaying the first virtual object at other positions, so that the user can truly feel the The first virtual object is in the real scene, and the first virtual object does not violate the real scene. In an embodiment, the target position may be the ground, an object plane, a head, a shoulder of a character, and the like.
S303,响应于用户的第二操作,在终端显示的现实场景中显示第二虚拟对象。S303. In response to the second operation of the user, display the second virtual object in the real scene displayed on the terminal.
第二操作可以与第一操作相同或不同,第二操作可以参照第一操作的相关描述。The second operation may be the same as or different from the first operation, and the second operation may refer to related descriptions of the first operation.
第二虚拟对象可以与第一虚拟对象相同或不同。在一种实施例中,第二虚拟对象可以为预设的,或者,第一虚拟对象为用户设置的,具体可以参照S302中第一虚拟对象的相关描述。The second virtual object may be the same as or different from the first virtual object. In an embodiment, the second virtual object may be preset, or the first virtual object may be set by the user, for details, refer to the related description of the first virtual object in S302.
示例性的,以图5为例,参照图5中的c和d,终端在A位置对应的现实场景相应位置处显示火柴人后,用户可以再次点击“长方体”图标,以及点击B位置,则终端响应于用户的第二操作,可以在B位置对应的现实场景中对应位置处显示长方体。在一种实施例中,显示第二虚拟对象的现实场景和显示第一虚拟对象的现实场景为同一现实场景,但终端的界面因为尺寸原因或者拍摄角度原因,可以在一画面上同时显示第二虚拟对象和第一虚拟对象,应理解,图5的d中以在终端显示的同一画面上同时显示第二虚拟对象和第一虚拟对象为例说明。Exemplarily, taking Figure 5 as an example, referring to c and d in Figure 5, after the terminal displays the stickman at the corresponding position in the real scene corresponding to position A, the user can click the "cuboid" icon again and click position B, then In response to the second operation of the user, the terminal may display a cuboid at a corresponding position in the real scene corresponding to position B. In one embodiment, the real scene where the second virtual object is displayed and the real scene where the first virtual object is displayed are the same real scene, but the interface of the terminal can simultaneously display the second For the virtual object and the first virtual object, it should be understood that, in d of FIG. 5 , the second virtual object and the first virtual object are simultaneously displayed on the same screen displayed by the terminal as an example.
在一种实施例中,B位置对应的现实场景中对应位置处可以称为终端拍摄的现实场景的第二位置。相应的,S303可以替换为:响应于用户的第二操作,在终端拍摄的现实场景的第二位置显示所述第二虚拟对象。In an embodiment, the corresponding position in the real scene corresponding to position B may be referred to as the second position of the real scene captured by the terminal. Correspondingly, S303 may be replaced by: displaying the second virtual object at the second position of the real scene captured by the terminal in response to the second operation of the user.
如上S302和S303中,均由用户设置显示虚拟对象(如第一虚拟对象和第二虚拟对象)的位置,在一种实施例中,显示虚拟对象的位置可以是预先设置的。As in S302 and S303 above, the user sets the positions for displaying virtual objects (such as the first virtual object and the second virtual object). In one embodiment, the positions for displaying virtual objects may be preset.
示例性的,终端的界面上显示虚拟对象的预设位置为A位置和B位置,在虚拟对象为预设的场景中,用户打开AR应用程序,可以在A位置、B位置对应的现实场景中对应位置处显示火柴人。Exemplarily, the preset positions of the virtual object displayed on the interface of the terminal are position A and position B. In the scene where the virtual object is preset, the user opens the AR application, and can be in the real scene corresponding to position A and position B. The stickman is displayed at the corresponding position.
在一种实施例中,用户执行第一操作和用户执行第二操作时,终端拍摄的现实场景的画面可以相同或不同,均可以称为终端拍摄的现实场景,或终端的界面上显示的现实场景。In one embodiment, when the user performs the first operation and the user performs the second operation, the pictures of the real scene captured by the terminal can be the same or different, and both can be referred to as the real scene captured by the terminal, or the real scene displayed on the interface of the terminal. Scenes.
示例性的,终端的界面上显示虚拟对象的预设位置为A位置和B位置,在虚拟对象为用户设置的场景中,参照图6中的a-c,用户打开AR应用程序,终端的界面上可以显示画面预览框41、拍摄控件42、至少一个待选择的虚拟对象的图标43,以及预设位置44,图6中的a以方框表征预设位置,如预设位置包括A位置、B位置。用户可以点击“火柴人”图标,终端响应于用户的操作,可以在A位置、B位置对应的现实场景中相应位置处显示火柴人。在一种实施例中,用户还可以依次选择A位置对应的现实场景中相应位置处显示的第一虚拟对象,以及B位置对应的现实场景中相应位置处显示的第二虚拟对象,在该种实施例中,不同预设位置处显示的虚拟对象可以相同或不同。Exemplarily, the preset positions of the virtual object displayed on the interface of the terminal are position A and position B. In the scene where the virtual object is set by the user, refer to a-c in FIG. Display a picture preview frame 41, a shooting control 42, an icon 43 of at least one virtual object to be selected, and a preset position 44, a in FIG. . The user can click the "stickman" icon, and the terminal can display the stickman at the corresponding position in the real scene corresponding to position A and position B in response to the user's operation. In one embodiment, the user can also sequentially select the first virtual object displayed at the corresponding position in the real scene corresponding to position A, and the second virtual object displayed at the corresponding position in the real scene corresponding to position B. In an embodiment, virtual objects displayed at different preset positions may be the same or different.
在显示虚拟对象的位置为预先设置的场景中,S302和S303可以替换为:终端在界面的预设位置对应的现实场景中相应位置处显示第一虚拟对象和第二虚拟对象。In the scene where the position of displaying the virtual object is preset, S302 and S303 may be replaced by: the terminal displays the first virtual object and the second virtual object at corresponding positions in the real scene corresponding to the preset position on the interface.
本申请实施例中,终端无需预先获取扫描现实场景或获取现实场景的先验地图信息,且可以基于用户的多次操作,生成并显示多个虚拟对象,不仅可以提高效率,且可以丰富人机交互的方式,提高用户体验。In the embodiment of this application, the terminal does not need to obtain the prior map information of scanning the real scene or obtaining the real scene in advance, and can generate and display multiple virtual objects based on the user's multiple operations, which can not only improve efficiency, but also enrich human-machine Interactive way to improve user experience.
如上图3所示的实施例中讲述了终端可以基于用户的多次操作,生成并显示多个虚拟对象,下述介绍终端生成并显示多个虚拟对象的具体过程。在一种实施例中,参照图7,上述实施例中的S302可以包括:As shown in the embodiment shown in FIG. 3 above, it is described that the terminal can generate and display multiple virtual objects based on multiple operations of the user. The following describes the specific process of the terminal generating and displaying multiple virtual objects. In an embodiment, referring to FIG. 7, S302 in the above embodiment may include:
S3021,响应于用户的第一操作,获取终端的第一位姿,以及第一位置在图像坐标系中的二维坐标。S3021. In response to a first user operation, acquire a first pose of the terminal and two-dimensional coordinates of the first position in the image coordinate system.
终端响应于用户的第一操作,可以获取终端的第一位姿。应理解的是,终端中可以设置用于检测终端的位姿的传感器,该传感器可以包括但不限于:加速度传感器、角速度传感器、重力检测传感器等。示例性的,终端可以基于加速度传感器、角速度传感器获取终端的惯性测量(inertial measurement unit,IMU)数据,终端可以基于重力检测传感器采集终端的重力轴数据,进而终端可以基于IMU数据、重力轴数据,获取终端的第一位姿,本申请实施例中对终端如何获取第一位姿的原理不做赘述。The terminal may acquire the first pose of the terminal in response to the first operation of the user. It should be understood that a sensor for detecting the pose of the terminal may be provided in the terminal, and the sensor may include but not limited to: an acceleration sensor, an angular velocity sensor, a gravity detection sensor, and the like. Exemplarily, the terminal may acquire inertial measurement (inertial measurement unit, IMU) data of the terminal based on the acceleration sensor and the angular velocity sensor, the terminal may acquire the gravity axis data of the terminal based on the gravity detection sensor, and then the terminal may acquire the gravity axis data based on the IMU data and the gravity axis data, To acquire the first pose of the terminal, the principle of how the terminal acquires the first pose will not be described in this embodiment.
第一位置对应的现实场景的相应位置处用于显示第一虚拟对象的位置,该第一位置可以为预先设置的或者由用户设置的,应理解,该第一位置为终端的界面上的位置。其中,可以将终端拍摄得到的画面看做一个平面,则得到第一位置在图像坐标系中的二维坐标,应理解,该第一位置可以为预设的位置或者用户设置的位置。示例性的,如可以将终端的界面的左下角作为二维坐标系的原点,则终端可以获取第一位置在图像坐标系中的二维坐标,本申请实施例对图像坐标系的设置不做限制。在一种实施例中,第一位置在图像坐标系中的二维坐标还可以称为:第一位置在终端拍摄的现实场景的画面中的二维坐标。The position corresponding to the first position in the real scene is used to display the position of the first virtual object. The first position may be preset or set by the user. It should be understood that the first position is a position on the interface of the terminal . Wherein, the picture captured by the terminal can be regarded as a plane, and the two-dimensional coordinates of the first position in the image coordinate system can be obtained. It should be understood that the first position can be a preset position or a position set by the user. For example, if the lower left corner of the terminal interface can be used as the origin of the two-dimensional coordinate system, the terminal can obtain the two-dimensional coordinates of the first position in the image coordinate system. limit. In an embodiment, the two-dimensional coordinates of the first position in the image coordinate system may also be referred to as: the two-dimensional coordinates of the first position in the frame of the real scene captured by the terminal.
S3022,追踪第一位置。S3022. Track the first position.
在一种实施例中,终端追踪第一位置的方式可以包括但不限于:特征点法、光流法、基于深度学习的图像块追踪法等。在一种实施例中,终端追踪第一位置可以理解为:终端追踪第一位置对应的图像块。In an embodiment, the method for the terminal to track the first location may include but not limited to: a feature point method, an optical flow method, an image block tracking method based on deep learning, and the like. In an embodiment, the tracking of the first position by the terminal may be understood as: the terminal tracks the image block corresponding to the first position.
特征点法指的是,终端根据第一位置对应的图像块的特征,追踪第一位置对应的图像块。终端可以采用不限于如下方式获取图像块的特征:尺度不变特征变换(scale-invariant feature transform,SIFT)算法、加速稳健特征(speeded up robust features,SURF)算法,或者快速定向和旋转(oriented fast and rotated brief,ORB)算法。The feature point method refers to that the terminal tracks the image block corresponding to the first position according to the feature of the image block corresponding to the first position. The terminal can acquire the features of the image block in the following ways: scale-invariant feature transform (SIFT) algorithm, speeded up robust features (SURF) algorithm, or fast orientation and rotation (oriented fast and rotated brief, ORB) algorithm.
光流(optical flow)是目标、场景或摄像机在连续两帧图像间运动时造成的目标的运动。它是图像在平移过程中的二维矢量场,是通过二维图像来表示物体点三维运动的速度场,反映了微小时间间隔内由于运动形成的图像变化,以确定图像点上的运动方向和运动速率。光流法可以包括但不限于:Lukas-Kanade算法、KLT(Kanade-Lucas-Tomasi)追踪算法等。Optical flow is the movement of the target caused by the target, scene or camera moving between two consecutive frames of images. It is the two-dimensional vector field of the image during the translation process. It is the velocity field that represents the three-dimensional motion of the object point through the two-dimensional image. Movement rate. The optical flow method may include but not limited to: Lukas-Kanade algorithm, KLT (Kanade-Lucas-Tomasi) tracking algorithm and the like.
基于深度学习的图像块追踪法可以理解为:终端将第一位置对应的图像块输入至预先基于深度学习训练的模型中,以实现第一位置对应的图像块的追踪。本申请实施例对如何训练图像块追踪的模型,以及图像块追踪的模型不做赘述。The image block tracking method based on deep learning can be understood as: the terminal inputs the image block corresponding to the first position into a pre-trained model based on deep learning, so as to realize the tracking of the image block corresponding to the first position. In the embodiment of the present application, how to train the image block tracking model and the image block tracking model are not described in detail.
在一种实施例中,第一位置可以为用户设置的位置。其中,终端响应于用户的第一操作,可以追踪第一位置。In an embodiment, the first location may be a location set by a user. Wherein, the terminal can track the first location in response to the first operation of the user.
在一种实施例中,第一位置为预设的。当第一虚拟对象为预设的时,S3021可以相应的替换为:追踪第一位置。In one embodiment, the first position is preset. When the first virtual object is preset, S3021 may be correspondingly replaced with: tracking the first position.
示例性的,终端拍摄的现实场景的画面如图9所示,无论显示虚拟对象的位置是预设的还是用户设置的,终端均可以追踪显示虚拟对象的位置的图像块。图9中以方框内包含的图像块(像素块)表征第一位置对应的图像块。Exemplarily, the picture of the real scene captured by the terminal is shown in FIG. 9 . Regardless of whether the position of the virtual object is preset or set by the user, the terminal can track the image block displaying the position of the virtual object. In FIG. 9 , the image block corresponding to the first position is represented by the image block (pixel block) contained in the box.
S3023,根据第一位置在图像坐标系的二维坐标,获取第一位置在虚拟平面上的映射点的三维坐标。S3023. According to the two-dimensional coordinates of the first position in the image coordinate system, acquire the three-dimensional coordinates of the mapping point of the first position on the virtual plane.
虚拟平面为预先设置的。本申请实施例中无需预先扫描现实场景得到现实场景中的平面,也无需获取现实场景的先验地图信息,是因为本申请实施例中预先设置虚拟平面,可以在该 虚拟平面上显示虚拟对象。虚拟平面可以由一个点的三维坐标,以及该虚拟平面的法向量进行表征。示例性的,虚拟平面对应的点的三维坐标为(X0,Y0,Z1),虚拟平面的法向量为向量n。应理解,虚拟平面的Z值是在相机坐标系下的Z值,相机坐标系指的是以终端的光心为原点的坐标系,本申请实施例中以终端的光心为终端所处的位置为例,终端所处的位置在本申请实施例中可以理解为终端的位姿。在一种实施例中,第一位置在虚拟平面上的映射点可以称为第一映射点。Virtual planes are preset. In the embodiment of the present application, there is no need to pre-scan the real scene to obtain the plane in the real scene, and it is not necessary to obtain the prior map information of the real scene, because the virtual plane is preset in the embodiment of the present application, and virtual objects can be displayed on the virtual plane. A virtual plane can be characterized by the three-dimensional coordinates of a point and the normal vector of the virtual plane. Exemplarily, the three-dimensional coordinates of the point corresponding to the virtual plane are (X0, Y0, Z1), and the normal vector of the virtual plane is a vector n. It should be understood that the Z value of the virtual plane is the Z value in the camera coordinate system, and the camera coordinate system refers to a coordinate system with the optical center of the terminal as the origin. The position is taken as an example. The position of the terminal may be understood as the pose of the terminal in the embodiment of the present application. In an embodiment, the mapping point of the first position on the virtual plane may be referred to as a first mapping point.
在一种实施例中,第一位置在虚拟平面上的映射点的三维坐标可以理解为:第一位置在虚拟平面上的映射点在相机坐标系中的三维坐标。In an embodiment, the three-dimensional coordinates of the mapping point of the first position on the virtual plane may be understood as: the three-dimensional coordinates of the mapping point of the first position on the virtual plane in the camera coordinate system.
图10为本申请实施例提供的三角化的示意图,图10中以虚拟平面与终端显示的画面平行为例进行说明。参照图10,终端的第一位姿为C1,应理解,C1可以理解为终端位姿为第一位姿时终端的光心的位置,第一位置在图像坐标系中的位置为a1,第一位置在虚拟平面上的映射点可以理解为:C1朝向a1的射线与虚拟平面的交点A'。第一位置在虚拟平面上的映射点的横坐标、纵坐标分别与第一位置的二维坐标中的横坐标、纵坐标相同,第一位置在虚拟平面上的映射点的Z值为虚拟平面的Z值(预设值,如虚拟平面与终端显示的画面所处的平面之间的距离)。示例性的,第一位置在图像坐标系的二维坐标为(X1,Y1),虚拟平面的Z值为Z1,则第一位置在虚拟平面上映射点的三维坐标为(X1,Y1,Z1)。FIG. 10 is a schematic diagram of triangulation provided by the embodiment of the present application. In FIG. 10 , the virtual plane is parallel to the screen displayed on the terminal as an example for illustration. Referring to Figure 10, the first pose of the terminal is C1. It should be understood that C1 can be understood as the position of the optical center of the terminal when the terminal pose is the first pose. The position of the first position in the image coordinate system is a1. The mapping point of a position on the virtual plane can be understood as: the intersection point A' of the ray from C1 towards a1 and the virtual plane. The abscissa and ordinate of the mapping point of the first position on the virtual plane are respectively the same as the abscissa and ordinate of the two-dimensional coordinates of the first position, and the Z value of the mapping point of the first position on the virtual plane is the virtual plane Z value (preset value, such as the distance between the virtual plane and the plane where the picture displayed on the terminal is located). Exemplarily, the two-dimensional coordinates of the first position in the image coordinate system are (X1, Y1), and the Z value of the virtual plane is Z1, then the three-dimensional coordinates of the mapping point of the first position on the virtual plane are (X1, Y1, Z1 ).
在一种实施例中,虚拟平面可以为至少一个,应理解,预先设置的至少一个虚拟平面可以称为预设虚拟平面集合。示例性的,本申请实施例可以预先设置3个虚拟平面为例进行说明,如该预先设置的3个虚拟平面可以包括:第一虚拟平面、第二虚拟平面,以及第三虚拟平面。示例性的,第一虚拟平面与地面(或水平面)平行,第二虚拟平面、第三虚拟平面均与第一虚拟平面垂直,且第二虚拟平面与第三虚拟平面垂直,如图8所示。In an embodiment, there may be at least one virtual plane. It should be understood that the preset at least one virtual plane may be referred to as a set of preset virtual planes. Exemplarily, the embodiment of the present application may be described by setting three virtual planes in advance as an example. For example, the three virtual planes preset may include: a first virtual plane, a second virtual plane, and a third virtual plane. Exemplarily, the first virtual plane is parallel to the ground (or horizontal plane), the second virtual plane and the third virtual plane are both perpendicular to the first virtual plane, and the second virtual plane is perpendicular to the third virtual plane, as shown in FIG. 8 .
在该实施例中,终端在得到第一位置在图像坐标系中的二维坐标后,如图10所示终端可以获取第一位姿C1至a1的射线,终端可以检测该射线是否与第一虚拟平面有交点A',若存在交点A',则终端可以将A'作为第一位置在第一虚拟平面上的映射点,以获取A'的三维坐标。若该射线与第一虚拟平面没有交点,则终端可以检测该射线是否与第二虚拟平面是否有交点,进而获取该交点的三维坐标。在一种实施例中,终端可以按照至少一个虚拟平面的优先级,依次获取第一位姿C1至a1的射线与虚拟平面是否存在交点,以获取交点的三维坐标。也就是说,为了保证第一位姿C1至a1的射线与预先设置的虚拟平面有交点,以为后续生成虚拟对象打下基础,本申请实施例中可以预先设置多个虚拟平面,以在射线与虚拟平面存在交点的虚拟平面上生成虚拟对象。In this embodiment, after the terminal obtains the two-dimensional coordinates of the first position in the image coordinate system, as shown in FIG. The virtual plane has an intersection point A', and if there is an intersection point A', the terminal may use A' as a mapping point of the first position on the first virtual plane, so as to obtain the three-dimensional coordinates of A'. If the ray has no intersection with the first virtual plane, the terminal may detect whether the ray has an intersection with the second virtual plane, and then obtain the three-dimensional coordinates of the intersection. In an embodiment, the terminal may sequentially obtain whether there is an intersection point between the rays of the first poses C1 to a1 and the virtual plane according to the priority of at least one virtual plane, so as to obtain the three-dimensional coordinates of the intersection point. That is to say, in order to ensure that the ray of the first pose C1 to a1 has an intersection point with the preset virtual plane, so as to lay the foundation for the subsequent generation of virtual objects, multiple virtual planes can be preset in the embodiment of the present application, so that when the ray and the virtual plane A virtual object is created on a virtual plane where the planes intersect.
S3024,在虚拟平面上的映射点的三维坐标处显示第一虚拟对象。S3024. Display the first virtual object at the three-dimensional coordinates of the mapping point on the virtual plane.
如图5中的b所示,终端可以在虚拟平面上的映射点的三维坐标处处显示第一虚拟对象,如该三维坐标为A位置在虚拟平面上的映射点的三维坐标。在一种实施例中,第一虚拟对象的中心为该映射点,本申请实施例对第一虚拟对象与映射点的相对位置不做限制。As shown in b in FIG. 5 , the terminal may display the first virtual object at the three-dimensional coordinates of the mapping point on the virtual plane, for example, the three-dimensional coordinates are the three-dimensional coordinates of the mapping point at position A on the virtual plane. In one embodiment, the center of the first virtual object is the mapping point, and this embodiment of the present application does not limit the relative positions of the first virtual object and the mapping point.
在本申请实施例中,终端可以将第一位姿、第一位置在虚拟平面上的映射点的三维坐标进行送显,进而在虚拟平面上的映射点的三维坐标处显示第一虚拟对象。应理解,送显可以理解为:终端中的处理器将第一位姿,以及第一位置在虚拟平面上的映射点的三维坐标发送至显示器,以使得显示器在虚拟平面上的映射点的三维坐标处显示第一虚拟对象的过程,本申请实施例对送显的过程不做赘述。在该实施例中,第一位置在虚拟平面上的映射点即为显示第一虚拟对象所处的位置,即第一位置。In the embodiment of the present application, the terminal may send and display the first pose and the first position on the three-dimensional coordinates of the mapping point on the virtual plane, and then display the first virtual object at the three-dimensional coordinates of the mapping point on the virtual plane. It should be understood that sending display can be understood as: the processor in the terminal sends the first pose and the three-dimensional coordinates of the mapping point of the first position on the virtual plane to the display, so that the three-dimensional coordinates of the mapping point of the display on the virtual plane For the process of displaying the first virtual object at the coordinates, the embodiment of the present application does not repeat the process of sending for display. In this embodiment, the mapping point of the first position on the virtual plane is the position where the first virtual object is displayed, that is, the first position.
S3025,获取第一位置在现实场景中的三维坐标。S3025. Acquire the three-dimensional coordinates of the first position in the real scene.
用户在使用终端拍摄的过程中会移动手机,终端的位姿是实时变化的,相应的,第一位置在图像坐标系中的二维坐标也是实时变化的。若将所有的虚拟对象都显示在虚拟平面对应的映射点上,则用户感受不到虚拟对象与现实场景的关联,不能真实感受到虚拟对象处于现实场景中。本申请实施例中,为了使得用户感受虚拟对象真实存在于现实场景中,终端可以获取第一位置在现实场景中的三维坐标,进而在终端拍摄的画面的第一位置在现实场景中的三维坐标处显示虚拟对象。在一种实施例中,第一位置在现实场景中的三维坐标可以称为:第一位置在世界坐标系中的三维坐标。The user will move the mobile phone during the shooting process using the terminal, and the pose of the terminal changes in real time. Correspondingly, the two-dimensional coordinates of the first position in the image coordinate system also change in real time. If all the virtual objects are displayed on the mapping points corresponding to the virtual plane, the user will not feel the connection between the virtual objects and the real scene, and cannot really feel that the virtual objects are in the real scene. In the embodiment of the present application, in order to make the user feel that the virtual object actually exists in the real scene, the terminal can obtain the three-dimensional coordinates of the first position in the real scene, and then obtain the three-dimensional coordinates of the first position in the real scene of the picture captured by the terminal to display virtual objects. In an embodiment, the three-dimensional coordinates of the first position in the real scene may be referred to as: the three-dimensional coordinates of the first position in the world coordinate system.
在一种实施例中,当终端中设置的拍摄装置为单目拍摄装置时,终端可以对映射点进行三角化处理,以获取第一位置在现实场景中的三维坐标。下面对终端对映射点进行三角化处理的过程进行说明:In an embodiment, when the photographing device set in the terminal is a monocular photographing device, the terminal may perform triangulation processing on the mapping points, so as to obtain the three-dimensional coordinates of the first position in the real scene. The process of triangulating the mapping points by the terminal is described below:
终端可以根据第一位姿、第一位姿时第一位置在图像坐标系中的二维坐标、第二位姿,以及第二位姿时第一位置在图像坐标系中的二维坐标,获取第一位置在现实场景中的三维坐标。应理解,第二位姿为第一位姿之后的位姿,因为终端可以追踪第一位置,因此可以获取第二位姿时第一位置在图像坐标系中的二维坐标。The terminal may, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system in the first pose, the second pose, and the two-dimensional coordinates of the first position in the image coordinate system in the second pose, Acquire the 3D coordinates of the first position in the real scene. It should be understood that the second pose is a pose after the first pose, because the terminal can track the first position, and therefore can obtain the two-dimensional coordinates of the first position in the image coordinate system at the second pose.
参照图10,终端的第二位姿为C2,第二位姿时第一位置在图像坐标系中的二维坐标为a2,本申请实施例中,可以将C1朝向a1的射线和C2朝向a2的射线的交点作为第一位置在现实场景中的位置,将该交点A的三维坐标作为第一位置在现实场景中的三维坐标。其中,可以理解的是,在已知终端的第一位姿C1、第二位姿C2以及两个画面中的二维坐标a1和a2的前提下,可以结合终端的内参数矩阵,通过三角化(或三角测量)的方式获取交点A的三维坐标。Referring to Figure 10, the second pose of the terminal is C2, and the two-dimensional coordinate of the first position in the image coordinate system at the second pose is a2. In this embodiment of the application, C1 can be directed toward the ray of a1 and C2 toward a2 The intersection point of the ray is used as the position of the first position in the real scene, and the three-dimensional coordinates of the intersection point A are used as the three-dimensional coordinates of the first position in the real scene. Among them, it can be understood that, on the premise that the first pose C1, the second pose C2 of the terminal and the two-dimensional coordinates a1 and a2 in the two pictures are known, the internal parameter matrix of the terminal can be combined to triangulate (or triangulation) to obtain the three-dimensional coordinates of the intersection point A.
三角化的具体方式如下,假设交点A在第一位姿C1时的相机坐标系的坐标为(x1,y1,z1),在第二位姿C2时的相机坐标系的坐标为(x2,y2,z2),第一位姿时第一位置在图像坐标系中的二维坐标a1为(u1,v1),第二位姿时第一位置在图像坐标系中的二维坐标a2为(u2,v2),终端的内参数矩阵K,那么根据相机的成像过程可得公式1:The specific method of triangulation is as follows, assuming that the coordinates of the camera coordinate system of the intersection point A in the first pose C1 are (x1, y1, z1), and the coordinates of the camera coordinate system in the second pose C2 are (x2, y2 , z2), the two-dimensional coordinate a1 of the first position in the image coordinate system in the first pose is (u1, v1), and the two-dimensional coordinate a2 of the first position in the image coordinate system in the second pose is (u2 , v2), the internal parameter matrix K of the terminal, then according to the imaging process of the camera, formula 1 can be obtained:
Figure PCTCN2022134830-appb-000001
Figure PCTCN2022134830-appb-000001
上述公式1中,d1和d2分别为交点A在第一位姿C1和第二位姿C2时,在相机坐标系中的深度。然后根据终端的第一位姿C1和第二位姿C2,可以计算得到第一位姿C1的相机坐标系变换到第二位姿C2的相机坐标系的旋转矩阵R和平移矩阵t,那么可得公式2:In the above formula 1, d1 and d2 are the depths of the intersection point A in the camera coordinate system at the first pose C1 and the second pose C2 respectively. Then, according to the first pose C1 and the second pose C2 of the terminal, the rotation matrix R and the translation matrix t of the camera coordinate system of the first pose C1 transformed to the camera coordinate system of the second pose C2 can be calculated, then Get formula 2:
Figure PCTCN2022134830-appb-000002
Figure PCTCN2022134830-appb-000002
把公式1代入公式2中可得公式3:Substituting Equation 1 into Equation 2 yields Equation 3:
Figure PCTCN2022134830-appb-000003
Figure PCTCN2022134830-appb-000003
公式3可以写成公式4:Equation 3 can be written as Equation 4:
CD=t    公式4CD = t Formula 4
其中,系数C和未知量D的具体形式为公式5所示:Among them, the specific form of the coefficient C and the unknown quantity D is shown in formula 5:
Figure PCTCN2022134830-appb-000004
Figure PCTCN2022134830-appb-000004
通过公式4和5可以求解出未知量D,也就是交点A在第一位姿C1和第二位姿C2这两个时刻的相机坐标系中的深度,有了深度后可以根据公式1算出交点A在C1和C2时的相机坐标系下的三维坐标,再通过交点A的第一位姿C1即可将相机坐标系坐标转换到现实场景坐标系中的坐标,由此即可以获取交点A的三维坐标。在一种实施例中,现实场景坐标系可以理解为世界坐标系。The unknown quantity D can be solved by formulas 4 and 5, that is, the depth of the intersection point A in the camera coordinate system at the first pose C1 and the second pose C2. After the depth is obtained, the intersection point can be calculated according to formula 1 The three-dimensional coordinates of A in the camera coordinate system at C1 and C2, and then through the first pose C1 of the intersection point A, the coordinates of the camera coordinate system can be converted to the coordinates in the real scene coordinate system, and the intersection point A can be obtained. 3D coordinates. In an embodiment, the real scene coordinate system can be understood as a world coordinate system.
上述介绍了对映射点进行三角化处理的具体过程,下面介绍终端对映射点进行三角化的条件:The above describes the specific process of triangulating the mapping points. The following describes the conditions for the terminal to triangulate the mapping points:
在一种实施例中,终端的位姿是实时变化的,终端可以获取终端的位姿与第一位姿之间的距离,且响应于终端的位姿与第一位姿之间的距离大于或等于预设距离,对映射点进行三角化处理。结合上述图10,终端响应于第二位姿与第一位姿之间的距离大于或等于预设距离,即C2与C1之间的距离大于或等于预设距离,对映射点进行三角化处理。In one embodiment, the pose of the terminal changes in real time, the terminal can obtain the distance between the pose of the terminal and the first pose, and respond to the distance between the pose of the terminal and the first pose being greater than Or equal to the preset distance, triangulate the map point. In conjunction with the above-mentioned Figure 10, the terminal performs triangulation processing on the mapping point in response to the distance between the second pose and the first pose being greater than or equal to the preset distance, that is, the distance between C2 and C1 is greater than or equal to the preset distance .
在一种实施例中,终端拍摄现实场景的过程中,可以得到连续多帧的图像,终端响应于已拍摄的图像的帧数达到预设帧数,可以对映射点进行三角化处理。或者,在该实施例中,在终端单位时长内拍摄的图像的帧数已知的前提下,终端也可以响应于拍摄的时长达到预设时长,对映射点进行三角化处理。In one embodiment, during the process of shooting a real scene, the terminal may obtain multiple consecutive frames of images, and the terminal may perform triangulation processing on the mapping points in response to the number of frames of the captured images reaching a preset number of frames. Alternatively, in this embodiment, on the premise that the number of frames of images captured within the terminal unit duration is known, the terminal may also perform triangulation processing on the mapping points in response to the duration of the capture reaching a preset duration.
在一种实施例中,在终端获取第一位姿之后,可以结合终端的位姿对映射点进行三角化处理,终端响应于三角化成功,可以得到第一位置在现实场景中的三维坐标。其中,三角化成功的条件可以包括但不限于:In an embodiment, after the terminal acquires the first pose, the mapping point may be triangulated in combination with the pose of the terminal, and the terminal may obtain the three-dimensional coordinates of the first position in the real scene in response to successful triangulation. Among them, the conditions for successful triangulation may include but not limited to:
1、至少有两个图像中可以跟踪到第一位置对应的图像块。1. The image block corresponding to the first position can be tracked in at least two images.
2、三角化得到的第一位置在现实场景中的三维坐标中的X,Y,Z均不为空值(null)。2. X, Y, and Z in the three-dimensional coordinates of the first position obtained by triangulation in the real scene are not null values (null).
3、三角化得到的第一位置在现实场景中的三维坐标在终端(或终端中的拍摄装置)的前方,即第一位置在相机坐标系中的三维坐标的Z值大于0。3. The three-dimensional coordinates of the first position obtained by triangulation in the real scene are in front of the terminal (or the shooting device in the terminal), that is, the Z value of the three-dimensional coordinates of the first position in the camera coordinate system is greater than 0.
4、将能够追踪到第一位置的图像块的图像进行三角化得到的第一位置在现实场景中的三维坐标,反投影至画面中的二维坐标系中,得到反投影点的二维坐标,计算反投影点与实际观测到的终端显示画面上的位置的距离,该距离小于或等于距离阈值。4. The three-dimensional coordinates of the first position in the real scene obtained by triangulating the image of the image block that can be traced to the first position are back-projected into the two-dimensional coordinate system in the screen to obtain the two-dimensional coordinates of the back-projected point , calculating the distance between the backprojection point and the actually observed position on the display screen of the terminal, where the distance is less than or equal to a distance threshold.
在一种实施例中,当终端中设置的拍摄装置为双目拍摄装置时,终端可以在第一位姿时,根据两个摄像头的位姿、以及第一位置分别在两个摄像头拍摄的画面中的二维坐标,获取第一位置在现实场景中的三维坐标,而无需借助终端在第二位姿获取第一位置在现实场景中的三维坐标。In one embodiment, when the shooting device set in the terminal is a binocular shooting device, the terminal can take the pictures taken by the two cameras according to the poses of the two cameras and the first position when the terminal is in the first position. The two-dimensional coordinates of the first position in the real scene are obtained to obtain the three-dimensional coordinates of the first position in the real scene, without the need of the terminal to obtain the three-dimensional coordinates of the first position in the real scene in the second pose.
S3026,根据第一位姿、第一位置在虚拟平面上的映射点的三维坐标,以及第一位置在现实场景中的三维坐标,获取第一位置对应的缩放比例。S3026. According to the first pose, the three-dimensional coordinates of the mapping point of the first position on the virtual plane, and the three-dimensional coordinates of the first position in the real scene, obtain the scaling ratio corresponding to the first position.
终端在得到第一位置在现实场景中的三维坐标后,若直接在该第一位置在现实场景中的三维坐标处显示虚拟对象,如终端将第二位姿、第一位置在现实场景中的三维坐标送显,则根据近大远小的原理,虚拟对象的大小会瞬间发生改变,用户体验差。示例性的,参照图10,因为a1在现实场景中的三维坐标距离第一位姿,相较于a1在虚拟平面上的映射点的三维坐标要远,这样若在A处显示虚拟对象,则用户可以看到画面上虚拟对象瞬间变小,用户体验差。After the terminal obtains the 3D coordinates of the first position in the real scene, if it directly displays the virtual object at the 3D coordinates of the first position in the real scene, for example, the terminal combines the second pose and the first position in the real scene The three-dimensional coordinates are sent to display, and the size of the virtual object will change instantly according to the principle that the near side is large and the far side is small, resulting in poor user experience. Exemplarily, referring to Fig. 10, since the 3D coordinates of a1 in the real scene are far from the first pose, compared to the 3D coordinates of the mapping point of a1 on the virtual plane, if a virtual object is displayed at A, then The user can see that the virtual objects on the screen become smaller instantly, and the user experience is poor.
本申请实施例中,一方面为了使得用户感受虚拟对象真实存在于现实场景中,如可以还 原虚拟对象在现实场景中的位置,另一方面还为了保证虚拟对象的大小不发生突变给用户造成困扰,终端可以根据第一位姿、第一位置在虚拟平面上的映射点的三维坐标,以及第一位置在现实场景中的三维坐标,获取第一位置对应的缩放比例,进而根据该缩放比例,对第二位姿、第一位置在现实场景中的三维坐标进行缩放。In the embodiment of this application, on the one hand, in order to make the user feel that the virtual object actually exists in the real scene, for example, the position of the virtual object in the real scene can be restored, and on the other hand, to ensure that the size of the virtual object does not change suddenly and cause trouble to the user , the terminal can obtain the scaling ratio corresponding to the first position according to the first pose, the three-dimensional coordinates of the mapping point of the first position on the virtual plane, and the three-dimensional coordinates of the first position in the real scene, and then according to the scaling ratio, Scale the second pose and the 3D coordinates of the first position in the real scene.
参照图10,终端可以根据第一位姿和A'之间的第一距离,以及第一位姿和A之间的第二距离,确定缩放比例,即第一距离和第二距离的商。Referring to FIG. 10 , the terminal may determine a scaling ratio, that is, a quotient of the first distance and the second distance, according to the first distance between the first pose and A' and the second distance between the first pose and A'.
S3027,根据缩放比例,对第二位姿、第一位置在现实场景中的三维坐标进行缩放,分别得到第三位姿、第一位置对应的缩放三维坐标。S3027. Scale the second pose and the three-dimensional coordinates of the first position in the real scene according to the scaling ratio, and obtain the scaled three-dimensional coordinates corresponding to the third pose and the first position respectively.
参照图10,终端可以基于该缩放比例,将A缩放至A',A'为缩放后的第一位置在现实场景中的三维坐标,即第一位置对应的缩放三维坐标。相应的,可以将第二位姿C2缩放至C2'A,C2'A为缩放后的第二位姿,即第三位姿。在该实施例中,终端可以将C2'A、A'进行送显,以显示第一虚拟对象。其中,因为同时对第二位姿、第一位置在现实场景中的三维坐标进行缩放,即虚拟对象和终端之间的相对位置与现实场景中的相对位置相同,用户可以感受到虚拟对象真实存在于现实场景中,即A'和C2'A的相对位置关系能够表征A和C2的相对位置关系,因此终端将C2'A、A'进行送显,可以还原虚拟对象在现实场景中的位置。另外又因为送显的第一位置对应的三维坐标为A',因此不会产生虚拟对象的大小突变的现象,可以提高用户体验。Referring to FIG. 10 , the terminal may scale A to A' based on the scaling ratio, where A' is the 3D coordinate of the scaled first position in the real scene, that is, the scaled 3D coordinate corresponding to the first position. Correspondingly, the second pose C2 can be scaled to C2'A, and C2'A is the scaled second pose, that is, the third pose. In this embodiment, the terminal may send C2'A, A' for display to display the first virtual object. Among them, because the three-dimensional coordinates of the second pose and the first position in the real scene are scaled at the same time, that is, the relative position between the virtual object and the terminal is the same as the relative position in the real scene, and the user can feel the real existence of the virtual object In the real scene, that is, the relative positional relationship between A' and C2'A can represent the relative positional relationship between A and C2, so the terminal sends C2'A and A' for display, which can restore the position of the virtual object in the real scene. In addition, since the three-dimensional coordinate corresponding to the first position to be displayed is A', there will be no sudden change in the size of the virtual object, which can improve user experience.
S3028,根据第三位姿、第一位置对应的缩放三维坐标,显示第一虚拟对象。S3028. Display the first virtual object according to the third pose and the scaled three-dimensional coordinates corresponding to the first position.
应注意,因为A'和C2'A的相对位置关系能够表征A和C2的相对位置关系,因此终端根据缩放后的第三位姿(C2'A)、第一位置对应的缩放三维坐标(A'),显示的第一虚拟对象的大小不会发生突变,用户对终端的处理过程无感知,如图5中的b所示,用户还是在A位置对应的真实场景中看到第一虚拟对象。It should be noted that because the relative positional relationship between A' and C2'A can characterize the relative positional relationship between A and C2, the terminal, according to the scaled third pose (C2'A), the scaled three-dimensional coordinates corresponding to the first position (A '), the size of the displayed first virtual object will not change suddenly, and the user has no perception of the processing process of the terminal. As shown in b in Figure 5, the user still sees the first virtual object in the real scene corresponding to position A .
同理的,上述实施例S303中终端显示第二虚拟对象的过程可以参照终端显示第一虚拟对象的相关描述,相应的,S303可以包括:Similarly, the process of the terminal displaying the second virtual object in S303 of the above embodiment may refer to the related description of the terminal displaying the first virtual object. Correspondingly, S303 may include:
S3031,响应于用户的第二操作,获取终端的第四位姿,以及第二位置在图像坐标系中的二维坐标。S3031. Acquire a fourth pose of the terminal and two-dimensional coordinates of the second position in the image coordinate system in response to the second operation of the user.
应理解,S3031-S3038可以参照S3021-S3028中的相关描述。第二位置可以参照第一位置的相关描述,第二位置可以为预先设置的或用户设置的。It should be understood that for S3031-S3038, reference may be made to related descriptions in S3021-S3028. For the second location, refer to the related description of the first location, and the second location may be preset or set by the user.
在一种实施例中,若显示虚拟对象的位置为预设的,则第四位姿与第一位姿相同。在一种实施例中,若显示虚拟对象的位置为用户设置的,则第四位姿为用户设置第二位置时终端的位姿,该第四位姿可以与第一位姿相同或不同。下述实施例中以用户设置第一位置和第二位置时的终端的位姿相同为例,即第四位姿和第一位姿相同为例进行说明。In one embodiment, if the position where the virtual object is displayed is preset, the fourth pose is the same as the first pose. In one embodiment, if the position where the virtual object is displayed is set by the user, the fourth pose is the pose of the terminal when the user sets the second position, and the fourth pose may be the same as or different from the first pose. In the following embodiments, an example in which the user sets the first position and the second position are the same, that is, the fourth position is the same as the first position, is used as an example for illustration.
S3032,追踪第二位置。S3032. Track the second position.
S3033,根据第二位置的二维坐标,获取第二位置在虚拟平面上的映射点的三维坐标。S3033. According to the two-dimensional coordinates of the second position, acquire the three-dimensional coordinates of the mapping point of the second position on the virtual plane.
参照图11A,终端为第一位姿C1时,第二位置在画面中的位置如b1所示,相应的,第二位置在虚拟平面上的映射点为:C1朝向b1的射线与虚拟平面的交点B',第二位置在虚拟平面上的映射点的三维坐标即为B'的三维坐标。在一种实施例中,第二位置在虚拟平面上的映射点可以称为第二映射点。第二位置在虚拟平面上的映射点的三维坐标可以称为:第二映射点在相机坐标系中的三维坐标。Referring to Figure 11A, when the terminal is in the first pose C1, the position of the second position in the screen is shown as b1. Correspondingly, the mapping point of the second position on the virtual plane is: the ray of C1 towards b1 and the point of the virtual plane The intersection point B', the three-dimensional coordinates of the mapping point of the second position on the virtual plane is the three-dimensional coordinates of B'. In an embodiment, the mapping point of the second position on the virtual plane may be referred to as a second mapping point. The three-dimensional coordinates of the mapping point of the second position on the virtual plane may be referred to as: the three-dimensional coordinates of the second mapping point in the camera coordinate system.
S3034,在虚拟平面上的映射点的三维坐标处显示第二虚拟对象。S3034. Display the second virtual object at the three-dimensional coordinates of the mapping point on the virtual plane.
应理解的是,本申请实施例可以在已显示第一虚拟对象的画面上显示第二虚拟对象。在 一种实施例中,用户可以在显示第二虚拟对象的画面中可以看到第一对象或第二对象。在一种实施例中,可以显示第一虚拟对象的画面和显示第二虚拟对象的画面不同,但显示第一虚拟对象的画面和显示第二虚拟对象的画面的加和为终端拍摄的同一现实场景,其中,第一虚拟对象的画面和显示第二虚拟对象的画面的加和可以称为全景画面。It should be understood that, in this embodiment of the present application, the second virtual object may be displayed on the screen where the first virtual object has already been displayed. In an embodiment, the user can see the first object or the second object in the screen displaying the second virtual object. In one embodiment, the screen displaying the first virtual object and the screen displaying the second virtual object may be different, but the sum of the screen displaying the first virtual object and the screen displaying the second virtual object is the same reality captured by the terminal In the scene, the sum of the picture of the first virtual object and the picture of the second virtual object can be called a panoramic picture.
S3035,获取第二位置在现实场景中的三维坐标。S3035. Acquire the three-dimensional coordinates of the second position in the real scene.
参照图11A,终端为第二位姿C2时,第二位置在画面中的位置为b2,本申请实施例中,可以将C1朝向b1的射线和C2朝向b2的射线的交点作为第二位置在现实场景中的位置,将该交点B的三维坐标作为第二位置在现实场景中的三维坐标。在一种实施例中,第二位置在现实场景中的三维坐标可以称为:第二位置在世界坐标系中的三维坐标。Referring to FIG. 11A , when the terminal is in the second pose C2, the position of the second position in the screen is b2. In the embodiment of the present application, the intersection of the ray from C1 towards b1 and the ray from C2 towards b2 can be used as the second position at For the position in the real scene, the three-dimensional coordinates of the intersection point B are used as the three-dimensional coordinates of the second position in the real scene. In an embodiment, the three-dimensional coordinates of the second position in the real scene may be referred to as: the three-dimensional coordinates of the second position in the world coordinate system.
S3036,根据第四位姿、第二位置在虚拟平面上的映射点的三维坐标,以及第二位置在现实场景中的三维坐标,获取第二位置对应的缩放比例。S3036. According to the fourth pose, the three-dimensional coordinates of the mapping point of the second position on the virtual plane, and the three-dimensional coordinates of the second position in the real scene, obtain the scaling ratio corresponding to the second position.
参照图11A,终端可以根据第一位姿C1和B'之间的第三距离,以及第一位姿C1和B之间的第四距离,确定第二位置对应的缩放比例,即第三距离和第四距离的商。Referring to FIG. 11A, the terminal can determine the scaling ratio corresponding to the second position, that is, the third distance, according to the third distance between the first pose C1 and B', and the fourth distance between the first pose C1 and B'. and the quotient of the fourth distance.
S3037,根据缩放比例,对第二位姿、第二位置在现实场景中的三维坐标进行缩放,得到第五位姿,以及第二位置对应的缩放三维坐标。S3037. Scale the second pose and the three-dimensional coordinates of the second position in the real scene according to the scaling ratio to obtain the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
参照图11A,终端可以基于第二位置对应的缩放比例,将B缩放至B',B'为缩放后的第二位置在现实场景中的三维坐标,即第二位置对应的缩放三维坐标,相应的,可以将第二位姿C2缩放至C2'B,C2'B为缩放后的第二位姿,即第五位姿。在该实施例中,终端可以将C2'B、B'进行送显,以显示第二虚拟对象。其中,因为B'和C2'B的相对位置关系能够表征B和C2的相对位置关系,因此终端将C2'B、B'进行送显,可以还原虚拟对象在现实场景中的位置,另外又因为送显的终端的第二位姿以及第二位置对应的三维坐标同时改变,并非仅第二位置对应的三维坐标改变,因此不会产生虚拟对象的大小突变的现象,可以提高用户体验。Referring to FIG. 11A, the terminal can scale B to B' based on the scaling ratio corresponding to the second position, and B' is the three-dimensional coordinate of the scaled second position in the real scene, that is, the scaled three-dimensional coordinate corresponding to the second position, corresponding Yes, the second pose C2 can be scaled to C2'B, and C2'B is the scaled second pose, that is, the fifth pose. In this embodiment, the terminal may send C2'B, B' for display to display the second virtual object. Among them, because the relative positional relationship between B' and C2'B can represent the relative positional relationship between B and C2, the terminal sends C2'B and B' to display, which can restore the position of the virtual object in the real scene. In addition, because The second pose and the three-dimensional coordinates corresponding to the second position of the terminal sent for display change at the same time, not only the three-dimensional coordinates corresponding to the second position change, so there will be no sudden change in the size of the virtual object, which can improve user experience.
图11A中将第一位置对应的虚拟平面和第二位置对应的虚拟平面相同为例进行说明,在一种实施例中,因为本申请实施例中可以预先设置多个虚拟平面,第一位置对应的虚拟平面和第二位置对应的虚拟平面存在不同的情况,在该种场景下,终端获取第二位置在虚拟平面上的映射点的三维坐标、获取第二位置在现实场景中的三维坐标,以及获取第二位置对应的缩放比例、对第二位姿、第二位置在现实场景中的三维坐标进行缩放等过程可以参照图11A中的方式,与图11A不同的是,终端采用的虚拟平面并非第一位置对应的虚拟平面,而是第二位置对应的虚拟平面,如图11B所示。在一种实施例中,如第一位置对应的虚拟平面为第一虚拟平面,第二位置对应的虚拟平面为第二虚拟平面,图11B中的第一虚拟平面、第二虚拟平面的位置关系为示例说明。In Fig. 11A, the virtual plane corresponding to the first position is the same as the virtual plane corresponding to the second position. The virtual plane corresponding to the second position is different from the virtual plane corresponding to the second position. In this scenario, the terminal obtains the three-dimensional coordinates of the mapping point of the second position on the virtual plane, and obtains the three-dimensional coordinates of the second position in the real scene. As well as obtaining the scaling ratio corresponding to the second position, scaling the second pose, and the 3D coordinates of the second position in the real scene, you can refer to the method in Figure 11A. The difference from Figure 11A is that the virtual plane used by the terminal It is not the virtual plane corresponding to the first position, but the virtual plane corresponding to the second position, as shown in FIG. 11B . In one embodiment, if the virtual plane corresponding to the first position is the first virtual plane, and the virtual plane corresponding to the second position is the second virtual plane, the positional relationship between the first virtual plane and the second virtual plane in FIG. 11B Illustrated as an example.
S3038,根据第五位姿、第二位置对应的缩放三维坐标,显示第二虚拟对象。S3038. Display the second virtual object according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
其中,终端可以将第五位姿(C2'B)、第二位置对应的缩放三维坐标(B')进行送显,以显示第二虚拟对象。Wherein, the terminal may send and display the fifth pose (C2'B) and the scaled three-dimensional coordinates (B') corresponding to the second position, so as to display the second virtual object.
本申请实施例中,预先设置虚拟平面,可以在预设的虚拟平面上显示虚拟对象,因此无需预先扫描现实场景得到现实场景中的平面,也无需获取现实场景的先验地图信息,响应速度快,效率高。另外还可以基于第一位置、第二位置在现实场景中的三维坐标,在真实场景中的对应位置处显示虚拟对象,使得用户感受虚拟对象真实存在于现实场景中,且基于不同位置对应的缩放比例,避免虚拟对象的大小突变的现象,提高用户体验。In the embodiment of the present application, the virtual plane is preset, and virtual objects can be displayed on the preset virtual plane, so there is no need to pre-scan the real scene to obtain the plane in the real scene, and there is no need to obtain the prior map information of the real scene, and the response speed is fast ,efficient. In addition, based on the three-dimensional coordinates of the first position and the second position in the real scene, the virtual object can be displayed at the corresponding position in the real scene, so that the user can feel that the virtual object actually exists in the real scene, and based on the corresponding scaling of different positions Scale, to avoid the phenomenon of sudden changes in the size of virtual objects, and improve user experience.
如上实施例中,终端在显示第一虚拟对象时,可以将第三位姿(C2'A)、第一位置对应 的缩放三维坐标(A')进行送显,以及在显示第二虚拟对象时,可以将第五位姿(C2'B)、第二位置对应的缩放三维坐标(B')进行送显,这样每个位置对应的缩放三维坐标均对应一个终端的位姿,这样终端在第二位姿时显示第一虚拟对象和第二虚拟对象时的计算量大,速度慢。As in the above embodiment, when displaying the first virtual object, the terminal can send the third pose (C2'A) and the scaled three-dimensional coordinates (A') corresponding to the first position for display, and when displaying the second virtual object , the fifth pose (C2'B) and the scaled three-dimensional coordinates (B') corresponding to the second position can be sent to display, so that the scaled three-dimensional coordinates corresponding to each position correspond to the pose of a terminal, so that the terminal at When displaying the first virtual object and the second virtual object in two poses, the calculation amount is large and the speed is slow.
为了减少终端的计算量,提高终端显示第一虚拟对象和第二虚拟对象的效率,在一种实施例中,终端在第二位姿时,可以将第一位置对应的缩放三维坐标(A')以及第二位置对应的缩放三维坐标(B')统一在一个终端位姿下,这样终端显示第一虚拟对象、第二虚拟对象时仅需送显一个终端位姿,以及多个显示虚拟对象的缩放三维坐标即可,可以提高终端渲染显示虚拟对象的速度。In order to reduce the calculation amount of the terminal and improve the efficiency of the terminal to display the first virtual object and the second virtual object, in one embodiment, when the terminal is in the second pose, the scaled three-dimensional coordinates corresponding to the first position (A' ) and the scaled three-dimensional coordinates (B') corresponding to the second position are unified under one terminal pose, so that when the terminal displays the first virtual object and the second virtual object, it only needs to send and display one terminal pose and multiple display virtual objects The three-dimensional coordinates can be scaled, which can improve the speed of terminal rendering and displaying virtual objects.
在一种实施例中,参照图12,上述S3028和S3038可以替换为S3028A-S3029A:In one embodiment, referring to FIG. 12, the above S3028 and S3038 can be replaced by S3028A-S3029A:
S3028A,根据第三位姿、第五位姿,将第一位置对应的缩放三维坐标和第二位置对应的缩放三维坐标统一在第三位姿下,得到第三位姿下的第一位置对应的缩放三维坐标,以及第二位置对应的缩放三维坐标。S3028A, according to the third pose and the fifth pose, unify the scaled three-dimensional coordinates corresponding to the first position and the scaled three-dimensional coordinates corresponding to the second position in the third pose, to obtain the first position corresponding to the third pose The scaled 3D coordinates of and the scaled 3D coordinates corresponding to the second position.
在一种实施例中,终端可以将第三位姿所在的三维坐标系作为统一的坐标系,即将第一位置对应的缩放三维坐标和第二位置对应的缩放三维坐标统一在第三位姿下。参照图11A,终端可以将第五位姿平移至第三位姿处,如终端可以沿着第一位姿C1和第二位姿C2之间的连线,将第五位姿C2'B平移至第三位姿C2'A处。相应的,因为第一位置对应的缩放三维坐标是与第三位姿按照相同的缩放比例得到的,因此第三位姿下的第一位置对应的缩放三维坐标不变。由于第五位姿C2'B平移至第三位姿C2'A处,终端可以在虚拟平面上,沿着平行于“C2'B朝向C2'A”的方向平移至B”处,平移的距离等于C2'B和C2'A之间的距离。也就是说,C2'B、C2'A、B'和B”可以构成一平行四边形。相应的,第三位姿下第二位置对应的缩放三维坐标为B”的三维坐标。In an embodiment, the terminal may use the three-dimensional coordinate system where the third pose is located as a unified coordinate system, that is, the scaled three-dimensional coordinates corresponding to the first position and the scaled three-dimensional coordinates corresponding to the second position are unified under the third pose . Referring to Figure 11A, the terminal can translate the fifth pose to the third pose, for example, the terminal can translate the fifth pose C2'B along the line between the first pose C1 and the second pose C2 to the third pose C2'A. Correspondingly, because the scaled three-dimensional coordinates corresponding to the first position are obtained according to the same scaling ratio as the third pose, the scaled three-dimensional coordinates corresponding to the first position in the third pose remain unchanged. Since the fifth pose C2'B is translated to the third pose C2'A, the terminal can be translated to B" on the virtual plane along the direction parallel to "C2'B towards C2'A", the translation distance is equal to the distance between C2'B and C2'A. That is, C2'B, C2'A, B' and B" can form a parallelogram. Correspondingly, the scaled three-dimensional coordinate corresponding to the second position in the third pose is the three-dimensional coordinate of B".
本申请实施例中,终端可以根据第五位姿C2'B和第三位姿C2'A之间的相对位置,以及B'的三维坐标,获取B”的三维坐标。In the embodiment of the present application, the terminal may acquire the three-dimensional coordinates of B" according to the relative position between the fifth pose C2'B and the third pose C2'A, and the three-dimensional coordinates of B'.
在一种实施例中,终端还可以将第五位姿作为统一的位姿,进而将第三位姿平移至第五位姿处,以获取第五位姿下的第一位置对应的缩放三维坐标,以及第二位置对应的缩放三维坐标进行送显,本申请实施例中对统一的位姿不做限制。In an embodiment, the terminal may also use the fifth pose as a unified pose, and then translate the third pose to the fifth pose to obtain the zoomed 3D corresponding to the first position in the fifth pose. The coordinates and the scaled three-dimensional coordinates corresponding to the second position are sent for display, and the unified pose is not limited in the embodiment of the present application.
在一种实施例中,预设的位置可以为多个,或者用户可以设置多个位置,在该种实施例中,终端可以获取每个位置对应的缩放三维坐标和终端的位姿。参照图13所示,为了减少终端渲染虚拟对象时的计算量,提高显示虚拟对象的速度,终端可以将每个锚点位姿和对应的终端的位姿,转换为一个终端位姿下的多个锚点位姿。应理解,图13中以锚点表征每个位置(如A位置、B位置),以锚点位姿表征每个位置对应的缩放三维坐标。In one embodiment, there may be multiple preset positions, or the user may set multiple positions. In this embodiment, the terminal may obtain the scaled three-dimensional coordinates corresponding to each position and the pose of the terminal. Referring to Figure 13, in order to reduce the amount of calculation when the terminal renders the virtual object and increase the speed of displaying the virtual object, the terminal can convert each anchor point pose and the corresponding terminal pose into multiple poses under one terminal pose. An anchor pose. It should be understood that in FIG. 13 , each position (such as position A and position B) is represented by an anchor point, and the scaled three-dimensional coordinates corresponding to each position are represented by an anchor point pose.
S3029A,根据第三位姿,以及第三位姿下的第一位置对应的缩放三维坐标,以及第二位置对应的缩放三维坐标,显示第一虚拟对象和第二虚拟对象。S3029A. Display the first virtual object and the second virtual object according to the third pose, the scaled three-dimensional coordinates corresponding to the first position in the third pose, and the scaled three-dimensional coordinates corresponding to the second position.
在该实施例中,终端在第二位姿时,可以将第三位姿(C2'A)、第一位置对应的缩放三维坐标(A')、以及第三位姿下第二位置对应的缩放三维坐标(B”)进行送显,以显示第一虚拟对象和第二虚拟对象。In this embodiment, when the terminal is in the second pose, the third pose (C2'A), the scaled three-dimensional coordinates (A') corresponding to the first position, and the scaled three-dimensional coordinates (A') corresponding to the second position in the third pose The three-dimensional coordinates (B") are scaled for display to display the first virtual object and the second virtual object.
本申请实施例中,可以将每个位置对应的缩放三维坐标统一在同一终端的位姿下进行送显,而不是将每个位置对应的缩放三维坐标以及终端的位姿进行送显,可以减少终端渲染虚拟对象时的计算量,提高显示虚拟对象的速度,进而提高用户体验。In the embodiment of the present application, the zoomed three-dimensional coordinates corresponding to each position can be sent for display under the pose of the same terminal, instead of sending the zoomed three-dimensional coordinates corresponding to each position and the pose of the terminal for display, which can reduce The amount of calculation when the terminal renders virtual objects improves the speed of displaying virtual objects, thereby improving user experience.
本申请实施例中,因为终端在第一位置、第二位置对应的缩放三维坐标处显示虚拟对象,则当终端转换拍摄的现实场景画面,再转回到原先的现实场景画面时,用户仍可以在终端的界面上看到第一虚拟对象和第二虚拟对象。In the embodiment of this application, because the terminal displays the virtual object at the scaled three-dimensional coordinates corresponding to the first position and the second position, when the terminal switches the captured real scene picture and then returns to the original real scene picture, the user can still The first virtual object and the second virtual object can be seen on the interface of the terminal.
示例性的,如图14中的a所示,终端拍摄的现实场景画面中包括桌子,桌子的右边是一个笔记本电脑,应理解,终端拍摄桌子时为拍摄到笔记本电脑,因此图14中的a中未示出笔记本电脑。终端在A位置对应的缩放三维坐标处显示第一虚拟对象“火柴人”,以及在B位置对应的缩放三维坐标处显示第二虚拟对象“长方体”。用户握持终端向右移动拍摄,相应的,终端拍摄的现实场景画面中包括笔记本电脑,未包含桌子,如图14中的b所示,因为第一虚拟对象和第二虚拟对象锚点在现实场景的桌子上,因为图14中的b中所示的画面中未包含桌子,因此用户在该画面中看不到第一虚拟对象和第二虚拟对象。在图14中的b所示的画面中,用户还可以执行如上第一操作和/或第二操作,以触发终端在该画面中显示其他虚拟对象,如图14中的c所示,终端响应于用户的操作,可以在该画面中显示第三虚拟对象“火柴人”。Exemplarily, as shown in a in FIG. 14 , the real-world scene shot by the terminal includes a table, and a laptop is on the right side of the table. It should be understood that when the terminal shoots the table, the laptop is captured, so a in FIG. 14 Laptop not shown. The terminal displays the first virtual object "stickman" at the zoomed three-dimensional coordinates corresponding to position A, and displays the second virtual object "cuboid" at the zoomed three-dimensional coordinates corresponding to position B. The user holds the terminal and moves to the right to shoot. Correspondingly, the real scene picture captured by the terminal includes the laptop computer, but does not include the table, as shown in b in Figure 14, because the anchor points of the first virtual object and the second virtual object are in the real world. On the table in the scene, because the screen shown in b in FIG. 14 does not include the table, the user cannot see the first virtual object and the second virtual object in the screen. In the screen shown in b in Figure 14, the user can also perform the first operation and/or the second operation above to trigger the terminal to display other virtual objects in the screen, as shown in c in Figure 14, the terminal responds Based on the user's operation, the third virtual object "stickman" can be displayed on the screen.
在该示例中,当用户握持终端向左移动拍摄,重新拍摄到包含有桌子的画面时,用户可以在A位置对应的缩放三维坐标处看到第一虚拟对象,以及在B位置对应的缩放三维坐标处看到第二虚拟对象,如图14中的d所示。In this example, when the user holds the terminal and moves to the left to shoot, and retakes the picture containing the table, the user can see the first virtual object at the zoomed three-dimensional coordinates corresponding to position A, and the zoomed three-dimensional coordinates corresponding to position B. The second virtual object is seen at the three-dimensional coordinates, as shown in d in FIG. 14 .
应理解的是,本申请实施例中,因为终端是在A位置(或其他位置)对应的缩放三维坐标处显示虚拟对象,因此当终端转换拍摄的现实场景画面,虚拟对象仍然处于A位置(或其他位置)对应的缩放三维坐标处,不会随着拍摄画面的变化而改变,当终端重新转换至原先的现实场景画面时,用户仍然可以看到对应缩放三维坐标处显示的虚拟对象。It should be understood that, in the embodiment of the present application, because the terminal displays the virtual object at the zoomed three-dimensional coordinates corresponding to position A (or other positions), when the terminal switches the captured real scene picture, the virtual object is still at position A (or other positions). Other positions) corresponding to the zoomed three-dimensional coordinates will not change with the change of the shooting picture. When the terminal re-switches to the original real scene picture, the user can still see the virtual object displayed at the corresponding zoomed three-dimensional coordinates.
在一种实施例中,用户还可以与终端上显示的虚拟对象进行交互。如用户可以对虚拟对象进行选中、删除、移动、缩放等操作,以及,用户还可以采用语音方式与虚拟对象进行交互等。In one embodiment, the user can also interact with virtual objects displayed on the terminal. For example, the user can perform operations such as selecting, deleting, moving, and scaling the virtual object, and the user can also interact with the virtual object in a voice manner.
用户可以选中虚拟对象。示例性的,如用户点击终端上显示的虚拟对象,可以选中该虚拟对象。如图15中的a所示,终端显示的画面上显示有火柴人和长方体。用户点击火柴人,终端响应于用户的点击操作,可以显示方框151,以圈中火柴人,表征火柴人被选中,如图15中的b所示。Users can select virtual objects. Exemplarily, if the user clicks on a virtual object displayed on the terminal, the virtual object may be selected. As shown in a in FIG. 15 , a stick figure and a cuboid are displayed on the screen displayed by the terminal. The user clicks on the stick figure, and the terminal may display a box 151 in response to the user's click operation to circle the stick figure, indicating that the stick figure is selected, as shown in b in FIG. 15 .
用户可以删除虚拟对象。示例性的,如用户长按终端上显示的虚拟对象,可以删除虚拟对象。如用户长按终端上显示的火柴人,终端可以在火柴人的右上角显示删除控件152,如图15中的c所示。用户点击删除控件152,终端删除火柴人,如图15中的d所示,终端上仅显示长方体,火柴人被删除。Users can delete virtual objects. Exemplarily, if the user presses and holds the virtual object displayed on the terminal, the virtual object can be deleted. If the user presses and holds the stick figure displayed on the terminal, the terminal may display a delete control 152 on the upper right corner of the stick figure, as shown in c in FIG. 15 . The user clicks the delete control 152, and the terminal deletes the stickman, as shown in d in FIG. 15 , only a cuboid is displayed on the terminal, and the stickman is deleted.
用户可以移动虚拟对象。示例性的,如用户长按且拖动终端上显示的虚拟对象,可以移动虚拟对象。如用户长按终端上显示的火柴人且将火柴人拖动至另一位置C,终端可以在位置C出显示火柴人。Users can move virtual objects. Exemplarily, if the user long presses and drags the virtual object displayed on the terminal, the virtual object can be moved. If the user presses and holds the stickman displayed on the terminal and drags the stickman to another position C, the terminal can display the stickman at position C.
用户可以缩放虚拟对象。示例性的,如用户使用两根手指虚拟对象,可以缩放虚拟对象。如用户将两根手指放置在终端的界面上显示的火柴人处,两根手指靠近可缩小火柴人,两个根手指远离可以放大火柴人。Users can scale virtual objects. Exemplarily, if the user uses two fingers on the virtual object, the virtual object can be scaled. For example, if the user puts two fingers on the stick figure displayed on the interface of the terminal, the stick figure can be zoomed out if the two fingers are close to each other, and the stick figure can be enlarged if the two fingers are far away.
应理解,如上用户与虚拟对象进行交互时的操作均为示例说明,本申请实施例对用户具体的操作不做限制。It should be understood that the above operations when the user interacts with the virtual object are examples, and this embodiment of the present application does not limit the specific operations of the user.
在一种实施例中,用户还可以采用语音方式与虚拟对象进行交互。In one embodiment, the user can also use voice to interact with the virtual object.
示例性的,用户可以说出“选中所有火柴人”,终端响应于该语音,可以选中当前画面 中的所有火柴人,如终端可以在当前画面中显示方框151,以圈中火柴人,表征火柴人被选中,如图15中的b所示。示例性的,终端可以响应于该语音,可以选中所有画面中的火柴人,如图14中的a和b所示的两个画面中均包含有火柴人,可以终端可以响应于该语音,可以显示包含有“桌子和笔记本电脑”的全景画面,且在该画面上显示方框151,以圈中该画面中的每个火柴人,如图15中的e所示。Exemplarily, the user can say "select all stick figures", and the terminal can select all the stick figures in the current screen in response to the voice, for example, the terminal can display a box 151 in the current screen to circle the stick figures, representing The stickman is selected, as shown in b in Figure 15. Exemplarily, in response to the voice, the terminal can select the stick figures in all the pictures, as shown in a and b in Figure 14, the two pictures both contain stick figures, and the terminal can respond to the voice, and can A panoramic picture containing "desk and laptop" is displayed, and a box 151 is displayed on the picture to circle each stick figure in the picture, as shown in e in FIG. 15 .
在一种实施例中,用户还可以采用语音的方式改变虚拟对象的属性。虚拟对象的属性可以包括但不限于:位置、大小、形状、颜色、动作、表情等。示例性的,如用户可以说出:将长方体变成红色,则终端响应于该语音,将画面中显示的长方体调整为红色。In an embodiment, the user can also change the attributes of the virtual object by voice. The attributes of the virtual object may include but not limited to: position, size, shape, color, action, expression and so on. Exemplarily, if the user can say: turn the cuboid into red, the terminal adjusts the cuboid displayed on the screen to red in response to the voice.
如上说明了用户还可以采用语音方式与虚拟对象进行交互的场景,本申请实施例对用户采用语音如何与虚拟对象交互的方式不做限制。As described above, the user can also use voice to interact with the virtual object. The embodiment of the present application does not limit how the user uses voice to interact with the virtual object.
本申请实施例中,用户可以与终端上显示的虚拟对象进行交互,丰富了用户与虚拟对象的交互方式,可以提高用户体验。In the embodiment of the present application, the user can interact with the virtual object displayed on the terminal, which enriches the interaction mode between the user and the virtual object, and can improve user experience.
在一种实施例中,参照图16所示,终端中可以包括:输入数据模块、多锚点实时追踪和管理模块、输出结果模块、渲染生成模块以及用户交互模块。In one embodiment, as shown in FIG. 16 , the terminal may include: an input data module, a multi-anchor real-time tracking and management module, an output result module, a rendering generation module, and a user interaction module.
输入数据模块,用于采集用户与终端交互的数据、拍摄的画面数据、IMU数据,以及重力轴数据等。多锚点实时追踪和管理模块,用于执行上述实施例中的S3021-S3023、S3025-S3027、S3031-S3033,以及S3035-S3037。输出结果模块,用于执行上述实施例中的S3028A-S3029A。渲染生成模块,用于渲染、显示虚拟对象。用户交互模块,用于实现用户与终端上显示的虚拟对象的交互。其中,各模块执行的步骤可以简化如图17所示。The input data module is used to collect the data of interaction between the user and the terminal, the captured picture data, the IMU data, and the gravity axis data, etc. The multi-anchor real-time tracking and management module is used to execute S3021-S3023, S3025-S3027, S3031-S3033, and S3035-S3037 in the above embodiments. The output result module is used to execute S3028A-S3029A in the above embodiment. The rendering generation module is used for rendering and displaying virtual objects. The user interaction module is used to realize the interaction between the user and the virtual objects displayed on the terminal. Wherein, the steps performed by each module can be simplified as shown in FIG. 17 .
综上实施例,本申请实施例提供的AR场景中的人机交互方法的流程示意图可以简化为图18所示。In summary, the schematic flowchart of the human-computer interaction method in the AR scene provided by the embodiment of the present application can be simplified as shown in FIG. 18 .
在一种实施例中,以终端的界面上显示有待选择的虚拟对象的标识为例进行说明,其中,待选择的虚拟对象的标识可以参照至少一个待选择的虚拟对象的图标43的相关描述,在该实施例中,参照图19,本申请实施例提供的AR场景中的人机交互方法可以包括:In one embodiment, an illustration is made by taking the identification of the virtual object to be selected displayed on the interface of the terminal as an example, wherein, the identification of the virtual object to be selected can refer to the relevant description of the icon 43 of at least one virtual object to be selected, In this embodiment, referring to FIG. 19, the human-computer interaction method in the AR scene provided by the embodiment of the present application may include:
S1901,拍摄现实场景,且在终端的界面上显示拍摄的现实场景,终端的界面上还显示有待选择的虚拟对象的标识。S1901. Shoot a real scene, and display the shot real scene on an interface of a terminal, and display an identifier of a virtual object to be selected on the interface of the terminal.
S1901可以参照S301的相关描述。本申请实施例中,终端的界面上还显示有待选择的虚拟对象的标识。For S1901, refer to the related description of S301. In the embodiment of the present application, the interface of the terminal further displays the identification of the virtual object to be selected.
S1902,响应于用户对第一虚拟对象的标识的操作,在终端拍摄的现实场景中显示第一虚拟对象。S1902. In response to the user's operation on the identification of the first virtual object, display the first virtual object in the real scene captured by the terminal.
其中,用户操作不同的虚拟对象的标识,可以触发终端在终端拍摄的现实场景的相应位置处显示虚拟对象。应注意的是,本申请实施例中,终端可以在终端拍摄的现实场景的相应位置处显示虚拟对象,当终端拍摄的画面转换,终端拍摄的现实场景中可以不显示该虚拟对象,但当终端拍摄的画面切换回显示有虚拟对象的现实场景时,终端可以在拍摄的现实场景中显示虚拟对象,可以参照图14中的相关描述。Wherein, the user operates different virtual object identifiers, which may trigger the terminal to display the virtual object at a corresponding position of the real scene captured by the terminal. It should be noted that in the embodiment of the present application, the terminal may display the virtual object at the corresponding position of the real scene captured by the terminal. When the screen captured by the terminal changes, the virtual object may not be displayed in the real scene captured by the terminal, but when the terminal When the captured picture is switched back to the real scene displaying the virtual object, the terminal may display the virtual object in the captured real scene, and reference may be made to the relevant description in FIG. 14 .
其中,S1902可以参照S302中的相关描述。Wherein, S1902 may refer to related descriptions in S302.
S1903,响应于用户对第二虚拟对象的标识的操作,在已显示第一虚拟对象的现实场景中显示第二虚拟对象。S1903, in response to the user's operation on the identification of the second virtual object, display the second virtual object in the real scene where the first virtual object has already been displayed.
其中,S1903可以参照S302中的相关描述。S1902-S1903还可以参照图5所示的附图。Wherein, S1903 may refer to related descriptions in S302. S1902-S1903 can also refer to the accompanying drawing shown in FIG. 5 .
本申请实施例的实现原理和技术效果与上述实施例类似,在此不再赘述。The implementation principles and technical effects of the embodiments of the present application are similar to those of the foregoing embodiments, and will not be repeated here.
图20为本申请实施例提供的终端的一种结构示意图。参照图20,电子设备2000中可以包括:拍摄模块2001、显示模块2002和处理模块2003。在一种实施例中,拍摄模块2001可以包含于输入数据模块中,显示模块2002包含于渲染生成模块中,处理模块2003可以包括多锚点实时追踪和管理模块、输出结果模块和用户交互模块。FIG. 20 is a schematic structural diagram of a terminal provided in an embodiment of the present application. Referring to FIG. 20 , an electronic device 2000 may include: a camera module 2001 , a display module 2002 and a processing module 2003 . In one embodiment, the shooting module 2001 can be included in the input data module, the display module 2002 can be included in the rendering generation module, and the processing module 2003 can include a multi-anchor real-time tracking and management module, an output result module and a user interaction module.
在一种实施例中,拍摄模块2001,用于拍摄现实场景。In one embodiment, the photographing module 2001 is configured to photograph a real scene.
显示模块2002,用于在终端的界面上显示拍摄的现实场景,所述界面上还显示有待选择的虚拟对象的标识,且响应于用户对第一虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第一虚拟对象,以及响应于所述用户对第二虚拟对象的标识的操作,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module 2002 is configured to display the captured real scene on the interface of the terminal, the interface also displays the identification of the virtual object to be selected, and responds to the user's operation on the identification of the first virtual object, and shoots the scene on the terminal. The first virtual object is displayed in a real scene, and in response to the user's operation on the identification of the second virtual object, the second virtual object is displayed in the real scene where the first virtual object has already been displayed.
在一种可能的实现方式中,处理模块2003,用于响应于所述用户对所述第一虚拟对象的标识的操作,获取所述终端的第一位姿;获取所述第一位置在第一虚拟平面上的第一映射点,所述第一位置为所述界面上的预设位置或所述用户在所述界面上确定的位置;获取所述第一映射点在相机坐标系中的三维坐标。In a possible implementation manner, the processing module 2003 is configured to acquire the first pose of the terminal in response to the user's operation on the identification of the first virtual object; A first mapping point on a virtual plane, where the first position is a preset position on the interface or a position determined by the user on the interface; obtaining the first mapping point in the camera coordinate system 3D coordinates.
所述显示模块2002,具体用于根据所述第一位姿和所述第一映射点在所述相机坐标系中的三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。The display module 2002 is specifically configured to display the first virtual object in the real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system .
在一种可能的实现方式中,处理模块2003,具体用于获取所述第一位置在图像坐标系中的二维坐标;将所述第一位姿至所述二维坐标的射线与所述第一虚拟平面的交点作为所述第一映射点。In a possible implementation manner, the processing module 2003 is specifically configured to acquire the two-dimensional coordinates of the first position in the image coordinate system; combine the ray from the first pose to the two-dimensional coordinates with the The intersection point of the first virtual plane is used as the first mapping point.
在一种可能的实现方式中,所述第一虚拟平面包含于预设虚拟平面集合中,处理模块2003,还用于若所述第一位姿至所述二维坐标的射线与所述第一虚拟平面没有交点,则获取所述第一位姿至所述二维坐标的射线与所述预设虚拟平面集合中其他虚拟平面的交点;将所述与所述预设虚拟平面集合中其他虚拟平面的交点作为所述第一映射点。In a possible implementation manner, the first virtual plane is included in a set of preset virtual planes, and the processing module 2003 is further configured to: If a virtual plane has no intersection point, then obtain the intersection point of the ray from the first pose to the two-dimensional coordinates and other virtual planes in the preset virtual plane set; The intersection point of the virtual plane is used as the first mapping point.
在一种可能的实现方式中,处理模块2003,还用于追踪所述第一位置对应的图像块;在所述终端为第二位姿时,获取所述第一位置在所述图像坐标系中的二维坐标。In a possible implementation manner, the processing module 2003 is further configured to track the image block corresponding to the first position; when the terminal is in the second pose, obtain the first position in the image coordinate system Two-dimensional coordinates in .
所述显示模块2002,还用于根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。The display module 2002 is further configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the second pose, and the The two-dimensional coordinates of the first position in the image coordinate system in the second pose, and display the first virtual object in the real scene captured by the terminal.
在一种可能的实现方式中,所述第二位姿与所述第一位姿之间的距离大于或等于距离阈值;和/或,所述终端在从所述第一位姿移动至所述第二位姿的过程中拍摄的图像的帧数大于或等于预设帧数;和/或,所述终端在从所述第一位姿移动至所述第二位姿的时长大于预设时长;和/或,所述第二位姿为所述终端对所述第一映射点执行三角化成功时的位姿。In a possible implementation, the distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or, the terminal moves from the first pose to the The number of frames of the image captured during the second pose is greater than or equal to the preset number of frames; and/or, the duration of the terminal moving from the first pose to the second pose is longer than the preset duration; and/or, the second pose is a pose when the terminal successfully triangulates the first mapping point.
在一种可能的实现方式中,处理模块2003,具体用于根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,获取所述第一位置在世界坐标系中的三维坐标;根据第一距离和第二距离,获取所述第一位置对应的缩放比例,所述第一距离为:所述第一位姿至所述第一映射点在所述相机坐标系中的三维坐标的距离,所述第二距离为:所述第一位姿至所述第一位置在所述世界坐标系中的三维坐标的距离;根据所述第一位置对应的缩放比例,分别对所述第二位姿、所述第一位置在所述世界坐标系中的三维坐标进行缩放,得到第三位姿、所述第一位置对应的缩放三维坐标。In a possible implementation manner, the processing module 2003 is specifically configured to, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the The second pose, and the two-dimensional coordinates of the first position in the image coordinate system during the second pose, obtaining the three-dimensional coordinates of the first position in the world coordinate system; according to the first distance and The second distance is to obtain the scaling ratio corresponding to the first position, the first distance is: the distance from the first pose to the three-dimensional coordinates of the first mapping point in the camera coordinate system, the The second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system; according to the scaling ratio corresponding to the first position, the second pose, The three-dimensional coordinates of the first position in the world coordinate system are scaled to obtain the third pose and the scaled three-dimensional coordinates corresponding to the first position.
所述显示模块2002,还用于根据所述第三位姿,以及所述第一位置对应的缩放三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。The display module 2002 is further configured to display the first virtual object in the real scene captured by the terminal according to the third pose and the scaled three-dimensional coordinates corresponding to the first position.
在一种可能的实现方式中,处理模块2003,具体用于将所述第一位置在所述世界坐标系中的三维坐标缩放至所述第一映射点在所述相机坐标系中的三维坐标处,所述第一位置对应的缩放三维坐标与所述第一映射点在所述相机坐标系中的三维坐标相同。In a possible implementation manner, the processing module 2003 is specifically configured to scale the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system , the scaled three-dimensional coordinates corresponding to the first position are the same as the three-dimensional coordinates of the first mapping point in the camera coordinate system.
在一种可能的实现方式中,所述用户执行对所述第二虚拟对象的标识的操作时所述终端为所述第一位姿,处理模块2003,具体用于响应于所述用户对第二虚拟对象的标识的操作,获取所述第二位置在第二虚拟平面上的第二映射点,所述第二位置为所述界面上的预设位置或所述用户在所述界面上确定的位置;获取所述第二映射点在所述相机坐标系中的三维坐标。In a possible implementation manner, when the user performs an operation on identifying the second virtual object, the terminal is in the first pose, and the processing module 2003 is specifically configured to respond to the user's operation on the second virtual object The operation of identifying the second virtual object, obtaining the second mapping point of the second position on the second virtual plane, the second position is a preset position on the interface or determined by the user on the interface the position of; obtaining the three-dimensional coordinates of the second mapping point in the camera coordinate system.
所述显示模块2002,还用于根据所述第一位姿和所述第二映射点在所述相机坐标系中的三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module 2002 is further configured to display the first virtual object in the real scene where the first virtual object has been displayed according to the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system. Second dummy object.
在一种可能的实现方式中,处理模块2003,具体用于在所述终端为所述第二位姿时,根据所述第二位置对应的缩放比例,分别对所述第二位姿、所述第二位置在所述世界坐标系中的三维坐标进行缩放,得到第五位姿、所述第二位置对应的缩放三维坐标,所述第二位置在所述世界坐标系中的三维坐标为:基于所述第一位姿、所述第一位姿时所述第二位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第二位置在所述图像坐标系中的二维坐标获取的。In a possible implementation manner, the processing module 2003 is specifically configured to, when the terminal is in the second pose, respectively calculate the second pose, the The three-dimensional coordinates of the second position in the world coordinate system are scaled to obtain the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, and the three-dimensional coordinates of the second position in the world coordinate system are : Based on the first pose, the two-dimensional coordinates of the second position in the image coordinate system during the first pose, the second pose, and the The second position is obtained in two-dimensional coordinates in the image coordinate system.
所述显示模块2002,具体用于根据所述第五位姿,以及所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module 2002 is specifically configured to display the second virtual object in the real scene where the first virtual object has been displayed according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position.
在一种可能的实现方式中,处理模块2003,具体用于将所述第五位姿平移至所述第三位姿;按照所述第五位姿平移至所述第三位姿的距离和方向,平移所述第二位置对应的缩放三维坐标,得到所述第三位姿下所述第二位置对应的缩放三维坐标。In a possible implementation manner, the processing module 2003 is specifically configured to translate the fifth pose to the third pose; direction, and translate the scaled three-dimensional coordinates corresponding to the second position to obtain the scaled three-dimensional coordinates corresponding to the second position in the third pose.
所述显示模块2002,具体用于根据所述第三位姿,以及所述第三位姿下所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。The display module 2002 is specifically configured to display in the real scene where the first virtual object has been displayed according to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose the second virtual object.
本申请实施例提供的终端可以执行如上实施例中的步骤,可以实现如上实施例中的技术效果,可以参照如上实施例中的相关描述。The terminal provided in the embodiment of the present application can execute the steps in the above embodiment, and can realize the technical effect in the above embodiment, and you can refer to the relevant description in the above embodiment.
在一种实施例中,参照图21,本申请实施例还提供一种电子设备,该电子设备可以为上述实施例中所述的终端,该电子设备中可以包括:处理器2101(例如CPU)、存储器2102。存储器2102可能包含高速随机存取存储器(random-access memory,RAM),也可能还包括非易失性存储器(non-volatile memory,NVM),例如至少一个磁盘存储器,存储器2102中可以存储各种指令,以用于完成各种处理功能以及实现本申请的方法步骤。In one embodiment, referring to FIG. 21 , an embodiment of the present application also provides an electronic device, which may be the terminal described in the above embodiments, and may include: a processor 2101 (such as a CPU) , Storage 2102. The memory 2102 may include a high-speed random-access memory (random-access memory, RAM), and may also include a non-volatile memory (non-volatile memory, NVM), such as at least one disk memory, and various instructions may be stored in the memory 2102 , so as to complete various processing functions and realize the method steps of the present application.
可选的,本申请涉及的电子设备还可以包括:电源2103、通信总线2104、通信端口2105,以及显示器2106。上述通信端口2105用于实现电子设备与其他外设之间进行连接通信。在本申请实施例中,存储器2102用于存储计算机可执行程序代码,程序代码包括指令;当处理器2101执行指令时,指令使电子设备的处理器2101执行上述方法实施例中的动作,其实现原理和技术效果类似,在此不再赘述。显示器2106用于显示终端拍摄的现实场景,以及在现实场景中显示虚拟对象。Optionally, the electronic device involved in this application may further include: a power supply 2103 , a communication bus 2104 , a communication port 2105 , and a display 2106 . The above-mentioned communication port 2105 is used to realize connection and communication between the electronic device and other peripheral devices. In this embodiment of the present application, the memory 2102 is used to store computer-executable program codes, and the program codes include instructions; when the processor 2101 executes the instructions, the instructions cause the processor 2101 of the electronic device to perform the actions in the above-mentioned method embodiments. The principles and technical effects are similar and will not be repeated here. The display 2106 is used to display the real scene captured by the terminal, and display the virtual object in the real scene.
需要说明的是,上述实施例中所述的模块或部件可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个专用集成电路(application specific integrated circuit,ASIC),或,一个或多个微处理器(digital signal processor,DSP),或,一个或者多个现场可编程门阵列(field programmable gate array,FPGA)等。再如,当以上某个模块通过处理元件调度程 序代码的形式实现时,该处理元件可以是通用处理器,例如中央处理器(central processing unit,CPU)或其它可以调用程序代码的处理器如控制器。再如,这些模块可以集成在一起,以片上***(system-on-a-chip,SOC)的形式实现。It should be noted that the modules or components described in the above embodiments may be one or more integrated circuits configured to implement the above method, for example: one or more application specific integrated circuits (ASIC), or , one or more microprocessors (digital signal processor, DSP), or, one or more field programmable gate arrays (field programmable gate array, FPGA), etc. For another example, when one of the above modules is implemented in the form of a processing element scheduler code, the processing element can be a general-purpose processor, such as a central processing unit (central processing unit, CPU) or other processors that can call program codes such as control device. For another example, these modules can be integrated together and implemented in the form of a system-on-a-chip (SOC).
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. A computer can be a general purpose computer, special purpose computer, computer network, or other programmable device. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, computer instructions may be transmitted from a website site, computer, server or data center by wire (such as Coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server, a data center, etc. integrated with one or more available media. Available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)).
本文中的术语“多个”是指两个或两个以上。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系;在公式中,字符“/”,表示前后关联对象是一种“相除”的关系。另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。The term "plurality" herein means two or more. The term "and/or" in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. In addition, the character "/" in this article generally indicates that the front and back related objects are an "or" relationship; in the formula, the character "/" indicates that the front and back related objects are a "division" relationship. In addition, it should be understood that in the description of this application, words such as "first" and "second" are only used to distinguish the purpose of description, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or imply order.
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。It can be understood that the various numbers involved in the embodiments of the present application are only for convenience of description, and are not used to limit the scope of the embodiments of the present application.
可以理解的是,在本申请的实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施例的实施过程构成任何限定。It can be understood that, in the embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be used in the implementation of this application. The implementation of the examples constitutes no limitation.

Claims (15)

  1. 一种增强现实AR场景中的人机交互方法,其特征在于,包括:A human-computer interaction method in an augmented reality AR scene, characterized in that, comprising:
    拍摄现实场景,且在终端的界面上显示拍摄的现实场景,所述界面上还显示有待选择的虚拟对象的标识;Shooting a real scene, and displaying the shot real scene on the interface of the terminal, the interface also displays the identity of the virtual object to be selected;
    响应于用户对第一虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第一虚拟对象;displaying the first virtual object in the real scene captured by the terminal in response to the user's operation on the identification of the first virtual object;
    响应于所述用户对第二虚拟对象的标识的操作,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。In response to the user's operation on the identification of the second virtual object, the second virtual object is displayed in the real scene where the first virtual object has already been displayed.
  2. 根据权利要求1所述的方法,其特征在于,所述响应于用户对第一虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第一虚拟对象,包括:The method according to claim 1, wherein the displaying the first virtual object in the real scene captured by the terminal in response to the user's operation of identifying the first virtual object comprises:
    响应于所述用户对所述第一虚拟对象的标识的操作,获取所述终端的第一位姿;Acquire a first pose of the terminal in response to the user's operation on the identification of the first virtual object;
    获取第一位置在第一虚拟平面上的第一映射点,所述第一位置为所述界面上的预设位置或所述用户在所述界面上确定的位置;Acquiring a first mapping point of a first position on a first virtual plane, where the first position is a preset position on the interface or a position determined by the user on the interface;
    获取所述第一映射点在相机坐标系中的三维坐标;Obtaining the three-dimensional coordinates of the first mapping point in the camera coordinate system;
    根据所述第一位姿和所述第一映射点在所述相机坐标系中的三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。Displaying the first virtual object in a real scene captured by the terminal according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system.
  3. 根据权利要求2所述的方法,其特征在于,所述获取第一位置在第一虚拟平面上的第一映射点,包括:The method according to claim 2, wherein said obtaining the first mapping point of the first position on the first virtual plane comprises:
    获取所述第一位置在图像坐标系中的二维坐标;Obtaining the two-dimensional coordinates of the first position in the image coordinate system;
    将所述第一位姿至所述二维坐标的射线与所述第一虚拟平面的交点作为所述第一映射点。An intersection point of a ray from the first pose to the two-dimensional coordinates and the first virtual plane is used as the first mapping point.
  4. 根据权利要求3所述的方法,其特征在于,所述第一虚拟平面包含于预设虚拟平面集合中,所述方法还包括:The method according to claim 3, wherein the first virtual plane is included in a set of preset virtual planes, and the method further comprises:
    若所述第一位姿至所述二维坐标的射线与所述第一虚拟平面没有交点,则获取所述第一位姿至所述二维坐标的射线与所述预设虚拟平面集合中其他虚拟平面的交点;If the ray from the first pose to the two-dimensional coordinates does not intersect with the first virtual plane, then obtain the ray from the first pose to the two-dimensional coordinates and the set of preset virtual planes intersection points of other virtual planes;
    将所述与所述预设虚拟平面集合中其他虚拟平面的交点作为所述第一映射点。Taking the intersection point with other virtual planes in the preset virtual plane set as the first mapping point.
  5. 根据权利要求3或4所述的方法,其特征在于,所述根据所述第一位姿和所述第一映射点在所述相机坐标系中的三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象之后,还包括:The method according to claim 3 or 4, characterized in that, according to the first pose and the three-dimensional coordinates of the first mapping point in the camera coordinate system, the real scene shot at the terminal After displaying the first virtual object in , further include:
    追踪所述第一位置对应的图像块;Tracking the image block corresponding to the first position;
    在所述终端为第二位姿时,获取所述第一位置在所述图像坐标系中的二维坐标;When the terminal is in the second pose, acquire the two-dimensional coordinates of the first position in the image coordinate system;
    根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。According to the first pose, the two-dimensional coordinates of the first position in the image coordinate system in the first pose, the second pose, and the first pose in the second pose A two-dimensional coordinate of a position in the image coordinate system, the first virtual object is displayed in the real scene captured by the terminal.
  6. 根据权利要求5所述的方法,其特征在于,The method according to claim 5, characterized in that,
    所述第二位姿与所述第一位姿之间的距离大于或等于距离阈值;和/或,a distance between the second pose and the first pose is greater than or equal to a distance threshold; and/or,
    所述终端在从所述第一位姿移动至所述第二位姿的过程中拍摄的图像的帧数大于或等于预设帧数;和/或,The number of frames of the image captured by the terminal during the process of moving from the first pose to the second pose is greater than or equal to a preset number of frames; and/or,
    所述终端在从所述第一位姿移动至所述第二位姿的时长大于预设时长;和/或,The duration of the terminal moving from the first pose to the second pose is longer than a preset duration; and/or,
    所述第二位姿为所述终端对所述第一映射点执行三角化成功时的位姿。The second pose is a pose when the terminal successfully performs triangulation on the first mapping point.
  7. 根据权利要求5或6所述的方法,其特征在于,所述根据所述第一位姿、所述第一位 姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象,包括:The method according to claim 5 or 6, characterized in that, according to the first pose, the two-dimensional coordinates of the first position in the image coordinate system at the first pose, the The second pose, and the two-dimensional coordinates of the first position in the image coordinate system in the second pose, and displaying the first virtual object in the real scene captured by the terminal includes:
    根据所述第一位姿、所述第一位姿时所述第一位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第一位置在所述图像坐标系中的二维坐标,获取所述第一位置在世界坐标系中的三维坐标;According to the first pose, the two-dimensional coordinates of the first position in the image coordinate system in the first pose, the second pose, and the first pose in the second pose The two-dimensional coordinates of a position in the image coordinate system, obtaining the three-dimensional coordinates of the first position in the world coordinate system;
    根据第一距离和第二距离,获取所述第一位置对应的缩放比例,所述第一距离为:所述第一位姿至所述第一映射点的距离,所述第二距离为:所述第一位姿至所述第一位置在所述世界坐标系中的三维坐标的距离;Obtain the scaling ratio corresponding to the first position according to the first distance and the second distance, the first distance is: the distance from the first pose to the first mapping point, and the second distance is: the distance from the first pose to the three-dimensional coordinates of the first position in the world coordinate system;
    根据所述第一位置对应的缩放比例,分别对所述第二位姿、所述第一位置在所述世界坐标系中的三维坐标进行缩放,得到第三位姿、所述第一位置对应的缩放三维坐标;According to the scaling ratio corresponding to the first position, the second pose and the three-dimensional coordinates of the first position in the world coordinate system are respectively scaled to obtain a third pose corresponding to the first position. The scaled three-dimensional coordinates of
    根据所述第三位姿,以及所述第一位置对应的缩放三维坐标,在所述终端拍摄的现实场景中显示所述第一虚拟对象。According to the third pose and the scaled three-dimensional coordinates corresponding to the first position, the first virtual object is displayed in the real scene captured by the terminal.
  8. 根据权利要求7所述的方法,其特征在于,根据所述第一位置对应的缩放比例,对所述第一位置在所述世界坐标系中的三维坐标进行缩放,包括:The method according to claim 7, wherein scaling the three-dimensional coordinates of the first position in the world coordinate system according to the scaling ratio corresponding to the first position comprises:
    将所述第一位置在所述世界坐标系中的三维坐标缩放至所述第一映射点在所述相机坐标系中的三维坐标处,所述第一位置对应的缩放三维坐标与所述第一映射点在所述相机坐标系中的三维坐标相同。Scaling the three-dimensional coordinates of the first position in the world coordinate system to the three-dimensional coordinates of the first mapping point in the camera coordinate system, the scaled three-dimensional coordinates corresponding to the first position are the same as the first The three-dimensional coordinates of a mapping point in the camera coordinate system are the same.
  9. 根据权利要求2-8中任一项所述的方法,其特征在于,所述用户执行对所述第二虚拟对象的标识的操作时所述终端为所述第一位姿,所述响应于所述用户对第二虚拟对象的标识的操作,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象,包括:The method according to any one of claims 2-8, wherein the terminal is in the first pose when the user performs an operation on the identification of the second virtual object, and the response The user's operation of identifying the second virtual object, and displaying the second virtual object in the real scene where the first virtual object has been displayed, includes:
    响应于所述用户对第二虚拟对象的标识的操作,获取第二位置在第二虚拟平面上的第二映射点,所述第二位置为所述界面上的预设位置或所述用户在所述界面上确定的位置;Responding to the user's operation on the identification of the second virtual object, acquiring a second mapping point of the second position on the second virtual plane, the second position being a preset position on the interface or the user's a determined location on the interface;
    获取所述第二映射点在所述相机坐标系中的三维坐标;Obtaining the three-dimensional coordinates of the second mapping point in the camera coordinate system;
    根据所述第一位姿和所述第二映射点在所述相机坐标系中的三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。According to the first pose and the three-dimensional coordinates of the second mapping point in the camera coordinate system, the second virtual object is displayed in the real scene where the first virtual object has been displayed.
  10. 根据权利要求7所述的方法,其特征在于,所述第二位姿为所述用户执行对所述第二虚拟对象的标识之后的位姿,所述方法还包括:The method according to claim 7, wherein the second pose is a pose after the user identifies the second virtual object, and the method further comprises:
    在所述终端为所述第二位姿时,根据所述第二位置对应的缩放比例,分别对所述第二位姿、所述第二位置在所述世界坐标系中的三维坐标进行缩放,得到第五位姿、所述第二位置对应的缩放三维坐标,所述第二位置在所述世界坐标系中的三维坐标为:基于所述第一位姿、所述第一位姿时所述第二位置在所述图像坐标系中的二维坐标、所述第二位姿,以及所述第二位姿时所述第二位置在所述图像坐标系中的二维坐标获取的;When the terminal is in the second pose, according to the scaling ratio corresponding to the second position, respectively scale the second pose and the three-dimensional coordinates of the second position in the world coordinate system , to obtain the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, the three-dimensional coordinates of the second position in the world coordinate system are: based on the first pose and the first pose The two-dimensional coordinates of the second position in the image coordinate system, the second pose, and the obtained two-dimensional coordinates of the second position in the image coordinate system in the second pose ;
    根据所述第五位姿,以及所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。According to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, the second virtual object is displayed in the real scene where the first virtual object has been displayed.
  11. 根据权利要求10所述的方法,其特征在于,所述根据所述第五位姿,以及所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象,包括:The method according to claim 10, wherein, according to the fifth pose and the scaled three-dimensional coordinates corresponding to the second position, the first virtual object is displayed in the real scene where the first virtual object has been displayed. The second virtual object includes:
    将所述第五位姿平移至所述第三位姿;translating the fifth pose to the third pose;
    按照所述第五位姿平移至所述第三位姿的距离和方向,平移所述第二位置对应的缩放三维坐标,得到所述第三位姿下所述第二位置对应的缩放三维坐标;Translating the zoomed three-dimensional coordinates corresponding to the second position according to the distance and direction of the fifth pose to the third pose, to obtain the zoomed three-dimensional coordinates corresponding to the second position in the third pose ;
    根据所述第三位姿,以及所述第三位姿下所述第二位置对应的缩放三维坐标,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。According to the third pose and the scaled three-dimensional coordinates corresponding to the second position in the third pose, display the second virtual object in the real scene where the first virtual object has already been displayed.
  12. 一种增强现实AR场景中的人机交互装置,其特征在于,包括:A human-computer interaction device in an augmented reality AR scene, characterized in that it includes:
    拍摄模块,用于拍摄现实场景;The shooting module is used for shooting real scenes;
    显示模块,用于:display module for:
    在终端的界面上显示拍摄的现实场景,所述界面上还显示有待选择的虚拟对象的标识;Displaying the captured real scene on the interface of the terminal, and displaying the identification of the virtual object to be selected on the interface;
    响应于用户对第一虚拟对象的标识的操作,在所述终端拍摄的现实场景中显示所述第一虚拟对象;displaying the first virtual object in the real scene captured by the terminal in response to the user's operation on the identification of the first virtual object;
    响应于所述用户对第二虚拟对象的标识的操作,在已显示所述第一虚拟对象的现实场景中显示所述第二虚拟对象。In response to the user's operation on the identification of the second virtual object, the second virtual object is displayed in the real scene where the first virtual object has already been displayed.
  13. 一种电子设备,其特征在于,包括:处理器和存储器;An electronic device, characterized in that it includes: a processor and a memory;
    所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
    所述处理器执行所述存储器存储的计算机执行指令,使得所述处理器执行如权利要求1-11中任一项所述的方法。The processor executes the computer-implemented instructions stored in the memory, such that the processor performs the method according to any one of claims 1-11.
  14. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序或指令,当所述计算机程序或指令被运行时,实现如权利要求1-11中任一项所述的方法。A computer-readable storage medium, characterized in that, computer programs or instructions are stored in the computer-readable storage medium, and when the computer programs or instructions are executed, the implementation of any one of claims 1-11 described method.
  15. 一种计算机程序产品,其特征在于,包括计算机程序或指令,所述计算机程序或指令被处理器执行时,实现权利要求1-11中任一项所述的方法。A computer program product, characterized in that it includes a computer program or instruction, and when the computer program or instruction is executed by a processor, the method according to any one of claims 1-11 is realized.
PCT/CN2022/134830 2022-02-23 2022-11-28 Human-computer interaction method and apparatus in augmented reality (ar) scene, and electronic device WO2023160072A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210168435.1A CN116679824A (en) 2022-02-23 2022-02-23 Man-machine interaction method and device in augmented reality AR scene and electronic equipment
CN202210168435.1 2022-02-23

Publications (1)

Publication Number Publication Date
WO2023160072A1 true WO2023160072A1 (en) 2023-08-31

Family

ID=87764577

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134830 WO2023160072A1 (en) 2022-02-23 2022-11-28 Human-computer interaction method and apparatus in augmented reality (ar) scene, and electronic device

Country Status (2)

Country Link
CN (1) CN116679824A (en)
WO (1) WO2023160072A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015192436A (en) * 2014-03-28 2015-11-02 キヤノン株式会社 Transmission terminal, reception terminal, transmission/reception system and program therefor
CN108273265A (en) * 2017-01-25 2018-07-13 网易(杭州)网络有限公司 The display methods and device of virtual objects
CN108520552A (en) * 2018-03-26 2018-09-11 广东欧珀移动通信有限公司 Image processing method, device, storage medium and electronic equipment
JP6548241B1 (en) * 2018-07-14 2019-07-24 株式会社アンジー Augmented reality program and information processing apparatus
WO2020191101A1 (en) * 2019-03-18 2020-09-24 Geomagical Labs, Inc. Virtual interaction with three-dimensional indoor room imagery

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015192436A (en) * 2014-03-28 2015-11-02 キヤノン株式会社 Transmission terminal, reception terminal, transmission/reception system and program therefor
CN108273265A (en) * 2017-01-25 2018-07-13 网易(杭州)网络有限公司 The display methods and device of virtual objects
CN108520552A (en) * 2018-03-26 2018-09-11 广东欧珀移动通信有限公司 Image processing method, device, storage medium and electronic equipment
JP6548241B1 (en) * 2018-07-14 2019-07-24 株式会社アンジー Augmented reality program and information processing apparatus
WO2020191101A1 (en) * 2019-03-18 2020-09-24 Geomagical Labs, Inc. Virtual interaction with three-dimensional indoor room imagery

Also Published As

Publication number Publication date
CN116679824A (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN111815755B (en) Method and device for determining blocked area of virtual object and terminal equipment
US11270460B2 (en) Method and apparatus for determining pose of image capturing device, and storage medium
JP7453470B2 (en) 3D reconstruction and related interactions, measurement methods and related devices and equipment
KR101453815B1 (en) Device and method for providing user interface which recognizes a user's motion considering the user's viewpoint
WO2019242262A1 (en) Augmented reality-based remote guidance method and device, terminal, and storage medium
EP3713220A1 (en) Video image processing method and apparatus, and terminal
US9256986B2 (en) Automated guidance when taking a photograph, using virtual objects overlaid on an image
WO2019007258A1 (en) Method, apparatus and device for determining camera posture information, and storage medium
WO2021082801A1 (en) Augmented reality processing method and apparatus, system, storage medium and electronic device
US9268410B2 (en) Image processing device, image processing method, and program
WO2019196745A1 (en) Face modelling method and related product
WO2021018214A1 (en) Virtual object processing method and apparatus, and storage medium and electronic device
TW201346640A (en) Image processing device, and computer program product
US11044398B2 (en) Panoramic light field capture, processing, and display
WO2018233623A1 (en) Method and apparatus for displaying image
US20220375258A1 (en) Image processing method and apparatus, device and storage medium
JP2018026064A (en) Image processor, image processing method, system
CN113934297B (en) Interaction method and device based on augmented reality, electronic equipment and medium
JP2013164697A (en) Image processing device, image processing method, program and image processing system
WO2023273499A1 (en) Depth measurement method and apparatus, electronic device, and storage medium
CN115278084A (en) Image processing method, image processing device, electronic equipment and storage medium
WO2019196871A1 (en) Modeling method and related device
WO2015072091A1 (en) Image processing device, image processing method, and program storage medium
JP2012103743A (en) Virtual information imparting device and virtual information imparting program
WO2023142732A1 (en) Image processing method and apparatus, and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22928332

Country of ref document: EP

Kind code of ref document: A1