CN111399654A

CN111399654A - Information processing method, information processing device, electronic equipment and storage medium

Info

Publication number: CN111399654A
Application number: CN202010219498.6A
Authority: CN
Inventors: 黄锋华
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2020-07-10
Anticipated expiration: 2040-03-25
Also published as: CN111399654B

Abstract

The application discloses an information processing method, an information processing device, electronic equipment and a storage medium, which relate to the technical field of display, and the method comprises the following steps: acquiring first position information of a target object in a real environment; obtaining second position information of virtual content in the real environment; acquiring the relative position relation between the target object and the virtual content in the real environment according to the first position information and the second position information; and if the relative position relation meets a specified condition, controlling the target object to execute a specified operation or controlling the target object and the virtual content to execute a specified operation. Therefore, the interactive action between the target object and the virtual content can be controlled according to the mutual position relation of the target object and the virtual content in the space of the real environment, and the interactivity of the virtual content to the real object in the real environment is improved.

Description

Information processing method, information processing device, electronic equipment and storage medium

Technical Field

The present application relates to the field of display technologies, and in particular, to an information processing method and apparatus, an electronic device, and a storage medium.

Background

In recent years, with the progress of science and technology, Augmented Reality (AR) technology has become a hot spot of research at home and abroad, and a user can observe an effect of Augmented Reality or mixed Reality after content such as virtual content is superimposed on a real world in a real world environment by wearing a head-mounted display device such as AR glasses or by using a screen of a terminal.

However, in current AR scenes, the presentation and interaction of virtual objects lacks interaction with the real environment.

Disclosure of Invention

The application provides an information processing method, an information processing device, an electronic device and a storage medium, so as to overcome the defects.

In a first aspect, an embodiment of the present application provides an information processing method, including: acquiring first position information of a target object in a real environment; obtaining second position information of virtual content in the real environment; acquiring the relative position relation between the target object and the virtual content in the real environment according to the first position information and the second position information; and if the relative position relation meets a specified condition, triggering the target object to execute a specified operation or triggering the target object and the virtual content to execute a specified operation.

In a second aspect, an embodiment of the present application further provides an information processing apparatus, where the apparatus includes: the device comprises a first acquisition unit, a second acquisition unit, a determination unit and a processing unit. The first acquisition unit is used for acquiring first position information of the target object in the real environment. A second obtaining unit, configured to obtain second location information of the virtual content within the real environment. A determining unit, configured to obtain, according to the first location information and the second location information, a relative location relationship between the target object and the virtual content in the real environment. And the processing unit is used for triggering the target object to execute a specified operation or triggering the target object and the virtual content to execute a specified operation if the relative position relation meets a specified condition.

In a third aspect, an embodiment of the present application further provides an electronic device, including: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the above-described method.

In a fourth aspect, the present application also provides a storage medium storing program code executable by a processor, where the program code causes the processor to execute the above method when executed by the processor.

According to the information processing method, the information processing device, the electronic equipment and the storage medium, first position information of a target object in a real environment and second position information of virtual content in the real environment are obtained, the relative position relation of the target object and the virtual content in the real environment can be determined according to the first position information and the second position information, the relative position relation can reflect the mutual position relation of the real target object and the virtual content in a space of the real environment, and when the relation meets a specified condition, the target object is triggered to execute specified operation or the target object and the virtual content are triggered to execute specified operation. Therefore, the interactive action between the target object and the virtual content can be controlled according to the mutual position relation of the target object and the virtual content in the space of the real environment, and the interactivity of the virtual content to the real object in the real environment is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an AR device according to an embodiment of the present application;

fig. 2 is a schematic diagram of an AR device according to another embodiment of the present application;

FIG. 3 is a flow chart of a method of processing information according to an embodiment of the present application;

FIG. 4 is a flow chart of a method of processing information provided by another embodiment of the present application;

FIG. 5 is a schematic diagram illustrating a relative position relationship between virtual content and a real object provided by an embodiment of the present application;

FIG. 6 is a schematic diagram illustrating a relative position relationship between virtual content and a real object provided by another embodiment of the present application;

FIG. 7 is a flow chart of a method of processing information provided by a further embodiment of the present application;

FIG. 8 is a diagram illustrating a move instruction provided by an embodiment of the present application;

fig. 9 shows a block diagram of modules of an information processing apparatus according to an embodiment of the present application;

fig. 10 shows a block diagram of a module of an information processing apparatus according to another embodiment of the present application;

fig. 11 shows a block diagram of an electronic device according to an embodiment of the present application;

fig. 12 illustrates a storage unit for storing or carrying program code for implementing a method according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

With the continuous development of technologies, some difficulties of Augmented Reality (AR) technologies are solved in succession. Augmented reality is a technology for increasing the perception of a user to the real world through information provided by a computer system, and is used for overlaying content objects such as virtual content, scenes or system prompt information generated by a computer into a real scene to enhance or modify the perception of the real world environment or data representing the real world environment, so that the user can observe the effect of augmented reality or mixed reality after the content such as the virtual content is overlaid with the real world in the real world environment by wearing head-mounted display equipment such as AR glasses or an AR related application program of a mobile terminal.

For example, some AR glasses may collect information in a real environment through a camera and a sensor of the AR glasses, and after running through a processor and a specific algorithm, render a corresponding image to be displayed on the glasses, so that a user feels that a virtual image is fused with the real world.

Some AR glasses are split-type structures, i.e. the processor module of the glasses is not on the glasses but on a separate computing unit, which is connected to the AR glasses by wires. Data collected by a camera and a sensor of the AR glasses are preprocessed and then sent to a computing unit, the computing unit renders a corresponding virtual image according to the collected information, and the virtual image is transmitted to the AR glasses for display.

As shown in fig. 1, fig. 1 shows an AR device, which may then be a head mounted display device, in particular AR glasses. As shown in fig. 1, the head-mounted display apparatus includes a display screen 110, a frame 120, and an imaging device 130.

The frame 120 includes a front surface 121 on which the display screen 110 is mounted, a side surface 122, and a rear surface 123, and the imaging device 130 is capable of displaying an image of virtual content on the display screen 110. For example, the imaging device 130 may be a diffractive light guide capable of projecting an image onto a display screen.

As an embodiment, the display screen 110 may be a lens of the AR glasses, and the display screen 110 may also transmit light, that is, the display screen 110 may be a transflective lens, when the user wears the head-mounted display device, when an image is displayed on the display screen 110, the user can see the image displayed on the display screen 110 and can also see objects in the real world in the surrounding environment through the display screen 110. Then, through the semi-transparent and semi-reflective lens, the user can superimpose the image displayed on the lens with the surrounding environment, thereby realizing the visual effect of augmented reality.

When the user wears the head-mounted display device, the display screen 110 is located in front of the eyes of the user, that is, the front surface 121 is located in front of the eyes of the user, the rear surface 123 is located behind the eyes of the user, and the side surface 122 is located at the side of the eyes of the user.

In addition, a front camera is disposed on the front surface 121, and environmental information in front is sensed through the front camera, so as to realize instant positioning and Mapping (S L AM), and further realize a visual effect of augmented reality or mixed reality.

In other AR technologies, a front-facing camera may be used to implement the integration of real scenes with virtual content. Specifically, the visual field direction of the front camera on the front surface of the head-mounted display device may be consistent with the visual field direction of the user when the user wears the head-mounted display device, and the front camera is configured to capture an image of a real scene, and display the captured image of the real scene on the display screen in front of the eyes of the user after being processed, specifically, an image of virtual content may be superimposed on the image of the real scene and viewed by the user, so that the user observes the visual effect of augmented reality.

As another embodiment, the AR effect may also be implemented by a mobile terminal or a terminal with a screen, such as a tablet computer or a computer device. Specifically, as shown in fig. 2, fig. 2 shows another AR device, which may be a user terminal, where the terminal includes a camera and a screen, and as shown in fig. 2, an image of a real scene (an indoor scene, such as a table lamp, a sofa, etc. shown in fig. 2) displayed on the screen of the terminal may be an image captured by the camera of the terminal, and display content corresponding to virtual content a is displayed on the screen, where the display content may be a picture, and a picture corresponding to virtual content a can be added to the image corresponding to the real scene displayed on the screen, and the user can see, through the content displayed on the screen, that virtual content a (such as a sphere in fig. 2) is disposed in the real scene, so as to obtain a display effect of augmented reality in which the virtual content a is disposed in the real scene.

Specifically, the steps of implementing the effect of augmented reality by the terminal shown in fig. 2 include tracking, scene understanding, and rendering. Wherein the tracking may be implemented to provide a relative position of the terminal in the real environment. In particular, an accurate view of where the terminal is located and the orientation of the device can be provided by means of a visual odometer, which uses camera images and motion data of the device.

Context understanding refers to determining properties or characteristics of the environment surrounding a device. For example, a surface or plane in the real environment can be determined by a plane detection (planedetection) function or the like. Such as a floor or a table. In order to place the virtual content, the terminal also needs to provide a hit test function. This function may obtain the intersection of the real-world topology, i.e. obtain the specific location of each object in the real environment, e.g. the height of the table from the ground or the distance from the terminal, etc., in order to place the virtual content in the real environment. Finally, a ray estimation can be performed for scene understanding. The ray estimation is used to correctly illuminate the virtual content to match the real world, for example, when the virtual content is placed in the real world, the shadow of the virtual content under the illumination of the real world is displayed, thereby increasing the sense of realism and matching of the virtual content with the real world.

The real environment displayed on the screen may be an image of the current environment acquired by a camera of the terminal, or an image of a real scene sent by another terminal received by the terminal in real time, for example, when video interaction is performed between the terminal and another terminal, an image of a real scene where the terminal is located and acquired by a camera of another terminal in real time is displayed on the screen of fig. 2.

However, the existing AR scenes are not sufficient in interactivity, and particularly, the inventors have found in research that, no matter which way the AR effect is achieved, the interactivity between the virtual content and the real environment is often not sufficient, and particularly, in the existing AR scenes or multi-person AR scenes, the virtual content is merely placed in the real environment, for example, a 3d model is placed on one plane of the real environment, or in the AR scenes, different virtual contents may interact with each other, such as collision of two virtual 3d models in a multi-person AR game, and the like.

Therefore, in the existing AR scene, the virtual content is not displayed and interacted with the real environment, so that the user only sees the virtual content placed in the real environment at the interface of the camera or on the avatar, and the interactivity of the AR is too low to well embody the characteristics of the AR.

In order to overcome the above-mentioned drawbacks, embodiments of the present application provide an information processing method, which may be applied to the apparatuses shown in fig. 1 and 2. In the embodiment of the present application, the method may be applied to the electronic device shown in fig. 2, that is, the execution subject of the method may be a processor or a client in the electronic device. In addition, it should be noted that the executing subject of the method may also be the head-mounted display device of fig. 1, specifically, if a processor is disposed in the head-mounted display device, the executing subject of the method may be the processor in the head-mounted display device. As an implementation manner, and in order to better illustrate the effect of the present embodiment, the execution subject of the method in the embodiment of the present application is, for example, a mobile terminal, such as the electronic device shown in fig. 2, but this does not limit the application field of the method of the present application, that is, the device and the application environment to which the present application is applicable are not limited.

Referring to fig. 3, fig. 3 shows an information processing method for improving interactivity between virtual content and real objects in an AR scene, specifically, the method includes: s301 to S304.

S301: first position information of a target object within a real environment is acquired.

The real environment may refer to a world coordinate system corresponding to the real world, and the first position information of the target object in the real environment may refer to physical coordinates of the target object in the world coordinate system. The world coordinate system may be a coordinate system established with the AR terminal as a center, that is, a position of the AR terminal in the real world scene is used as an origin of the world coordinate system, where the terminal is configured to display the virtual object in the real world, and specifically, the terminal may be an execution subject of the method, that is, the mobile terminal described above.

Wherein the target object is a real object within the real environment. As an embodiment, when there are multiple objects within the real environment, the target object may be multiple objects within the real environment.

The mobile terminal can scan the surrounding environment according to a preset positioning algorithm to establish a world coordinate system taking the terminal as a center, and determine the coordinate position of each real object in the real environment in the world coordinate system, wherein the coordinate position is used as first position information of the real object in the real environment. As an implementation manner, a camera and an inertial measurement unit are arranged in the mobile terminal, and a world coordinate system corresponding to the real environment, that is, a world coordinate system with the terminal as a center, can be established according to an image of the surrounding environment acquired by the camera and pose information of the mobile terminal obtained by the inertial measurement unit, so as to obtain a coordinate position of each real object in the world coordinate system.

Specifically, the above-mentioned S L AM technique may be used to understand the surrounding real environment and track the real object in the real environment, the S L AM technique may be used to construct a world coordinate system based on the terminal as a starting point based on the image of the surrounding environment collected by the camera and the pose information of the mobile terminal obtained by the inertial measurement unit, and then obtain a dense 3D point cloud using a Time of flight (TOF) depth camera, where the dense 3D point cloud may obtain 3D coordinates of each point on the surface of the real object in the world coordinate system, and then the 3D coordinates may be used as first position information of each real object in the real environment.

S302: second location information of virtual content within the real environment is obtained.

As an embodiment, the first position information of the target object in the real environment may refer to coordinate information of the target object in a world coordinate system corresponding to the real environment, where the world coordinate system may be a world coordinate system established with the mobile terminal as a center, and the second position information of the virtual content in the real environment may also be coordinate information of the virtual content in the world coordinate system.

Specifically, the second position information of the virtual content in the real environment has a mapping relationship with the display information of the display content corresponding to the virtual content on the display screen of the mobile terminal, and the display information includes information such as the display size, the shape, and the position of the display content corresponding to the virtual content.

In one embodiment, in the image of the real environment captured by the camera, the real object is in a camera coordinate system corresponding to the camera, a Z axis of the camera coordinate system matches with an optical axis direction of the camera, specifically, the optical axis direction of the camera is the Z axis direction of the camera coordinate system, and an XOY plane formed by an X axis and a Y axis is perpendicular to the Z axis. The coordinates of the real object within the camera coordinate system can be determined. For example, according to the mapping relationship between the pixel coordinate system of the image acquired by the camera and the camera coordinate system, the coordinates of the pixel points of the image of each real object in the image in the camera coordinate system can be determined, where the coordinates include depth information of the real object, and for example, the projection of the coordinates of the real object on the Z-axis of the camera coordinate system is the depth information of the moving object. The change of the depth of field of the real object can be determined through the change of the coordinates of the real object in the camera coordinate system, and the distance from the real object to the camera can be determined according to the change of the depth of field.

In an embodiment of displaying the virtual content in the real environment, the virtual content is displayed on a plane or at a position of an object in the real environment, a distance between the object and the camera may be determined, and further, depth-of-field information corresponding to the virtual content may be determined, and profile information of the display content corresponding to the depth-of-field information of the target object may be determined according to a preset correspondence relationship between the depth-of-field information and profile information in the display corresponding to the virtual content, where the profile information includes a shape and a size of the display content corresponding to the virtual content. For example, according to a rule that the farther the distance is or the depth information is larger, the smaller the contour is, that is, the larger the distance is, the correspondence between different positions in the real environment and the contour information of the display content corresponding to the virtual content may be set in advance, and the shape and size of the display content corresponding to the virtual content may be determined.

As another embodiment, the coordinate relationship between each pixel point in the pixel coordinates of the screen and each position point in the real environment may also be determined, specifically, by means of internal and external parameters of the camera, for example, by a zhangnyou calibration method. For example, as shown in fig. 2, the virtual content a is displayed on the ground in the vicinity of the desk lamp in the real environment, and the position of the ground in the vicinity of the desk lamp in the image of the real environment within the pixel coordinates of the screen can be determined, so that the display content corresponding to the virtual content can be determined to be displayed at the position of the ground in the vicinity of the desk lamp in the image of the real environment displayed on the screen.

Therefore, by means of the predetermined mapping relationship between the pixel coordinates of each pixel point of the image displayed on the screen and the world coordinates of each position point in the world coordinate system, the position of each display content displayed on the screen in the real environment can be determined, and further the position of the virtual content in the real environment can be determined, that is, the second position information of the virtual content in the real environment can be determined.

For example, it is necessary to display the virtual content at a position a in a specified coordinate system, that is, when the user can see that a virtual object is displayed at the position a in the real space when using the mobile terminal, it is determined that the display position on the display screen corresponding to the position a is a position B, and when the virtual object is displayed at the position B on the display screen of the user equipment, the user can see that a virtual object is displayed at the position a through the display screen.

Specifically, the second location information of the virtual content in the real environment may be a virtual space corresponding to the real environment as the virtual content, and the location information of the virtual content in the virtual space.

Wherein, virtual space refers to a fully or partially artificial environment, which may be three-dimensional; a virtual scene refers to a representation of a virtual space viewed from a particular viewpoint within the virtual space. Real space, as opposed to virtual space, refers to a real environment, which may be three-dimensional; and a real scene refers to a representation of a real space viewed from a particular viewpoint within the real space. The virtual scene may be an AR scene, a VR scene, or an MR scene, and the virtual space refers to a virtual space in the AR scene, the VR scene, or the MR scene.

The coordinate system of the virtual space is matched with the world coordinate system of the real environment, it can be understood that the coordinate system of the virtual space has been aligned to the world coordinate system of the real environment, i.e. the coordinates of the target object within the world coordinate system of the real environment are (x1, y1, z1), and if the display position of the virtual content is completely coincident with the target object, the coordinates of the virtual content within the coordinate system of the virtual space are also (x1, y1, z1), i.e. it can be understood that the coordinate system of the virtual space is coincident with the world coordinate system of the real environment, so that the virtual content viewed from the viewpoint of the user can exist within the real environment like the real object. That is, the second position information of the virtual content and the first position information of the position of the real object may each be a coordinate point of the virtual content and the real object within the above-described world coordinate system.

For example, if a table exists in the real world, and a virtual content, for example, a virtual table lamp, needs to be displayed on the desktop of the table, the position of the desktop of the table in a spatial coordinate system, that is, a designated coordinate system, needs to be determined, then, according to a mapping relationship between a pixel coordinate system of a display screen and the designated coordinate system, which is established in advance, a corresponding relationship between the position of the desktop and a pixel coordinate in the pixel coordinate system is found, so that a display position on the display screen corresponding to the position of the desktop can be found, and then, an image corresponding to a virtual object is displayed at the display position. The scene that the user sees through this display screen is that the desktop of real world has shown the desk lamp. The virtual table lamp and the position information of the table both have coordinates corresponding to the world coordinate system of the real environment.

S303: and acquiring the relative position relation between the target object and the virtual content in the real environment according to the first position information and the second position information.

The first position information can reflect the position of a real object in the real environment, namely, the target object in the real environment, namely, the position in the world coordinate system corresponding to the real environment, the second position information can reflect the position of the virtual content in the real environment, and the first position information and the second position information correspond to the position in the same spatial coordinate system, so that the relative position relationship between the target object and the virtual content in the real environment can be reflected in the first position information and the second position information.

Wherein the relative position relationship includes at least one of a relative motion relationship and a relative orientation relationship.

Wherein the relative motion relationship includes relatively close and relatively far. Wherein relatively close and relatively far are relative to the position of the head mounted display device, i.e. the user, i.e. forward and far relative to the user. The relative close may mean that the depth of field of the target object gradually becomes smaller or the distance between the target object and the virtual content gradually becomes smaller in the world coordinate system, and similarly, the relative far may mean that the depth of field of the target object gradually becomes larger or the distance between the target object and the virtual content gradually becomes larger in the world coordinate system.

Specifically, the relative motion relationship between the target object and the virtual content may be determined by continuously acquiring the first position information and the second position information. For example, the first position information and the second position information are coordinates in the world coordinate system, and by continuously acquiring a plurality of coordinates of the target object in the world coordinate system and a plurality of coordinates of the virtual content in the world coordinate system, a position change of a coordinate point between the target object and the virtual content can be determined. Further, the directions of approaching or separating from each other may be determined according to the coordinate transformation of the two coordinates and the coordinate transformation of the three coordinate axis directions of the world coordinate system, for example, the directions may be approaching to each other in the vertical direction or separating from each other in the horizontal direction.

The relative orientation relationship may be an orientation relationship between the target object and the virtual content, for example, the virtual content is located in front of, behind, in front of left or right of the target object, and the like, and may also be an occlusion relationship between the target object and the virtual content, for example, the target object is occluded by the virtual content or the virtual content is occluded by the target object. Specifically, the occlusion determination may be performed in a collision detection manner, and specifically, reference may be made to subsequent embodiments, which are not described herein again.

S304: and if the relative position relation meets a specified condition, triggering the target object to execute a specified operation or triggering the target object and the virtual content to execute a specified operation.

The specified condition may be set in advance according to a requirement, for example, if the relative position relationship is a relative motion relationship, the specified condition is relatively close or relatively far, as an embodiment, when the target object and the virtual content are relatively close in the real environment, it is determined that the relative position relationship satisfies the specified condition.

As another embodiment, if the relative positional relationship is the orientation relationship between the target object and the virtual content, the specified condition is mutual collision, i.e., mutual occlusion. As one embodiment, when a target object and the virtual content relatively collide within the real environment, it is determined that the relative positional relationship satisfies a specified condition.

In this embodiment, the condition that the relative position satisfies the specified condition may be used as a trigger condition for triggering the target object to perform the specified operation or triggering the target object and the virtual content to perform the specified operation. Specifically, a designation operation of the target object or the target object corresponding to the virtual content may be set in advance, and when it is determined that the relative positional relationship satisfies a designation condition, the designation operation is triggered to be executed.

The specifying operation may be preset, and for the target object, the specifying operation may be that the terminal sends a control instruction to the target object, and the control instruction can control the target object to perform the specifying operation, for example, the target object is a table lamp, and when the relative position relationship between the virtual content and the target object meets a specified condition, the table lamp is controlled to turn on or turn off the lamp.

As another embodiment, the target object and the virtual content may be further controlled to perform a specifying operation, that is, when the relative position relationship between the virtual content and the target object satisfies a specifying condition, the virtual content and the target object are both controlled by the terminal to perform the specifying operation, and the specifying operation may include a first action corresponding to the target object and a second action corresponding to the virtual content. The terminal sends a first action instruction to the target object, instructs the target object to execute the first action, acquires rendering data corresponding to the virtual content, and changes display content corresponding to the virtual content according to the rendering data, so that the virtual content executes a second action in the real environment. For example, if the second action is to control the virtual content to rotate, the rendering data is a moving image of the virtual content, that is, a plurality of images corresponding to the virtual content, and the plurality of images constitute a moving image of the displayed content of the virtual content. That is, the display screen of the virtual content on the screen is changed so that the virtual content performs the second action within the real environment.

As an embodiment, the target object may be an electronic device that can be controlled by a terminal, for example, the target object may be an electric device such as a desk lamp, an electric switch, a television, a refrigerator, and the like, the target object is connected to the terminal, and the terminal can send a control instruction to the target object and control the target object to perform a specified operation.

In particular, the target object may be a controlled device within an internet of things. The internet of things is a network concept which extends and expands the user side of the internet of things to any article to perform information exchange and communication on the basis of the internet concept. With the development of the internet of things technology, some scenes can be configured in the internet of things system. For a configured scene, a plurality of controlled devices can be involved, and the plurality of controlled devices have a certain linkage relationship and can work cooperatively.

The controlled equipment can be a projector, a projection screen, an intelligent lamp, an intelligent socket, a human body sensor, a door and window sensor, a wireless switch, an air conditioner partner, a smoke alarm, an intelligent curtain motor, an air purifier, an intelligent sound box and other mobile terminals. In one embodiment, in the internet of things system, the electronic device (such as the mobile terminal) for controlling can realize data interaction with the controlled device by directly establishing a wireless connection with the router. Moreover, after the electronic device is connected with the cloud end, data interaction with the controlled device is achieved through a data link between the cloud end and the router. Alternatively, the controlled device may establish a wireless connection with the router through the gateway. The data interaction may include the mobile terminal sending a control instruction to the controlled device, and the controlled device returning status information or returning an instruction execution result to the mobile terminal. Wherein, the data interaction between the mobile terminal and the controlled device can be triggered by the client installed in the mobile terminal.

As another embodiment, the target object is an electronic device but is not connected to the mobile terminal or cannot be directly controlled by the mobile terminal. In some embodiments, the target object can be controlled by the operation terminal, that is, the target object is a controlled device of the operation terminal, and specifically, when the mobile terminal determines that the relative position satisfies the specified condition, the mobile terminal sends a control instruction to the operation terminal, and the operation terminal controls the target object to perform the specified operation in response to the control instruction. For example, the operation terminal may be a control center of the internet of things, or may be another electronic device capable of simultaneously interacting with the mobile terminal and the target object data.

In yet another embodiment, the target object is unable to communicate with the electronic device. For example, the target object is not an electronic device, or the target object is an electronic device but is damaged and cannot be normally started, so that communication with the mobile terminal or other electronic devices cannot be performed, or the target object is an electronic device and can be normally used, but is in a connectionless state, i.e., in the connectionless state, the target object shields data connection, for example, in the connectionless state, the wireless and wired communication module of the target object is turned off, and the mobile terminal or other electronic devices cannot establish wireless and wired communication connection with the target object.

In the case that the target object cannot communicate with the electronic device, an embodiment of triggering the target object to perform the specified operation may be to send a manipulation instruction to the intelligent manipulation device, where the intelligent manipulation device executes a predetermined action corresponding to the manipulation instruction in response to the manipulation instruction, and the predetermined action acts on the target object, so that the target object can perform the specified operation. As an embodiment, when the predetermined action acts on the target object, the state of the target object can be changed, wherein the state includes pose information of the target object, and the pose information includes position, orientation, and pose state of the target object. In this embodiment, the specified operation may be a moving operation of the target object, for example, moving the target object a certain distance toward a specified direction, the specified direction may be the same direction as the direction in which the virtual content approaches the target object, and for example, moving the target object to a specified position. The intelligent handling device may be a device capable of controlling the movement of the target object, for example, the intelligent handling device may be a robot arm or a mobile platform, and the mobile platform may be a conveyor belt.

Taking the intelligent control device as a mechanical arm as an example, the specified operation may be that the target object moves from the current position to the specified position, and an embodiment of triggering the target object to execute the specified operation may be that the mobile terminal sends a control instruction to the intelligent control device, and the intelligent control device captures the target object in response to the control instruction and places the target object at the specified position after moving to the specified position, so that the target object completes the specified operation, that is, the target object is moved from the current position to the specified position.

As another embodiment, the state may further include a working state of the target object, and the target object is provided with a working state switching key, in this embodiment, the specifying operation may be to change the working state of the target object, and specifically, the working state switching key may be touched by controlling the intelligent control device to change the state of the working state switching key, so that the working state of the target object can be changed. In some embodiments, the smart manipulation device may be a smart robot that can be controlled by the mobile terminal to move to the position of the target object and touch the operation state switch key on the target object. The implementation manner of triggering the target object to execute the specified operation may be that the mobile terminal sends a control instruction to the intelligent control device, the intelligent control device responds to the control instruction to touch the operating state switch key on the target object, so as to change the state of the operating state switch key to be the specified state, and the target object can execute the specified operation when the operating state switch key of the target object is the specified state.

For example, taking the target object as an illuminating lamp and the intelligent control device as an intelligent robot, where the illuminating lamp belongs to the type incapable of communicating with the electronic device, the mobile terminal triggers the specified operation performed by the illuminating lamp to turn on the illuminating lamp, the operating state switch key of the illuminating lamp is a switch key, the mobile terminal sends a control instruction to the intelligent robot, and the intelligent robot responds to the control instruction to control the arm of the intelligent robot to press the switch key, so that the switch key is in the on state, and the illuminating lamp is controlled to be turned on.

Referring to fig. 4, fig. 4 shows an information processing method for improving interactivity between virtual content and real objects in an AR scene, specifically, the method includes: s401 to S406.

S401: first position information of a target object within a real environment is acquired.

S402: second location information of virtual content within the real environment is obtained.

S403: and acquiring the relative position relation between the target object and the virtual content in the real environment according to the first position information and the second position information.

Specifically, environment understanding and tracking are performed by using images acquired by a terminal and input of imu, namely, a real environment is scanned according to the above-mentioned S L AM technology, then object depth measurement is performed by using tof hardware and dense 3D point cloud is established, coordinates of a real object in the real environment in the world coordinate system, namely, first position information, can be obtained according to the 3D point cloud, and coordinates of a virtual object in the world coordinate system, namely, second position information, can be obtained according to a mapping relation between pixel coordinates of a screen of the terminal and world coordinates, and further, a relative position relation between the target object and the virtual content in the real environment is obtained.

As an embodiment, the relative positional relationship includes a distance difference between the target object and the virtual content in the real environment, and the relative positional relationship satisfying the specified condition includes that the target object and the virtual content collide, that is, it is determined whether the specified condition is satisfied between the target object and the virtual content based on collision detection, that is, whether a collision occurs.

As an embodiment, whether the target object and the virtual content collide may be detected according to a binary space division tree (BSP).

Among them, binary space partitioning tree (BSP) is one type of space partitioning technique. It is used for world-object type collision detection. Traversal of the BSP tree is one of the basic techniques for using BSP. Collision detection essentially reduces tree traversal or searching. This method only detects collisions on a small number of faces because it can exclude a large number of polygons at an early stage. Traversal of the BSP tree is one of the basic techniques for using BSP. Collision detection essentially reduces tree traversal or searching. In particular, the method of finding the separation plane between two objects is suitable for determining whether two objects intersect. If a separation surface is present, no collision occurs. Thus recursively traversing the world tree and determining whether the split plane intersects a bounding sphere or bounding box. Accuracy can also be improved by detecting polygons for each object. One of the simplest ways to perform this test is to test to see if all parts of the object are on one side of the dividing plane. The cartesian plane equation ax + by + cz + d is 0 to determine on which side of the plane the point lies. If the equation is satisfied, the point is on the plane; if ax + by + cz + d >0 then the point is on the front of the plane; if ax + by + cz + d <0, the point is on the back of the plane.

An object (or its bounding box) must be on the front or back of the parting plane when a collision is not occurring. If there is a vertex on both the front and back of a plane, it is indicated that the object intersects the plane, i.e. that two objects have collided.

As another embodiment, whether the target object and the virtual content collide may be determined based on a distance difference between the target object and the virtual content within the real environment, that is, whether the distance difference between the target object and the virtual content within the real environment is smaller than a specified threshold, and if the distance difference is smaller than the specified threshold, it is determined that a relative positional relationship between the target object and the virtual content within the real environment satisfies a specified condition. The specified threshold value may be predetermined, for example, a small value, for example, a value between 0 and 1 in centimeters.

S404: and if the relative position relation meets a specified condition, determining the object identification of the target object based on the pre-acquired image of the target object.

The object identifier of the target object may be information describing a type, a model, a size parameter, and the like of the target object, for example, if the object identifier may be the type of the target object, the object type of the target object may be determined according to the acquired image of the target object. As an implementation, the mobile terminal is provided with a camera, and the camera can acquire an image of a real scene where a target object is located, such as the image of the real scene shown in fig. 2, then analyze and extract the type of the real object in the image, and use the obtained type of each object as an object identifier of the object, for example, fig. 2 at least includes a floor lamp, a sofa, a table, a cup, a pot plant, and the like.

S405: and determining the specified operation according to the object identification.

In one embodiment, a correspondence relationship between the object identifier and the motion is set in advance, and after the object identifier of the target object is acquired, a designation operation corresponding to the object identifier of the target object is searched for in the correspondence relationship.

As another embodiment, the relative positional relationship may be considered for a specified operation determined for some object. Specifically, the determination of the specified operation according to the object identifier may be implemented by determining the specified operation according to the object identifier and a relative positional relationship.

In some embodiments, it may be determined whether the object identifier matches a specified identifier, and if so, motion data corresponding to the relative positional relationship is acquired, and a specified operation is determined according to the motion data. Two types of object identifiers can be preset, namely a designated identifier and a non-designated identifier, wherein the designated operation corresponding to the object without the designated identifier can be preset and is independent of the motion data corresponding to the relative position relationship, namely, regardless of the motion data, as long as the designated condition is met at the relative position, the object without the designated identifier is triggered to execute the designated operation. And the object with the designated mark corresponds to a plurality of operations, and one operation is selected as the designated operation according to the motion data corresponding to the relative position relationship. Wherein the motion data includes a relative motion direction and a relative motion change rate between the target object and the virtual content, wherein the relative motion direction may be a direction of motion of the virtual content relative to the target object, e.g., the virtual content approaches the target object from a left side of the target object. The relative motion change rate may refer to a relative acceleration, and may be an acceleration when the virtual content moves toward or away from the target object.

In one embodiment, a correspondence relationship between motion data and an operation is preset for the target object, and after the motion data is acquired, the operation corresponding to the motion data is determined as a designation operation of the target object. For example, the target object is a robot, the virtual content is a virtual boxing glove, the head of the robot rotates to the right when the virtual content is hit from the left side to the head of the robot, and the head of the robot rotates to the left when the virtual content is hit from the right side to the head of the robot.

As another embodiment, the target object may implement an operation of moving following the movement of the virtual content, and the category of the target object may be an object that can move freely, for example, the target object is an electric toy car. Therefore, the specified category is the identifier of the freely movable object counted in advance, whether the object identifier of the target object is the specified identifier is judged, if so, the target object is indicated to belong to the freely movable object, and the mobile terminal can move according to the movement of the virtual content by touch. Then, when determining motion data, the motion data comprises a relative motion direction, which may in particular be a direction in which the virtual content moves towards or away from the target object. And then, determining the movement direction of the target object according to the relative movement direction, and triggering the target object to move according to the movement direction, wherein the movement direction is the same as the relative movement direction.

As an embodiment, it is considered that the target object has a certain volume and shape size, that is, the whole target object can be divided into a plurality of parts, and each part may have different semantics, wherein the semantics can express the vocabulary of the meaning of the object part or whole, such as the table top and legs of a table, the head, nose, eyes, hands, etc. of a person.

In particular, in some embodiments, the object identification of the target object comprises at least one portion of the target object and a portion identification for each of the portions. The above-mentioned embodiment of acquiring the first location information of the target object in the real environment may be acquiring sub-location information of each location of the target object in the real environment, that is, the first location information includes sub-location information of each location of the target object in the real environment.

As an embodiment, after the above-mentioned 3D point cloud is acquired, coordinate points of each part of the target object in the world coordinate system, that is, sub-position information of each part of the target object in the real environment can be acquired. An embodiment of determining the specifying operation according to the object identification may be that, based on the sub-position information of each part of the target object, a part whose relative positional relationship with the virtual content within the real environment satisfies a specified condition is determined as a target part, and the specifying operation is determined based on the part identification of the target part.

Specifically, the position information of each part of the target object in the real environment is acquired, and then the relative position relationship between the position information of each part and the position information of the virtual content in the real environment is acquired, so that the part, named as the target part, of which the relative position relationship between each part of the target object and the second position information of the virtual content meets the specified condition is determined.

Specifically, taking the case where the relative positional relationship satisfies the specification condition as an example of a collision, a target portion, which is a certain portion of the target object that collides with the virtual content in the world coordinate system corresponding to the real environment, is specified based on the positional information of each portion of the target object in the real environment, a portion identifier corresponding to the target portion is obtained, and the specification operation is specified based on the portion identifier.

As an embodiment, the part identifier is a part identifier, which refers to an identifier of each part of the target object, for example, a screen, a keyboard, and a mouse of a computer, or a head, a hand, and a belly of a robot, etc.

Then, the designated operation is determined based on the part identifier of the target part, specifically, a correspondence relationship of actions corresponding to part identifiers of different parts of the target object is preset, and the action corresponding to the part identifier of the target part can be determined according to the correspondence relationship, that is, the designated operation.

Wherein the location of the target object within the recognition image and the location identity of each of said locations may be determined according to a pre-established model. Specifically, an embodiment of determining an object identifier of the target object based on a pre-acquired image of the target object includes: acquiring an image of the target object; determining at least one part of the target object based on the feature points within the image; and acquiring a part identifier of each part according to a pre-trained neural network model, wherein the trained neural network model is trained according to sample data, the sample data comprises a plurality of object image samples, and each part of an object in each object image sample is marked with the part identifier.

The feature points in the image may be contour lines in the image, and the feature points of the image are input into the model according to a pre-established model to identify a plurality of parts of the target object, wherein the model may be a neural network model.

In addition, a plurality of object image samples can be acquired in advance, wherein each object image sample comprises an image of a real object, and the image of the real object is labeled, specifically, each part of the object image and a part identifier corresponding to each part can be labeled in a manual labeling mode, so that sample data for training the neural network model can be obtained.

Therefore, the sample data is input to the neural network model, the neural network model analyzes and recognizes each part in each image sample and a part identifier corresponding to each part, and then, the trained neural network model can analyze a plurality of parts of the target object and a part identifier of each part based on the image of the target object after repeatedly training with the previously labeled part and part identifier as expected values.

S406: triggering the target object to execute the specified operation or triggering the target object and the virtual content to execute the specified operation.

Referring to fig. 5 and 6, assuming that the target object is a floor lamp, the target object includes at least two parts, namely, a lamp socket P1 and a lamp shade P2, as shown in fig. 5, when the relative position relationship between the virtual content a and the lamp socket P1 satisfies a specified condition, that is, the virtual content a collides with the lamp socket P1, a specified operation is determined according to the lamp socket P1, for example, the floor lamp is turned on according to the specified operation determined by the lamp socket P1, and the terminal controls the floor lamp to be turned on. As shown in fig. 6, when the relative positional relationship between the virtual content a and the lamp shade P2 satisfies a specified condition, that is, the virtual content a collides with the lamp shade P2, a specified operation is determined according to the lamp shade P2, for example, the specified operation determined according to the lamp shade P2 is to change the light emission luminance of the floor lamp, for example, change the color temperature or the magnitude of the luminance of the light emission, or the like.

Therefore, when the virtual content collides with different portions of the same target object, the actions that can be executed by the target object are also different.

For example, the target object is a switch on a wall surface, the virtual content collides with the switch on the wall surface, and the terminal can send a command to the switch controller to turn off the light.

In addition, in addition to controlling the target object to perform the designation operation, the virtual content may also be controlled to perform the designation operation, and the designation operation may also be related to the target portion, as shown in fig. 6 as an example, when the light emitting brightness of the floor lamp is changed, the virtual content may be controlled to perform the designation operation, for example, the light-illuminated brightness of the virtual content may be changed.

As an embodiment, the operating state of the target object may be changed by a change in a designation operation, for example, the target object may be the above-mentioned table lamp or a switch, and the operating state of the target object is the first operating state before the designation operation is performed. And at this moment, the relative position relationship between the virtual content and the target object meets the specified condition, that is, when the relative position relationship meets the specified condition, the target object is triggered to change from the first working state to the second working state or the target object and the virtual content are triggered to change from the first working state to the second working state.

For example, taking a table lamp as an example and the relative position relationship satisfies a specified condition, if the distance difference between the target object and the virtual content in the real environment is smaller than a specified threshold, the first operating state of the table lamp is an off state and the second operating state of the table lamp is an on state. When the distance difference between the virtual content and the desk lamp is smaller than a specified threshold value, the desk lamp is changed from a turning-off state to a turning-on state.

If the target object and the virtual content are triggered to change from the first working state to the second working state, the desk lamp changes from the off state to the on state, and the brightness of the virtual content illuminated by the lamp light changes from the first brightness to the second brightness, wherein the first brightness is smaller than the second brightness.

And then, the relative position relationship changes along with the change of the positions of the virtual content and the target object, and when the relative position relationship does not meet a specified condition, the target object is triggered to change from the second working state to the first working state or the target object and the virtual content are triggered to change from the second working state to the first working state. Similarly, for the above-mentioned desk lamp as an example, when the distance difference is greater than the predetermined threshold, the desk lamp is turned from the on state to the off state, and the brightness of the virtual content illuminated by the lamp light is changed from the second brightness to the first brightness. In addition, the specifying operation may be determined not only from the description information of the target portion but also from the moving speed of the virtual content when the virtual content is moving. In one embodiment, when the virtual content collides with the target content, the moving speed of the virtual content is determined, and the specifying operation is determined based on the moving speed and the part identifier of the target part.

Moreover, in the embodiment of the application, the control of the virtual content to execute the designated operation can also be used for collision in the scheme and can also be applied to other scenes, for example, in ar navigation, if no collision detection exists, the virtual object can pass through the foreground of the obstacle when encountering the obstacle, so that the experience is not good, the collision with the real obstacle is increased, the virtual object can be enabled to go forward again by bypassing the obstacle, the reality of ar navigation is increased, and the user experience is improved.

Referring to fig. 7, fig. 7 shows an information processing method for improving interactivity between virtual content and real objects in an AR scene, specifically, the method includes: s701 to S705.

S701: controlling the virtual content to move within the real environment.

As an embodiment, in order to increase the interest of AR, after the virtual content is set within the real environment, the user may control the movement of the virtual content so that the relative positional relationship of the virtual content and the target object within the real environment.

The input mode of the movement instruction may be an instruction input by the user on the screen of the terminal, or an instruction input by the user through voice.

In this embodiment, the input mode of the movement instruction may be an instruction input by a user on a screen of a terminal, and as an implementation, the implementation of controlling the virtual content to move towards the target object in the real environment based on the movement instruction may be to acquire a touch sliding operation of the user on the display content displayed on the screen, where the touch sliding operation is the movement instruction; controlling movement of the virtual content within the real environment toward the target object based on the touch swipe operation.

In this embodiment, the terminal sets the virtual content in the real environment by displaying the display content corresponding to the virtual content on the screen, and for a specific implementation, reference may be made to the foregoing embodiment, which is not described herein again.

As an embodiment, the terminal acquires the touch operation applied to the screen by the user, determines a position on the screen corresponding to the touch operation, and determines whether the content currently displayed on the screen includes display content of virtual content, that is, determines whether the display content of the virtual content is displayed on the screen, and if so, determines whether the position of the touch operation applied to the screen by the user is the display content of the virtual content, and if so, determines to acquire the touch operation applied to the display content displayed on the screen by the user, and then detects the sliding operation based on the touch operation by the user, so as to acquire the touch sliding operation.

Then, the virtual content is controlled to move towards the target object in the real environment based on the touch sliding operation, specifically, the virtual content is controlled to move according to the moving track of the touch sliding operation, as shown in fig. 8, the touch sliding operation is parallel sliding to a position W2 at a position W1 of the screen, then according to the correspondence between the pixel coordinates of the screen and the world coordinate system, the position W1 corresponds to a first position of the real world, the position W3 corresponds to the first position of the real world, and then the virtual content moves from the first position to a second position, so that the user can touch the screen to control the virtual content to move in the real environment.

S702: first position information of a target object within a real environment is acquired.

S703: second location information of virtual content within the real environment is obtained.

S704: and acquiring the relative position relation between the target object and the virtual content in the real environment according to the first position information and the second position information.

S705: and if the relative position relation meets a specified condition, triggering the target object to execute a specified operation or triggering the target object and the virtual content to execute a specified operation.

Referring to fig. 9, an information processing apparatus 900 according to an embodiment of the present application is shown, which may include: a first acquisition unit 901, a second acquisition unit 902, a determination unit 903, and a processing unit 904.

A first obtaining unit 901, configured to obtain first position information of a target object within a real environment;

a second obtaining unit 902, configured to obtain second location information of the virtual content within the real environment;

a determining unit 903, configured to obtain, according to the first location information and the second location information, a relative location relationship between the target object and the virtual content in the real environment;

a processing unit 904, configured to trigger the target object to perform a specified operation or trigger the target object and the virtual content to perform a specified operation if the relative position relationship satisfies a specified condition.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Referring to fig. 10, an information processing apparatus 900 according to an embodiment of the present application is shown, which may include: a first acquisition unit 1002, a second acquisition unit 1003, a determination unit 1004, and a processing unit 1005.

A moving unit 1001 for controlling the virtual content to move within the real environment.

Specifically, the mobile unit 1001 is specifically configured to display a display content corresponding to the virtual content on the screen; acquiring touch sliding operation acted on display content displayed on the screen by a user, wherein the touch sliding operation is the moving instruction; controlling movement of the virtual content within the real environment toward the target object based on the touch swipe operation.

A first obtaining unit 1002, configured to obtain first position information of a target object within a real environment;

a second obtaining unit 1003 configured to obtain second location information of the virtual content within the real environment;

a determining unit 1004, configured to obtain a relative position relationship between the target object and the virtual content in the real environment according to the first position information and the second position information;

a processing unit 1005, configured to trigger the target object to perform a specified operation or trigger the target object and the virtual content to perform a specified operation if the relative positional relationship satisfies a specified condition.

The processing unit 1005 is further configured to determine an object identifier of the target object based on a pre-acquired image of the target object; determining the designated operation according to the object identification; triggering the target object to execute the specified operation or triggering the target object and the virtual content to execute the specified operation. Specifically, the object identifier of the target object includes at least one location of the target object and a location identifier of each of the locations, the first location information includes sub-location information of each location of the target object within the real environment, and the processing unit 1005 is further configured to determine, as a target location, a location satisfying a specified condition in a relative location relationship with the virtual content within the real environment based on the sub-location information of each location of the target object; determining the specified operation based on a site identification of the target site.

The processing unit 1005 is further configured to acquire an image of the target object; determining at least one part of the target object based on the feature points within the image; and acquiring a part identifier of each part according to a pre-trained neural network model, wherein the trained neural network model is trained according to sample data, the sample data comprises a plurality of object image samples, and each part of an object in each object image sample is marked with the part identifier.

In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

Referring to fig. 11, a block diagram of an electronic device according to an embodiment of the present application is shown. The electronic device 100 may be a smart phone, a tablet computer, an electronic book, or other electronic devices capable of running an application. In particular, the electronic device may be the head-up display or the mobile terminal described above.

The electronic device 100 in the present application may include one or more of the following components: a processor 110, a memory 120, and one or more applications, wherein the one or more applications may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more programs configured to perform a method as described in the aforementioned method embodiments.

The processor 110 may include one or more Processing cores, the processor 110 may connect various parts throughout the electronic device 100 using various interfaces and lines, perform various functions of the electronic device 100 and process data by running or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and calling data stored in the memory 120, alternatively, the processor 110 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), Programmable logic Array (Programmable L organic Array, P L a), the processor 110 may be implemented in the form of at least one of a Central Processing Unit (CPU), Graphics Processing Unit (GPU), and modem, etc., wherein the CPU primarily processes operating systems, user interfaces, application programs, etc., the GPU is responsible for displaying content, the modem is used for rendering, and the modem may be implemented separately for communication, or may be implemented in a separate chip.

The Memory 120 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the terminal 100 in use, such as a phonebook, audio-video data, chat log data, and the like.

Referring to fig. 12, a block diagram of a storage medium according to an embodiment of the present disclosure is shown. The medium 1200 has stored therein a program code which can be called by a processor to execute the method described in the above method embodiments.

The storage medium 1200 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the storage medium 1200 includes a non-volatile medium (non-nonvolatile-readable storage medium). The storage medium 1200 has storage space for program code 1210 that performs any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 1210 may be compressed, for example, in a suitable form.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. An information processing method characterized by comprising:

acquiring first position information of a target object in a real environment;

obtaining second position information of virtual content in the real environment;

acquiring the relative position relation between the target object and the virtual content in the real environment according to the first position information and the second position information;

and if the relative position relation meets a specified condition, triggering the target object to execute a specified operation or triggering the target object and the virtual content to execute a specified operation.

2. The method of claim 1, wherein the triggering the target object to perform the specified operation or the triggering the target object and the virtual content to perform the specified operation comprises:

determining an object identifier of the target object according to the image of the target object;

determining the designated operation according to the object identification;

triggering the target object to execute the specified operation or triggering the target object and the virtual content to execute the specified operation.

3. The method of claim 2, wherein the object identification of the target object comprises at least one location of the target object and a location identification of each of the locations, and the first location information comprises sub-location information of each location of the target object within the real environment; the determining the specified operation according to the object identifier includes:

determining a part satisfying a specified condition in a relative position relation with the virtual content in the real environment based on the sub-position information of each part of the target object as a target part;

determining the specified operation based on a site identification of the target site.

4. The method of claim 3, wherein determining the object identification of the target object based on the pre-acquired image of the target object comprises:

acquiring an image of the target object;

determining at least one part of the target object based on the feature points within the image;

and acquiring a part identifier of each part according to a pre-trained neural network model, wherein the trained neural network model is trained according to sample data, the sample data comprises a plurality of object image samples, and each part of an object in each object image sample is marked with the part identifier.

5. The method of claim 1, wherein the relative positional relationship comprises a distance difference between the target object and the virtual content within the real environment, and wherein the relative positional relationship satisfying a specified condition comprises the distance difference being less than a specified threshold.

6. The method according to claim 1, wherein the triggering the target object to perform a specified operation or the triggering the target object and the virtual content to perform a specified operation if the relative positional relationship satisfies a specified condition includes:

and if the relative position relation meets a specified condition, triggering the target object to change from the first working state to the second working state or triggering the target object and the virtual content to change from the first working state to the second working state.

7. The method of claim 6, further comprising:

and if the relative position relation does not meet the specified condition, triggering the target object to change from the second working state to the first working state or triggering the target object and the virtual content to change from the second working state to the first working state.

8. The method of claim 1, wherein the obtaining the virtual content is preceded by the second location information within the real environment, further comprising:

controlling the virtual content to move within the real environment.

9. The method of claim 8, applied to an electronic device comprising a screen, wherein the controlling the virtual content to move towards the target object within the real environment based on the movement instruction comprises:

displaying display content corresponding to the virtual content on the screen;

acquiring touch sliding operation acted on display content displayed on the screen by a user, wherein the touch sliding operation is the moving instruction;

controlling movement of the virtual content within the real environment toward the target object based on the touch swipe operation.

10. An information processing apparatus characterized by comprising:

a first acquisition unit, configured to acquire first position information of a target object within a real environment;

a second acquisition unit configured to acquire second position information of the virtual content within the real environment;

a determining unit, configured to obtain, according to the first location information and the second location information, a relative location relationship between the target object and the virtual content in the real environment;

and the processing unit is used for triggering the target object to execute a specified operation or triggering the target object and the virtual content to execute a specified operation if the relative position relation meets a specified condition.

11. An electronic device, comprising:

one or more processors;

a memory;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-9.

12. A storage medium storing program code executable by a processor, the program code causing the processor to perform the method of any one of claims 1 to 9 when executed by the processor.