CN115527011A - Navigation method and device based on three-dimensional model - Google Patents

Navigation method and device based on three-dimensional model Download PDF

Info

Publication number
CN115527011A
CN115527011A CN202211026584.0A CN202211026584A CN115527011A CN 115527011 A CN115527011 A CN 115527011A CN 202211026584 A CN202211026584 A CN 202211026584A CN 115527011 A CN115527011 A CN 115527011A
Authority
CN
China
Prior art keywords
image
spherical
target
dimensional model
panoramic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211026584.0A
Other languages
Chinese (zh)
Inventor
李蒙
王森博
盛哲
董子龙
谭平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Damo Institute Hangzhou Technology Co Ltd
Original Assignee
Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Damo Institute Hangzhou Technology Co Ltd filed Critical Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority to CN202211026584.0A priority Critical patent/CN115527011A/en
Publication of CN115527011A publication Critical patent/CN115527011A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the specification provides a navigation method and a navigation device based on a three-dimensional model, wherein the method is applied to a client and comprises the following steps: responding to a display request of a user for a target three-dimensional model, and displaying the target three-dimensional model to the user through a user interaction interface; receiving a destination in the target three-dimensional model, which is input by the user through the user interaction interface, and determining the current position of a virtual object of the user in the target three-dimensional model; determining a navigation map of the virtual object moving from the current position to the destination according to the current position and the destination; guiding the virtual object to move from the current position to the destination according to the navigation map, and displaying a guide track to the user through the user interaction interface in a three-dimensional visual mode; and the navigation is performed by combining a modeling technology and a three-dimensional visual guiding mode, so that the navigation experience of a user is improved.

Description

Navigation method and device based on three-dimensional model
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to a navigation method based on a three-dimensional model.
Background
With the rise of AR (Augmented Reality) and VR (Virtual Reality) technologies, more and more fields begin to apply AR or VR technologies to assist in production or work.
However, in the prior art, when the AR or VR technology is adopted in the navigation field, generally, a simple map route is generated and then is directly presented by the AR or VR, and a more intelligent service is not realized by combining a modeling technology and a user guidance skill, which causes poor user experience.
Disclosure of Invention
In view of this, embodiments of the present specification provide a navigation method based on a three-dimensional model. One or more embodiments of the present specification also relate to a navigation apparatus based on a three-dimensional model, an augmented reality AR device, an augmented reality VR device, a computing device, a computer-readable storage medium, and a computer program, so as to solve technical drawbacks in the related art.
According to a first aspect of embodiments of the present specification, there is provided a navigation method based on a three-dimensional model, applied to a client, including:
responding to a display request of a user for a target three-dimensional model, and displaying the target three-dimensional model to the user through a user interaction interface;
receiving a destination in the target three-dimensional model, which is input by the user through the user interaction interface, and determining the current position of a virtual object of the user in the target three-dimensional model;
determining a navigation map of the virtual object moving from the current position to the destination according to the current position and the destination;
and guiding the virtual object to move from the current position to the destination according to the navigation map, and displaying a guide track to the user through the user interaction interface in a three-dimensional visual mode.
According to a second aspect of the embodiments of the present specification, there is provided a navigation device based on a three-dimensional model, applied to a client, including:
the model display module is configured to respond to a display request of a user for a target three-dimensional model, and display the target three-dimensional model to the user through a user interaction interface;
a position determination module configured to receive a destination in the target three-dimensional model input by the user through the user interaction interface and determine a current position of a virtual object of the user in the target three-dimensional model;
a map determination module configured to determine a navigation map of the virtual object moving from the current location to the destination according to the current location and the destination;
and the navigation module is configured to guide the virtual object to move from the current position to the destination according to the navigation map, and display a guide track to the user through the user interaction interface in a three-dimensional visual mode.
According to a third aspect of embodiments herein, there is provided an Augmented Reality (AR) device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method described above.
According to a fourth aspect of embodiments herein, there is provided an augmented reality VR device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method described above.
According to a fifth aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method described above.
According to a sixth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the three-dimensional model-based navigation method described above.
According to a seventh aspect of embodiments herein, there is provided a computer program, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of the above three-dimensional model based navigation method.
An embodiment of the present specification implements a navigation method and apparatus based on a three-dimensional model, where the method is applied to a client, and includes: responding to a display request of a user for a target three-dimensional model, and displaying the target three-dimensional model to the user through a user interaction interface; receiving a destination in the target three-dimensional model, which is input by the user through the user interaction interface, and determining the current position of a virtual object of the user in the target three-dimensional model; determining a navigation map of the virtual object moving from the current position to the destination according to the current position and the destination; and guiding the virtual object to move from the current position to the destination according to the navigation map, and displaying a guide track to the user through the user interaction interface in a three-dimensional visual mode.
Specifically, the method plans a navigation map for the user according to the current position and the destination of the virtual object of the user in the target three-dimensional model, guides the user to travel in a three-dimensional visual presentation mode in the target three-dimensional model, and guides the user to navigate by combining a modeling technology and a three-dimensional visual guidance mode, so that the navigation experience of the user is improved.
Drawings
FIG. 1 is a schematic diagram of a specific application of a navigation method based on a three-dimensional model according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a method for three-dimensional model-based navigation according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a specific application scenario of a target three-dimensional model construction of a target object in a navigation method based on a three-dimensional model according to an embodiment of the present specification;
FIG. 4 is a flowchart illustrating a target three-dimensional model construction of a target object in a three-dimensional model based navigation method according to an embodiment of the present disclosure;
fig. 5 is a flowchart illustrating a specific processing procedure of panoramic image processing in a navigation method based on a three-dimensional model according to an embodiment of the present disclosure;
FIG. 6 is a flowchart illustrating a specific process of constructing a three-dimensional model of a target object in a three-dimensional model based navigation method according to an embodiment of the present disclosure;
fig. 7 is a schematic diagram illustrating feature fusion performed by an attention-free mechanism during panoramic image processing in a navigation method based on a three-dimensional model according to an embodiment of the present disclosure;
FIG. 8 is a schematic structural diagram of a navigation device based on a three-dimensional model according to an embodiment of the present disclosure;
fig. 9 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can be termed a second and, similarly, a second can be termed a first without departing from the scope of one or more embodiments of the present description. The word "if," as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.
First, the noun terms referred to in one or more embodiments of the present specification are explained.
Three-dimensional model: is a polygon representation of an object, typically displayed by a computer or other video device. The displayed object may be a real-world entity or a fictional object.
Panoramic images: the image having the 360-degree view angle of the surrounding environment may be captured by a panoramic camera, or may be generated by fusing planar images of multiple view angles of one object.
Depth estimation: an image is input, and the distance from the actual position of each pixel point in a scene to the optical center of a camera is estimated through a depth neural network model.
An image processing model: the image processing model is a neural network model based on an encoder-decoder framework, wherein the encoder is an encoding layer of the neural network model and is used for data compression (dimensionality reduction); decoder is the decoding layer of the neural network model for parameter reconstruction (ascending dimension).
HEALPIxel algorithm: a method of pixelating a sphere, which generates a subdivision of a sphere, where each pixel covers the same surface Area as every other pixel.
skip-connection: a jump connection. For a depth estimation task, a high-definition image/feature map is needed, and a feature extraction part of a network performs continuous layer-by-layer calculation, and the resolution of the feature map is reduced to be very small by a final calculated result, which is not favorable for an accurate depth estimation result; shallow layer feature maps can be introduced through skip-connection and directly added together with the original feature maps, and the shallow layer feature maps have high resolution and contain rich local information, so that the accuracy of depth estimation results is facilitated.
ERP: equirectangular Projection image.
backbone: the main network is usually a pre-trained open-source feature extraction network.
In this specification, a three-dimensional model based navigation method is provided. One or more embodiments of the present specification also relate to a navigation apparatus based on a three-dimensional model, an augmented reality AR device, an augmented reality VR device, a computing device, a computer-readable storage medium, and a computer program, which are described in detail one by one in the following embodiments.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a specific application of a three-dimensional model based navigation method according to an embodiment of the present disclosure.
Fig. 1 includes a terminal 102 (e.g., a mobile phone, a tablet computer, etc.) and a server 104, where the server 104 may be a physical server or a cloud server, and for convenience of understanding, in this embodiment of the present disclosure, the server 104 is taken as an example of a cloud server for detailed description.
The navigation method based on the three-dimensional model provided by the embodiment of the description is applied to a shopping scene of a shopping mall as an example, and the navigation method based on the three-dimensional model is described in detail.
In specific implementation, a user selects a name of a currently located mall, for example, mall a in diagram a of fig. 1, through the terminal 102, the terminal 102 obtains the three-dimensional model of the mall a from the server 104 according to a selection instruction of the user, and sends the three-dimensional model of the mall to the terminal 102, and the terminal 102 is displayed to the user through a user interaction interface, for example, part b of fig. 1, at this time, the user can browse each store of the mall through the three-dimensional model of the mall; and when a user inputs a certain shop name in the search box of the user interaction interface, the terminal 102 calculates the current position of the user in the three-dimensional model and a navigation map of a destination (i.e., a certain shop input by the user) according to the three-dimensional model of the mall a, as shown in part c of fig. 1, and calculates according to the current position of the user in the three-dimensional model and the destination in real time in the process that the user travels to the destination, and presents the guide track to the user in a three-dimensional visual effect in a navigation manner in the three-dimensional model, so that the user has experience of being in the scene, and can quickly and conveniently reach the destination according to shop display in the three-dimensional model.
In practical applications, the construction of the three-dimensional model of the mall a may also be implemented in the terminal 102, which may be specifically set according to practical applications, and this is not limited in this embodiment of the specification.
The image processing method provided in the embodiment of the description is applied to a shopping mall shopping navigation scene, can quickly and accurately reach a destination by combining the three-dimensional visual effects of buildings in the three-dimensional model and articles in the buildings according to a navigation map in the three-dimensional model presented by a terminal, and improves the navigation experience of a user by combining the navigation and the three-dimensional model.
Referring to fig. 2, fig. 2 is a flowchart illustrating a three-dimensional model based navigation method according to an embodiment of the present disclosure, where the method is applied to a client, and specifically includes the following steps.
Step 202: and responding to a display request of a user for the target three-dimensional model, and displaying the target three-dimensional model to the user through a user interaction interface.
Specifically, the navigation method based on the three-dimensional model provided by the embodiment of the present specification has different specific application scenarios, and the target three-dimensional model is also different; for example, the navigation method based on the three-dimensional model is applied to a shopping scene of a shopping mall, and the target three-dimensional model can be understood as the three-dimensional model of the shopping mall; the navigation method based on the three-dimensional model is applied to office area scenes, and the target three-dimensional model can be understood as an office building three-dimensional model and the like.
In order to facilitate understanding, in the embodiments of the present specification, the navigation method based on the three-dimensional model is applied to a shopping scene of a shopping mall, and the target three-dimensional model is described in detail as an example.
In practical application, the client may be understood as a mobile phone, a tablet computer, or the like, and then, when the client is understood as a mobile phone, the client responds to a display request of a user for a target three-dimensional model, and responds to a display request sent by the user by clicking a name or a control of the target three-dimensional model in a user interaction interface of the client, and displays the target three-dimensional model to the user through the user interaction interface, so that the user can clearly view the target three-dimensional model through the user interaction interface, so that the user can subsequently operate the target three-dimensional model through the user interaction interface, for example, enlarge, reduce, or click the inside of the target three-dimensional model for browsing, and the like.
Step 204: and receiving a destination in the target three-dimensional model, which is input by the user through the user interactive interface, and determining the current position of the virtual object of the user in the target three-dimensional model.
Specifically, after the target three-dimensional model is displayed to the user through the user interaction interface, the client receives a destination in the target three-dimensional model, which is input through the user interaction interface by the user, for example, the target three-dimensional model is a shopping mall, and then the destination in the target three-dimensional model can be understood as a certain shop or a bathroom in the shopping mall.
And after receiving a destination in the target three-dimensional model input by the user through the user interactive interface, the client determines the current position of a virtual object of the user in the target three-dimensional model, wherein the virtual object can be a small dot representing the user, or a virtual head portrait of the user, or an avatar of the user, and the like.
Step 206: and determining a navigation map of the virtual object moving from the current position to the destination according to the current position and the destination.
Specifically, after determining the current position and the destination of the user in the target three-dimensional model, the client may determine a navigation map for the virtual object to move from the current position to the destination in the target three-dimensional model.
In practical application, the navigation map can be obtained by calculation of a client according to the current position and the destination of the user in the target three-dimensional model; or the client sends the current position and the destination of the user in the target three-dimensional model to the server, and the server calculates the navigation map and returns the navigation map to the client.
Step 208: and guiding the virtual object to move from the current position to the destination according to the navigation map, and displaying a guide track to the user through the user interaction interface in a three-dimensional visual mode.
Specifically, after the client determines the navigation map, the client may guide the virtual object to move from the current position to the destination in the target three-dimensional model according to the navigation map, and in the process that the virtual object moves from the current position to the destination in the target three-dimensional model, the client may display the guide track to the user through the user interaction interface in a three-dimensional visual manner.
The specific realization effect is that the display effect of the area through which the virtual object of the user passes in the target three-dimensional model is the same as the display effect of the area through which the user passes in the market; the user is guided to quickly reach the destination through the implementation mode.
According to the navigation method based on the three-dimensional model, provided by the embodiment of the specification, in the target three-dimensional model, the navigation map is planned for the user according to the current position and the destination of the virtual object of the user in the target three-dimensional model, the user is guided to travel in the target three-dimensional model in a three-dimensional visual presentation mode, and the navigation is performed by combining a modeling technology and a three-dimensional visual guidance mode, so that the navigation experience of the user is improved.
In specific implementation, because the navigation display effect of the target three-dimensional model needs to be the same as what a user sees in a real market, each shop and commodity in the target three-dimensional model need to be restored and rendered one by one, so that great workload is caused, the whole navigation effect is slowed, and the situation of blockage occurs; in order to solve the problem, the client can render the object according to the distance between the virtual object of the user in the target three-dimensional model and the display object to be rendered in the target three-dimensional model, and only render the object which is closer to the virtual object, so that the client can assist the user in quickly distinguishing the direction and the destination, the rendering workload is reduced on the basis of ensuring the navigation effect, and the navigation efficiency is improved. The specific implementation mode is as follows:
and in the process of guiding the virtual object to move from the current position to the destination according to the navigation map, rendering the display article according to the distance between the virtual object and the display article in the target three-dimensional model.
For example, in practical applications, when it is determined that the distance between the virtual object and the displayed object in the target three-dimensional model is 5 centimeters, it may be determined that the distance between the user and the real object in the real mall is 3 meters, and in this case, the object in the target three-dimensional model and within 5 centimeters of the virtual object is rendered and displayed with the virtual object as a center of circle; and the objects beyond 5 cm are not displayed in a rendering manner; or simulating the visual range of the virtual object in the target three-dimensional model according to the visual range of the user, and rendering the displayed articles in the visual range. By the method, the navigation effect of the user can not be discounted, and the rendering amount can be greatly saved.
Meanwhile, in order to further improve the shopping experience of the user, the webpage link corresponding to the displayed article can be displayed in the target three-dimensional model while the displayed article in the target three-dimensional model is rendered, and the webpage link is displayed to the user through the user interaction interface, so that the user can not only shop in an entity store, but also shop online in the follow-up process according to the webpage link, the inconvenience of holding the article when shopping is avoided, and the shopping experience of the user is improved.
Then, after the user clicks the webpage link of a certain displayed article displayed on the client through the user interactive interface, the user can jump from the target three-dimensional model to the article display page corresponding to the webpage link to purchase the article online. The specific implementation mode is as follows:
after the step of rendering the displayed article according to the distance between the virtual object and the displayed article in the target three-dimensional model, the method further comprises the following steps:
displaying the display article and the webpage link corresponding to the display article to the user through the user interaction interface;
and responding to a click instruction of the user for the webpage link, and jumping to an article display page corresponding to the webpage link from the target three-dimensional model.
In addition, before responding to a display request of a user for a target three-dimensional model, the target three-dimensional model needs to be constructed, and then a navigation function can be applied to the target three-dimensional model. The specific implementation mode is as follows:
referring to fig. 3, fig. 3 is a schematic diagram illustrating a specific application scenario of target three-dimensional model construction of a target object in a three-dimensional model based navigation method according to an embodiment of the present specification.
Fig. 3 includes a terminal 302 (i.e., client), wherein the terminal 302 includes, but is not limited to, a mobile phone, a tablet computer, a desktop computer, etc.
A target object is taken as a room, and a scene constructed by a three-dimensional model of the room is realized through the embodiments of the present description.
In specific implementation, the terminal 302 acquires or receives a panoramic image of a room to be subjected to three-dimensional modeling, where the panoramic image of the room may be shot by a panoramic camera embedded in the terminal 302, may also be shot by an external panoramic camera and uploaded to the terminal 302, and certainly may also be uploaded to the terminal 302 by a user through other storage devices (e.g., a hard disk).
The terminal 302 processes the panoramic image of the room through a pre-trained image processing model to obtain a panoramic depth image corresponding to the panoramic image of the room, and then performs three-dimensional modeling on the room according to the panoramic depth image corresponding to the panoramic image of the room to obtain a three-dimensional model of the room; and the constructed three-dimensional model of the room is presented to the user through the display interface of the terminal 302.
According to the panoramic depth image corresponding to the panoramic image of the room, there are various implementation manners for performing three-dimensional modeling on the room, for example, by point cloud processing: converting the panoramic depth image into point cloud data, calculating the relative relationship between the point clouds by using a normal distribution transformation method, and splicing the point clouds together through the relative relationship between all the panoramic images; grid construction: for the well spliced point cloud, a triangular mesh is obtained by using a Poisson reconstruction method; mapping calculation: calculating the corresponding relation between the points on the triangular mesh and the pixels on the image, wherein the energy of the Markov random field between the image and the triangular mesh is increased due to the wrong corresponding relation, so that the mapping of the triangular mesh can be obtained by searching the corresponding relation between the triangular mesh points with the lowest energy of the Markov random field and the pixels of the image; and finally, repairing the defects and the redundancies in the grids in a manual repairing mode to finally obtain the finished product three-dimensional modeling.
The room may also be modeled in three dimensions by another implementation, namely by point cloud processing: converting the panoramic depth image into point cloud data; and (3) calculating a splicing relation: searching a matching relation of pixel points on the two panoramic images, finding out two groups of matched 3D points according to the matching relation, calculating a relative relation between the two groups of matching points by using a Umeyama algorithm, and splicing point clouds together through the relative relation between all the panoramic images; grid construction: for the well spliced point cloud, a Delaunay triangulation method is used to obtain a triangular mesh; grid processing: using a Linderslen method to degenerate grids, using a Surazhsky algorithm to smooth triangular grids, and using a Liepa method to detect holes in the grids and fill the holes; mapping calculation: calculating the corresponding relation between the points on the triangular mesh and the pixels on the image, wherein the energy of the Markov random field between the image and the triangular mesh is increased due to the wrong corresponding relation, so that the mapping of the triangular mesh can be obtained by searching the corresponding relation between the triangular mesh points with the lowest energy of the Markov random field and the pixels of the image; and finally, repairing the defects and the redundancies in the grids in a manual repairing mode to finally obtain the finished product three-dimensional modeling.
In addition, the specific processing on the panoramic picture may be performed at the terminal 302 or the server, and may be set according to an actual application, which is not limited in this embodiment of the specification. The processing of the panoramic depth image of the panoramic image of the room and the construction of the three-dimensional model of the room may also be implemented at the terminal 302, if the computing resources of the terminal 302 are sufficient; in the embodiment of the present specification, only the processing of the panoramic depth image of the panoramic image of the room and the building of the three-dimensional model of the room are implemented in the terminal 302 for example.
In the three-dimensional model construction of the target object provided in the embodiment of the specification, an accurate panoramic depth image of a panoramic image of a room to be three-dimensionally modeled can be obtained through the deep learning image processing model, no additional instrument or equipment is needed, and the cost is saved; and the three-dimensional modeling of the room can be rapidly realized according to the panoramic depth image, and the user experience is improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating a target three-dimensional model building of a target object in a three-dimensional model based navigation method according to an embodiment of the present disclosure, which includes the following steps.
Step 402: inputting a target panoramic image into an image processing model, and obtaining panoramic image characteristics of the target panoramic image through an encoding layer of the image processing model.
The target panoramic image can be understood as a target panoramic image shot by an independent panoramic camera or a panoramic camera embedded in a terminal such as a mobile phone, a mobile computer and the like; of course, the target panoramic image may be generated by fusing a plurality of planar images, and when the target panoramic image is generated by fusing a plurality of planar images, the plurality of planar images are directed to the same target object, for example, if the target object is a room, the plurality of planar images are directed to planar images of the room. The specific implementation mode is as follows:
before the inputting the target panoramic image into the image processing model, the method further comprises:
acquiring a target panoramic image of a target object shot by panoramic shooting equipment; or alternatively
And acquiring at least two initial plane images of the target object, and fusing the at least two initial plane images according to a preset image fusion algorithm to obtain a target panoramic image of the target object.
The panoramic shooting equipment comprises an independent panoramic camera or terminal equipment embedded with the panoramic camera and the like; and the target object may be any type of object of any size, for example, the target object is a house building, a sales commodity, etc.
For convenience of understanding, in the embodiments of the present specification, the target object is taken as an example of a house building, and details are described.
Taking a target object as a house to be rented as an example, acquiring a target panoramic image of the target object shot by panoramic shooting equipment, wherein the target panoramic image of the house to be rented shot by a panoramic camera of a mobile phone can be understood as acquiring the target panoramic image of the house to be rented; or
Acquiring at least two initial plane images of a target object, and fusing the at least two initial plane images according to a preset image fusion algorithm to obtain a target panoramic image of the target object, wherein the target panoramic image can be understood as acquiring a plurality of initial plane images of a house to be rented; the initial plane images can be shot in real time through a camera of a mobile phone, or historical shooting can be obtained from a database.
After a plurality of initial plane images of the house to be leased are obtained, the plurality of initial plane images can be fused according to a preset image fusion algorithm to generate and obtain a target panoramic image of the house to be leased; the preset image fusion algorithm can be any image fusion algorithm, and any image fusion algorithm capable of fusing a plurality of plane images into a Zhang Quanjing image can be realized, for example, images with different angles are shot by a mobile phone or a camera (the images are overlapped with each other to a certain extent), features are extracted by a Scale Invariant Feature Transform (SIFT) operator, and the images are spliced together to form a large-scene image and the like through operations such as Feature matching, image rotation and image fusion; the examples in this specification do not limit this.
In the embodiment of the present description, a target panoramic image of a target object may be quickly obtained through a panoramic shooting device or fusion of multiple initial plane images, and then a panoramic depth image corresponding to the target panoramic image of the target object may be obtained through an image processing model.
In addition, the image processing model in the embodiments of the present specification may be understood as an image processing model specifically designed for a spherical surface that runs on the spherical surface.
Taking a target object as a house to be leased, taking a target panoramic image as a target panoramic image of the target object, namely the target panoramic image of the house to be leased as an example, inputting the target panoramic image into an image processing model, and obtaining the panoramic image characteristics of the target panoramic image through an encoding layer of the image processing model; it may be understood that a target panoramic image of a house to be leased is input to an image processing model, and a panoramic image feature of the target panoramic image of the house to be leased is obtained through an encoding layer (encoder layer) of the image processing model, where the feature in the embodiments of the present specification may be understood as a feature map.
Step 404: and converting the panoramic image characteristics into spherical image coding characteristics according to a panoramic image spherical conversion algorithm.
The panoramic image spherical surface conversion algorithm may be understood as a HEALPixel algorithm, but in the embodiment of the present specification, the algorithm is not limited to be used, and any algorithm that can convert a panoramic image into a spherical image may be implemented.
In practical application, after the panoramic image features of the target panoramic image are obtained, the panoramic image features are converted into spherical image coding features through a panoramic image spherical conversion algorithm.
Step 406: and inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining a spherical depth image of the target panoramic image through the decoding layer.
Specifically, after obtaining the spherical image coding feature, the spherical image coding feature is input into a decoding layer (decoder layer) of the image processing model, and the decoding layer is used for decoding to obtain the spherical depth image of the target panoramic image.
Step 408: and converting the spherical depth image into a panoramic depth image of the target panoramic image according to the panoramic image spherical conversion algorithm.
After the spherical depth image of the target panoramic image is obtained, inverse transformation can be performed according to the spherical transformation algorithm of the panoramic image, and the spherical depth image is transformed into the panoramic depth image of the target panoramic image, so that the panoramic depth image of the target panoramic image is obtained.
Referring to fig. 5, fig. 5 is a flowchart illustrating a specific processing procedure of panoramic image processing in a navigation method based on a three-dimensional model according to an embodiment of the present disclosure.
Step 502: the target panoramic image is input into an encoding layer of an image processing model.
Step 504: and acquiring a panoramic image characteristic map of the target panoramic image extracted by the coding layer.
Step 506: and converting the panoramic image characteristic map of the target panoramic image through a HEALPIxel algorithm.
Step 508: and converting the panoramic image feature map into an undistorted spherical image coding feature map by using a HEALPIxel algorithm.
Step 510: and inputting the spherical image coding feature map into a decoding layer of the image processing model.
Step 512: and acquiring a spherical depth image of the target panoramic image acquired by the decoding layer.
Step 514: and performing inverse transformation on the spherical depth image of the target panoramic image through a HEALPIxel algorithm.
Step 516: the spherical depth image is converted into a panoramic depth image of the target panoramic image by an inverse transformation of the HEALPIxel algorithm.
According to the method provided by the embodiment of the specification, the depth image of the panoramic image is obtained through the image processing model of the deep learning, extra instrument equipment is not needed, and the cost is saved; in actual use, the process of deep learning on the plane image is expanded to the spherical surface, and the image processing model specially designed for the spherical surface is operated on the spherical surface, so that the distortion of the panoramic image generated in the ERP projection process is solved, and the accuracy of obtaining the depth image of the panoramic image is improved.
In practical application, because the target panoramic image passes through a plurality of encoder layers of the image processing model, the feature extraction of the target panoramic image is carried out through continuous layer-by-layer calculation, and the resolution of the feature map of the finally extracted panoramic image is reduced very little, which is not favorable for accurate depth estimation results. Therefore, in the embodiments of the present specification, in order to avoid solving the problem, after obtaining the spherical image decoding feature of the target panoramic image through the decoding layer, the spherical image coding feature obtained through the coding layer and the spherical image decoding feature obtained through the decoding layer are superimposed to obtain the spherical depth image of the target panoramic image, so as to increase the accuracy of the depth estimation result of the panoramic depth image of the target panoramic image obtained subsequently according to the spherical depth image. The specific implementation mode is as follows:
the inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining the spherical depth image of the target panoramic image through the decoding layer comprises:
inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining the spherical image decoding features of the target panoramic image through the decoding layer;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding features and the spherical image decoding features.
Specifically, after obtaining the spherical image coding feature of the target panoramic image, inputting the spherical image coding feature into a decoding layer of an image processing model, and obtaining the spherical image decoding feature of the target panoramic image through the decoding layer; and then, overlapping the spherical image coding features and the spherical image decoding features to obtain a spherical depth image of the target panoramic image.
In practical application, in order to ensure the superposition effect of the spherical image coding features and the spherical image decoding features, the feature superposition between the two features can be performed by adopting a skip-connection mode, namely, the shallow spherical image coding features of the encoder layer are introduced by the skip-connection mode and are superposed with the spherical image decoding features of the decoder layer, so that the feature images with the shallow features contain rich local information due to the high resolution of the feature images, and the accuracy of a depth estimation result is better facilitated. The specific implementation mode is as follows:
the obtaining a spherical depth image of the target panoramic image according to the spherical image coding feature and the spherical image decoding feature includes:
overlapping the coding features of the spherical images and the decoding features of the spherical images in a jumping connection mode;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding characteristics and the spherical image decoding characteristics after superposition.
Specifically, the spherical image coding features and the spherical image decoding features are overlapped in a jumping connection mode, and the accuracy of depth estimation of the spherical depth image of the target panoramic image is improved based on the high resolution and the abundant local information of the feature image of the spherical image coding features of the shallow encoder layer by overlapping the spherical image coding features of the shallow encoder layer and the spherical image decoding features of the deep decoder layer.
In specific implementation, taking the coding layer and the decoding layer as i layers as an example, the feature superposition of the spherical image coding feature and the spherical image decoding feature is specifically introduced:
the performing feature superposition on the spherical image coding features and the spherical image decoding features in a jumping connection mode includes:
s2, overlapping the spherical image coding characteristics obtained by the ith coding layer and the spherical image decoding characteristics obtained by the jth decoding layer by jumping connection,
wherein, the initial layer of the ith coding layer is a first layer, and the initial layer of the jth decoding layer is a last layer;
s4, judging whether the ith coding layer is the last coding layer and whether the jth decoding layer is the first decoding layer,
if not, increasing i by 1, decreasing j by 1, and continuing to execute step S2.
Wherein i and j are both positive integers.
Specifically, the feature superposition is performed by a jump connection mode on the spherical image coding feature obtained through a 1 st coding layer and the spherical image decoding feature obtained through a 4 th decoding layer, and then the feature superposition is performed by a jump connection mode on the spherical image coding feature obtained through a 2 nd coding layer and the spherical image decoding feature obtained through a 3 rd decoding layer; and by parity of reasoning, the whole feature superposition process is completed. Of course, in practical application, the number of layers of i and j may be the same or different, and may be specifically set according to actual requirements.
In the embodiment of the specification, the accuracy of depth estimation of the spherical depth image of the target panoramic image is improved based on the higher resolution of the feature images of the spherical image coding features of the shallower encoder layer and abundant local information by superposing the spherical image coding features of the shallower encoder layer and the spherical image decoding features of the deeper decoder layer.
In addition, the embodiment of the present specification is a processing of a panoramic image, and then in order to ensure the integrity and accuracy of a panoramic depth image of an obtained panoramic image, a global receptive field, that is, a context acquisition capability needs to be considered. Therefore, in the embodiment of the present specification, by providing a cross attention mechanism fusion module in the decoding layer decoder, a better overlap between the spherical image coding feature and the spherical image decoding feature is achieved through a self-attention mechanism of the cross attention mechanism fusion module (CAF). The specific implementation mode is as follows:
the obtaining a spherical depth image of the target panoramic image according to the spherical image coding feature and the spherical image decoding feature includes:
performing attention calculation on the spherical image coding features and the spherical image decoding features through a cross attention mechanism fusion module of the decoding layer to obtain correction amounts of the spherical image coding features and the spherical image decoding features;
and obtaining a spherical depth image of the target panoramic image according to the correction quantity of the spherical image coding characteristic, the correction quantity of the spherical image decoding characteristic, the spherical image coding characteristic and the spherical image decoding characteristic.
Specifically, for the global feature (spherical image decoding feature), the correction amount Att is calculated using the global Q information 0 (ii) a For local features, the correction Att is calculated by using local Q information 1 Wherein, the specific implementation formula 1 is as follows:
Figure BDA0003815986080000121
Figure BDA0003815986080000122
in the embodiment of the present disclosure, the spherical image encoding feature and the spherical image decoding feature are self-attentive calculated by a cross attention mechanism fusion module in the decoding layer, and the self-attentive calculation is explained as follows:
the image is divided into a plurality of image blocks, and the association degree between each image block and other image blocks is calculated through the formula. Q is query (image block), K is key (other image block), and for a certain image block (query), all other image blocks (key) are used to calculate the correlation degree between two image blocks, and the higher the correlation degree is, the larger the calculation result of Q × K is. V can be regarded as a "feature" in the neural network, which is equivalent to calculating the importance of each feature V by Q and K.
After the correction amount of the spherical image coding feature and the correction amount of the spherical image decoding feature are obtained, the spherical depth image of the target panoramic image can be obtained according to the correction amount of the spherical image coding feature, the correction amount of the spherical image decoding feature, the spherical image coding feature and the spherical image decoding feature; the specific implementation formula 2 is as follows:
X CA =FFN(LN(X 0 +Att 0 +X 1 +Att 1 ) Equation 2)
In addition, as can be seen from the above description of the embodiments, a skip-connection may be adopted as a specific superposition manner of the spherical image encoding feature and the spherical image decoding feature. Therefore, in the embodiment of the present specification, a relatively accurate panoramic depth image can be obtained by using an image processing model of an encoding layer encoder-decoding layer decoder with a skip-connection, so that the spherical image encoding characteristics and the spherical image decoding characteristics are simply superimposed through the skip-connection, the fusion process is also relatively simple, and the local (local characteristics introduced from the skip-connection) and global (global characteristics of the decoder) characteristics cannot be well balanced. In the embodiment of the present specification, in order to perform more optimal superposition between features, features from decoder and skip-connection in different dimensions are fused by means of a CAF module (cross attention mechanism fusion module), and a correction amount is learned for each of the features in two different dimensions, so as to compensate for the defect caused by direct summation of the features in different dimensions. The specific implementation mode is as follows:
the obtaining a spherical depth image of the target panoramic image according to the correction amount of the spherical image coding feature, the correction amount of the spherical image decoding feature, the spherical image coding feature and the spherical image decoding feature includes:
according to the correction amount of the spherical image coding features and the correction amount of the spherical image decoding features, the spherical image coding features and the spherical image decoding features are subjected to feature superposition in a jumping connection mode;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding characteristics and the spherical image decoding characteristics after superposition.
Specifically, after the correction amount of the spherical image coding feature and the correction amount of the spherical image decoding feature are obtained, according to the correction amount of the spherical image coding feature and the correction amount of the spherical image decoding feature, the spherical image coding feature and the spherical image decoding feature are subjected to feature superposition in a CAF module in a jumping connection mode, and finally, according to the superposed characteristic images of the spherical image coding feature and the spherical image decoding feature, a spherical depth image of the target panoramic image is obtained; i.e. the superposition between features in the CAF module, is a superposition of features by a self-attention mechanism.
In practical application, after a panoramic depth image of a target panoramic image is obtained, accurate three-dimensional modeling can be performed according to the panoramic depth image, and the specific implementation mode is as follows:
after the spherical depth image is converted into the panoramic depth image of the target panoramic image according to the panoramic image spherical conversion algorithm, the method further comprises the following steps:
and constructing a three-dimensional model of the target object according to the panoramic depth image of the target panoramic image.
In specific implementation, the manner of constructing the three-dimensional model according to the panoramic depth image may refer to the description of the above embodiments, and is not described herein again.
And under the condition that the image processing model is obtained by pre-training, after the target panoramic image is obtained, the target panoramic image can be directly input into the image processing model, and the image processing model can directly output the panoramic depth image of the target panoramic image after the processing of the flow steps is carried out in the image processing model, so that the user feels no and the user experience is improved.
The specific implementation mode is as follows:
after the target panoramic image is input into the image processing model, the method further comprises the following steps:
obtaining a panoramic depth image of the target panoramic image output by the image processing model;
accordingly, the training step of the image processing model is as follows:
determining a sample panoramic image and a sample panoramic depth image corresponding to the sample panoramic image;
inputting a sample panoramic image into an image processing model, and obtaining sample panoramic image characteristics of the sample panoramic image through an encoding layer of the image processing model;
converting the sample panoramic image characteristics into sample spherical image coding characteristics according to the panoramic image spherical conversion algorithm;
inputting the sample spherical image coding features into a decoding layer of the image processing model, and obtaining a sample spherical depth image of the sample panoramic image through the decoding layer;
according to the spherical conversion algorithm of the panoramic image, converting the spherical depth image of the sample into a predicted panoramic depth image of the panoramic image of the sample;
and adjusting a loss function of the image processing model according to the sample panoramic depth image and the predicted panoramic depth image to realize the training of the image processing model.
Specifically, the training step of the image processing model is the same as the specific implementation process of the image processing model in the above embodiment for processing the target panoramic image to obtain the panoramic depth image of the target panoramic image, and details not described in detail in the training step of the image processing model can be referred to in the above embodiments. And the image processing model is used for processing the target panoramic image to obtain a panoramic depth image of the target panoramic image.
In the embodiment of the present description, the deep learning process (such as feature superposition) performed on the planar image is expanded to the spherical surface, and the deep learning model (i.e., the image processing model in the embodiment of the present description) specially designed for the spherical surface is run on the spherical surface, so that the problem that the left and right ends of the panoramic image are spatially continuous and discontinuous on the image is solved, and the panoramic depth image can be obtained only by processing the panoramic image, so that the overall parameters are reduced, and the processing efficiency of the image processing model is greatly improved.
Step 410: and constructing a target three-dimensional model of the target object according to the panoramic depth image of the target panoramic image.
The target three-dimensional model is constructed through the image processing model operating on a spherical surface, and the image processing model is a neural network model; for a specific implementation manner of constructing the target three-dimensional model of the target object according to the panoramic depth image of the target panoramic image, reference may be made to the detailed description of the above embodiments, which is not repeated herein.
According to the navigation method based on the three-dimensional model, provided by the embodiment of the specification, in the target three-dimensional model, the navigation map is planned for the user according to the current position and the destination of the virtual object of the user in the target three-dimensional model, the user is guided to travel in the target three-dimensional model in a three-dimensional visual presentation mode, and the navigation is performed by combining a modeling technology and a three-dimensional visual guidance mode, so that the navigation experience of the user is improved.
Fig. 6, which is described below with reference to fig. 6, illustrates a specific processing flow diagram of target three-dimensional model construction of a target object in a three-dimensional model based navigation method provided in an embodiment of the present specification, and specifically includes the following steps.
Step 602: the panoramic image of the house is input into an image processing model.
Step 604: feature extraction and downsampling are carried out on a first coding layer, a second coding layer, a third coding layer and a fourth coding layer of the image processing model, and panoramic image feature diagrams output by the first coding layer, the second coding layer, the third coding layer and the fourth coding layer are obtained respectively.
Step 606: and respectively converting the panoramic image feature maps output by the first coding layer, the second coding layer, the third coding layer and the fourth coding layer into undistorted spherical image coding feature maps through a HEALPIxel algorithm in a panoramic image spherical conversion module.
Step 608: inputting the spherical image coding feature map output by the fourth coding layer into the first decoding layer; obtaining a spherical image decoding characteristic diagram after the first decoding layer is subjected to up-sampling; and performing feature fusion on the spherical image coding feature map output after down sampling of the third coding layer and the spherical image decoding feature map obtained after up sampling of the first decoding layer through a cross self-attention mechanism fusion module of the first decoding layer to obtain a first fusion feature image.
Referring to fig. 7, fig. 7 is a schematic diagram illustrating feature fusion through a self-attention mechanism during panoramic image processing in a navigation method based on a three-dimensional model according to an embodiment of the present disclosure.
FIG. 7 is a diagram showing an implementation of feature fusion for each decoding layer by a cross-attention mechanism fusion module; as can be seen from fig. 7, the spherical image coding feature map output by the coding layer of each layer and converted by the HEALPixel algorithm in the panorama spherical surface conversion module is combined with the spherical position coding by the residual convolution processing, and then input to the cross attention mechanism module by the layer normalization; meanwhile, a spherical image coding feature map is output by a decoding layer corresponding to the coding layer of each layer, and is input into a cross attention mechanism module through layer normalization after residual convolution processing and combination with spherical position coding, the spherical image coding feature map and the spherical image decoding feature map are subjected to feature fusion through a self attention mechanism in a cross self attention mechanism fusion module, and after a fusion feature image is obtained, the fusion feature image is input into a feed-forward network for processing, and then up-sampling is carried out for continuous subsequent processing.
Step 610: inputting the first fusion characteristic image into a second decoding layer, and performing up-sampling on the second decoding layer to obtain a spherical image decoding characteristic image; and performing feature fusion on the spherical image coding feature map output after down sampling of the second coding layer and the spherical image decoding feature map obtained after up sampling of the second decoding layer through a cross self-attention mechanism fusion module of the second decoding layer to obtain a second fusion feature image.
Step 612: inputting the second fusion characteristic image into a third decoding layer, and performing up-sampling on the third decoding layer to obtain a spherical image decoding characteristic image; and performing feature fusion on the spherical image coding feature map output after down sampling of the first coding layer and the spherical image decoding feature map obtained after up sampling of the third decoding layer through a cross self-attention mechanism fusion module of the third decoding layer to obtain a third fusion feature image.
Step 614: and regressing the third fusion characteristic image through a depth regressor to obtain a spherical depth image, and inversely transforming the spherical depth image through a HEALPIxel algorithm to obtain a panoramic depth image corresponding to the house panoramic image.
Step 616: and constructing a target three-dimensional model of the house according to the panoramic depth image corresponding to the panoramic image of the house.
In the navigation method based on the three-dimensional model provided in the embodiment of the present specification, when acquiring the panoramic depth image corresponding to the panoramic image of the house, a single backbone depth neural network is used, so that in the actual image processing process, the parameter amount is smaller and the precision is obviously higher; the HEALPIxel algorithm is adopted to convert the panoramic image into an undistorted spherical image, the self-attention mechanism calculation and the convolution calculation are carried out on the spherical surface, the spherical surface is a continuous curved surface, the left end and the right end of the panoramic image are connected together on the spherical surface, and the problem that the left end and the right end are inconsistent is solved; moreover, the HEALPIxel algorithm is adopted to project the extracted features from the panoramic image onto a distortion-free spherical surface, and special treatment is carried out on distortion in the features, so that the influence of the distortion on a depth estimation task is greatly reduced, and the accuracy of the panoramic depth image of the panoramic image is improved; meanwhile, the CAF module is adopted to greatly improve the context sensing capability of the decoding layer, effectively utilize scene information and further improve the accuracy of the obtained panoramic depth image of the panoramic image; therefore, the target three-dimensional model of the house can be quickly and accurately constructed subsequently according to the panoramic depth image with higher accuracy.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a navigation device based on a three-dimensional model, and fig. 8 shows a schematic structural diagram of a navigation device based on a three-dimensional model provided in an embodiment of the present specification. As shown in fig. 8, the apparatus includes:
a model presentation module 802 configured to present a target three-dimensional model to a user through a user interaction interface in response to a presentation request of the user for the target three-dimensional model;
a location determination module 804 configured to receive a destination in the target three-dimensional model input by the user through the user interaction interface and determine a current location of a virtual object of the user in the target three-dimensional model;
a map determination module 806 configured to determine a navigation map of the virtual object moving from the current location to the destination according to the current location and the destination;
a navigation module 808 configured to guide the virtual object to move from the current position to the destination according to the navigation map, and display a guide track to the user through the user interaction interface in a three-dimensional visual manner.
Optionally, the apparatus further comprises:
a rendering module configured to:
and in the process of guiding the virtual object to move from the current position to the destination according to the navigation map, rendering the display object according to the distance between the virtual object and the display object in the target three-dimensional model.
Optionally, the apparatus further comprises:
a skip module configured to:
displaying the display article and the webpage link corresponding to the display article to the user through the user interaction interface;
and responding to a click instruction of the user for the webpage link, and jumping to an article display page corresponding to the webpage link from the target three-dimensional model.
Optionally, the apparatus further comprises:
a three-dimensional model building module configured to:
inputting a target panoramic image of a target object into an image processing model, and obtaining panoramic image characteristics of the target panoramic image through an encoding layer of the image processing model;
converting the panoramic image characteristics into spherical image coding characteristics according to a panoramic image spherical conversion algorithm;
inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining a spherical depth image of the target panoramic image through the decoding layer;
according to the spherical conversion algorithm of the panoramic image, converting the spherical depth image into a panoramic depth image of the target panoramic image;
and constructing a target three-dimensional model of the target object according to the panoramic depth image of the target panoramic image, wherein the target three-dimensional model is constructed through the image processing model operating on a spherical surface, and the image processing model is a neural network model.
Optionally, the three-dimensional model building module is further configured to:
inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining the spherical image decoding features of the target panoramic image through the decoding layer;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding features and the spherical image decoding features.
Optionally, the three-dimensional model building module is further configured to:
overlapping the coding features of the spherical images and the decoding features of the spherical images in a jumping connection mode;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding features and the spherical image decoding features after superposition.
Optionally, the three-dimensional model building module is further configured to:
performing attention calculation on the spherical image coding features and the spherical image decoding features through a cross attention mechanism fusion module of the decoding layer to obtain correction quantities of the spherical image coding features and the spherical image decoding features;
and obtaining a spherical depth image of the target panoramic image according to the correction amount of the spherical image coding feature, the correction amount of the spherical image decoding feature, the spherical image coding feature and the spherical image decoding feature.
Optionally, the three-dimensional model building module is further configured to:
according to the correction amount of the spherical image coding features and the correction amount of the spherical image decoding features, the spherical image coding features and the spherical image decoding features are subjected to feature superposition in a jumping connection mode;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding features and the spherical image decoding features after superposition.
Optionally, the three-dimensional model building module is further configured to:
s2, overlapping the spherical image coding characteristics obtained by the ith coding layer and the spherical image decoding characteristics obtained by the jth decoding layer by jumping connection,
wherein, the initial layer of the ith coding layer is a first layer, and the initial layer of the jth decoding layer is a last layer;
s4, judging whether the ith coding layer is the last coding layer and whether the jth decoding layer is the first decoding layer,
if not, increasing i by 1, decreasing j by 1, and continuing to execute step S2.
Optionally, the three-dimensional model building module is further configured to:
acquiring a target panoramic image of a target object shot by panoramic shooting equipment; or
And acquiring at least two initial plane images of the target object, and fusing the at least two initial plane images according to a preset image fusion algorithm to obtain a target panoramic image of the target object.
Optionally, the three-dimensional model building module is further configured to:
obtaining a panoramic depth image of the target panoramic image output by the image processing model;
accordingly, the training step of the image processing model is as follows:
determining a sample panoramic image and a sample panoramic depth image corresponding to the sample panoramic image;
inputting a sample panoramic image into an image processing model, and obtaining sample panoramic image characteristics of the sample panoramic image through an encoding layer of the image processing model;
converting the sample panoramic image characteristics into sample spherical image coding characteristics according to the panoramic image spherical conversion algorithm;
inputting the sample spherical image coding features into a decoding layer of the image processing model, and obtaining a sample spherical depth image of the sample panoramic image through the decoding layer;
according to the spherical conversion algorithm of the panoramic image, converting the spherical depth image of the sample into a predicted panoramic depth image of the panoramic image of the sample;
and adjusting a loss function of the image processing model according to the sample panoramic depth image and the predicted panoramic depth image to realize the training of the image processing model.
In the navigation device based on the three-dimensional model provided in the embodiment of the present specification, a navigation map is planned for a user according to a current position and a destination of a virtual object of the user in the target three-dimensional model, the user is guided to travel in the target three-dimensional model in a three-dimensional visual presentation manner, and navigation is performed by combining a modeling technology and a three-dimensional visual guidance manner, so that navigation experience of the user is improved.
The foregoing is a schematic diagram of a navigation device based on a three-dimensional model according to this embodiment. It should be noted that the technical solution of the navigation apparatus based on the three-dimensional model is the same as the technical solution of the navigation method based on the three-dimensional model, and details of the technical solution of the navigation apparatus based on the three-dimensional model, which are not described in detail, can be referred to the description of the technical solution of the navigation method based on the three-dimensional model.
FIG. 9 illustrates a block diagram of a computing device 900 provided in accordance with one embodiment of the present specification. Components of the computing device 900 include, but are not limited to, a memory 910 and a processor 920. The processor 920 is coupled to the memory 910 via a bus 930, and a database 950 is used to store data.
Computing device 900 also includes access device 940, access device 940 enabling computing device 900 to communicate via one or more networks 960. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 940 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 900, as well as other components not shown in FIG. 9, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 9 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.
Computing device 900 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 900 may also be a mobile or stationary server.
Wherein the processor 920 is configured to execute computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method described above.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the navigation method based on the three-dimensional model belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the navigation method based on the three-dimensional model.
An embodiment of the present specification further provides an augmented reality AR apparatus, including:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method described above.
The above is a schematic scheme of an augmented reality AR device of this embodiment. It should be noted that the technical solution of the augmented reality AR device and the technical solution of the navigation method based on the three-dimensional model belong to the same concept, and details of the technical solution of the augmented reality AR device, which are not described in detail, can be referred to the description of the technical solution of the navigation method based on the three-dimensional model.
An embodiment of this specification further provides a virtual reality VR device, including:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method described above.
The above is a schematic scheme of a virtual reality VR device of this embodiment. It should be noted that the technical solution of the virtual reality VR device and the technical solution of the navigation method based on the three-dimensional model belong to the same concept, and details of the technical solution of the virtual reality VR device, which are not described in detail, can be referred to the description of the technical solution of the navigation method based on the three-dimensional model.
An embodiment of the present specification further provides a computer-readable storage medium storing computer-executable instructions, which when executed by a processor, implement the steps of the three-dimensional model-based navigation method described above.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the above-mentioned navigation method based on the three-dimensional model, and for details that are not described in detail in the technical solution of the storage medium, reference may be made to the description of the technical solution of the above-mentioned navigation method based on the three-dimensional model.
An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer program is used for executing the steps of the navigation method based on the three-dimensional model.
The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program belongs to the same concept as the technical solution of the navigation method based on the three-dimensional model, and for details that are not described in detail in the technical solution of the computer program, reference may be made to the description of the technical solution of the navigation method based on the three-dimensional model.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (14)

1. A navigation method based on a three-dimensional model is applied to a client and comprises the following steps:
responding to a display request of a user for a target three-dimensional model, and displaying the target three-dimensional model to the user through a user interaction interface;
receiving a destination in the target three-dimensional model, which is input by the user through the user interaction interface, and determining the current position of a virtual object of the user in the target three-dimensional model;
determining a navigation map of the virtual object moving from the current position to the destination according to the current position and the destination;
and guiding the virtual object to move from the current position to the destination according to the navigation map, and displaying a guide track to the user through the user interaction interface in a three-dimensional visual mode.
2. The three-dimensional model-based navigation method of claim 1, further comprising:
and in the process of guiding the virtual object to move from the current position to the destination according to the navigation map, rendering the display object according to the distance between the virtual object and the display object in the target three-dimensional model.
3. The three-dimensional model-based navigation method of claim 2, further comprising, after rendering the displayed item according to the distance between the virtual object and the displayed item in the target three-dimensional model:
displaying the display article and the webpage link corresponding to the display article to the user through the user interaction interface;
and responding to a click instruction of the user for the webpage link, and jumping to an article display page corresponding to the webpage link from the target three-dimensional model.
4. The three-dimensional model based navigation method according to claim 1, further comprising, before responding to a user's presentation request for the target three-dimensional model:
inputting a target panoramic image of a target object into an image processing model, and obtaining panoramic image characteristics of the target panoramic image through an encoding layer of the image processing model;
converting the panoramic image characteristics into spherical image coding characteristics according to a panoramic image spherical conversion algorithm;
inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining a spherical depth image of the target panoramic image through the decoding layer;
according to the spherical conversion algorithm of the panoramic image, converting the spherical depth image into a panoramic depth image of the target panoramic image;
and constructing a target three-dimensional model of the target object according to the panoramic depth image of the target panoramic image, wherein the target three-dimensional model is constructed through the image processing model operating on a spherical surface, and the image processing model is a neural network model.
5. The three-dimensional model-based navigation method of claim 4, wherein the inputting the spherical image coding features into a decoding layer of the image processing model, a spherical depth image of the target panoramic image being obtained through the decoding layer, comprises:
inputting the spherical image coding features into a decoding layer of the image processing model, and obtaining the spherical image decoding features of the target panoramic image through the decoding layer;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding features and the spherical image decoding features.
6. The three-dimensional model-based navigation method according to claim 5, wherein the obtaining of the spherical depth image of the target panoramic image according to the spherical image coding feature and the spherical image decoding feature comprises:
overlapping the coding features of the spherical images and the decoding features of the spherical images in a jumping connection mode;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding characteristics and the spherical image decoding characteristics after superposition.
7. The three-dimensional model-based navigation method according to claim 5, wherein the obtaining of the spherical depth image of the target panoramic image according to the spherical image encoding feature and the spherical image decoding feature comprises:
performing attention calculation on the spherical image coding features and the spherical image decoding features through a cross attention mechanism fusion module of the decoding layer to obtain correction amounts of the spherical image coding features and the spherical image decoding features;
and obtaining a spherical depth image of the target panoramic image according to the correction amount of the spherical image coding feature, the correction amount of the spherical image decoding feature, the spherical image coding feature and the spherical image decoding feature.
8. The three-dimensional model-based navigation method according to claim 7, wherein the obtaining of the spherical depth image of the target panoramic image according to the correction amount of the spherical image coding feature, the correction amount of the spherical image decoding feature, the spherical image coding feature and the spherical image decoding feature comprises:
according to the correction quantity of the spherical image coding features and the correction quantity of the spherical image decoding features, the spherical image coding features and the spherical image decoding features are subjected to feature superposition in a jumping connection mode;
and obtaining a spherical depth image of the target panoramic image according to the spherical image coding features and the spherical image decoding features after superposition.
9. The three-dimensional model-based navigation method according to claim 6 or 8, wherein the feature superposition by means of skip connection of the spherical image coding features and the spherical image decoding features comprises:
s2, overlapping the spherical image coding characteristics obtained by the ith coding layer and the spherical image decoding characteristics obtained by the jth decoding layer by jumping connection,
wherein, the initial layer of the ith coding layer is a first layer, and the initial layer of the jth decoding layer is a last layer;
s4, judging whether the ith coding layer is the last coding layer and whether the jth decoding layer is the first decoding layer,
if not, increasing i by 1, decreasing j by 1, and continuing to execute step S2.
10. The three-dimensional model based navigation method of claim 4, further comprising, before inputting the target panoramic image into the image processing model:
acquiring a target panoramic image of a target object shot by panoramic shooting equipment; or
And acquiring at least two initial plane images of the target object, and fusing the at least two initial plane images according to a preset image fusion algorithm to obtain a target panoramic image of the target object.
11. The three-dimensional model based navigation method of claim 4, further comprising, after inputting the target panoramic image of the target object into the image processing model:
obtaining a panoramic depth image of the target panoramic image output by the image processing model;
accordingly, the training step of the image processing model is as follows:
determining a sample panoramic image and a sample panoramic depth image corresponding to the sample panoramic image;
inputting a sample panoramic image into an image processing model, and obtaining sample panoramic image characteristics of the sample panoramic image through an encoding layer of the image processing model;
converting the sample panoramic image characteristics into sample spherical image coding characteristics according to the panoramic image spherical conversion algorithm;
inputting the sample spherical image coding features into a decoding layer of the image processing model, and obtaining a sample spherical depth image of the sample panoramic image through the decoding layer;
according to the spherical conversion algorithm of the panoramic image, converting the spherical depth image of the sample into a predicted panoramic depth image of the panoramic image of the sample;
and adjusting a loss function of the image processing model according to the sample panoramic depth image and the predicted panoramic depth image to realize the training of the image processing model.
12. A navigation device based on a three-dimensional model is applied to a client and comprises:
the model display module is configured to respond to a display request of a user for a target three-dimensional model, and display the target three-dimensional model to the user through a user interaction interface;
a location determination module configured to receive a destination in the target three-dimensional model input by the user through the user interaction interface and determine a current location of a virtual object of the user in the target three-dimensional model;
a map determination module configured to determine a navigation map of the virtual object moving from the current location to the destination according to the current location and the destination;
and the navigation module is configured to guide the virtual object to move from the current position to the destination according to the navigation map, and display a guide track to the user through the user interaction interface in a three-dimensional visual mode.
13. An Augmented Reality (AR) device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method of any one of claims 1 to 11.
14. A Virtual Reality (VR) device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the three-dimensional model based navigation method of any one of claims 1 to 11.
CN202211026584.0A 2022-08-25 2022-08-25 Navigation method and device based on three-dimensional model Pending CN115527011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211026584.0A CN115527011A (en) 2022-08-25 2022-08-25 Navigation method and device based on three-dimensional model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211026584.0A CN115527011A (en) 2022-08-25 2022-08-25 Navigation method and device based on three-dimensional model

Publications (1)

Publication Number Publication Date
CN115527011A true CN115527011A (en) 2022-12-27

Family

ID=84697266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211026584.0A Pending CN115527011A (en) 2022-08-25 2022-08-25 Navigation method and device based on three-dimensional model

Country Status (1)

Country Link
CN (1) CN115527011A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116858215A (en) * 2023-09-05 2023-10-10 武汉大学 AR navigation map generation method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116858215A (en) * 2023-09-05 2023-10-10 武汉大学 AR navigation map generation method and device
CN116858215B (en) * 2023-09-05 2023-12-05 武汉大学 AR navigation map generation method and device

Similar Documents

Publication Publication Date Title
US11270460B2 (en) Method and apparatus for determining pose of image capturing device, and storage medium
CN109961507B (en) Face image generation method, device, equipment and storage medium
US20200388064A1 (en) Single image-based real-time body animation
CN109754464B (en) Method and apparatus for generating information
CN115690382B (en) Training method of deep learning model, and method and device for generating panorama
US20140160122A1 (en) Creating a virtual representation based on camera data
US11961266B2 (en) Multiview neural human prediction using implicit differentiable renderer for facial expression, body pose shape and clothes performance capture
EP4191538A1 (en) Large scene neural view synthesis
CN114782661B (en) Training method and device for lower body posture prediction model
CN115272565A (en) Head three-dimensional model reconstruction method and electronic equipment
Li et al. MonoIndoor++: Towards better practice of self-supervised monocular depth estimation for indoor environments
CN113220251A (en) Object display method, device, electronic equipment and storage medium
Chai et al. Monocular and binocular interactions oriented deformable convolutional networks for blind quality assessment of stereoscopic omnidirectional images
CN115294268A (en) Three-dimensional model reconstruction method of object and electronic equipment
CN115527011A (en) Navigation method and device based on three-dimensional model
JP2024510230A (en) Multi-view neural human prediction using implicitly differentiable renderer for facial expression, body pose shape and clothing performance capture
CN116386087B (en) Target object processing method and device
Kim et al. Deep transformer based video inpainting using fast fourier tokenization
CN115512038B (en) Real-time drawing method for free viewpoint synthesis, electronic device and readable storage medium
CN115272575B (en) Image generation method and device, storage medium and electronic equipment
Guo et al. Perspective reconstruction of human faces by joint mesh and landmark regression
CN115731344A (en) Image processing model training method and three-dimensional object model construction method
CN114419158A (en) Six-dimensional attitude estimation method, network training method, device, equipment and medium
CN115205325A (en) Target tracking method and device
Pang et al. JointMETRO: a 3D reconstruction model for human figures in works of art based on transformer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination