WO2023098737A1 - Three-dimensional reconstruction method, electronic device, and computer-readable storage medium - Google Patents

Three-dimensional reconstruction method, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
WO2023098737A1
WO2023098737A1 PCT/CN2022/135517 CN2022135517W WO2023098737A1 WO 2023098737 A1 WO2023098737 A1 WO 2023098737A1 CN 2022135517 W CN2022135517 W CN 2022135517W WO 2023098737 A1 WO2023098737 A1 WO 2023098737A1
Authority
WO
WIPO (PCT)
Prior art keywords
images
key frame
panoramic
camera
panoramic image
Prior art date
Application number
PCT/CN2022/135517
Other languages
French (fr)
Chinese (zh)
Inventor
靳懿
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023098737A1 publication Critical patent/WO2023098737A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches

Definitions

  • the present application relates to the technical field of image processing, in particular to a three-dimensional reconstruction method, electronic equipment, and a computer-readable storage medium.
  • the construction methods of pure visual 3D reconstruction are all realized based on discretely distributed image data of survey areas in different viewing angle directions. Smaller results in poorer building effects.
  • the main purpose of the embodiments of the present application is to provide a three-dimensional reconstruction method, electronic equipment and computer-readable storage medium, so that the construction effect and success rate of the three-dimensional reconstruction can be improved.
  • an embodiment of the present application provides a three-dimensional reconstruction method, including: acquiring a sequence of panoramic images taken from a target area; performing camera pose estimation on the panoramic images in the sequence of panoramic images to obtain the The camera pose corresponding to the panoramic image; cutting the panoramic image to obtain images of multiple orientations corresponding to the panoramic image; according to the images of the multiple orientations and the camera pose corresponding to the panoramic image, the The target area is subjected to three-dimensional reconstruction.
  • an embodiment of the present application further provides an electronic device, including: at least one processor; and a memory connected to the at least one processor in communication; wherein, the memory stores information that can be used by the Instructions executed by at least one processor, the instructions being executed by the at least one processor; enabling the at least one processor to execute the above three-dimensional reconstruction method.
  • an embodiment of the present application further provides a computer-readable storage medium storing a computer program, and the computer program implements the above three-dimensional reconstruction method when executed by a processor.
  • Fig. 1 is a flow chart of the three-dimensional reconstruction method mentioned in the embodiment of the present application.
  • FIG. 2 is a flowchart of an implementation of step 102 mentioned in the embodiment of the present application.
  • Fig. 3 is a schematic diagram of the coordinate system involved in cutting the panoramic image mentioned in the embodiment of the present application.
  • FIG. 4 is a flowchart of an implementation of step 104 mentioned in the embodiment of the present application.
  • FIG. 5 is a flowchart of an implementation of step 302 mentioned in the embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of the electronic device mentioned in the embodiment of the present application.
  • An embodiment of the present application provides a three-dimensional reconstruction method applied to electronic equipment.
  • the flowchart of the three-dimensional reconstruction method in this embodiment may refer to FIG. 1 , and includes the following steps.
  • Step 101 Obtain a panoramic image sequence obtained by photographing a target area.
  • Step 102 Perform camera pose estimation on the panoramic images in the panoramic image sequence, and obtain the corresponding camera poses of the panoramic images.
  • Step 103 Cutting the panoramic image to obtain images corresponding to multiple orientations of the panoramic image.
  • Step 104 Perform 3D reconstruction of the target area according to the camera poses corresponding to the images in multiple orientations and the panoramic image.
  • the camera pose required in the reconstruction process is obtained by estimating the camera pose of the panoramic image.
  • the panoramic image is cut to obtain images from multiple directions, which is conducive to a more comprehensive response from multiple different directions. Characteristics of the target area.
  • the 3D reconstruction is performed by combining the camera pose and the multi-directional images obtained by cutting, which is conducive to improving the construction effect and success rate of the 3D reconstruction.
  • the three-dimensional reconstruction method provided in the embodiment of the present application obtains the panoramic image sequence obtained by shooting the target area; performs camera pose estimation on the panoramic image in the panoramic image sequence to obtain the corresponding camera pose of the panoramic image; cuts the panoramic image, To obtain images of multiple orientations corresponding to the panoramic image; perform 3D reconstruction of the target area according to the images of multiple orientations and the camera poses corresponding to the panoramic image.
  • by shooting a panoramic image it is beneficial to increase the visible range of the target area, and the 360° panoramic view can provide greater convenience for the reconstruction of the target area.
  • the camera pose required in the reconstruction process is obtained by estimating the camera pose of the panoramic image.
  • the panoramic image is cut to obtain images from multiple directions, which is conducive to a more comprehensive response from multiple different directions. Characteristics of the target area.
  • the 3D reconstruction is performed by combining the camera pose and the multi-directional images obtained by cutting, which is conducive to improving the construction effect and success rate of the 3D reconstruction.
  • a panoramic camera may be used to shoot a target area to obtain a panoramic image sequence; wherein, the panoramic image sequence may be understood as several consecutive frames of panoramic images.
  • the panoramic camera can send the sequence of panoramic images to the electronic device, so that the electronic device acquires the sequence of panoramic images captured by the target area for subsequent processing.
  • the electronic device is equipped with a panoramic camera, so that the electronic device can directly obtain a sequence of panoramic images through the panoramic camera set inside it.
  • it is equivalent to using continuous large-view images for 3D reconstruction, which is beneficial to improving the stability and accuracy of 3D reconstruction.
  • the panoramic camera includes two fisheye lenses and an image splicing unit, the center positions of the two fisheye lenses are the same and the placement directions are opposite, and the viewing angle of each fisheye lens is 180°; the image splicing unit is used to detect and extract The features and key points of the two images collected by the two fisheye lenses, match the descriptors between the two images, and then use the eigenvectors matched by the RANSAC algorithm to estimate the homography matrix, complete the splicing of the two images, and obtain the vision of the target area
  • a panorama is a panoramic image.
  • the panoramic camera may be a helmet-mounted panoramic camera, and the helmet-mounted panoramic camera is that the panoramic camera is fixed on the helmet through a connecting piece.
  • the panoramic camera may be a handle-type panoramic camera, and the handle-type panoramic camera is that the panoramic camera is fixed on the handle through a connecting piece.
  • the two cameras when the helmet-mounted panoramic camera is fixed, the two cameras should be kept facing the two sides, and at the same time, the upper and lower edges of the camera should be parallel to the ground as much as possible.
  • the distance between the center vertical line and the vertical line of the body's center of gravity should be as close as possible; keep the edge of the camera parallel to the ground, and at the same time, the bottom edge of the video should be roughly parallel to the ground, and there should be no obvious inclination.
  • step 102 the electronic device performs camera pose estimation on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images.
  • the electronic device can convert the panoramic images in the panoramic image sequence into equirectangular images with an aspect ratio of 2:1, and perform simultaneous localization and mapping (SLAM) based on the equirectangular images, Get the camera pose corresponding to the panoramic image.
  • SLAM simultaneous localization and mapping
  • the converted equirectangular image can be transmitted to the SLAM system to obtain the camera pose corresponding to each equirectangular image in the SLAM system, which is conducive to adapting to the size requirements of the SLAM system for calculating the image of the camera pose.
  • step 102 may be implemented through the flowchart shown in FIG. 2 .
  • Step 201 Initialize the first key frame image in the panoramic image sequence.
  • Step 202 Extract features of each panoramic image in the panoramic image sequence, and determine key frame images in the panoramic image sequence except the first key frame image according to the features of each panoramic image and the first key frame image.
  • Step 203 Perform feature matching on the features of each adjacent key frame image to obtain feature matching pairs in each adjacent key frame image.
  • Step 204 According to the feature matching pair, calculate the camera pose corresponding to the subsequent key frame image in each adjacent key frame image.
  • the feature matching pairs in each adjacent key frame image are obtained, which is conducive to accurate calculation and obtaining the subsequent key frame image in each adjacent key frame image.
  • the electronic device may initialize the first frame image in the panoramic image sequence as the first key frame image, but the present invention is not limited thereto.
  • the camera pose can be initialized, that is, the camera pose corresponding to the first key frame image is initialized.
  • the electronic device can extract the feature of each panoramic image in the panoramic image sequence, the feature can be an ORB feature, and the ORB feature is a very representative image feature, which improves the problem that the FAST detector does not have directionality, And the extremely fast binary descriptor BRIEF is used to greatly speed up the entire image feature extraction process.
  • the ORB feature of each panoramic image and the first key frame image determine the key frame image except the first key frame image in the panoramic image sequence, that is to say, according to the ORB feature of each panoramic image and the first key frame
  • the images may determine the second key frame image, the third key frame image...the nth key frame image in the panoramic image sequence in sequence.
  • ORB Oriented FAST and Rotated BRIEF
  • ORB features can be understood as the features extracted by the ORB algorithm.
  • step 203 the electronic device performs feature matching on the features of each adjacent key frame image to obtain feature matching pairs in each adjacent key frame image. For example, it is possible to continuously perform feature matching between key frame images, and to screen out reliable matching pairs, which are feature matching pairs, and the matching degree of features in the feature matching pair is greater than the preset matching degree.
  • the preset matching degree can be set according to actual needs, and the matching degree between the features used to characterize the feature matching pair is relatively large. This embodiment does not specifically limit the specific size of the preset matching degree.
  • Feature points can be composed of key points and descriptors. Key points refer to the position of feature points in the image, and some feature points also have information such as orientation and size.
  • a descriptor is usually a vector that describes the information of the pixels around the keypoint.
  • the electronic device calculates the camera pose corresponding to the subsequent key frame image in each adjacent key frame image according to the feature matching pair.
  • the electronic device can use the epipolar geometric constraints to solve the inter-frame motion according to the reliable matching pairs, that is, the feature matching pairs, and combine the initialized camera poses to calculate the camera corresponding to the next key frame image in the adjacent key frame images pose.
  • the calculation and the first key frame image For example, according to the camera pose corresponding to the initialized first key frame image, that is, the initialized pose and the feature matching pair between the first key frame image and the second key frame image, the calculation and the first key frame image The camera pose corresponding to the adjacent second key frame image, and then, according to the camera pose corresponding to the second key frame image and the feature matching pair between the second key frame image and the third key frame image, Calculate the camera pose corresponding to the third key frame image adjacent to the second key frame image, and so on, calculate the camera pose corresponding to the next key frame image in each adjacent key frame image, so as to get all key The camera pose corresponding to the frame image.
  • the camera pose includes a translation vector T used to characterize the camera position and a rotation matrix R used to characterize the camera pose; in step 204, according to the feature matching pair, the next key frame image in each adjacent key frame is calculated
  • the camera pose corresponding to the key frame image including: according to the pixel position of the feature matching pair, calculate the essential matrix or fundamental matrix; according to the essential matrix or fundamental matrix, calculate the translation corresponding to the next key frame image in each adjacent key frame image Vectors and rotation matrices.
  • T and R can be obtained by decomposing the essential matrix.
  • the essential matrix E has 5 degrees of freedom, so at least 5 pairs of points (that is, 5 pairs of feature points) can be used to solve E.
  • E is equivalent at different scales.
  • the classic eight-point method is used to solve E.
  • the eight-point method only uses the linear property of E.
  • the intrinsic nature of E is a nonlinear property.
  • R and T are obtained from E decomposition, which is obtained by singular value decomposition (SVD).
  • calculating the camera pose corresponding to the next key frame image in each adjacent key frame image according to the feature matching pair in step 204 includes: according to the feature matching pair in the preset local area and the camera The prior pose in the target area is processed by local beam adjustment to obtain the camera pose corresponding to the next key frame image in each adjacent key frame image; or, according to the feature matching pair in the global area and the camera in The prior pose in the target area is processed by global bundle adjustment to obtain the camera pose corresponding to the next key frame image in each adjacent key frame image. That is to say, when performing global bundle adjustment processing, select all feature matching pairs; when performing local bundle adjustment processing, select part of feature matching pairs.
  • the prior pose can be understood as the camera pose corresponding to the key frame image before the i key frame image, for example, for the third key frame image, its prior pose can be is the camera pose of the second key frame image relative to the first key frame image; for the fourth key frame image, its prior pose can include: the third key frame image relative to the second key frame image The camera pose of , the camera pose of the second key frame image relative to the first key frame image.
  • step 102 the camera pose estimation is performed on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images.
  • the VO end initializes the first key frame image, and at the same time initializes the camera pose T, and at the same time continuously performs ORB feature extraction on the equirectangular image sequence, and selects the next key frame image.
  • the camera motion estimation of the SLAM system is performed on the equirectangular image sequence, including loop closure detection, so as to obtain the pose of the camera when it arrived at the place before, and perform beam adjustment optimization on the previous local trajectory according to the position.
  • the electronic device may perform SLAM camera pose estimation on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images.
  • the processing flow of SLAM camera pose estimation can include: feature extraction, feature matching, pose estimation, feature tracking, feature re-identification, global beam adjustment and local beam adjustment processing.
  • the input of this process is the initial extraction feature, and the output is the feature tracked by the next frame image; in feature extraction, the input is a color image, and the output is a feature; in feature re-identification, you can find
  • the input of this process is the initial extraction feature and the camera pose of the previous frame, and the output is the feature after the feature re-identification process; in the global bundle adjustment process, global nonlinear optimization can be performed, and the input of this process is All feature matching sets, the output is the pose; in the local beam adjustment processing, local nonlinear optimization can be performed, the input of this processing is the feature matching set of the local area, and the output is the pose.
  • the camera pose can include a spatial position and a posture direction, and based on the spatial position and posture direction, As well as image data, the depth information and spatial position of most pixels can be deduced in turn.
  • the electronic device cuts the panoramic image to obtain images of multiple orientations corresponding to the panoramic image; wherein, images of multiple orientations are images of different orientations, such as front, back, up, down, left, right
  • images of multiple orientations are images of different orientations, such as front, back, up, down, left, right
  • the six images in six orientations and the images in multiple orientations can form a panoramic image.
  • an algorithm for creating a perspective projection can be used, and the images in multiple directions obtained by cutting can be images taken by virtual monocular cameras in multiple directions.
  • the images in multiple orientations are not exemplified by the aforementioned images in six orientations, and may also be seven images in seven orientations, eight images in eight orientations, and so on.
  • the above-mentioned algorithm for creating a perspective projection first considers a virtual camera located at the origin of the coordinate system.
  • the virtual camera can be a virtual monocular camera obtained by cutting a panoramic image.
  • the images in multiple orientations may be images respectively captured by the virtual monocular camera in multiple orientations.
  • This coordinate system is a right-handed coordinate system.
  • the right-handed coordinate system can have a "left" vector pointing to the positive direction of the y-axis, an "up" vector pointing to the positive direction of the z-axis, and a "right” vector pointing to the positive direction of the x-axis.
  • the projection planes corresponding to the vectors of the panoramic image in different directions may be multiple images in different orientations obtained by dividing the panoramic image.
  • the projection of the panorama image on the projection plane corresponding to the "left" vector is the left orientation image obtained by segmentation.
  • step 103 can be implemented in the following manner: according to the camera poses corresponding to the panoramic image, calculate the camera poses corresponding to the images in multiple orientations to be segmented; The camera pose is used to cut the panoramic image to obtain images in multiple directions corresponding to the panoramic image, which is conducive to accurate cutting of the panoramic image.
  • step 104 the electronic device performs three-dimensional reconstruction of the target area according to the images of multiple orientations and the camera poses corresponding to the panoramic image.
  • Three-dimensional reconstruction refers to the establishment of a mathematical model suitable for computer representation and processing of three-dimensional objects. It is the basis for processing, operating and analyzing its properties in a computer environment. It is also a key technology for establishing a virtual reality that expresses the objective world in a computer.
  • step 104 may be implemented through the flowchart shown in FIG. 4 , including the following steps.
  • Step 301 Select target pixel points.
  • a target pixel point for which depth needs to be calculated can be selected in a panoramic image or an image obtained by cutting in multiple directions.
  • the number of selected target pixel points may be multiple.
  • Step 302 According to the camera poses corresponding to the images of multiple orientations and the panoramic image, determine the epipolar lines between the images of multiple orientations.
  • Step 303 traverse each pixel point on the epipolar line, and search for a pixel point matching the target pixel point.
  • the order of traversal may be set according to actual needs, which is not specifically limited in this embodiment.
  • Step 304 Calculate the spatial position of the target pixel according to the pixel matching the target pixel.
  • the actual spatial position of the target pixel can be calculated through triangulation, so that the depth information of the target pixel can be updated according to the calculated actual spatial position.
  • Step 305 Determine the structured reconstruction information of the target area according to the spatial position of the target pixel.
  • the structured reconstruction information can be obtained according to the spatial position information of the target pixels that can characterize the structured features in the target area.
  • Step 306 Using the structured reconstruction information as a reconstruction skeleton, perform 3D reconstruction on the target area.
  • the structural reconstruction information can be used as the reconstruction skeleton, and the point cloud dense reconstruction algorithm can be used to perform 3D reconstruction of the target scene.
  • the process of obtaining the structured reconstruction information can be understood as a sparse reconstruction process, and the process of performing three-dimensional reconstruction on the target area using the structured reconstruction information as a reconstruction skeleton can be understood as a further dense reconstruction process.
  • the panoramic camera to increase the viewing range of the shooting, at the same time, by obtaining the camera pose required in the mapping process, and then combining the camera pose to obtain the structured reconstruction information, and using the structured reconstruction information as the reconstruction skeleton for further densification Reconstruction can greatly improve the stability and expansibility of the mapping process.
  • step 302 may be implemented through a flowchart as shown in FIG. 5 .
  • Step 401 Determine the vector of the line connecting the optical center of the camera and the target pixel in the i-th frame of images in the images of multiple azimuths.
  • Step 402 According to the camera poses corresponding to the panoramic images, determine the translation vectors of the optical centers of the cameras corresponding to the images in multiple orientations.
  • the camera poses corresponding to the images of multiple orientations can be calculated, and the camera poses corresponding to the images of the multiple orientations include translation vectors of the optical centers of the cameras corresponding to the images of the multiple orientations.
  • Step 403 Determine the plane formed by the vector of the connecting line and the translation vector.
  • Step 404 Determine the epipolar line between the images of multiple orientations according to the intersection line of the plane and the i+nth frame image in the images of multiple orientations; wherein, i and n are both natural numbers greater than or equal to 1 .
  • the panoramic camera is used to shoot the target area with a large angle of view, and the panoramic camera is cut and then three-dimensionally reconstructed. Its 360° panoramic view can provide great convenience for the entire system to initially evaluate the target area. Information has the characteristics of strong reliability and large amount of information.
  • the 3D reconstruction method proposed in this embodiment can significantly improve the effect and success rate of visual mapping, enhance the purpose of the system, and avoid areas that cannot be reconstructed during the system mapping process.
  • SLAM can be combined with camera pose calculations in some complex scene areas, so that 3D reconstruction can be performed in scenes with complex structures or fewer features according to the camera poses calculated by SLAM, thereby greatly improving the system. Applicability in different scenarios.
  • the panoramic camera used in this embodiment has low cost, which can effectively reduce the cost.
  • Step 104 perform three-dimensional reconstruction of the target area according to the camera poses corresponding to the multi-aspect images and the panoramic images, which may include: extracting the camera poses corresponding to the key frame images, and according to the multiple orientation images and the camera positions corresponding to the key frame images 3D reconstruction of the target area.
  • the pose can be estimated for each panoramic image. After the estimation, it is considered that this picture is relatively good, that is, it has key features, and it is considered as a key frame image.
  • the current frame image and the previous frame image have a certain repetition, that is, the camera pose corresponding to the current frame image is similar to the camera pose corresponding to the previous frame image, but not completely the same, that is, there are new environmental characteristics at the same time, then it can be determined that the current frame image is Keyframe images.
  • the key frame image is screened out, and the key frame image can provide more effective and valuable information for 3D reconstruction compared with the common frame image, so the 3D reconstruction combined with the camera pose corresponding to the key frame image is beneficial to the 3D reconstruction. While effective 3D reconstruction, the processing burden of electronic equipment is reduced.
  • step division of the above various methods is only for the sake of clarity of description. During implementation, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.
  • An embodiment of the present application provides an electronic device, as shown in FIG. 6 , including: at least one processor 501; and a memory 502 communicatively connected to at least one processor 501; Instructions executed by one processor 501, the instructions are executed by at least one processor 501, so that at least one processor 501 can execute the above three-dimensional reconstruction method.
  • the electronic device may further include a panoramic camera 503 communicatively connected to at least one processor 501, and the panoramic camera 503 is configured to photograph a target area to obtain a panoramic image.
  • the processor 501 is connected to the panoramic camera 503, and can control the panoramic camera 503 to photograph the target area. After the panoramic image of the target area is captured by the panoramic camera 503, it can be sent to the processor 501 for the processor 501 to carry out subsequent operations according to the panoramic image. 3D reconstruction process.
  • the memory 502 and the processor 501 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 501 and various circuits of the memory 502 together.
  • the bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein.
  • the bus interface provides an interface between the bus and the transceivers.
  • a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium.
  • the data processed by the processor 501 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 501 .
  • Processor 501 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management and other control functions. And the memory 502 may be used to store data used by the processor 501 when performing operations.
  • the embodiment of the present application also provides a computer-readable storage medium storing a computer program.
  • the above method embodiments are implemented when the computer program is executed by the processor.
  • a storage medium includes several instructions to make a device ( It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
  • an embodiment of the present application also provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed When executed by a computer, the computer is made to execute the method in any of the above method embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

Embodiments of the present application relate to the technical field of image processing. Disclosed are a three-dimensional reconstruction method, an electronic device, and a computer-readable storage medium. The three-dimensional reconstruction method comprises: acquiring a panoramic image sequence obtained by photographing a target area; performing camera pose estimation on a panoramic image in the panoramic image sequence to obtain a camera pose corresponding to the panoramic image; cutting the panoramic image to obtain images in multiple directions corresponding to the panoramic image; and performing three-dimensional reconstruction on the target area according to the images in the multiple directions and the camera pose corresponding to the panoramic image.

Description

三维重建方法、电子设备和计算机可读存储介质Three-dimensional reconstruction method, electronic device and computer-readable storage medium 技术领域technical field
本申请涉及图像处理技术领域,特别涉及一种三维重建方法、电子设备和计算机可读存储介质。The present application relates to the technical field of image processing, in particular to a three-dimensional reconstruction method, electronic equipment, and a computer-readable storage medium.
背景技术Background technique
在近些年的计算机技术的进步,AR、VR技术逐渐成为研究的热点领域之一,在娱乐领域,尤其是基于5G通讯下大带宽、低延时的场景下,逐渐开始出现各种优秀的应用。如何有效的重建目标区域的场景是其中极其重要的方向之一。同时,由于目前娱乐场景逐渐增大,娱乐内容逐渐广泛,对重建区域的大小及精度提出了更高的要求。With the advancement of computer technology in recent years, AR and VR technologies have gradually become one of the hot research areas. In the field of entertainment, especially in the scenarios of large bandwidth and low latency based on 5G communication, various excellent application. How to effectively reconstruct the scene of the target area is one of the extremely important directions. At the same time, due to the gradual increase of entertainment scenes and the wide range of entertainment content, higher requirements are placed on the size and accuracy of the reconstruction area.
在一些实施例中的纯视觉三维重建的构建方法都是基于离散分布的不同视角方向上的测区影像数据来实现,然而其由于建图效果不稳定、可建成区域小以及传统单目相机视角小从而使建成效果较差。In some embodiments, the construction methods of pure visual 3D reconstruction are all realized based on discretely distributed image data of survey areas in different viewing angle directions. Smaller results in poorer building effects.
发明内容Contents of the invention
本申请实施例的主要目的在于提出一种应三维重建方法、电子设备和计算机可读存储介质,使得可以提高三维重建的建成效果与成功率。The main purpose of the embodiments of the present application is to provide a three-dimensional reconstruction method, electronic equipment and computer-readable storage medium, so that the construction effect and success rate of the three-dimensional reconstruction can be improved.
为至少实现上述目的,本申请实施例提供了一种三维重建方法,包括:获取对目标区域拍摄得到的全景图像序列;对所述全景图像序列中的全景图像进行相机位姿估计,得到所述全景图像对应的相机位姿;对所述全景图像进行切割,以得到所述全景图像对应的多个方位的图像;根据所述多个方位的图像和所述全景图像对应的相机位姿,对所述目标区域进行三维重建。In order to at least achieve the above purpose, an embodiment of the present application provides a three-dimensional reconstruction method, including: acquiring a sequence of panoramic images taken from a target area; performing camera pose estimation on the panoramic images in the sequence of panoramic images to obtain the The camera pose corresponding to the panoramic image; cutting the panoramic image to obtain images of multiple orientations corresponding to the panoramic image; according to the images of the multiple orientations and the camera pose corresponding to the panoramic image, the The target area is subjected to three-dimensional reconstruction.
为至少实现上述目的,本申请实施例还提供了一种电子设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少 一个处理器执行;以使所述至少一个处理器能够执行上述的三维重建方法。In order to at least achieve the above purpose, an embodiment of the present application further provides an electronic device, including: at least one processor; and a memory connected to the at least one processor in communication; wherein, the memory stores information that can be used by the Instructions executed by at least one processor, the instructions being executed by the at least one processor; enabling the at least one processor to execute the above three-dimensional reconstruction method.
为至少实现上述目的,本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述的三维重建方法。In order to at least achieve the above purpose, an embodiment of the present application further provides a computer-readable storage medium storing a computer program, and the computer program implements the above three-dimensional reconstruction method when executed by a processor.
附图说明Description of drawings
图1是本申请实施例中提到的三维重建方法的流程图;Fig. 1 is a flow chart of the three-dimensional reconstruction method mentioned in the embodiment of the present application;
图2是本申请实施例中提到的步骤102的一种实现方式的流程图;FIG. 2 is a flowchart of an implementation of step 102 mentioned in the embodiment of the present application;
图3是本申请实施例中提到的对全景图像进行切割所涉及的坐标系的示意图;Fig. 3 is a schematic diagram of the coordinate system involved in cutting the panoramic image mentioned in the embodiment of the present application;
图4是本申请实施例中提到的步骤104的一种实现方式的流程图;FIG. 4 is a flowchart of an implementation of step 104 mentioned in the embodiment of the present application;
图5是本申请实施例中提到的步骤302的一种实现方式的流程图;FIG. 5 is a flowchart of an implementation of step 302 mentioned in the embodiment of the present application;
图6是本申请实施例中提到的电子设备的结构示意图。Fig. 6 is a schematic structural diagram of the electronic device mentioned in the embodiment of the present application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请的各实施例进行详细的阐述。然而,本领域的普通技术人员可以理解,在本申请各实施例中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本申请的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art can understand that in each embodiment of the application, many technical details are provided for readers to better understand the application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in this application can also be realized. The division of the following embodiments is for the convenience of description, and should not constitute any limitation to the specific implementation of the present application, and the embodiments can be combined and referred to each other on the premise of no contradiction.
本申请的一个实施例提供了一种三维重建方法,应用于电子设备。本实施例中的三维重建方法的流程图可以参考图1,包括以下步骤。An embodiment of the present application provides a three-dimensional reconstruction method applied to electronic equipment. The flowchart of the three-dimensional reconstruction method in this embodiment may refer to FIG. 1 , and includes the following steps.
步骤101:获取对目标区域拍摄得到的全景图像序列。Step 101: Obtain a panoramic image sequence obtained by photographing a target area.
步骤102:对全景图像序列中的全景图像进行相机位姿估计,得到全景 图像对应的相机位姿。Step 102: Perform camera pose estimation on the panoramic images in the panoramic image sequence, and obtain the corresponding camera poses of the panoramic images.
步骤103:对全景图像进行切割,以得到全景图像对应的多个方位的图像。Step 103: Cutting the panoramic image to obtain images corresponding to multiple orientations of the panoramic image.
步骤104:根据多个方位的图像和全景图像对应的相机位姿,对目标区域进行三维重建。Step 104: Perform 3D reconstruction of the target area according to the camera poses corresponding to the images in multiple orientations and the panoramic image.
本申请实施例通过拍摄全景图像,有利于增加目标区域的可视范围,360°的全景视野能为目标区域的重建提供较大便利。同时通过对全景图像进行相机位姿估计获得在重建过程中所需要的相机位姿,结合相机位姿对全景图像进行切割以得到多个方位的图像,有利于从多个不同方位更加全面的反应目标区域的特点。最后结合相机位姿和切割得到的多个方位的图像进行三维重建,有利于提高三维重建的建成效果与成功率。In the embodiment of the present application, by shooting a panoramic image, it is beneficial to increase the visible range of the target area, and the 360° panoramic view can provide greater convenience for the reconstruction of the target area. At the same time, the camera pose required in the reconstruction process is obtained by estimating the camera pose of the panoramic image. Combined with the camera pose, the panoramic image is cut to obtain images from multiple directions, which is conducive to a more comprehensive response from multiple different directions. Characteristics of the target area. Finally, the 3D reconstruction is performed by combining the camera pose and the multi-directional images obtained by cutting, which is conducive to improving the construction effect and success rate of the 3D reconstruction.
本申请实施例提供的三维重建方法,获取对目标区域拍摄得到的全景图像序列;对全景图像序列中的全景图像进行相机位姿估计,得到全景图像对应的相机位姿;对全景图像进行切割,以得到全景图像对应的多个方位的图像;根据多个方位的图像和全景图像对应的相机位姿,对目标区域进行三维重建。本申请实施例通过拍摄全景图像,有利于增加目标区域的可视范围,360°的全景视野能为目标区域的重建提供较大便利。同时通过对全景图像进行相机位姿估计获得在重建过程中所需要的相机位姿,结合相机位姿对全景图像进行切割以得到多个方位的图像,有利于从多个不同方位更加全面的反应目标区域的特点。最后结合相机位姿和切割得到的多个方位的图像进行三维重建,有利于提高三维重建的建成效果与成功率。The three-dimensional reconstruction method provided in the embodiment of the present application obtains the panoramic image sequence obtained by shooting the target area; performs camera pose estimation on the panoramic image in the panoramic image sequence to obtain the corresponding camera pose of the panoramic image; cuts the panoramic image, To obtain images of multiple orientations corresponding to the panoramic image; perform 3D reconstruction of the target area according to the images of multiple orientations and the camera poses corresponding to the panoramic image. In the embodiment of the present application, by shooting a panoramic image, it is beneficial to increase the visible range of the target area, and the 360° panoramic view can provide greater convenience for the reconstruction of the target area. At the same time, the camera pose required in the reconstruction process is obtained by estimating the camera pose of the panoramic image. Combined with the camera pose, the panoramic image is cut to obtain images from multiple directions, which is conducive to a more comprehensive response from multiple different directions. Characteristics of the target area. Finally, the 3D reconstruction is performed by combining the camera pose and the multi-directional images obtained by cutting, which is conducive to improving the construction effect and success rate of the 3D reconstruction.
在步骤101中,可以通过全景相机对目标区域进行拍摄,得到全景图像序列;其中,全景图像序列可以理解为若干帧连续的全景图像。全景相机在拍摄得到全景图像序列后,可以将全景图像序列发送给电子设备,使得电子设备获取对目标区域拍摄得到的全景图像序列,从而进行后续的处理。或者,电子设备中具备全景相机,从而电子设备可以直接通过其内部设置的全景相 机获取到全景图像序列。对目标区域使用全景相机进行拍摄,拍摄时尽量保持相机稳定,由于全景相机的视角极大,因此在遇到某个时刻有障碍物或者行人导致相机部分被遮挡并不需要有额外补偿或躲避操作。本实施例中,相当于采用连续的大视角图像进行三维重建,有利于提高三维重建的稳定性和准确度。In step 101, a panoramic camera may be used to shoot a target area to obtain a panoramic image sequence; wherein, the panoramic image sequence may be understood as several consecutive frames of panoramic images. After the panoramic camera captures the sequence of panoramic images, it can send the sequence of panoramic images to the electronic device, so that the electronic device acquires the sequence of panoramic images captured by the target area for subsequent processing. Alternatively, the electronic device is equipped with a panoramic camera, so that the electronic device can directly obtain a sequence of panoramic images through the panoramic camera set inside it. Use a panoramic camera to shoot the target area. Try to keep the camera stable when shooting. Since the panoramic camera has a large viewing angle, there is no need for additional compensation or avoidance operations when encountering obstacles or pedestrians that cause the camera to be partially blocked. . In this embodiment, it is equivalent to using continuous large-view images for 3D reconstruction, which is beneficial to improving the stability and accuracy of 3D reconstruction.
在一个实施例中,全景相机包括两个鱼眼镜头和图像拼接单元,两个鱼眼镜头中心位置相同且放置方向相反,各鱼眼镜头的视角为180°;图像拼接单元用于检测并提取两个鱼眼镜头采集的两个图像的特征和关键点,匹配两个图像之间的描述符,然后使用RANSAC算法匹配的特征向量估计单应矩阵,完成两个图像拼接,获得目标区域的视觉全景图,即全景图像。In one embodiment, the panoramic camera includes two fisheye lenses and an image splicing unit, the center positions of the two fisheye lenses are the same and the placement directions are opposite, and the viewing angle of each fisheye lens is 180°; the image splicing unit is used to detect and extract The features and key points of the two images collected by the two fisheye lenses, match the descriptors between the two images, and then use the eigenvectors matched by the RANSAC algorithm to estimate the homography matrix, complete the splicing of the two images, and obtain the vision of the target area A panorama is a panoramic image.
在一个实施例中,全景相机可以为头盔式全景相机,头盔式全景相机即为将全景相机通过连接件固定在头盔上。在另一个例子中,全景相机可以为手柄式全景相机,手柄式全景相机即为将全景相机通过连接件固定在手柄上。在具体实现中,头盔式全景相机固定时应保持两个相机朝向两边,同时尽量使得相机上下边沿与地面平行。In one embodiment, the panoramic camera may be a helmet-mounted panoramic camera, and the helmet-mounted panoramic camera is that the panoramic camera is fixed on the helmet through a connecting piece. In another example, the panoramic camera may be a handle-type panoramic camera, and the handle-type panoramic camera is that the panoramic camera is fixed on the handle through a connecting piece. In a specific implementation, when the helmet-mounted panoramic camera is fixed, the two cameras should be kept facing the two sides, and at the same time, the upper and lower edges of the camera should be parallel to the ground as much as possible.
在一个实施例中,在各种固定方式的全景相机中都需要注意的是:保持相机的镜头一个朝左一个朝右;尽量减少拍摄过程中的抖动;保持拍摄过程中相机高度一致;使相机中心垂线与人体重心垂线距离尽量接近;保持相机边沿与地面平行,同时视频底边应与地面大致平行,不应出现明显倾斜。In one embodiment, attention should be paid to the panoramic cameras in various fixed ways: keep the lens of the camera facing left and one facing right; minimize the shaking during shooting; keep the camera height consistent during shooting; make the camera The distance between the center vertical line and the vertical line of the body's center of gravity should be as close as possible; keep the edge of the camera parallel to the ground, and at the same time, the bottom edge of the video should be roughly parallel to the ground, and there should be no obvious inclination.
在步骤102中,电子设备对全景图像序列中的全景图像进行相机位姿估计,得到全景图像对应的相机位姿。In step 102, the electronic device performs camera pose estimation on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images.
在一个实施例中,电子设备可以将全景图像序列中的全景图像转换为长宽比为2:1的等矩形图像,根据等矩形图像进行同步定位与建图(simultaneous localization and mapping,SLAM),得到全景图像对应的相机位姿。比如,可以将转换后的等矩形图像传输到SLAM***中,从而获得在SLAM***中每张等矩形图像所对应的相机位姿,有利于适应SLAM***对计算相机位姿的 图像的尺寸要求。In one embodiment, the electronic device can convert the panoramic images in the panoramic image sequence into equirectangular images with an aspect ratio of 2:1, and perform simultaneous localization and mapping (SLAM) based on the equirectangular images, Get the camera pose corresponding to the panoramic image. For example, the converted equirectangular image can be transmitted to the SLAM system to obtain the camera pose corresponding to each equirectangular image in the SLAM system, which is conducive to adapting to the size requirements of the SLAM system for calculating the image of the camera pose.
在一个实施例中,步骤102可以通过如图2所示的流程图实现。In an embodiment, step 102 may be implemented through the flowchart shown in FIG. 2 .
步骤201:初始化全景图像序列中的第一个关键帧图像。Step 201: Initialize the first key frame image in the panoramic image sequence.
步骤202:提取全景图像序列中各全景图像的特征,根据各全景图像的特征和第一个关键帧图像确定全景图像序列中除第一个关键帧图像之外的关键帧图像。Step 202: Extract features of each panoramic image in the panoramic image sequence, and determine key frame images in the panoramic image sequence except the first key frame image according to the features of each panoramic image and the first key frame image.
步骤203:对各相邻关键帧图像的特征,进行特征匹配,得到各相邻关键帧图像中的特征匹配对。Step 203: Perform feature matching on the features of each adjacent key frame image to obtain feature matching pairs in each adjacent key frame image.
步骤204:根据特征匹配对,计算各相邻关键帧图像中的后一关键帧图像对应的相机位姿。Step 204: According to the feature matching pair, calculate the camera pose corresponding to the subsequent key frame image in each adjacent key frame image.
本实施例中,通过对全景图像序列中的各相邻关键帧图像的特征匹配,得到各相邻关键帧图像中的特征匹配对,有利于准确的计算得到各相邻关键帧图像中的后一关键帧图像对应的相机位姿。In this embodiment, by matching the features of each adjacent key frame image in the panorama image sequence, the feature matching pairs in each adjacent key frame image are obtained, which is conducive to accurate calculation and obtaining the subsequent key frame image in each adjacent key frame image. The camera pose corresponding to a keyframe image.
在步骤201中,电子设备可以将全景图像序列中的第一帧图像初始化为第一个关键帧图像,然并不以此为限。在初始化完第一个关键帧图像后,可以初始化相机位姿,即对第一个关键帧图像对应的相机位姿进行初始化。In step 201, the electronic device may initialize the first frame image in the panoramic image sequence as the first key frame image, but the present invention is not limited thereto. After the first key frame image is initialized, the camera pose can be initialized, that is, the camera pose corresponding to the first key frame image is initialized.
在步骤202中,电子设备可以提取全景图像序列中各全景图像的特征,该特征可以为ORB特征,ORB特征为非常具有代表性的图像特征,它改进了FAST检测子不具备方向性的问题,并采用速度极快的二进制描述子BRIEF,使整个图像特征提取的环节大大加速。根据各全景图像的ORB特征和第一个关键帧图像确定全景图像序列中除第一个关键帧图像之外的关键帧图像,也就是说,根据各全景图像的ORB特征和第一个关键帧图像可以依次确定全景图像序列中第二个关键帧图像、第三个关键帧图像......第n个关键帧图像。具体的,各相邻关键帧图像之间满足如下关系:相邻关键帧图像中的后一张关键帧图像与前一帧关键帧图像相比,存在部分相同的ORB特征也出现了部分新的ORB特征。ORB(Oriented FAST and Rotated BRIEF)是一种快速特征 点提取和描述的算法,ORB特征可以理解为采用该ORB算法提取到的特征。In step 202, the electronic device can extract the feature of each panoramic image in the panoramic image sequence, the feature can be an ORB feature, and the ORB feature is a very representative image feature, which improves the problem that the FAST detector does not have directionality, And the extremely fast binary descriptor BRIEF is used to greatly speed up the entire image feature extraction process. According to the ORB feature of each panoramic image and the first key frame image, determine the key frame image except the first key frame image in the panoramic image sequence, that is to say, according to the ORB feature of each panoramic image and the first key frame The images may determine the second key frame image, the third key frame image...the nth key frame image in the panoramic image sequence in sequence. Specifically, the following relationship is satisfied between adjacent key frame images: Compared with the previous key frame image, the next key frame image in the adjacent key frame images has some of the same ORB features and some new ones. ORB characteristics. ORB (Oriented FAST and Rotated BRIEF) is an algorithm for fast feature point extraction and description. ORB features can be understood as the features extracted by the ORB algorithm.
在步骤203中,电子设备对各相邻关键帧图像的特征,进行特征匹配,得到各相邻关键帧图像中的特征匹配对。比如,可以不断将关键帧图像之间进行特征匹配,并筛选出可靠的匹配对,该可靠的匹配对即为特征匹配对,特征匹配对中的特征之间的匹配度大于预设匹配度。其中预设匹配度可以根据实际需要进行设置,用于表征特征匹配对中的特征之间的匹配度较大,本实施例对预设匹配度的具体大小不作具体限定。In step 203, the electronic device performs feature matching on the features of each adjacent key frame image to obtain feature matching pairs in each adjacent key frame image. For example, it is possible to continuously perform feature matching between key frame images, and to screen out reliable matching pairs, which are feature matching pairs, and the matching degree of features in the feature matching pair is greater than the preset matching degree. The preset matching degree can be set according to actual needs, and the matching degree between the features used to characterize the feature matching pair is relatively large. This embodiment does not specifically limit the specific size of the preset matching degree.
其中,最简单的特征匹配方法就是暴力匹配,即对每一个特征点与所有的特征点测量描述子的距离,然后排序,取最近的一个作为匹配点。描述子的距离表述了两个特征之间的相似程度。然而,当特征点很多的时,暴力匹配的运算量就变得很大,此时快速近似最近邻算法更加适合于匹配点数量极多的情况。特征点可以由关键点和描述子两部分组成,关键点是指特征点在图像中的位置,有些特征点还具有朝向、大小等信息。描述子通常是一个向量,描述了关键点周围像素的信息。Among them, the simplest feature matching method is brute force matching, which is to measure the distance between each feature point and all feature points, and then sort them, and take the nearest one as the matching point. The descriptor distance expresses the degree of similarity between two features. However, when there are many feature points, the computational load of brute force matching becomes very large. At this time, the fast approximate nearest neighbor algorithm is more suitable for the situation where the number of matching points is very large. Feature points can be composed of key points and descriptors. Key points refer to the position of feature points in the image, and some feature points also have information such as orientation and size. A descriptor is usually a vector that describes the information of the pixels around the keypoint.
在步骤204中,电子设备根据特征匹配对,计算各相邻关键帧图像中的后一关键帧图像对应的相机位姿。具体的,电子设备可以根据可靠的匹配对即特征匹配对,利用对极几何约束求解帧间运动,并结合初始化的相机位姿,计算相邻关键帧图像中的后一关键帧图像对应的相机位姿。比如,根据初始化的第一个关键帧图像对应的相机位姿即初始化的位姿和第一个关键帧图像与第二个关键帧图像之间的特征匹配对,计算和第一个关键帧图像相邻的第二个关键帧图像对应的相机位姿,接着,根据第二个关键帧图像对应的相机位姿和第二个关键帧图像与第三个关键帧图像之间的特征匹配对,计算和第二个关键帧图像相邻的第三个关键帧图像对应的相机位姿,依次类推,计算各相邻关键帧图像中的后一关键帧图像对应的相机位姿,以得到所有关键帧图像对应的相机位姿。In step 204, the electronic device calculates the camera pose corresponding to the subsequent key frame image in each adjacent key frame image according to the feature matching pair. Specifically, the electronic device can use the epipolar geometric constraints to solve the inter-frame motion according to the reliable matching pairs, that is, the feature matching pairs, and combine the initialized camera poses to calculate the camera corresponding to the next key frame image in the adjacent key frame images pose. For example, according to the camera pose corresponding to the initialized first key frame image, that is, the initialized pose and the feature matching pair between the first key frame image and the second key frame image, the calculation and the first key frame image The camera pose corresponding to the adjacent second key frame image, and then, according to the camera pose corresponding to the second key frame image and the feature matching pair between the second key frame image and the third key frame image, Calculate the camera pose corresponding to the third key frame image adjacent to the second key frame image, and so on, calculate the camera pose corresponding to the next key frame image in each adjacent key frame image, so as to get all key The camera pose corresponding to the frame image.
在一个实施例中,相机位姿包括用于表征相机位置的平移向量T和用于表征相机姿态的旋转矩阵R;步骤204中的根据特征匹配对,计算各相邻关键帧图像中的后一关键帧图像对应的相机位姿,包括:根据特征匹配对的像素位置,计算本质矩阵或基础矩阵;根据本质矩阵或基础矩阵,计算各相邻关键帧图像中的后一关键帧图像对应的平移向量和旋转矩阵。In one embodiment, the camera pose includes a translation vector T used to characterize the camera position and a rotation matrix R used to characterize the camera pose; in step 204, according to the feature matching pair, the next key frame image in each adjacent key frame is calculated The camera pose corresponding to the key frame image, including: according to the pixel position of the feature matching pair, calculate the essential matrix or fundamental matrix; according to the essential matrix or fundamental matrix, calculate the translation corresponding to the next key frame image in each adjacent key frame image Vectors and rotation matrices.
下面以根据本质矩阵计算平移向量和旋转矩阵进行说明:比如,可以通过对本质矩阵进行分解来获得T和R。本质矩阵E有5个自由度,因此最少可以用5对点(即5对特征点对)求解E。E在不同的尺度下是等价的,一般使用经典的八点法来求解E,八点法只利用了E的线性性质。E的内在本质是一种非线性性质。从E分解得到R和T,这个过程由奇异值分解(SVD)得到的。The calculation of the translation vector and the rotation matrix according to the essential matrix is described below: for example, T and R can be obtained by decomposing the essential matrix. The essential matrix E has 5 degrees of freedom, so at least 5 pairs of points (that is, 5 pairs of feature points) can be used to solve E. E is equivalent at different scales. Generally, the classic eight-point method is used to solve E. The eight-point method only uses the linear property of E. The intrinsic nature of E is a nonlinear property. R and T are obtained from E decomposition, which is obtained by singular value decomposition (SVD).
在一个实施例中,步骤204中的根据特征匹配对,计算各相邻关键帧图像中的后一关键帧图像对应的相机位姿,包括:根据预设的局部区域中的特征匹配对和相机在目标区域内的先验位姿,进行局部光束平差处理,得到各相邻关键帧图像中的后一关键帧图像对应的相机位姿;或者,根据全局区域中的特征匹配对和相机在目标区域内的先验位姿,进行全局光束平差处理,得到各相邻关键帧图像中的后一关键帧图像对应的相机位姿。也就是说,进行全局光束平差处理时,选取所有特征匹配对;进行局部光束平差处理时,选取部分特征匹配对。对于第i个关键帧图像而言,先验位姿可以理解为第i个关键帧图像之前的关键帧图像对应的相机位姿,比如,对于第3个关键帧图像,其先验位姿可以为第2个关键帧图像相对于第1个关键帧图像的相机位姿;对于第4个关键帧图像,其先验位姿可以包括:第3个关键帧图像相对于第2个关键帧图像的相机位姿、第2个关键帧图像相对于第1个关键帧图像的相机位姿。在计算第3个关键帧图像对应的相机位姿时,可以先根据第3个关键帧图像和第2个关键帧图像之间的特征匹配对,得到第3个关键帧图像相对于第2个关键帧图像的相机位姿,然后根据第3个关键帧图像相 对于第2个关键帧图像的相机位姿,以及第3个关键帧图像的先验位姿进行局部光束平差处理或全局光束平差处理,以得到可靠的第3个关键帧图像对应的相机位姿。本实施例中根据先验位姿和特征匹配对进行局部光束平差处理或全局光束平差处理有利于得到更加可靠且准确度更高的相机位姿。In one embodiment, calculating the camera pose corresponding to the next key frame image in each adjacent key frame image according to the feature matching pair in step 204 includes: according to the feature matching pair in the preset local area and the camera The prior pose in the target area is processed by local beam adjustment to obtain the camera pose corresponding to the next key frame image in each adjacent key frame image; or, according to the feature matching pair in the global area and the camera in The prior pose in the target area is processed by global bundle adjustment to obtain the camera pose corresponding to the next key frame image in each adjacent key frame image. That is to say, when performing global bundle adjustment processing, select all feature matching pairs; when performing local bundle adjustment processing, select part of feature matching pairs. For the ith key frame image, the prior pose can be understood as the camera pose corresponding to the key frame image before the i key frame image, for example, for the third key frame image, its prior pose can be is the camera pose of the second key frame image relative to the first key frame image; for the fourth key frame image, its prior pose can include: the third key frame image relative to the second key frame image The camera pose of , the camera pose of the second key frame image relative to the first key frame image. When calculating the camera pose corresponding to the third key frame image, you can first obtain the third key frame image relative to the second key frame image based on the feature matching pair between the third key frame image and the second key frame image The camera pose of the key frame image, and then perform local beam adjustment or global beam adjustment according to the camera pose of the third key frame image relative to the second key frame image, and the prior pose of the third key frame image Adjustment processing to obtain a reliable camera pose corresponding to the third key frame image. In this embodiment, performing local bundle adjustment processing or global bundle adjustment processing according to the prior pose and feature matching is beneficial to obtain a more reliable and accurate camera pose.
在一个实施例中,步骤102中对全景图像序列中的全景图像进行相机位姿估计,得到全景图像对应的相机位姿的实现方式可以为。In one embodiment, in step 102, the camera pose estimation is performed on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images.
(1)将全景图像序列中的全景图像转换为长宽比为2:1的等矩形图像,得到等矩形图像序列。(1) Convert the panoramic images in the panoramic image sequence into equirectangular images with an aspect ratio of 2:1 to obtain an equirectangular image sequence.
(2)通过对提供的等矩形图像序列进行初始化,若未初始化则继续进行初始化,若完成初始化则进入后续视觉里程计(VisualOdometry,VO)端。(2) Initialize the equirectangular image sequence provided, if not initialized, continue to initialize, and if the initialization is completed, then enter the subsequent visual odometry (Visual Odometry, VO) terminal.
(3)VO端初始化第一个关键帧图像,同时初始化相机位姿T,同时持续对等矩形图像序列进行ORB特征提取,选取下一帧关键帧图像。(3) The VO end initializes the first key frame image, and at the same time initializes the camera pose T, and at the same time continuously performs ORB feature extraction on the equirectangular image sequence, and selects the next key frame image.
(4)不断将关键帧图像之间进行特征匹配,并筛选出可靠的特征匹配对,利用对极几何约束求解帧间运动,并计算下一帧关键帧图像对应的相机位姿。(4) Continuously perform feature matching between key frame images, and screen out reliable feature matching pairs, use epipolar geometric constraints to solve inter-frame motion, and calculate the camera pose corresponding to the next key frame image.
(5)通过对等矩形图像序列中进行SLAM***的相机运动估计,其中包括回环检测,从而获得相机在先前到达该地时的位姿,并根据该位置对之前局部轨迹进行光束平差优化。(5) The camera motion estimation of the SLAM system is performed on the equirectangular image sequence, including loop closure detection, so as to obtain the pose of the camera when it arrived at the place before, and perform beam adjustment optimization on the previous local trajectory according to the position.
(6)计算出相邻关键帧图像之间的平移向量t和旋转矩阵R,而t与R可以通过分解本质矩阵E来获得,在相邻关键帧图像之间只有旋转而无平移的时候,两视图的对极约束不成立,基础矩阵F为零矩阵,此时可以分解单应矩阵H,得到旋转矩阵R。(6) Calculate the translation vector t and the rotation matrix R between adjacent key frame images, and t and R can be obtained by decomposing the essential matrix E. When there is only rotation but no translation between adjacent key frame images, The epipolar constraint of the two views does not hold, and the fundamental matrix F is a zero matrix. At this time, the homography matrix H can be decomposed to obtain the rotation matrix R.
在一个实施例中,在步骤102中,电子设备可以对全景图像序列中的全景图像进行SLAM相机位姿估计,得到全景图像对应的相机位姿。SLAM相机位姿估计的处理流程可以包括:特征提取、特征匹配、位姿估计、特征跟踪、特征再识别、全局光束平差和局部光束平差处理。在特征跟踪中,该处理的输入为初始提取特征,输出为下一帧图像所跟踪到的特征;在特征提取 中,输入为彩色图像,输出为特征;在特征再识别中,可寻找曾经提取的特征,该处理的输入为初始提取特征和之前的帧的相机位姿,输出为特征再识别处理后的特征;在全局光束平差处理中,可进行全局非线性优化,该处理的输入是所有特征匹配集合,输出是位姿;在局部光束平差处理中,可进行局部非线性优化,该处理的输入是局部区域的特征匹配集合,输出是位姿。通过本实施例中的SLAM相机位姿计算,可以高效且相对可靠地估计出每一个全景图像的相机位姿,该相机位姿可以包括空间位置与姿态方向,而基于该空间位置与姿态方向,以及图像数据,可以反过来推导出大部分像素点的深度信息及空间位置。In one embodiment, in step 102, the electronic device may perform SLAM camera pose estimation on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images. The processing flow of SLAM camera pose estimation can include: feature extraction, feature matching, pose estimation, feature tracking, feature re-identification, global beam adjustment and local beam adjustment processing. In feature tracking, the input of this process is the initial extraction feature, and the output is the feature tracked by the next frame image; in feature extraction, the input is a color image, and the output is a feature; in feature re-identification, you can find The input of this process is the initial extraction feature and the camera pose of the previous frame, and the output is the feature after the feature re-identification process; in the global bundle adjustment process, global nonlinear optimization can be performed, and the input of this process is All feature matching sets, the output is the pose; in the local beam adjustment processing, local nonlinear optimization can be performed, the input of this processing is the feature matching set of the local area, and the output is the pose. Through the SLAM camera pose calculation in this embodiment, the camera pose of each panoramic image can be estimated efficiently and relatively reliably. The camera pose can include a spatial position and a posture direction, and based on the spatial position and posture direction, As well as image data, the depth information and spatial position of most pixels can be deduced in turn.
在步骤103,电子设备对全景图像进行切割,以得到全景图像对应的多个方位的图像;其中,多个方位的图像即不同方位的图像,比如,前、后、上、下、左、右这六个方位的六张图像,多个方位的图像能够组成一张全景图像。对全景图像切割,可以采用创建透视投影的算法,切割得到的多个方位的图像可以为多个方位上虚拟的单目相机分别拍摄的图像。然而,在具体实现中,多个方位的图像并不以上述六个方位的图像为例,也可以为七个方位上的七张图像、八个方位上的八张图像等。In step 103, the electronic device cuts the panoramic image to obtain images of multiple orientations corresponding to the panoramic image; wherein, images of multiple orientations are images of different orientations, such as front, back, up, down, left, right The six images in six orientations and the images in multiple orientations can form a panoramic image. For panoramic image cutting, an algorithm for creating a perspective projection can be used, and the images in multiple directions obtained by cutting can be images taken by virtual monocular cameras in multiple directions. However, in a specific implementation, the images in multiple orientations are not exemplified by the aforementioned images in six orientations, and may also be seven images in seven orientations, eight images in eight orientations, and so on.
在一个实施例中,参考图3中的坐标系,上述的创建透视投影的算法首先考虑位于该坐标系原点的虚拟相机,该虚拟相机可以为虚拟的单目相机,对全景图像进行切割得到的多个方位的图像可以为虚拟的单目相机在多个方位上分别拍摄的图像。该坐标系为右手坐标系,右手坐标系可以具有指向y轴正向的“左”向量,指向z轴正向的是“上”向量,指向x轴正向是“右”向量,还具有图3未示出的指向y轴负向的“前”向量,指向z轴负向的是“下”向量,指向x轴负向是“后”向量,全景图像在不同方向的向量对应的投影平面中的投影,可以为将全景图像分割后得到的多个不同方位的图像。比如,在图3中的坐标系中,全景图像在“左”向量对应的投影平面上的投影即为分割得到的左方位的图像。In one embodiment, referring to the coordinate system in FIG. 3 , the above-mentioned algorithm for creating a perspective projection first considers a virtual camera located at the origin of the coordinate system. The virtual camera can be a virtual monocular camera obtained by cutting a panoramic image. The images in multiple orientations may be images respectively captured by the virtual monocular camera in multiple orientations. This coordinate system is a right-handed coordinate system. The right-handed coordinate system can have a "left" vector pointing to the positive direction of the y-axis, an "up" vector pointing to the positive direction of the z-axis, and a "right" vector pointing to the positive direction of the x-axis. 3 The "front" vector not shown pointing to the negative direction of the y-axis, the "down" vector pointing to the negative direction of the z-axis, and the "back" vector pointing to the negative direction of the x-axis, the projection planes corresponding to the vectors of the panoramic image in different directions The projection in may be multiple images in different orientations obtained by dividing the panoramic image. For example, in the coordinate system in FIG. 3 , the projection of the panorama image on the projection plane corresponding to the "left" vector is the left orientation image obtained by segmentation.
在一个实施例中,步骤103的实现方式可以为:根据全景图像对应的相机位姿,计算待分割得到的多个方位的图像对应的相机位姿;根据待分割得到的多个方位的图像对应的相机位姿,对全景图像进行切割,以得到全景图像对应的多个方位的图像,有利于对全景图像进行准确的切割。In one embodiment, step 103 can be implemented in the following manner: according to the camera poses corresponding to the panoramic image, calculate the camera poses corresponding to the images in multiple orientations to be segmented; The camera pose is used to cut the panoramic image to obtain images in multiple directions corresponding to the panoramic image, which is conducive to accurate cutting of the panoramic image.
在步骤104中,电子设备根据多个方位的图像和全景图像对应的相机位姿,对目标区域进行三维重建。三维重建是指对三维物体建立适合计算机表示和处理的数学模型,是在计算机环境下对其进行处理、操作和分析其性质的基础,也是在计算机中建立表达客观世界的虚拟现实的关键技术。In step 104, the electronic device performs three-dimensional reconstruction of the target area according to the images of multiple orientations and the camera poses corresponding to the panoramic image. Three-dimensional reconstruction refers to the establishment of a mathematical model suitable for computer representation and processing of three-dimensional objects. It is the basis for processing, operating and analyzing its properties in a computer environment. It is also a key technology for establishing a virtual reality that expresses the objective world in a computer.
在一个实施例中,步骤104可以通过如图4所示的流程图实现,包括以下步骤。In an embodiment, step 104 may be implemented through the flowchart shown in FIG. 4 , including the following steps.
步骤301:选取目标像素点。Step 301: Select target pixel points.
比如,可以在全景图像或是切割得到的多个方位的图像中选取需要计算深度的目标像素点。其中,选取的目标像素点的数量可以为多个。For example, a target pixel point for which depth needs to be calculated can be selected in a panoramic image or an image obtained by cutting in multiple directions. Wherein, the number of selected target pixel points may be multiple.
步骤302:根据多个方位的图像和全景图像对应的相机位姿,确定多个方位的图像之间的极线。Step 302: According to the camera poses corresponding to the images of multiple orientations and the panoramic image, determine the epipolar lines between the images of multiple orientations.
步骤303:遍历极线上的各个像素点,并搜索与目标像素点匹配的像素点。Step 303: traverse each pixel point on the epipolar line, and search for a pixel point matching the target pixel point.
其中,遍历的顺序可以根据实际需要设定,本实施例对此不作具体限定。Wherein, the order of traversal may be set according to actual needs, which is not specifically limited in this embodiment.
步骤304:根据与目标像素点匹配的像素点,计算目标像素点的空间位置。Step 304: Calculate the spatial position of the target pixel according to the pixel matching the target pixel.
比如,可以通过三角测量计算出目标像素点实际的空间位置,从而根据计算的该实际的空间位置,更新该目标像素点的深度信息。For example, the actual spatial position of the target pixel can be calculated through triangulation, so that the depth information of the target pixel can be updated according to the calculated actual spatial position.
步骤305:根据目标像素点的空间位置,确定目标区域的结构化重建信息。Step 305: Determine the structured reconstruction information of the target area according to the spatial position of the target pixel.
其中,结构化重建信息可以根据能够表征目标区域中的结构化特征的目 标像素点的空间位置信息得到。Among them, the structured reconstruction information can be obtained according to the spatial position information of the target pixels that can characterize the structured features in the target area.
步骤306:以结构化重建信息为重建骨架,对目标区域进行三维重建。Step 306: Using the structured reconstruction information as a reconstruction skeleton, perform 3D reconstruction on the target area.
比如,可以以结构性重建信息作为重建骨架,使用点云稠密重建算法对目标场景进行三维重建。For example, the structural reconstruction information can be used as the reconstruction skeleton, and the point cloud dense reconstruction algorithm can be used to perform 3D reconstruction of the target scene.
本实施例中,得到结构化重建信息的过程可以理解为稀疏重建的过程,以结构化重建信息为重建骨架,对目标区域进行三维重建的过程可以理解为进一步的稠密重建的过程。通过利用全景相机增加拍摄的可视范围,同时通过获得在建图过程中所需要的相机位姿,进而结合相机位姿得到结构化重建信息,并以该结构化重建信息为重建骨架进行进一步稠密重建,可以极大的提高建图过程中的稳定性及拓展性。In this embodiment, the process of obtaining the structured reconstruction information can be understood as a sparse reconstruction process, and the process of performing three-dimensional reconstruction on the target area using the structured reconstruction information as a reconstruction skeleton can be understood as a further dense reconstruction process. By using the panoramic camera to increase the viewing range of the shooting, at the same time, by obtaining the camera pose required in the mapping process, and then combining the camera pose to obtain the structured reconstruction information, and using the structured reconstruction information as the reconstruction skeleton for further densification Reconstruction can greatly improve the stability and expansibility of the mapping process.
在一个实施例中,步骤302可以通过如图5所示的流程图实现。In an embodiment, step 302 may be implemented through a flowchart as shown in FIG. 5 .
步骤401:确定多个方位的图像中的第i帧图像的相机光心与目标像素点的连线的向量。Step 401: Determine the vector of the line connecting the optical center of the camera and the target pixel in the i-th frame of images in the images of multiple azimuths.
步骤402:根据全景图像对应的相机位姿,确定多个方位的图像对应的相机光心的平移向量。Step 402: According to the camera poses corresponding to the panoramic images, determine the translation vectors of the optical centers of the cameras corresponding to the images in multiple orientations.
其中,可以根据全景相机对应的相机位姿,计算多个方位的图像对应的相机位姿,多个方位的图像对应的相机位姿中包括多个方位的图像对应的相机光心的平移向量。Wherein, according to the camera poses corresponding to the panoramic camera, the camera poses corresponding to the images of multiple orientations can be calculated, and the camera poses corresponding to the images of the multiple orientations include translation vectors of the optical centers of the cameras corresponding to the images of the multiple orientations.
步骤403:确定连线的向量和平移向量构成的平面。Step 403: Determine the plane formed by the vector of the connecting line and the translation vector.
步骤404:根据所述平面和多个方位的图像中的第i+n帧图像的相交线,确定多个方位的图像之间的极线;其中,i和n均为大于或等于1的自然数。Step 404: Determine the epipolar line between the images of multiple orientations according to the intersection line of the plane and the i+nth frame image in the images of multiple orientations; wherein, i and n are both natural numbers greater than or equal to 1 .
本实施例中利用全景相机对目标区域进行大视角的拍摄,并将全景相机进行切割后进行三维重建,其360°的全景视野能为整个***对目标区域的初步评价提供巨大便利,其提供的信息具有可靠性强、信息量大的特点。本实施例提出的三维重建方法可以明显提升视觉建图的效果与成功率,增强***的目的性,避免造成***建图过程中出现无法重建区域。本实施例中结合 SLAM可以通过在一些复杂场景的区域进行相机位姿计算,从而可以在结构性复杂或者特征较少的场景根据SLAM计算出的相机位姿进行三维重建,从而极大地提高了***在不同场景的适用性。而且,本实施例所使用的全景相机成本低廉,可有效降低成本。In this embodiment, the panoramic camera is used to shoot the target area with a large angle of view, and the panoramic camera is cut and then three-dimensionally reconstructed. Its 360° panoramic view can provide great convenience for the entire system to initially evaluate the target area. Information has the characteristics of strong reliability and large amount of information. The 3D reconstruction method proposed in this embodiment can significantly improve the effect and success rate of visual mapping, enhance the purpose of the system, and avoid areas that cannot be reconstructed during the system mapping process. In this embodiment, SLAM can be combined with camera pose calculations in some complex scene areas, so that 3D reconstruction can be performed in scenes with complex structures or fewer features according to the camera poses calculated by SLAM, thereby greatly improving the system. Applicability in different scenarios. Moreover, the panoramic camera used in this embodiment has low cost, which can effectively reduce the cost.
在一个实施例中,在所述得到所述全景图像对应的相机位姿之后,即在步骤102之后,还包括:根据全景图像对应的相机位姿,确定全景图像序列中的关键帧图像;则步骤104,根据多方位图像和全景图像对应的相机位姿,对目标区域进行三维重建,可以包括:提取关键帧图像对应的相机位姿,根据多个方位的图像和关键帧图像对应的相机位姿,对目标区域进行三维重建。本实施例中可以对每张全景图像都估算位姿,估算完后认为这张图比较好即具有关键性特点,才认为是关键帧图像,比如:如果当前帧图像和上一帧图像有一定的重复,即当前帧图像对应的相机位姿和上一帧图像对应的相机位姿有一定的相似性,但又不完全相同,即同时又有新的环境特点,则可以确定当前帧图像为关键帧图像。本实施例中,筛选出关键帧图像,关键帧图像相比于普通帧图像对于三维重建能够提供更加有效有价值的信息,因此结合关键帧图像对应的相机位姿进行三维重建,有利于在进行有效的三维重建的同时,降低电子设备的处理负担。In one embodiment, after obtaining the camera pose corresponding to the panoramic image, that is, after step 102, it further includes: determining the key frame image in the panoramic image sequence according to the camera pose corresponding to the panoramic image; then Step 104, perform three-dimensional reconstruction of the target area according to the camera poses corresponding to the multi-aspect images and the panoramic images, which may include: extracting the camera poses corresponding to the key frame images, and according to the multiple orientation images and the camera positions corresponding to the key frame images 3D reconstruction of the target area. In this embodiment, the pose can be estimated for each panoramic image. After the estimation, it is considered that this picture is relatively good, that is, it has key features, and it is considered as a key frame image. For example: if the current frame image and the previous frame image have a certain repetition, that is, the camera pose corresponding to the current frame image is similar to the camera pose corresponding to the previous frame image, but not completely the same, that is, there are new environmental characteristics at the same time, then it can be determined that the current frame image is Keyframe images. In this embodiment, the key frame image is screened out, and the key frame image can provide more effective and valuable information for 3D reconstruction compared with the common frame image, so the 3D reconstruction combined with the camera pose corresponding to the key frame image is beneficial to the 3D reconstruction. While effective 3D reconstruction, the processing burden of electronic equipment is reduced.
需要说明的是,本申请实施例中的上述各示例均为为方便理解进行的举例说明,并不对本发明的技术方案构成限定。It should be noted that the above examples in the embodiments of the present application are all illustrations for the convenience of understanding, and do not limit the technical solution of the present invention.
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。The step division of the above various methods is only for the sake of clarity of description. During implementation, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.
本申请的一个实施例提供了一种电子设备,如图6所示,包括:至少一个处理器501;以及,与至少一个处理器501通信连接的存储器502;其中,存储器502存储有可被至少一个处理器501执行的指令,指令被至少一个处理器501执行,以使至少一个处理器501能够执行上述三维重建方法。An embodiment of the present application provides an electronic device, as shown in FIG. 6 , including: at least one processor 501; and a memory 502 communicatively connected to at least one processor 501; Instructions executed by one processor 501, the instructions are executed by at least one processor 501, so that at least one processor 501 can execute the above three-dimensional reconstruction method.
电子设备还可以包括与至少一个处理器501通信连接的全景相机503,全景相机503用于对目标区域进行拍摄得到全景图像。处理器501和全景相机503连接,可以控制全景相机503对目标区域进行拍摄,全景相机503拍摄得到目标区域的全景图像后,可以发送给处理器501,以供处理器501根据全景图像进行后续的三维重建的流程。The electronic device may further include a panoramic camera 503 communicatively connected to at least one processor 501, and the panoramic camera 503 is configured to photograph a target area to obtain a panoramic image. The processor 501 is connected to the panoramic camera 503, and can control the panoramic camera 503 to photograph the target area. After the panoramic image of the target area is captured by the panoramic camera 503, it can be sent to the processor 501 for the processor 501 to carry out subsequent operations according to the panoramic image. 3D reconstruction process.
其中,存储器502和处理器501采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器501和存储器502的各种电路连接在一起。总线还可以将诸如***设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器501处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器501。Wherein, the memory 502 and the processor 501 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 501 and various circuits of the memory 502 together. The bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein. The bus interface provides an interface between the bus and the transceivers. A transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium. The data processed by the processor 501 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 501 .
处理器501负责管理总线和通常的处理,还可以提供各种功能,包括定时,***接口,电压调节、电源管理以及其他控制功能。而存储器502可以被用于存储处理器501在执行操作时所使用的数据。 Processor 501 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management and other control functions. And the memory 502 may be used to store data used by the processor 501 when performing operations.
本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。The embodiment of the present application also provides a computer-readable storage medium storing a computer program. The above method embodiments are implemented when the computer program is executed by the processor.
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor) 执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。That is, those skilled in the art can understand that all or part of the steps in the method of the above-mentioned embodiments can be completed by instructing related hardware through a program, the program is stored in a storage medium, and includes several instructions to make a device ( It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
此外,本申请实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述任意方法实施例中的方法。In addition, an embodiment of the present application also provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed When executed by a computer, the computer is made to execute the method in any of the above method embodiments.
本领域的普通技术人员可以理解,上述各实施方式是实现本发明的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。Those of ordinary skill in the art can understand that the above-mentioned embodiments are specific examples for realizing the present invention, and in practical applications, various changes can be made to it in form and details without departing from the spirit and spirit of the present invention. scope.

Claims (11)

  1. 一种三维重建方法,其中,包括:A three-dimensional reconstruction method, including:
    获取对目标区域拍摄得到的全景图像序列;Obtaining a panoramic image sequence obtained by photographing the target area;
    对所述全景图像序列中的全景图像进行相机位姿估计,得到所述全景图像对应的相机位姿;Performing camera pose estimation on the panoramic images in the panoramic image sequence to obtain the camera pose corresponding to the panoramic images;
    对所述全景图像进行切割,以得到所述全景图像对应的多个方位的图像;cutting the panoramic image to obtain images corresponding to multiple orientations of the panoramic image;
    根据所述多个方位的图像和所述全景图像对应的相机位姿,对所述目标区域进行三维重建。Perform three-dimensional reconstruction on the target area according to the images in the multiple orientations and the camera poses corresponding to the panoramic image.
  2. 根据权利要求1所述的三维重建方法,其中,在所述得到所述全景图像对应的相机位姿之后,还包括:The three-dimensional reconstruction method according to claim 1, wherein, after obtaining the camera pose corresponding to the panoramic image, further comprising:
    根据所述全景图像对应的相机位姿,确定所述全景图像序列中的关键帧图像;以及determining a key frame image in the sequence of panoramic images according to the camera pose corresponding to the panoramic image; and
    所述根据所述多个方位的图像和所述全景图像对应的相机位姿,对所述目标区域进行三维重建,包括:The three-dimensional reconstruction of the target area according to the images of the multiple orientations and the camera pose corresponding to the panoramic image includes:
    提取所述关键帧图像对应的相机位姿;Extracting the camera pose corresponding to the key frame image;
    根据所述全景图像序列中的各全景图像对应的多个方位的图像和所述关键帧图像对应的相机位姿,对所述目标区域进行三维重建。Perform three-dimensional reconstruction on the target area according to the images in multiple orientations corresponding to the panoramic images in the panoramic image sequence and the camera poses corresponding to the key frame images.
  3. 根据权利要求1所述的三维重建方法,其中,所述根据所述多个方位的图像和所述全景图像对应的相机位姿,对所述目标区域进行三维重建;The three-dimensional reconstruction method according to claim 1, wherein, performing three-dimensional reconstruction on the target area according to the camera poses corresponding to the images of the multiple orientations and the panoramic image;
    选取目标像素点;Select the target pixel;
    根据所述多个方位的图像和所述全景图像对应的相机位姿,确定所述多个方位的图像之间的极线;determining epipolar lines between the images of the multiple orientations according to the images of the multiple orientations and the camera poses corresponding to the panoramic image;
    遍历所述极线上的各个像素点,并搜索与所述目标像素点匹配的像素点;Traverse each pixel point on the epipolar line, and search for a pixel point matching the target pixel point;
    根据所述与所述目标像素点匹配的像素点,计算所述目标像素点的空间 位置,calculating the spatial position of the target pixel according to the pixel matched with the target pixel,
    根据所述目标像素点的空间位置,确定所述目标区域的结构化重建信息;determining structured reconstruction information of the target area according to the spatial position of the target pixel;
    以所述结构化重建信息为重建骨架,对所述目标区域进行三维重建。Using the structured reconstruction information as a reconstruction skeleton, three-dimensional reconstruction is performed on the target area.
  4. 根据权利要求3所述的三维重建方法,其中,所述根据所述多个方位的图像和所述全景图像对应的相机位姿,确定所述多个方位的图像之间的极线,包括:The three-dimensional reconstruction method according to claim 3, wherein, according to the camera poses corresponding to the images of the multiple orientations and the panoramic image, determining the epipolar lines between the images of the multiple orientations comprises:
    确定所述多个方位的图像中的第i帧图像的相机光心与所述目标像素点的连线的向量;Determine the vector of the line connecting the optical center of the camera and the target pixel point of the i-th frame image in the images of the plurality of orientations;
    根据所述全景图像对应的相机位姿,确定所述多个方位的图像对应的相机光心的平移向量;According to the camera pose corresponding to the panoramic image, determine the translation vector of the optical center of the camera corresponding to the images of the multiple orientations;
    确定所述连线的向量和所述平移向量构成的平面;determining the plane formed by the vector of the connecting line and the translation vector;
    根据所述平面和所述多个方位的图像中的第i+n帧图像的相交线,确定所述多个方位的图像之间的极线;其中,i和n均为大于或等于1的自然数。Determine the epipolar line between the images of the multiple orientations according to the intersection line of the plane and the i+nth frame image in the images of the multiple orientations; wherein, i and n are both greater than or equal to 1 Natural number.
  5. 根据权利要求1至4任一项所述的三维重建方法,其中,所述对所述全景图像进行切割,以得到所述全景图像对应的多个方位的图像,包括:The three-dimensional reconstruction method according to any one of claims 1 to 4, wherein the cutting the panoramic image to obtain images of multiple orientations corresponding to the panoramic image comprises:
    根据所述全景图像对应的相机位姿,计算待分割得到的多个方位的图像对应的相机位姿;According to the camera pose corresponding to the panoramic image, calculate the camera pose corresponding to the images of multiple orientations to be segmented;
    根据所述待分割得到的多个方位的图像对应的相机位姿,对所述全景图像进行切割,以得到所述全景图像对应的多个方位的图像。According to the camera poses corresponding to the images in multiple orientations to be segmented, the panoramic image is segmented to obtain images in multiple orientations corresponding to the panoramic image.
  6. 根据权利要求1所述的三维重建方法,其中,所述对所述全景图像序列中的全景图像进行相机位姿估计,得到所述全景图像对应的相机位姿,包括:The three-dimensional reconstruction method according to claim 1, wherein said performing camera pose estimation on the panoramic images in the panoramic image sequence to obtain the corresponding camera poses of the panoramic images comprises:
    初始化所述全景图像序列中的第一个关键帧图像;Initialize the first key frame image in the panoramic image sequence;
    提取所述全景图像序列中各全景图像的特征,根据所述各全景图像的特 征和所述第一个关键帧图像确定所述全景图像序列中除所述第一个关键帧图像之外的关键帧图像;Extracting the features of each panoramic image in the panoramic image sequence, and determining the key points in the panoramic image sequence except for the first key frame image according to the features of each panoramic image and the first key frame image frame image;
    对各相邻关键帧图像的特征,进行特征匹配,得到所述各相邻关键帧图像中的特征匹配对;Perform feature matching on the features of each adjacent key frame image to obtain a feature matching pair in each adjacent key frame image;
    根据所述特征匹配对,计算所述各相邻关键帧图像中的后一关键帧图像对应的相机位姿。According to the feature matching pair, the camera pose corresponding to the next key frame image in the adjacent key frame images is calculated.
  7. 根据权利要求6所述的三维重建方法,其中,所述根据所述特征匹配对,计算所述各相邻关键帧图像中的后一关键帧图像对应的相机位姿,包括:The three-dimensional reconstruction method according to claim 6, wherein the calculation of the camera pose corresponding to the next key frame image in the adjacent key frame images according to the feature matching pair includes:
    根据预设的局部区域中的特征匹配对和相机在所述目标区域内的先验位姿,进行局部光束平差处理,得到所述各相邻关键帧图像中的后一关键帧图像对应的相机位姿;或者,According to the feature matching pairs in the preset local area and the prior pose of the camera in the target area, local bundle adjustment processing is performed to obtain the corresponding key frame image in the adjacent key frame images. camera pose; or,
    根据全局区域中的特征匹配对和相机在所述目标区域内的先验位姿,进行全局光束平差处理,得到所述各相邻关键帧图像中的后一关键帧图像对应的相机位姿。According to the feature matching pairs in the global area and the prior pose of the camera in the target area, perform global bundle adjustment processing to obtain the camera pose corresponding to the next key frame image in the adjacent key frame images .
  8. 根据权利要求6所述的三维重建方法,其中,所述相机位姿包括用于表征相机位置的平移向量和用于表征相机姿态的旋转矩阵;The three-dimensional reconstruction method according to claim 6, wherein the camera pose includes a translation vector for representing the camera position and a rotation matrix for representing the camera pose;
    所述根据所述特征匹配对,计算所述各相邻关键帧图像中的后一关键帧图像对应的相机位姿,包括:According to the feature matching pair, calculating the camera pose corresponding to the next key frame image in the adjacent key frame images includes:
    根据所述特征匹配对的像素位置,计算本质矩阵或基础矩阵;Calculate an essential matrix or a fundamental matrix according to the pixel positions of the feature matching pairs;
    根据所述本质矩阵或基础矩阵,计算所述各相邻关键帧图像中的后一关键帧图像对应的平移向量和旋转矩阵。According to the essential matrix or fundamental matrix, calculate the translation vector and rotation matrix corresponding to the next key frame image in the adjacent key frame images.
  9. 根据权利要求1所述的三维重建方法,其中,所述对所述全景图像序列中的全景图像进行相机位姿估计,得到所述全景图像对应的相机位姿,包括:The three-dimensional reconstruction method according to claim 1, wherein said performing camera pose estimation on the panoramic images in the panoramic image sequence to obtain the corresponding camera poses of the panoramic images comprises:
    将所述全景图像序列中的全景图像转换为长宽比为2:1的等矩形图像;Converting the panoramic image in the panoramic image sequence into an equirectangular image with an aspect ratio of 2:1;
    根据所述等矩形图像进行同步定位与建图SLAM,得到所述全景图像对应的相机位姿。Synchronous positioning and mapping SLAM are performed according to the equirectangular image to obtain the camera pose corresponding to the panoramic image.
  10. 一种电子设备,其中,包括:至少一个处理器;以及,An electronic device, comprising: at least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行;以使所述至少一个处理器能够执行如权利要求1至9中任一所述的三维重建方法。The memory is stored with instructions executable by the at least one processor, and the instructions are executed by the at least one processor; so that the at least one processor can perform any one of claims 1 to 9 3D reconstruction method.
  11. 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现权利要求1至9中任一所述的三维重建方法。A computer-readable storage medium storing a computer program, wherein, when the computer program is executed by a processor, the three-dimensional reconstruction method according to any one of claims 1 to 9 is realized.
PCT/CN2022/135517 2021-11-30 2022-11-30 Three-dimensional reconstruction method, electronic device, and computer-readable storage medium WO2023098737A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111443648.2A CN116206050A (en) 2021-11-30 2021-11-30 Three-dimensional reconstruction method, electronic device, and computer-readable storage medium
CN202111443648.2 2021-11-30

Publications (1)

Publication Number Publication Date
WO2023098737A1 true WO2023098737A1 (en) 2023-06-08

Family

ID=86517794

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135517 WO2023098737A1 (en) 2021-11-30 2022-11-30 Three-dimensional reconstruction method, electronic device, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN116206050A (en)
WO (1) WO2023098737A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118052952A (en) * 2024-04-16 2024-05-17 中国建筑一局(集团)有限公司 Method and device for reconstructing panoramic image of tunnel face structural surface

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106251399A (en) * 2016-08-30 2016-12-21 广州市绯影信息科技有限公司 A kind of outdoor scene three-dimensional rebuilding method based on lsd slam
WO2018150086A2 (en) * 2017-02-20 2018-08-23 Nokia Technologies Oy Methods and apparatuses for determining positions of multi-directional image capture apparatuses
CN112927362A (en) * 2021-04-07 2021-06-08 Oppo广东移动通信有限公司 Map reconstruction method and device, computer readable medium and electronic device
CN113643365A (en) * 2021-07-07 2021-11-12 紫东信息科技(苏州)有限公司 Camera pose estimation method, device, equipment and readable storage medium
CN113674416A (en) * 2021-08-26 2021-11-19 中国电子科技集团公司信息科学研究院 Three-dimensional map construction method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106251399A (en) * 2016-08-30 2016-12-21 广州市绯影信息科技有限公司 A kind of outdoor scene three-dimensional rebuilding method based on lsd slam
WO2018150086A2 (en) * 2017-02-20 2018-08-23 Nokia Technologies Oy Methods and apparatuses for determining positions of multi-directional image capture apparatuses
CN112927362A (en) * 2021-04-07 2021-06-08 Oppo广东移动通信有限公司 Map reconstruction method and device, computer readable medium and electronic device
CN113643365A (en) * 2021-07-07 2021-11-12 紫东信息科技(苏州)有限公司 Camera pose estimation method, device, equipment and readable storage medium
CN113674416A (en) * 2021-08-26 2021-11-19 中国电子科技集团公司信息科学研究院 Three-dimensional map construction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116206050A (en) 2023-06-02

Similar Documents

Publication Publication Date Title
WO2019161813A1 (en) Dynamic scene three-dimensional reconstruction method, apparatus and system, server, and medium
CN112435325B (en) VI-SLAM and depth estimation network-based unmanned aerial vehicle scene density reconstruction method
US20220262039A1 (en) Positioning method, electronic device, and storage medium
WO2021196294A1 (en) Cross-video person location tracking method and system, and device
US10628949B2 (en) Image processing with iterative closest point (ICP) technique
CN106251399B (en) A kind of outdoor scene three-dimensional rebuilding method and implementing device based on lsd-slam
CN109102537B (en) Three-dimensional modeling method and system combining two-dimensional laser radar and dome camera
US10789765B2 (en) Three-dimensional reconstruction method
WO2020014909A1 (en) Photographing method and device and unmanned aerial vehicle
WO2019238114A1 (en) Three-dimensional dynamic model reconstruction method, apparatus and device, and storage medium
Li et al. Large scale image mosaic construction for agricultural applications
CN108073857B (en) Dynamic visual sensor DVS event processing method and device
US20170330375A1 (en) Data Processing Method and Apparatus
CN111127524A (en) Method, system and device for tracking trajectory and reconstructing three-dimensional image
KR101891201B1 (en) Method and apparatus for acquiring depth map from all-around camera
WO2019075948A1 (en) Pose estimation method for mobile robot
CN110580720B (en) Panorama-based camera pose estimation method
US20140009503A1 (en) Systems and Methods for Tracking User Postures to Control Display of Panoramas
CN112207821B (en) Target searching method of visual robot and robot
CN111402412A (en) Data acquisition method and device, equipment and storage medium
WO2022052782A1 (en) Image processing method and related device
WO2019157922A1 (en) Image processing method and device and ar apparatus
WO2022047701A1 (en) Image processing method and apparatus
WO2023098737A1 (en) Three-dimensional reconstruction method, electronic device, and computer-readable storage medium
KR20220098895A (en) Apparatus and method for estimating the pose of the human body

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22900541

Country of ref document: EP

Kind code of ref document: A1