CN106803267B - Kinect-based indoor scene three-dimensional reconstruction method - Google Patents

Kinect-based indoor scene three-dimensional reconstruction method Download PDF

Info

Publication number
CN106803267B
CN106803267B CN201710014728.3A CN201710014728A CN106803267B CN 106803267 B CN106803267 B CN 106803267B CN 201710014728 A CN201710014728 A CN 201710014728A CN 106803267 B CN106803267 B CN 106803267B
Authority
CN
China
Prior art keywords
point cloud
data
cloud data
frame
registration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710014728.3A
Other languages
Chinese (zh)
Other versions
CN106803267A (en
Inventor
卢朝阳
丹熙方
李静
矫春龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201710014728.3A priority Critical patent/CN106803267B/en
Publication of CN106803267A publication Critical patent/CN106803267A/en
Application granted granted Critical
Publication of CN106803267B publication Critical patent/CN106803267B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/30Polynomial surface description
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Algebra (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)
  • Image Generation (AREA)

Abstract

The invention discloses an indoor scene three-dimensional reconstruction method based on Kinect, which solves the technical problems of reconstructing an indoor scene three-dimensional model in real time and avoiding excessive redundant points and comprises the following steps: acquiring object depth data by using a Kinect, denoising and downsampling the depth data; acquiring point cloud data of a current frame, and calculating normal vectors of each point in the frame; establishing a global data cube by using a TSDF algorithm, and calculating predicted point cloud data by using a ray casting algorithm; calculating a point cloud registration matrix through an ICP (inductively coupled plasma) algorithm and predicted point cloud data, fusing point cloud data acquired by each frame into a global data cube, and fusing point cloud data frame by frame until a better fusion effect is obtained; and rendering the point cloud data by using an isosurface extraction algorithm to construct a three-dimensional model of the object. The invention improves the registration speed and the registration precision, has high fusion speed and few redundant points, and can be used for real-time reconstruction of indoor scenes.

Description

Kinect-based indoor scene three-dimensional reconstruction method
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to an indoor scene three-dimensional reconstruction method based on Kinect. The invention can be used in the fields of robot navigation, industrial measurement, virtual interaction and the like.
Background
The three-dimensional reconstruction technology is a hotspot and difficulty in the frontier fields of computer vision, artificial intelligence, virtual reality and the like, is one of the major challenges facing human beings in basic research and application research, and is widely applied to the fields of cultural relic digitization, biomedical imaging, animation production, industrial measurement, immersive virtual interaction and the like.
Three-dimensional reconstruction has been studied for a long time in the scientific research field, but has not reached the degree of popularity at present due to the high cost of the required equipment. With the popularization and the use of the Microsoft Kinect somatosensory camera, the cost is greatly reduced, so that an ordinary user can also make a model by using a three-dimensional reconstruction technology.
The existing three-dimensional reconstruction technology can be divided into a passive technology and an active technology according to a mode of acquiring depth information. Passive techniques use natural light reflection, typically by capturing images with a camera, and then calculating the three-dimensional coordinate information of the object through a series of algorithms. The passive method has large calculation amount and low speed.
Active techniques include a light source to directly measure depth information of an object, and thus are easily implemented in real Time, such as Time of Flight and structured light techniques. The Time of Flight technique is very costly, which greatly limits the use of the technique. The Kinect camera uses a structured light technology, is low in cost, can meet the requirements of common users, and is widely applied to three-dimensional reconstruction.
For three-dimensional reconstruction, there is a related patent, for example, "a contact network three-dimensional reconstruction method based on point cloud registration of SIFT and LBP" (publication No. CN104299260A, application No. 201410456796.1, application date: 2014.09.10), which proposes a contact network three-dimensional reconstruction method based on SIFT and LBP. The patent "three-dimensional space map construction method based on Kinect vision technology" (publication number: 104794748A, application number: 201510116276.0, application date: 2015.03.17) proposes a three-dimensional space map construction method based on Kinect vision technology.
The document "Henry P, kraining M, Herbst E, et al, RGB-D mapping: Using depthcameras for dense 3D modeling of index environment [ C ]// RSS works on RGB-D cameras.2010" proposes an indoor scene three-dimensional reconstruction system based on SIFT (scale invariant feature transform) feature matching localization and TORO (Tree-based network optimization) optimization algorithm, which uses depth data and color image data, uses ICP algorithm to combine SIFT features in color images to register point cloud data of two frames, and uses TORO algorithm, an optimization algorithm for SLAM to obtain global point cloud data, and can more accurately reconstruct indoor scenes with unobvious features even dim light, but the algorithm is complex to calculate three-dimensionally and has a slow reconstruction speed.
The document "Fioraio N, Konolige K.real visual and point closed SLAM [ C ]// RSS works hop on RGB-D cameras.2011" proposes an RGBD-SLAM algorithm, the algorithm utilizes an RGB-D sensor to obtain depth data and color image data, uses a k-D tree or a projection method to search corresponding points in two frames of point cloud data, uses an ICP algorithm based on the corresponding points to realize the registration of the point cloud data, uses g2 o-an efficient nonlinear least square optimizer to carry out global optimization, and still has the problems of complex calculation and slow reconstruction speed in order to achieve a better reconstruction effect.
The two algorithms are complex in calculation, high in configuration requirement and multiple in redundant points of the point cloud model.
Disclosure of Invention
The invention aims to provide an indoor scene three-dimensional reconstruction method based on Kinect, which is more accurate in registration and higher in speed, aiming at the defects in the prior art.
The invention relates to a Kinect-based indoor scene three-dimensional reconstruction method which is characterized by comprising the following steps of:
step 1, denoising and downsampling depth data:
setting a timer t, starting timing, acquiring depth data of one frame of an object in an indoor scene by using a Kinect, denoising the depth data by adopting a combined bilateral filtering method, combining a color image and a depth image, completing the missing depth image, and obtaining depth data of a plurality of resolutions by down-sampling;
step 2, acquiring point cloud data of the current frame, and calculating normal vectors of each point in the frame:
obtaining a transformation matrix from an image coordinate system to a camera coordinate system according to the camera parameters of the Kinect, calculating current frame point cloud data of an object in an indoor scene by using the transformation matrix and depth data of a plurality of resolutions, and calculating normal vectors of all points of the current frame point cloud data by using Eigenvalue Estimation (eigen Estimation);
step 3, acquiring a global data cube of the current frame point cloud data and calculating and predicting the point cloud data:
converting current frame point cloud data into a voxel of a global data cube (Volume) by using a Truncated Symbolic Distance Function (TSDF), and calculating to obtain predicted point cloud data of the global data cube and normal vectors of each point of the predicted point cloud by using a Ray projection algorithm (Ray Casting) in combination with an initial point cloud registration matrix, wherein the initial point cloud registration matrix is set as a unit matrix;
step 4, fusion registration of two frames of point cloud data:
performing fusion registration on the two frames of point cloud data, wherein a point cloud registration matrix is required for the fusion registration;
4.1 moving the Kinect, returning to execute the step 1-step 2, obtaining point cloud data of the object in the frame of indoor scene again, calculating normal vectors of all points in the frame, wherein the obtained point cloud data and normal vectors of the object in the frame of indoor scene are current frame point cloud data and normal vectors, and the current point cloud registration matrix is an initial point cloud matrix;
4.2, calculating and updating a current point cloud registration matrix by using an ICP (inductively coupled plasma) algorithm in combination with point cloud data and normal vectors acquired by a current frame, and normal vectors of each point of the predicted point cloud data and each normal vector of each point of the predicted point cloud;
4.3, performing point cloud fusion by adopting a TSDF algorithm, updating the voxel of the global data cube through the current point cloud registration matrix, and fusing the current frame point cloud data into the global data cube;
4.4, calculating to obtain predicted point cloud data of the global data cube by using a Ray projection algorithm (Ray Casting) in combination with the current point cloud registration matrix;
step 5, fusion registration of multi-frame point cloud data: and returning to the step 4, repeatedly executing the step 4, acquiring data frame by frame, fusing each newly acquired frame of point cloud data into the global data cube until the timer t reaches the set time, wherein the set time is 1-3 minutes, and stopping acquiring the point cloud data to obtain the well-registered point cloud data.
And 6, rendering the registered point cloud data, namely rendering the registered point cloud data through an equivalent surface extraction algorithm (Marching Cubes), constructing a three-dimensional model of an object in the indoor scene, and finishing the three-dimensional reconstruction of the indoor scene.
According to the method, only depth data are used, calculation is performed on the GPU through a highly parallel algorithm, high real-time performance can be achieved, and a point cloud model constructed by using a model based on TSDF has few redundant points. The method and the device for scanning and reconstructing the indoor scene have the advantages of low cost, high speed and real-time requirement meeting.
Compared with the prior art, the invention has the following advantages:
1. during registration, multi-resolution depth data is used, a point cloud registration transformation matrix is calculated primarily by adopting low-resolution depth data and predicted point cloud data, the calculated amount is greatly reduced and the calculation speed is accelerated due to the low resolution of the depth data, and then the transformation matrix is calculated by utilizing the high-resolution depth data and the incremental property, so that the registration speed is improved;
2. the method utilizes a light projection algorithm and a current point cloud registration transformation matrix to be combined with the global data cube to calculate the predicted point cloud data, and the predicted point cloud data is registered with the point cloud data obtained at the current moment;
3. the TSDF global data cube is used for point cloud fusion, because the global data cube is composed of voxels, and the number of the voxels is fixed, the redundancy of point clouds can be avoided, the calculation process is carried out in the GPU, the calculation speed is high, and the point cloud fusion can be accelerated.
Drawings
FIG. 1 is a flow chart of a method for three-dimensional reconstruction of an indoor scene according to the present invention;
FIG. 2 is a flow chart of point cloud registration in the present invention;
fig. 3 is a schematic diagram of scene depth provided in embodiment 6 of the present invention, where fig. 3(a) is before filtering and fig. 3(b) is after filtering;
FIG. 4 is a schematic diagram of a scene point cloud provided in embodiment 6 of the present invention;
fig. 5 is a scene rendering effect diagram provided in embodiment 6 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1
The existing three-dimensional reconstruction technology is widely applied to the fields of robot navigation, industrial measurement, virtual interaction and the like. In order to achieve a good effect, most algorithms in the prior art have the problems of large calculation amount, low speed, redundant points and the like. Aiming at the current situation, the invention provides an indoor scene three-dimensional reconstruction method based on Kinect, and referring to fig. 1, the indoor scene three-dimensional reconstruction method comprises the following steps:
step 1, denoising and downsampling depth data: firstly, a timer t is set and starts to time, the timer is used for determining when to stop acquiring point cloud data and performing global point cloud rendering, and the timer can be selected according to the scene size. The method comprises the steps of obtaining depth data of one frame of an object in an indoor scene by using a Kinect camera, denoising the depth data of one frame of the object in the indoor scene obtained by the Kinect camera by adopting a combined bilateral filtering method, simultaneously utilizing a depth image and a color image, introducing the color image with complete information while keeping a foreground edge and a background edge, and repairing a part with missing depth information while denoising. The depth data of multiple resolutions is obtained by performing down-sampling on the denoised depth data, the resolution of an original image obtained by the Kinect is 640 × 480, and the depth data of 320 × 240 and 160 × 120 resolutions is obtained by performing down-sampling, namely the depth data of multiple resolutions is formed; providing for later calculation of the point cloud registration matrix.
Step 2, acquiring point cloud data of the current frame, and calculating normal vectors of each point in the frame: obtaining a transformation matrix from an image coordinate system to a camera coordinate system according to the camera parameters of the Kinect, calculating current frame point cloud data of objects in an indoor scene by using the transformation matrix of the camera coordinate system and the depth data of a plurality of resolutions, and calculating normal vectors of all points of the current frame point cloud data by using a characteristic value estimation method. Compared with the prior art that the normal vector is calculated by using a method of a near point and a vector cross product, the method provided by the invention adopts a characteristic value estimation method to calculate more accurately.
Step 3, acquiring a global data cube of the current frame point cloud data and calculating and predicting the point cloud data: converting the current frame point cloud data into a voxel of a global data cube (Volume) by using a Truncated Symbolic Distance Function (TSDF), constructing the global data cube, fusing the current frame point cloud data of objects in indoor scenes of different frames, storing the data of the global data cube in a video memory of a Graphics Processing Unit (GPU), and calculating and updating the data by using the GPU. And then, calculating to obtain the predicted point cloud data of the global data cube and normal vectors of all points of the predicted point cloud by using a ray projection algorithm (RayCasting) in combination with the initial point cloud registration matrix. The initial point cloud registration matrix mentioned therein is an identity matrix.
Step 4, fusion registration of two frames of point cloud data: and performing fusion registration on the two frames of point cloud data, wherein a point cloud registration matrix is required for the fusion registration.
4.1 moving the Kinect, returning to the step 1-step 2, obtaining point cloud data of the object in the indoor scene of a frame again and calculating normal vectors of all points in the frame; and the point cloud data and the normal vector of the object in the frame of indoor scene obtained again are the current frame point cloud data and the normal vector, and the current point cloud registration matrix is the initial point cloud registration matrix.
And 4.2, calculating and updating a current point cloud registration matrix by using an ICP (inductively coupled plasma) algorithm in combination with the point cloud data and the normal vector acquired by the current frame and the predicted point cloud data and the predicted point cloud normal vector acquired by the previous frame, so as to update a global data cube. Illustratively, after first frame point cloud data is obtained, a global data cube is established, at the moment, a point cloud registration matrix is an initial point cloud registration matrix, namely an identity matrix, predicted point cloud data is obtained through calculation by using a light projection algorithm, then point cloud data of a second frame is obtained, and a point cloud registration matrix is calculated and updated through an ICP algorithm by combining the point cloud data of the second frame and the predicted point cloud data of the first frame.
And 4.3, performing point cloud fusion by adopting a TSDF algorithm, updating the voxel of the global data cube through the current point cloud registration matrix, and fusing the current frame point cloud data into the global data cube. The purpose of establishing the global data cube is to fuse all the acquired point cloud data together to form complete indoor scene point cloud data.
4.4, calculating to obtain predicted point cloud data of the global data cube by utilizing a Ray projection algorithm (Ray Casting) in combination with the current point cloud registration matrix; the ray projection algorithm is to emit rays by taking the current camera position as a starting point, and then calculate point cloud data of a global data cube observed under the current camera position according to the current point cloud registration matrix, namely predicted point cloud data.
Step 5, fusion registration of multi-frame point cloud data: returning to the step 4, repeatedly executing the step 4, acquiring data frame by frame, fusing each newly acquired frame of point cloud data into the global cube until the timer t reaches the set time, and stopping acquiring the point cloud data to obtain the well-registered point cloud data; the set time is 1 to 3 minutes, in this case 1 minute. The current global point cloud can be checked in real time through a computer screen in the process of multi-frame data fusion registration, and if enough point cloud data are obtained, the point cloud data can be manually stopped to be obtained even if the timer does not reach the set time. If the timer reaches the set time, but sufficient point cloud data is not obtained, the point cloud data can be obtained manually, and the method has certain adaptability and can be adjusted as required.
And 6, rendering the registered point cloud data: rendering the registered point cloud data through an iso-surface extraction algorithm (Marching Cubes), constructing a three-dimensional model of an object in the indoor scene, and finishing the three-dimensional reconstruction of the indoor scene.
The Marching Cubes algorithm is the most commonly used method in the generation of the isosurface of the three-dimensional data field at present. It is actually a divide-and-conquer method, and the extraction of the iso-surface is distributed in each voxel. For each voxel processed, the interior iso-patches are approximated with triangular patches.
The MC algorithm mainly comprises three steps: 1. converting the point cloud data into voxel grid data; 2. extracting an isosurface for each voxel by using a linear interpolation value; 3. and carrying out mesh triangulation on the isosurface so as to reconstruct a three-dimensional model of the object.
The Kinect-based indoor scene three-dimensional reconstruction method provided by the invention improves the speed and the precision of point cloud registration, and can achieve a good effect on real-time three-dimensional reconstruction of an indoor scene.
Example 2
The method for three-dimensional reconstruction of an indoor scene based on Kinect is the same as that in embodiment 1, and step 1 uses multi-resolution depth data obtained by down-sampling to calculate a point cloud registration transformation matrix in step 4.2, and specifically comprises the following steps:
4.2.1 calculating to obtain a point cloud registration matrix by using an ICP (inductively coupled plasma) algorithm and adopting the depth data with the lowest resolution and the predicted point cloud data.
And 4.2.2, on the basis of the point cloud registration matrix, by utilizing the depth data and the predicted point cloud data with higher resolution, calculating step by step to obtain a more accurate point cloud registration transformation matrix, and updating the current point cloud registration matrix.
The method comprises the steps of firstly using low-resolution depth data to preliminarily calculate a point cloud registration matrix during calculation, then using higher-resolution depth data to calculate a more accurate point cloud registration matrix on the basis of the point cloud registration matrix, and till the final point cloud registration matrix is calculated by using the highest-resolution depth data.
Compared with the method of directly utilizing the original resolution data to calculate, the method of adopting multi-resolution to calculate step by step is less in time consumption and faster.
Example 3
The Kinect-based indoor scene three-dimensional reconstruction method is similar to that of the embodiment 1-2, and point cloud fusion is performed by adopting a TSDF algorithm in the step 4.3, and the method comprises the following steps:
4.3.1 Using the TSDF algorithm, a 3-dimensional space is represented by a grid of cubes, each of which stores the distance D and weight W of the grid to the surface of the object model.
The TSDF algorithm is adopted, and the method mainly has the idea that a virtual cube (Volume) is established in a display card, the side length is L, the side length L of the virtual cube in the example is set to be 2 meters, then the cube is divided into N multiplied by N voxels (Voxel), N is set to be 512 in the example, and the side length of each Voxel is LNEach voxel holds its distance D to the nearest surface of the object and its weight W. This example performs a three-dimensional reconstruction of a cabinet in a room.
4.3.2, the inside and the outside of the surface are represented by positive and negative values, the negative value of the distance in the voxel represents that the voxel is inside the object currently, the positive value of the distance represents that the voxel is outside the object currently, and the distance of 0 represents the surface of the object.
4.3.3 fuse the global data cube and the current point cloud data by this weight W.
In the point cloud data fusion process in the embodiment, the set time is 2 minutes, and enough point cloud data is obtained after 2 minutes, so that the three-dimensional reconstruction of the indoor cabinet scene is completed.
The point cloud fusion is carried out by using the global data cube, because the global data cube is composed of voxels, and the number of the voxels is fixed, the redundancy of the point cloud can be avoided, and the GPU is used for carrying out parallel computation, so that the computation speed is high, and the point cloud fusion can be accelerated.
Example 4
The three-dimensional reconstruction method of the indoor scene based on the Kinect is the same as that of the embodiment 1-3, and the weight W and the distance D in the step 4.3.1 are calculated by the following weight formula and distance formula:
Wi(x,y,z)=min(max weight,Wi-1(x,y,z)+1)
Figure BDA0001206079060000081
wherein Wi(x, y, z) is the weight of the voxel in the i-th frame global data cube, Wi-1(x, y, z) is the weight of the voxel in the i-1 frame global data cube, max weight is the maximum weight, Di(x, y, z) is the distance from the voxel in the current frame global data cube to the object surface, Di-1(x, y, z) is the distance from the voxel in the global data cube of the previous frame to the surface of the object, diAnd (x, y, z) is the distance from the voxel in the global data cube to the surface of the object calculated according to the current frame depth data.
Weight W of i frame voxeli(x, y, z) is the maximum weight max weight and the i-1 th frame voxel weight Wi-1(x, y, z) plus the minimum value of 1, the distance D of the i-th frame voxeli(x, y, z) is the i-1 th frame voxel distance Di-1Distance d calculated from (x, y, z) and ith frame depth datai(x, y, z) the result of the fusion according to the respective weights.
In the two formulas, the value range of i is that i is more than or equal to 2, and the weight W of all voxels is when i is 11(x, y, z) are all 0, the distance D of all voxels1(x, y, z) are all 1.
For the weight formula, when i is 2, max weight is 1, W1(x, y, z) is 0, and W is obtained from the formula2(x, y, z) is 1.
For the distance formula, when i is 2, W is known1(x,y,z)、W2(x,y,z)、D1(x, y, z) and d measured at frame 11(x, y, z), D can be calculated2(x, y, z) by D2(x, y, z) may determine whether the current voxel is inside, outside, or on the surface of the object.
The invention uses the weight to fuse the global data cube, so that the fusion result is more accurate.
Example 5
The Kinect-based indoor scene three-dimensional reconstruction method is the same as that in the embodiment 1-4, and the step 4.4 of obtaining the predicted point cloud data by adopting a ray casting algorithm comprises the following steps:
4.4.1 obtaining the predicted point cloud data of the global data cube by using a light projection algorithm and a current point cloud registration matrix; the basis of the ray casting algorithm is to cast a ray from the center of the casting until it reaches the surface of the nearest object that blocks its continued propagation.
4.4.2 the predicted point cloud data and the current frame point cloud data are registered, and the precision of point cloud registration is improved. The ray casting algorithm is used here, and the voxel with the distance value of 0 in the global data cube can be obtained by combining the current point cloud registration matrix, so that predicted point cloud data is obtained and is used for updating the point cloud registration matrix in the next frame.
Many three-dimensional reconstruction algorithms adopt current frame point cloud data and previous frame point cloud data for registration, the method can accumulate errors of each frame to finally result in poor three-dimensional reconstruction effect, and the method adopts predicted point cloud data for registration, because the predicted point cloud data is obtained by calculation according to a global data cube, the accumulated errors can be greatly reduced, and better reconstruction effect is achieved.
A complete specific example is given below to further illustrate the present invention.
Example 6
The Kinect-based indoor scene three-dimensional reconstruction method is the same as the embodiments 1-5, referring to fig. 1, and the technical scheme adopted by the invention is as follows:
1. image acquisition and pre-processing
The Kinect is connected with a computer through a USB2.0 interface, the CPU of the computer is Core i 74790, the display card is GTX970, and the operating system is Windows 7SP 164.
First, image acquisition is performed through the Kinect, depth data and color data of objects in a scene are acquired, the resolution is 640 x 480, and a timer is started. The Kinect is a somatosensory peripheral device which comprises three cameras, an RGB color camera is arranged in the middle of the Kinect, and an infrared emitter and an infrared CMOS camera are respectively arranged on the left lens and the right lens. The infrared ray can code the space, as long as the space is marked with the structured light, the whole space is marked, an object is placed in the space, and the position information of the object can be determined through the speckle pattern on the surface of the object.
Due to the influence of artificial disturbance, illumination and the accuracy of the Kinect, noise and holes exist in the acquired depth data, the noise in the depth image is removed by adopting a combined filtering method, and meanwhile, the missing depth information is repaired. The method simultaneously utilizes the depth image and the color image and utilizes the Gaussian kernel function to calculate the spatial domain weight w of the depth imagesSum color image gray domain weight wrThen using wsAnd wrCalculating to obtain a weight w required by filtering, wherein the formula is as follows:
w=ws×wr
Figure BDA0001206079060000101
Figure BDA0001206079060000102
wherein sigmas、σrThe performance of the filter is determined based on the standard deviation of the gaussian function. G (i, j) is the gray value at the pixel point (i, j) after the color image is converted into the gray image, and G (x, y) is the gray value at the pixel point (x, y) of the depth image. In this example, σ is selecteds=4,σr=4。
As shown in fig. 3, the depth data before and after filtering is shown in fig. 3(a) and the depth data after filtering is shown in fig. 3(b), and it can be seen that there are many noises and holes in fig. 3(a) and the noises and holes in fig. 3(b) are greatly reduced. The filtering method adopted by the invention can introduce a color image with complete information while keeping the foreground and background edges, and can repair the part with missing depth information while denoising.
And then, the depth data is subjected to down sampling, the data with the original resolution of 640 × 480 is subjected to down sampling to obtain the depth data with the resolutions of 320 × 240, 160 × 120 and 80 × 60, and a depth map pyramid is constructed to prepare for subsequent ICP point cloud registration.
2. Point cloud acquisition and computation method vector
And obtaining a camera transformation matrix according to parameters of the Kinect camera, converting the depth data from an image coordinate system to a camera coordinate system to obtain point cloud data under a current visual angle, which is a point cloud image of a desktop scene, as shown in FIG. 4, and estimating and calculating a normal vector of each point by using the characteristic value. The function depth2 cluster in the point cloud base PCL can be specifically adopted for implementation.
Establishing of TSDF global data cube and calculating of prediction point cloud
The point cloud data are converted into voxels in a global data cube (Volume) according to a Truncated Signed Distance Function (Truncated Signed Distance Function). In this example, the cube side length is set to 3m, and the cube is divided into 512 × 512 × 512 voxels, and the distance values of all the voxels are 1 and the weights are 0. Setting the left lower corner of the cube as a coordinate origin, setting the direction perpendicular to the front face of the cube and towards the inner side of the cube as the positive direction of a Z axis, setting the horizontal direction to the right as the positive direction of an X axis, setting the horizontal direction to the positive direction of a Y axis, setting the coordinate of the initial position of the Kinect as [ 1.51.5-0.3 ], and enabling the camera to shoot most places at the position.
And then, calculating by combining a ray casting algorithm and a global data cube to obtain predicted point cloud data.
This step is only performed once at frame 1 and not after frame 2.
4. Point cloud fusion based on ICP algorithm
The point cloud fusion process is shown in fig. 2.
(1) And returning to the step 1 and the step 2, acquiring a frame of point cloud data again, calculating a normal vector, updating a point cloud registration matrix according to an ICP (inductively coupled plasma) algorithm by combining the current frame of point cloud data and the predicted point cloud data acquired from the previous frame, and then updating a global data cube through the updated point cloud registration matrix.
(2) When the global data cube is updated, all voxels in the global cube are traversed, the current point cloud registration matrix is used for inversely transforming the voxels into an image coordinate system, then the image coordinate system is compared with the current depth data, and if the difference value is within a certain threshold value, the distance and the weight of the voxel are updated, wherein the threshold value is set to be 10mm in the example. The size of the threshold is related to the reconstruction accuracy and can be selected by the person skilled in the art according to the needs.
(3) And according to a light projection algorithm, calculating a new predicted point cloud by using the updated global data cube and the current point cloud registration matrix, and updating the point cloud registration matrix in the next frame.
And circularly executing the step of point cloud fusion until the timer reaches the set time, wherein the set time is 3 minutes in the example, stopping acquiring the point cloud data to obtain the well-registered point cloud data.
5. Rendering of point cloud data
Rendering the registered point cloud data to obtain the grid representation of the three-dimensional object. And rendering the registered point cloud data by using a Marching Cube algorithm. For each voxel processed, the iso-surface is extracted using linear interpolation and then mesh triangulated. Finally, a three-dimensional model of the object is obtained, as shown in fig. 5, the effect of three-dimensional reconstruction of the desktop scene is clearly shown in the figure, objects such as a display, a keyboard and a mouse in the scene can be clearly recognized, and the hierarchical relationship among the objects is relatively accurate. And finally, storing the reconstructed model as model data in a format of ". ply". The model can be viewed and edited by the software MeshLab at any time later.
In short, the invention discloses a Kinect-based indoor scene three-dimensional reconstruction method, which comprises the following steps: 1. acquiring depth data of an object by using a Kinect, denoising and downsampling the depth data; 2. obtaining point cloud data of the object according to the parameters and the depth data of the Kinect, and calculating a normal vector; 3. establishing a global data cube by using a TSDF algorithm, and calculating predicted point cloud data by using a ray casting algorithm; 4. calculating a point cloud registration matrix through an ICP (inductively coupled plasma) algorithm and predicted point cloud data, fusing the point cloud data acquired by each frame into a global data cube to realize point cloud data fusion registration, and fusing the point cloud data frame by frame until a better fusion effect is obtained; 5. and rendering the point cloud data through an isosurface extraction algorithm to construct a three-dimensional model of the object. The invention has the advantages that: during registration, the registration speed is improved through multi-resolution depth data; a predicted point cloud is obtained by using a light projection algorithm and is registered with the current frame point cloud, so that the registration precision is improved; the point cloud fusion is carried out by using the TSDF algorithm, the fusion speed is high, and the redundant points are few. The method can meet the requirement of real-time reconstruction of indoor scenes, and is used in the fields of robot navigation, industrial measurement, virtual interaction and the like.

Claims (2)

1. A Kinect-based indoor scene three-dimensional reconstruction method is characterized by comprising the following steps:
step 1, denoising and downsampling depth data:
setting a timer t, starting timing, acquiring depth data of one frame of an object in an indoor scene by using a Kinect, denoising the depth data by adopting a combined bilateral filtering method, combining a color image and a depth image, completing the missing depth image, and obtaining depth data of a plurality of resolutions by down-sampling;
step 2, acquiring point cloud data of the current frame, and calculating normal vectors of each point in the frame:
obtaining a transformation matrix from an image coordinate system to a camera coordinate system according to the camera parameters of Kinect, calculating current frame point cloud data of an object in an indoor scene by using the transformation matrix and depth data of a plurality of resolutions, and calculating normal vectors of all points of the current frame point cloud data by using eigenvalue estimation (eigen estimation);
step 3, acquiring a global data cube of the current frame point cloud data and calculating and predicting the point cloud data:
converting current frame point cloud data into voxels of a global data cube by using a truncated symbol distance function, and calculating to obtain predicted point cloud data of the global data cube and normal vectors of each point of the predicted point cloud by using a light projection algorithm in combination with an initial point cloud registration matrix, wherein the initial point cloud registration matrix is set as a unit matrix;
step 4, fusion registration of two frames of point cloud data:
4.1 moving the Kinect, returning to the step 1-step 2, obtaining point cloud data of the object in the indoor scene of a frame again and calculating normal vectors of all points in the frame;
4.2, using an ICP algorithm to calculate by combining point cloud data and normal vector acquired by the current frame, predicted point cloud data and predicted point cloud normal vector acquired by the previous frame, so as to obtain a point cloud registration matrix, and updating the current point cloud registration matrix by using the point cloud registration matrix, so as to calculate a point cloud registration transformation matrix, specifically comprising:
4.2.1, calculating to obtain a point cloud registration transformation matrix by using an ICP (inductively coupled plasma) algorithm and adopting depth data with the lowest resolution and predicted point cloud data;
4.2.2 then on the basis of the point cloud registration transformation matrix, by utilizing the depth data and the predicted point cloud data with higher resolution, calculating step by step to obtain a more accurate point cloud registration transformation matrix, and updating the current point cloud registration matrix;
4.3, performing point cloud fusion by adopting a TSDF algorithm, updating the voxel of the global cube through the current point cloud registration matrix, and fusing the current frame point cloud data into the global cube;
4.4, calculating to obtain the predicted point cloud data of the global data cube and the normal vector of each point of the predicted point cloud by using a light projection algorithm in combination with the current point cloud registration matrix;
step 5, fusion registration of multi-frame point cloud data:
returning to the step 4, repeatedly executing the step 4, acquiring data frame by frame, fusing each newly acquired frame of point cloud data into the global cube until the timer t reaches the set time, wherein the set time is 1-3 minutes, and stopping acquiring the point cloud data to obtain the well-registered point cloud data;
and 6, rendering the registered point cloud data:
rendering the registered point cloud data through an iso-surface extraction algorithm (Marching Cubes), constructing a three-dimensional model of an object in the indoor scene, and finishing the three-dimensional reconstruction of the indoor scene.
2. The Kinect-based indoor scene three-dimensional reconstruction method according to claim 1, wherein point cloud fusion is performed in step 4.3 by using a TSDF algorithm, and comprises the steps of:
4.3.1TSDF algorithm represents 3-dimensional space with a cube grid, each grid in the cube stores the distance D and weight W of the grid to the surface of the object model;
4.3.2 the shaded and visible sides of the surface are represented by positive and negative, and the zero-crossing points are points on the surface; the positive value represents the visible side of an indoor scene object, and the negative value represents the shielded side of the indoor scene object;
4.3.3 fusing the global point cloud data and the current frame point cloud data through the weight W, wherein the point cloud data are obtained by down sampling.
CN201710014728.3A 2017-01-10 2017-01-10 Kinect-based indoor scene three-dimensional reconstruction method Active CN106803267B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710014728.3A CN106803267B (en) 2017-01-10 2017-01-10 Kinect-based indoor scene three-dimensional reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710014728.3A CN106803267B (en) 2017-01-10 2017-01-10 Kinect-based indoor scene three-dimensional reconstruction method

Publications (2)

Publication Number Publication Date
CN106803267A CN106803267A (en) 2017-06-06
CN106803267B true CN106803267B (en) 2020-04-14

Family

ID=58984269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710014728.3A Active CN106803267B (en) 2017-01-10 2017-01-10 Kinect-based indoor scene three-dimensional reconstruction method

Country Status (1)

Country Link
CN (1) CN106803267B (en)

Families Citing this family (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358645B (en) * 2017-06-08 2020-08-11 上海交通大学 Product three-dimensional model reconstruction method and system
CN107123161A (en) * 2017-06-14 2017-09-01 西南交通大学 A kind of the whole network three-dimensional rebuilding method of contact net zero based on NARF and FPFH
CN107767456A (en) * 2017-09-22 2018-03-06 福州大学 A kind of object dimensional method for reconstructing based on RGB D cameras
CN107633532B (en) * 2017-09-22 2020-10-23 武汉中观自动化科技有限公司 Point cloud fusion method and system based on white light scanner
CN107833270B (en) * 2017-09-28 2020-07-03 浙江大学 Real-time object three-dimensional reconstruction method based on depth camera
CN116385505A (en) * 2017-10-20 2023-07-04 阿里巴巴集团控股有限公司 Data processing method, device, system and storage medium
CN107862733B (en) * 2017-11-02 2021-10-26 南京大学 Large-scale scene real-time three-dimensional reconstruction method and system based on sight updating algorithm
CN107977938A (en) * 2017-11-24 2018-05-01 北京航空航天大学 A kind of Kinect depth image restorative procedure based on light field
CN107861920B (en) * 2017-11-27 2021-11-30 西安电子科技大学 Point cloud data registration method
CN107945220B (en) * 2017-11-30 2020-07-10 华中科技大学 Binocular vision-based reconstruction method
CN109905691A (en) * 2017-12-08 2019-06-18 浙江舜宇智能光学技术有限公司 Depth image acquisition device and depth image acquisition system and its image processing method
CN108052914A (en) * 2017-12-21 2018-05-18 中国科学院遥感与数字地球研究所 A kind of forest forest resource investigation method identified based on SLAM and image
CN108335325A (en) * 2018-01-30 2018-07-27 上海数迹智能科技有限公司 A kind of cube method for fast measuring based on depth camera data
CN108416804A (en) * 2018-02-11 2018-08-17 深圳市优***科技股份有限公司 Obtain method, apparatus, terminal device and the storage medium of target object volume
CN108537876B (en) * 2018-03-05 2020-10-16 清华-伯克利深圳学院筹备办公室 Three-dimensional reconstruction method, device, equipment and storage medium
CN108550181B (en) * 2018-03-12 2020-07-31 中国科学院自动化研究所 Method, system and equipment for online tracking and dense reconstruction on mobile equipment
CN108564652B (en) * 2018-03-12 2020-02-14 中国科学院自动化研究所 High-precision three-dimensional reconstruction method, system and equipment for efficiently utilizing memory
CN108537723B (en) * 2018-04-08 2021-09-28 华中科技大学苏州脑空间信息研究院 Three-dimensional nonlinear registration method and system for massive brain image data sets
CN108765548A (en) * 2018-04-25 2018-11-06 安徽大学 Three-dimensional scenic real-time reconstruction method based on depth camera
CN108632607B (en) 2018-05-09 2019-06-21 北京大学深圳研究生院 A kind of point cloud genera compression method based on multi-angle self-adaption intra-frame prediction
CN108734772A (en) * 2018-05-18 2018-11-02 宁波古德软件技术有限公司 High accuracy depth image acquisition methods based on Kinect fusion
CN108959761A (en) * 2018-06-29 2018-12-07 南京信息工程大学 A kind of novel prosthetic socket manufacturing method
CN109242951A (en) * 2018-08-06 2019-01-18 宁波盈芯信息科技有限公司 A kind of face's real-time three-dimensional method for reconstructing
CN109389633B (en) * 2018-08-31 2022-02-11 南京邮电大学 Depth information estimation method based on LSD-SLAM and laser radar
CN109242961B (en) 2018-09-26 2021-08-10 北京旷视科技有限公司 Face modeling method and device, electronic equipment and computer readable medium
US11166048B2 (en) * 2018-10-02 2021-11-02 Tencent America LLC Method and apparatus for video coding
CN109685848B (en) * 2018-12-14 2023-06-09 上海交通大学 Neural network coordinate transformation method of three-dimensional point cloud and three-dimensional sensor
CN109741382A (en) * 2018-12-21 2019-05-10 西安科技大学 A kind of real-time three-dimensional method for reconstructing and system based on Kinect V2
CN111369678A (en) * 2018-12-25 2020-07-03 浙江舜宇智能光学技术有限公司 Three-dimensional scene reconstruction method and system
CN109784232A (en) * 2018-12-29 2019-05-21 佛山科学技术学院 A kind of vision SLAM winding detection method and device merging depth information
CN110047144A (en) * 2019-04-01 2019-07-23 西安电子科技大学 A kind of complete object real-time three-dimensional method for reconstructing based on Kinectv2
CN110084840B (en) 2019-04-24 2022-05-13 阿波罗智能技术(北京)有限公司 Point cloud registration method, device, server and computer readable medium
CN110060335B (en) * 2019-04-24 2022-06-21 吉林大学 Virtual-real fusion method for mirror surface object and transparent object in scene
CN110335295B (en) * 2019-06-06 2021-05-11 浙江大学 Plant point cloud acquisition registration and optimization method based on TOF camera
CN110415351B (en) * 2019-06-21 2023-10-10 北京迈格威科技有限公司 Method, device and system for constructing three-dimensional grid based on single image
CN111243093B (en) * 2020-01-07 2023-05-12 腾讯科技(深圳)有限公司 Three-dimensional face grid generation method, device, equipment and storage medium
CN111383330A (en) * 2020-03-20 2020-07-07 吉林化工学院 Three-dimensional reconstruction method and system for complex environment
CN111445523A (en) * 2020-03-25 2020-07-24 中国农业科学院农业信息研究所 Fruit pose calculation method and device, computer equipment and storage medium
CN111652085B (en) * 2020-05-14 2021-12-21 东莞理工学院 Object identification method based on combination of 2D and 3D features
CN111681318B (en) * 2020-06-10 2021-06-15 上海城市地理信息***发展有限公司 Point cloud data modeling method and device and electronic equipment
CN111739080A (en) * 2020-07-23 2020-10-02 成都艾尔帕思科技有限公司 Method for constructing 3D space and 3D object by multiple depth cameras
CN112053390B (en) * 2020-09-04 2023-12-22 上海懒书智能科技有限公司 Positioning method based on point cloud transformation matching
CN112367514B (en) * 2020-10-30 2022-12-09 京东方科技集团股份有限公司 Three-dimensional scene construction method, device and system and storage medium
CN112581511B (en) * 2020-12-14 2023-07-18 北京林业大学 Three-dimensional reconstruction method and system based on near vertical scanning point cloud rapid registration
CN112785687A (en) * 2021-01-25 2021-05-11 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and readable storage medium
CN112686993B (en) * 2021-01-27 2024-04-02 大连理工大学 Three-dimensional reconstruction method, apparatus and computer storage medium for three-dimensional object
CN112884901B (en) * 2021-02-22 2022-04-15 武汉大学 Three-dimensional point cloud data normal global consistency method for semi-closed space scene
CN113034675A (en) * 2021-03-26 2021-06-25 鹏城实验室 Scene model construction method, intelligent terminal and computer readable storage medium
CN113470180B (en) * 2021-05-25 2022-11-29 思看科技(杭州)股份有限公司 Three-dimensional mesh reconstruction method, device, electronic device and storage medium
CN113610869A (en) * 2021-08-06 2021-11-05 成都易瞳科技有限公司 Panoramic monitoring display method based on GIS system
CN113689471B (en) * 2021-09-09 2023-08-18 中国联合网络通信集团有限公司 Target tracking method, device, computer equipment and storage medium
CN113902846B (en) * 2021-10-11 2024-04-12 岱悟智能科技(上海)有限公司 Indoor three-dimensional modeling method based on monocular depth camera and mileage sensor
CN114700949B (en) * 2022-04-25 2024-04-09 浙江工业大学 Mechanical arm smart grabbing planning method based on voxel grabbing network
CN115908708B (en) * 2022-11-16 2023-08-15 南京农业大学 Kinect-based plant population global three-dimensional reconstruction method
CN115731355B (en) * 2022-11-29 2024-06-04 湖北大学 SuperPoint-NeRF-based three-dimensional building reconstruction method
CN117470106B (en) * 2023-12-27 2024-04-12 中铁四局集团有限公司 Narrow space point cloud absolute data acquisition method and model building equipment
CN117518197B (en) * 2024-01-08 2024-03-26 太原理工大学 Contour marking method for underground coal mine tunneling roadway

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106688A (en) * 2013-02-20 2013-05-15 北京工业大学 Indoor three-dimensional scene rebuilding method based on double-layer rectification method
CN103279987A (en) * 2013-06-18 2013-09-04 厦门理工学院 Object fast three-dimensional modeling method based on Kinect
CN104700451A (en) * 2015-03-14 2015-06-10 西安电子科技大学 Point cloud registering method based on iterative closest point algorithm
CN105205858A (en) * 2015-09-18 2015-12-30 天津理工大学 Indoor scene three-dimensional reconstruction method based on single depth vision sensor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106688A (en) * 2013-02-20 2013-05-15 北京工业大学 Indoor three-dimensional scene rebuilding method based on double-layer rectification method
CN103279987A (en) * 2013-06-18 2013-09-04 厦门理工学院 Object fast three-dimensional modeling method based on Kinect
CN104700451A (en) * 2015-03-14 2015-06-10 西安电子科技大学 Point cloud registering method based on iterative closest point algorithm
CN105205858A (en) * 2015-09-18 2015-12-30 天津理工大学 Indoor scene three-dimensional reconstruction method based on single depth vision sensor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《全自动深度相机三维扫描***》;杨红庄等;《计算机辅助设计与图形学学报》;20151130;第27卷(第11期);第2041页第3.2小节 *
《基于kinect深度图像的三维重建研究》;韦羽棉;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150115;第14-17页第2.2.1-2.2.7小节 *
韦羽棉.《基于kinect深度图像的三维重建研究》.《中国优秀硕士学位论文全文数据库 信息科技辑》.2015,第I138-1237页. *

Also Published As

Publication number Publication date
CN106803267A (en) 2017-06-06

Similar Documents

Publication Publication Date Title
CN106803267B (en) Kinect-based indoor scene three-dimensional reconstruction method
CN109003325B (en) Three-dimensional reconstruction method, medium, device and computing equipment
CN110874864B (en) Method, device, electronic equipment and system for obtaining three-dimensional model of object
CN107240129A (en) Object and indoor small scene based on RGB D camera datas recover and modeling method
Li et al. Markerless shape and motion capture from multiview video sequences
US11074752B2 (en) Methods, devices and computer program products for gradient based depth reconstructions with robust statistics
CN114863038B (en) Real-time dynamic free visual angle synthesis method and device based on explicit geometric deformation
CN104376596A (en) Method for modeling and registering three-dimensional scene structures on basis of single image
Xu et al. Survey of 3D modeling using depth cameras
CN114450719A (en) Human body model reconstruction method, reconstruction system and storage medium
Condorelli et al. A comparison between 3D reconstruction using nerf neural networks and mvs algorithms on cultural heritage images
Özbay et al. A voxelize structured refinement method for registration of point clouds from Kinect sensors
Kang et al. Competitive learning of facial fitting and synthesis using uv energy
Tiwary et al. Towards learning neural representations from shadows
CN111862278A (en) Animation obtaining method and device, electronic equipment and storage medium
CN114863061A (en) Three-dimensional reconstruction method and system for remote monitoring medical image processing
CN113989434A (en) Human body three-dimensional reconstruction method and device
Luan et al. Research and development of 3D modeling
JP2021026759A (en) System and method for performing 3d imaging of objects
CN113744410A (en) Grid generation method and device, electronic equipment and computer readable storage medium
Afzal et al. Kinect Deform: enhanced 3d reconstruction of non-rigidly deforming objects
Takimoto et al. Shape reconstruction from multiple RGB-D point cloud registration
CN113487741A (en) Dense three-dimensional map updating method and device
CN113129348A (en) Monocular vision-based three-dimensional reconstruction method for vehicle target in road scene
Zhou et al. Progress and review of 3D image feature reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant