CN109903372B - Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system - Google Patents

Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system Download PDF

Info

Publication number
CN109903372B
CN109903372B CN201910079993.9A CN201910079993A CN109903372B CN 109903372 B CN109903372 B CN 109903372B CN 201910079993 A CN201910079993 A CN 201910079993A CN 109903372 B CN109903372 B CN 109903372B
Authority
CN
China
Prior art keywords
depth
map
image
depth image
resolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910079993.9A
Other languages
Chinese (zh)
Other versions
CN109903372A (en
Inventor
李建伟
高伟
吴毅红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201910079993.9A priority Critical patent/CN109903372B/en
Publication of CN109903372A publication Critical patent/CN109903372A/en
Application granted granted Critical
Publication of CN109903372B publication Critical patent/CN109903372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a depth map super-resolution completion method and a high-quality three-dimensional reconstruction method and system, wherein the method comprises the following steps: learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image; based on the gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image; learning from the HR color image through SRC-Net, and determining a normal map and a boundary map; performing ambiguity measurement on the HR color image to obtain ambiguity information; and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image. The method is based on a depth super-resolution and completion network, a gradient sensitivity outlier detection and elimination algorithm and a ambiguity and boundary constraint depth image adaptive optimization algorithm, and can perform super-resolution and completion operations on the original LR depth image, so that a completed HR depth image can be obtained, the difficulty of indoor scene three-dimensional reconstruction is reduced, and the reconstruction accuracy is improved.

Description

Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system
Technical Field
The invention relates to an image super-resolution and completion technology in the field of image processing and a three-dimensional reconstruction technology in the field of computer vision, in particular to a depth map super-resolution completion method and a high-quality three-dimensional reconstruction method and system.
Background
High-precision three-dimensional reconstruction of an indoor scene is one of challenging research subjects in computer vision, and relates to theories and technologies in multiple fields of computer vision, computer graphics, pattern recognition, optimization and the like.
The three-dimensional reconstruction technology aims at obtaining depth information of a scene or an object, and can be divided into two categories, namely passive measurement and active measurement according to the obtaining mode of the depth information. Passive measurements generally use the reflection of the surrounding environment, such as natural light, to acquire images using a camera, and obtain three-dimensional spatial information of an object through a specific algorithm, i.e., a vision-based three-dimensional reconstruction technique. Active measurement refers to the use of a light source or energy source such as laser, sound wave, electromagnetic wave, etc. to emit to a target object, and the depth information of the object is obtained by receiving the returned light wave. The active measurement method comprises the following steps: time of Flight (TOF), Structured Light (Structured Light), and triangulation. In recent years, the appearance of consumer-grade RGB-D cameras greatly promotes the indoor scene three-dimensional reconstruction technology. The RGB-D camera is a novel visual sensor combining active measurement and passive measurement, and can shoot a two-dimensional color image and actively transmit a signal to a target object to obtain object depth information. Common consumer grade RGB-D cameras are TOF cameras based on the time-of-flight method, microsoft Kinect, woo xution, intel real sense, etc. based on the structured light method. The KinectFusion algorithm proposed by Newcombe et al uses Kinect to obtain depth information of each Point in an image, estimates a camera pose through an Iterative approximate Closest Point (ICP) algorithm, and performs volume data fusion through a curved surface hidden Function (TSDF) iteration to obtain a dense three-dimensional model.
The three-dimensional reconstruction of indoor scenes based on consumer-grade RGB-D cameras generally has the following problems: (1) the depth image acquired by the RGB-D camera has low resolution and large noise, which can cause camera attitude estimation errors and make the details of the object surface difficult to maintain; (2) due to the fact that transparent or high-reflectivity objects exist in the indoor scene, the depth image acquired by the RGB-D camera is empty and missing; (3) the depth distance acquired by the RGB-D camera is limited, and the corresponding color image can provide high-resolution complete scene information. These problems make the three-dimensional reconstruction applications based on consumer grade RGB-D cameras relatively limited.
Disclosure of Invention
In order to solve the problems in the prior art, namely, to improve the accuracy of indoor scene reproduction and reduce reproduction difficulty, the invention provides a depth map super-resolution completion method and a high-quality three-dimensional reconstruction method and system.
In order to solve the technical problems, the invention provides the following scheme:
a depth map super-resolution completion method for three-dimensional reconstruction, the method comprising:
learning from an original low-resolution LR depth image to be supplemented through a depth super-resolution and supplementation network SRC-Net to obtain a high-resolution HR depth image;
based on gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image;
learning from the HR color image through SRC-Net, and determining a normal map and a boundary map;
performing ambiguity measurement on the HR color image to obtain ambiguity information;
and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image.
Optionally, the removing, based on the gradient sensitivity detection, an outlier in the HR depth image to obtain a processed HR depth image specifically includes:
calculating a gradient map G by using a Sobel operatori
gi(u)=Sobel(u);
Wherein, gi(u) is the gradient value corresponding to pixel u;
calculating a mask image M based on the gradient sensitivityi
mi(u)=0,gi(u)≥gh
mi(u)=1,gi(u)<gh
Wherein m isi(u) is a mask value corresponding to pixel u, ghIs a set gradient threshold;
using a mask image MiFor high resolution depth map DiAnd performing corrosion operation, and removing outer points to obtain a processed HR depth image.
Optionally, the performing ambiguity measurement on the HR color image to obtain ambiguity information specifically includes:
filtering the HR color image in the horizontal direction and the vertical direction respectively through an average filter to obtain a Re-blu image;
calculating the difference in the horizontal direction and the vertical direction of the original LR depth image and the Re-blu image respectively to obtain a horizontal difference and a vertical difference;
determining a difference map of the original LR depth image and the Re-blu image according to the horizontal difference and the vertical difference;
summing and normalizing the difference map to obtain a processing map;
calculating a blurriness measure Blur of the processing graph:
Blur=max(RH,RV)
wherein R isHIs a normalized horizontal direction difference value, RVIs the normalized vertical direction difference value.
Optionally, the method optimizes the normal map, the boundary map, and the ambiguity information to obtain a complemented HR depth map, which specifically includes:
constructing an objective function according to the normal graph, the boundary graph and the ambiguity information;
and optimizing the HR depth image according to the objective function to obtain a completed HR depth image.
Optionally, according to the objective function, the HR depth image is optimized to obtain a completed HR depth image, which specifically includes:
determining an optimization function according to the objective function; the objective function comprises a first optimization term, a second optimization term and a third optimization term, and the optimization function is the weighted sum of the first optimization term, the second optimization term and the third optimization term;
E=λDEDSESNENBnBb
Figure GDA0002719418380000041
Figure GDA0002719418380000042
Figure GDA0002719418380000043
wherein the first optimization term EDRepresenting the estimated depth D (p) and the observed depth D at pixel po(p) distance; third optimization term ENRepresenting the consistency of the estimated depth with the predicted surface normal n (p); second optimization term ESIndicating that the same pixel is promoted between neighboring pixels, where v (p, q) denotes a tangent vector between pixel p and pixel q, q denotes a neighboring pixel of pixel p; b isn∈[0,1]Representing weighting of the normal term according to the predicted probability of a pixel on the occlusion boundary b (p); b isb∈[0,1]Indicating that the normal term is weighted according to the blurriness of the color image; lambda [ alpha ]D、λsAnd λNAll are preset reference coefficients;
and optimizing according to the optimization function to obtain a complete HR depth map.
Optionally, λDValue of 1000, λsValue of 1, λNThe value is 0.001.
In order to solve the technical problems, the invention provides the following scheme:
a depth map super resolution completion system for three-dimensional reconstruction, the system comprising:
the super-resolution processing unit is used for learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image;
the outlier removing unit is used for removing outliers in the HR depth image based on gradient sensitivity detection to obtain a processed HR depth image;
the information extraction unit is used for learning from the HR color image through SRC-Net and determining a normal map and a boundary map;
the fuzzy measurement unit is used for measuring the fuzzy degree of the HR color image to obtain fuzzy degree information;
and the optimization unit is used for optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth map.
In order to solve the technical problems, the invention provides the following scheme:
an indoor scene three-dimensional reconstruction method comprises the following steps:
calculating a three-dimensional point and a normal vector of each pixel in the supplemented HR depth map under a corresponding camera coordinate system;
estimating the attitude of the camera of the current frame by an iterative approximate nearest neighbor ICP algorithm according to the three-dimensional points and the normal vector;
performing volume data fusion through a Truncated Symbolic Distance Function (TSDF) model iteration according to the camera track information to obtain fusion data;
and performing surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.
Optionally, the three-dimensional point v under the corresponding camera coordinate system is calculated according to the following formulai(u) and normal vector ni(u):
vi(u)=zi(u)K-1[u,1]T
ni(u)=(vi(u+1,v)-vi(u,v))×(vi(u,v+1)-vi(u,v));
Wherein K is camera internal parameter obtained by calibration, and ziAnd (u) is the depth value corresponding to pixel u.
In order to solve the technical problems, the invention provides the following scheme:
an indoor scene three-dimensional reconstruction system, comprising:
the preprocessing unit is used for calculating three-dimensional points and normal vectors in a corresponding camera coordinate system for each pixel in the supplemented HR depth map;
the estimation unit is used for estimating the attitude of the current frame camera by iterative approximation nearest neighbor point ICP algorithm according to the three-dimensional points and the normal vector;
the fusion unit is used for carrying out volume data fusion through the iteration of a Truncated Symbolic Distance Function (TSDF) model according to the camera track information to obtain fusion data;
and the modeling unit is used for carrying out surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.
According to the embodiment of the invention, the invention discloses the following technical effects:
the method is based on the depth super-resolution and completion network, the gradient sensitivity outer point detection and elimination algorithm and the ambiguity and boundary constraint depth image adaptive optimization algorithm, and can complete the original low-resolution LR depth image acquired by the RGB-D camera, so that a completed HR depth map can be obtained, the difficulty of indoor scene three-dimensional reconstruction is reduced, and the reconstruction accuracy is improved.
Drawings
FIG. 1 is a flow chart of a depth map super resolution completion method for three-dimensional reconstruction according to the present invention;
FIG. 2 is a schematic block diagram of a depth map super-resolution completion system for three-dimensional reconstruction according to the present invention;
FIG. 3 is a flow chart of a method for three-dimensional reconstruction of an indoor scene according to the present invention;
fig. 4 is a schematic block structure diagram of an indoor scene three-dimensional reconstruction system according to the present invention.
Description of the symbols:
the system comprises a super-resolution processing unit-1, an outlier rejection unit-2, an information extraction unit-3, a fuzzy measurement unit-4, an optimization unit-5, a preprocessing unit-6, an estimation unit-7, a fusion unit-8 and a modeling unit-9.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
The invention provides a depth map super-resolution completion method for three-dimensional reconstruction, which is based on a depth super-resolution and completion network, an outer point detection and elimination algorithm of gradient sensitivity and a self-adaptive optimization algorithm of a ambiguity and boundary constraint depth image, and can complete an original low-resolution LR depth image acquired by an RGB-D camera, thereby obtaining a completed HR depth map, and being beneficial to reducing the difficulty of three-dimensional reconstruction of an indoor scene and improving the accuracy of reconstruction.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1, the depth map super-resolution completion method for three-dimensional reconstruction of the present invention includes:
step 100: learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image;
step 200: based on gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image;
step 300: learning from the HR color image through SRC-Net, and determining a normal map and a boundary map;
step 400: performing ambiguity measurement on the HR color image to obtain ambiguity information;
step 500: and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image.
In step 100, the Depth super-resolution and Completion network (SRC-Net) functionally includes the following two parts:
(1) the depth super-resolution is realized by adopting a Laplacian pyramid structure network with shared weight, a low-resolution depth image is taken as input, and high-frequency residual error information of the super-resolution depth image is predicted on each layer of pyramid. Each layer of pyramid structure has side output, and supervision information is introduced to enable the network to be trained better. In order to enlarge the receptive field of the high-frequency characteristic diagram, a recursive network with shared weights is adopted on each pyramid by a method of increasing the network depth. By adopting a recursive network weight sharing and local cross-channel connection strategy, because parameters in each convolution in a block are shared, the receptive field of the recursive network is calculated as follows:
RF=(Ksize-1)(N-1)+Ksize (1);
wherein, KsizeIs the size of the convolution kernel, N is the number of convolutions, and RF represents the receptive field of the nth layer.
The cascade structure comprises two branches: a feature extraction branch and an image reconstruction branch. For each level (s-level), the input image is operated on with an upsampled level with one scale (scale) equal to 2. Then, the up-sampling layer adds the residual image obtained by predicting the characteristic extraction branch of the current layer. And inputs the added HR (High-resolution) image into the next stage (s + 1).
(2) The depth completion is mainly to guide the HR depth image through the HR color image and perform depth optimization completion. The invention predicts the surface normal and occlusion boundary from the HR color image separately based on two VGG-16 networks with symmetric encoders and decoders, and then uses the learned normal information and boundary information for deep optimization. The normal information is used to estimate the corresponding depth value; occlusion boundaries provide information for depth discontinuities and help preserve boundary sharpness.
The training process of the invention adopting the deep super-resolution and completion network (SRC-Net) is as follows:
(1) deep super-resolution: the objective of the laplacian pyramid network training is to learn a mapping function, so that an HR image is generated from an LR (low-resolution) image to approximate a true-value image with high resolution. We respectively select 795 depth images from the NYU-v2 data set of the real scene and 372 depth images from three sequences (kt0, kt1 and kt3) of the virtual room of the ICL-NUIM data set of the synthetic scene as training data. In order to obtain the LR depth image, the HR image is scaled to the LR image using a down-sampling method, and the down-sampling multiple is set to 2, 4, and 8. For data augmentation, we perform operations such as random scaling (scaling factor range is [0.5,1]), random rotation (90 degrees, 180 degrees, 270 degrees), and horizontal and vertical flipping on the training data.
(2) And (3) deep completion: the goal of two VGG-16 network training is to learn the mapping function, generate surface normal and occlusion boundaries from the HR color map, respectively, to approximate the high resolution truth normal and truth boundary. We performed network training using 54755 RGB-D images of the SUNCG-RGBD dataset for the composite scene, and 59743 rendered supplemented RGB-D images with the ScanNet dataset for the real scene.
Further, for different learning tasks, the invention employs the following two loss functions:
(1) the loss function for learning the HR depth image uses charbonier as a penalty function, and since the network is cascaded, the loss of error is required for the output of each stage. The loss function is defined as follows:
Figure GDA0002719418380000101
wherein D represents the predicted depth, D*The depth of the truth is shown, M is the number of samples per training, L is the number of pyramid layers, and ε is 1 e-3.
(2) The penalty functions for learning normal and occlusion boundaries in HR color images are defined as follows:
Figure GDA0002719418380000102
Figure GDA0002719418380000103
wherein B denotes a prediction boundary, B*Representing a truth boundary; n denotes the prediction Normal, N*Representing a true value method; n is the number of effective pixels.
In order to effectively detect and remove error outliers in an HR depth image, normalization processing is carried out on the HR depth image, gradient is calculated, outliers are determined by detecting gradient change, and the outliers are removed.
In step 200, the removing, based on the gradient sensitivity detection, outliers in the HR depth image to obtain a processed HR depth image specifically includes:
step 201: calculating a gradient map G by using a Sobel operatori
gi(u)=Sobel(u) (5);
Wherein, giAnd (u) is the gradient value corresponding to pixel u.
Step 202: according to the gradient sensitivity, the meterComputation mask image Mi
mi(u)=0,gi(u)≥gh
mi(u)=1,gi(u)<gh (6);
Wherein m isi(u) is a mask value corresponding to pixel u, ghIs a set gradient threshold.
Step 203: using a mask image MiFor high resolution depth map DiAnd performing corrosion operation, and removing outer points to obtain a processed HR depth image.
Since the color image inevitably has motion blur when scanning an indoor scene using a consumer RGB-D camera, the optimization effect is affected if the depth is optimized directly using normal and boundary information obtained from the blurred color image. In order to ensure the quality of a depth image, no-reference ambiguity measurement is carried out on the quality of a color image, and subsequent depth image completion optimization is restrained by ambiguity information.
Specifically, in step 400, the performing ambiguity measurement on the HR color image to obtain ambiguity information includes:
step 401: and filtering the HR color image in the horizontal direction and the vertical direction through an average filter to obtain a Re-blu image.
Step 402: and respectively calculating the difference of the original LR depth image and the Re-blu in the horizontal direction and the vertical direction to obtain a horizontal difference and a vertical difference.
Step 403: and determining a difference map of the original LR depth image and the Re-blu image according to the horizontal difference and the vertical difference.
Step 404: and summing and normalizing the difference maps to obtain a processing map.
Step 405: calculating a blurriness measure Blur of the processing graph:
Blur=max(RH,RV) (7);
wherein R isHIs a normalized horizontal direction difference value, RVIs the normalized vertical direction difference value.
Further, in step 500, according to the normal map, the boundary map, and the ambiguity information, the optimization is performed to obtain a completed HR depth map, which specifically includes:
step 501: and constructing an objective function according to the normal graph, the boundary graph and the ambiguity information. In the present embodiment, three objective functions are obtained in total, namely, the first optimization term EDSecond optimization term ENThird optimization term Es
Step 502: and optimizing the HR depth image according to the objective function to obtain a completed HR depth image.
Preferably, according to the objective function, the HR depth image is optimized to obtain a completed HR depth image, which specifically includes:
step 5021: determining an optimization function E according to the objective function; the objective function comprises a first optimization term, a second optimization term and a third optimization term, and the optimization function is a weighted sum of the first optimization term, the second optimization term and the third optimization term.
Specifically, the following formula (8):
E=λDEDSESNENBnBb
Figure GDA0002719418380000121
Figure GDA0002719418380000122
Figure GDA0002719418380000123
wherein the first optimization term EDRepresenting the estimated depth D (p) and the observed depth D at pixel po(p) distance; third optimization term ENRepresenting the consistency of the estimated depth with the predicted surface normal n (p); second optimization term ESRepresenting facilitated neighboring imagesThe pixels have the same pixel with each other, wherein v (p, q) represents a tangent vector between the pixel p and a pixel q, and q represents a neighboring pixel of the pixel p; b isn∈[0,1]Representing weighting of the normal term according to the predicted probability of a pixel on the occlusion boundary b (p); b isb∈[0,1]Indicating that the normal term is weighted according to the blurriness of the color image; lambda [ alpha ]D、λsAnd λNAre all preset reference coefficients.
Step 5022: and optimizing according to the optimization function to obtain a complete HR depth map.
In the present embodiment, λDValue of 1000, λsValue of 1, λNThe value is 0.001.
In addition, the invention also provides a depth map super-resolution completion system for three-dimensional reconstruction. As shown in fig. 2, the depth map super-resolution completion system for three-dimensional reconstruction according to the present invention includes a super-resolution processing unit 1, an outlier rejection unit 2, an information extraction unit 3, a fuzzy measurement unit 4, and an optimization unit 5.
Specifically, the super-resolution processing unit 1 is configured to learn from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image.
The outlier rejection unit 2 is configured to reject outliers in the HR depth image based on gradient sensitivity detection, so as to obtain a processed HR depth image.
The information extraction unit 3 is used for learning from the HR color image through SRC-Net, and determining a normal map and a boundary map.
The ambiguity measuring unit 4 is used for measuring ambiguity of the HR color image to obtain ambiguity information.
And the optimization unit 5 is configured to optimize the HR depth image according to the normal map, the boundary map, and the ambiguity information to obtain a completed HR depth map.
Further, the invention also provides an indoor scene three-dimensional reconstruction method. As shown in fig. 3, the method for reconstructing an indoor scene in three dimensions according to the present invention includes:
step 600: and calculating a three-dimensional point and a normal vector of each pixel in the supplemented HR depth map under a corresponding camera coordinate system.
Step 700: and estimating the attitude of the current frame camera by an ICP (inductively coupled plasma) algorithm according to the three-dimensional points and the normal vector.
Step 800: and carrying out volume data fusion through TSDF model iteration according to the camera track information to obtain fusion data.
Step 900: and performing surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.
Processing an original LP depth image acquired by a consumer RGB-D camera through steps 100-600 to obtain a completed HR depth image; then, for each pixel u in the HR depth map, a three-dimensional point v in the corresponding camera coordinate system is calculatedi(u) and normal vector ni(u), the calculation formula is as follows:
Figure GDA0002719418380000141
wherein K is camera internal parameter obtained by calibration, and ziAnd (u) is the depth value corresponding to pixel u.
In step 700, the current depth map and the depth map generated by performing ray projection on the three-dimensional model under the previous frame view angle are registered through an ICP (Iterative close Point) algorithm, so as to obtain the pose of the current frame camera.
Transformation matrix T of current frame camera attitude relative to global coordinate systemg,iBy minimizing the point-to-plane distance error E (T)g,i) Calculated, the formula is as follows:
Figure GDA0002719418380000142
wherein the content of the first and second substances,
Figure GDA0002719418380000143
is the projected pixel of the pixel u,
Figure GDA0002719418380000144
is a three-dimensional point vi(u) a homogeneous coordinate form of (u),
Figure GDA0002719418380000145
and
Figure GDA0002719418380000146
is the three-dimensional point and normal vector predicted from the previous frame.
In step 800, based on the estimation result of the camera pose, the HR depth image of each frame is fused by using a TSDF (Truncated Signed Distance Function) model.
Three-dimensional spaces are represented using a voxel grid of resolution m, i.e. each space is divided into m blocks, each grid v storing two values: truncating the symbol distance function fi(v) And weight w thereofi(v) In that respect Truncating the symbol distance function fi(v) Is defined as follows:
fi(v)=[K-1zi(u)[uT,1]T]z-[vi]z (11);
wherein f isi(v) The distance from the mesh to the surface of the object model is shown, and the positive and negative indicate whether the mesh is on the occluded side or the visible side of the surface, and the zero crossing points are points on the surface. In this embodiment, the weights are averaged and fixed to 1.
The iterative formula for TSDF volumetric data fusion is as follows:
Figure GDA0002719418380000151
and performing light projection on the volume data obtained by fusion under the posture of the current frame camera to obtain surface point cloud, registering the estimated surface and the depth map acquired in real time in a camera tracking part, and finally extracting through a MarchingCube algorithm to obtain a three-dimensional model.
The invention provides a depth image super-resolution and completion method based on depth learning, which comprises a depth image super-resolution and completion (SRC-Net) network, an outlier elimination algorithm based on gradient sensitivity and a depth image self-adaptive optimization algorithm based on ambiguity and boundary constraint, and is applied to an indoor scene offline three-dimensional reconstruction system. The effect of super-resolution and completion of the depth image provided by the standard data set shows that: the depth image super-resolution and completion method can effectively process the original low-resolution depth image into a high-resolution completed depth image. The results of three-dimensional reconstruction of indoor scene data provided by the standard data set all show that: the indoor scene three-dimensional reconstruction system can obtain a complete and accurate high-quality indoor scene model, and has good robustness and expansibility.
Preferably, the invention also provides an indoor scene three-dimensional reconstruction system. As shown in fig. 4, the indoor scene three-dimensional reconstruction system of the present invention includes a preprocessing unit 6, an estimating unit 7, a fusing unit 8, and a modeling unit 9.
Specifically, the preprocessing unit 6 is configured to calculate a three-dimensional point and a normal vector in a corresponding camera coordinate system for each pixel in the supplemented HR depth map.
The estimation unit 7 is configured to estimate the pose of the current frame camera by iteratively approximating an ICP algorithm to a nearest neighbor point according to the three-dimensional point and the normal vector.
The fusion unit 9 is configured to perform volume data fusion through a truncated symbolic distance function TSDF model iteration according to the camera trajectory information to obtain fusion data.
The modeling unit 10 is configured to perform surface estimation according to the fusion data and the pose of the current frame camera to obtain an indoor scene three-dimensional model.
Compared with the prior art, the depth map super-resolution completion method and system for three-dimensional reconstruction, and the indoor scene three-dimensional reconstruction method and system have the same beneficial effects, and are not repeated herein.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (10)

1. A depth map super-resolution completion method for three-dimensional reconstruction, the method comprising:
learning from an original low-resolution LR depth image to be supplemented through a depth super-resolution and supplementation network SRC-Net to obtain a high-resolution HR depth image;
based on gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image;
learning from the HR color image through SRC-Net, and determining a normal map and a boundary map;
performing ambiguity measurement on the HR color image to obtain ambiguity information;
and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image.
2. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 1, wherein the method for removing outliers in the HR depth image based on gradient sensitivity detection to obtain the processed HR depth image specifically comprises:
calculating a gradient map G by using a Sobel operatori
gi(u)=Sobel(u);
Wherein, gi(u) is the gradient value corresponding to pixel u;
calculating a mask image M based on the gradient sensitivityi
Figure FDA0002935323410000011
Wherein m isi(u) is a mask value corresponding to pixel u, ghIs a set gradient threshold;
using a mask image MiFor high resolution depth map DiAnd performing corrosion operation, and removing outer points to obtain a processed HR depth image.
3. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 1, wherein the blur degree measurement is performed on the HR color image to obtain blur degree information, and specifically comprises:
filtering the HR color image in the horizontal direction and the vertical direction respectively through an average filter to obtain a Re-blu image;
calculating the difference in the horizontal direction and the vertical direction of the original LR depth image and the Re-blu image respectively to obtain a horizontal difference and a vertical difference;
determining a difference map of the original LR depth image and the Re-blu image according to the horizontal difference and the vertical difference;
summing and normalizing the difference map to obtain a processing map;
calculating a blurriness measure Blur of the processing graph:
Blur=max(RH,RV)
wherein R isHIs a normalized horizontal direction difference value, RVIs the normalized vertical direction difference value.
4. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 1, wherein the HR depth image is optimized according to the normal map, the boundary map, and the ambiguity information to obtain a completed HR depth map, specifically comprising:
constructing an objective function according to the normal graph, the boundary graph and the ambiguity information;
and optimizing the HR depth image according to the objective function to obtain a completed HR depth image.
5. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 4, wherein the HR depth image is optimized according to the objective function to obtain a completed HR depth map, specifically comprising:
determining an optimization function according to the objective function; the objective function comprises a first optimization term, a second optimization term and a third optimization term, and the optimization function is the weighted sum of the first optimization term, the second optimization term and the third optimization term;
E=λDEDSESNENBnBb
Figure FDA0002935323410000031
Figure FDA0002935323410000032
Figure FDA0002935323410000033
wherein the first optimization term EDRepresenting the estimated depth D (p) and the observed depth D at pixel po(p) distance; third optimization term ENRepresenting the consistency of the estimated depth with the predicted surface normal n (p); second optimization term EsIndicating that the same pixel is promoted between neighboring pixels, where v (p, q) denotes a tangent vector between pixel p and pixel q, q denotes a neighboring pixel of pixel p; b isn∈[0,1]Representing weighting of the normal term according to the predicted probability of a pixel on the occlusion boundary b (p); b isb∈[0,1]Indicating that the normal term is weighted according to the blurriness of the color image; lambda [ alpha ]D、λsAnd λNAll are preset reference coefficients;
and optimizing according to the optimization function to obtain a complete HR depth map.
6. The depth map super resolution completion for three dimensional reconstruction of claim 5Method characterized by λDValue of 1000, λsValue of 1, λNThe value is 0.001.
7. A depth map super resolution completion system for three-dimensional reconstruction, the system comprising:
the super-resolution processing unit is used for learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image;
the outlier removing unit is used for removing outliers in the HR depth image based on gradient sensitivity detection to obtain a processed HR depth image;
the information extraction unit is used for learning from the HR color image through SRC-Net and determining a normal map and a boundary map;
the fuzzy measurement unit is used for measuring the fuzzy degree of the HR color image to obtain fuzzy degree information;
and the optimization unit is used for optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth map.
8. An indoor scene three-dimensional reconstruction method is characterized by comprising the following steps:
acquiring a complemented HR depth map by the depth map super resolution complementing method for three-dimensional reconstruction of any one of claims 1-6;
calculating a three-dimensional point and a normal vector of each pixel in the supplemented HR depth map under a corresponding camera coordinate system;
estimating the attitude of the camera of the current frame by an iterative approximate nearest neighbor ICP algorithm according to the three-dimensional points and the normal vector;
performing volume data fusion through a Truncated Symbolic Distance Function (TSDF) model iteration according to the camera track information to obtain fusion data;
and performing surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.
9. The method of claim 8, wherein the three-dimensional points v in the corresponding camera coordinate system are calculated according to the following formulai(u) and normal vector ni(u):
Figure FDA0002935323410000041
Wherein K is camera internal parameter obtained by calibration, and ziAnd (u) is the depth value corresponding to pixel u.
10. An indoor scene three-dimensional reconstruction system, characterized in that the indoor scene three-dimensional reconstruction system comprises:
a preprocessing unit, configured to calculate three-dimensional points and normal vectors in a corresponding camera coordinate system for each pixel in a supplemented HR depth map obtained by the depth map super-resolution supplementation method for three-dimensional reconstruction according to any one of claims 1 to 6;
the estimation unit is used for estimating the attitude of the current frame camera by iterative approximation nearest neighbor point ICP algorithm according to the three-dimensional points and the normal vector;
the fusion unit is used for carrying out volume data fusion through the iteration of a Truncated Symbolic Distance Function (TSDF) model according to the camera track information to obtain fusion data;
and the modeling unit is used for carrying out surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.
CN201910079993.9A 2019-01-28 2019-01-28 Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system Active CN109903372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910079993.9A CN109903372B (en) 2019-01-28 2019-01-28 Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910079993.9A CN109903372B (en) 2019-01-28 2019-01-28 Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system

Publications (2)

Publication Number Publication Date
CN109903372A CN109903372A (en) 2019-06-18
CN109903372B true CN109903372B (en) 2021-03-23

Family

ID=66944359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910079993.9A Active CN109903372B (en) 2019-01-28 2019-01-28 Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system

Country Status (1)

Country Link
CN (1) CN109903372B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349087B (en) * 2019-07-08 2021-02-12 华南理工大学 RGB-D image high-quality grid generation method based on adaptive convolution
CN112749594B (en) * 2019-10-31 2022-04-22 浙江商汤科技开发有限公司 Information completion method, lane line identification method, intelligent driving method and related products
CN111145094A (en) * 2019-12-26 2020-05-12 北京工业大学 Depth map enhancement method based on surface normal guidance and graph Laplace prior constraint
CN113139910B (en) * 2020-01-20 2022-10-18 复旦大学 Video completion method
CN112907748B (en) * 2021-03-31 2022-07-19 山西大学 Three-dimensional shape reconstruction method based on non-down-sampling shear wave transformation and depth image texture feature clustering
CN113096039A (en) * 2021-04-01 2021-07-09 西安交通大学 Depth information completion method based on infrared image and depth image
CN113269689B (en) * 2021-05-25 2023-08-29 西安交通大学 Depth image complement method and system based on normal vector and Gaussian weight constraint
CN114049464A (en) * 2021-11-15 2022-02-15 聚好看科技股份有限公司 Reconstruction method and device of three-dimensional model
CN114909993A (en) * 2022-04-26 2022-08-16 泰州市创新电子有限公司 High-precision laser projection visual three-dimensional measurement system
CN117853679A (en) * 2022-09-30 2024-04-09 合肥美亚光电技术股份有限公司 Curved surface fusion method and device and medical imaging equipment
CN115578516A (en) * 2022-10-19 2023-01-06 京东科技控股股份有限公司 Three-dimensional imaging method, device, equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105046743A (en) * 2015-07-01 2015-11-11 浙江大学 Super-high-resolution three dimensional reconstruction method based on global variation technology

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103871038B (en) * 2014-03-06 2014-11-05 中国人民解放军国防科学技术大学 Super-resolution omnidirectional image reconstruction method based on non-uniform measurement matrix
CN108564652B (en) * 2018-03-12 2020-02-14 中国科学院自动化研究所 High-precision three-dimensional reconstruction method, system and equipment for efficiently utilizing memory
CN108961390B (en) * 2018-06-08 2020-05-19 华中科技大学 Real-time three-dimensional reconstruction method based on depth map

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105046743A (en) * 2015-07-01 2015-11-11 浙江大学 Super-high-resolution three dimensional reconstruction method based on global variation technology

Also Published As

Publication number Publication date
CN109903372A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109903372B (en) Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system
Min et al. Cost aggregation and occlusion handling with WLS in stereo matching
CN106910242B (en) Method and system for carrying out indoor complete scene three-dimensional reconstruction based on depth camera
KR101554241B1 (en) A method for depth map quality enhancement of defective pixel depth data values in a three-dimensional image
CN107980150B (en) Modeling three-dimensional space
KR20210042942A (en) Object instance mapping using video data
CN109685732B (en) High-precision depth image restoration method based on boundary capture
US8385630B2 (en) System and method of processing stereo images
CN109961506A (en) A kind of fusion improves the local scene three-dimensional reconstruction method of Census figure
Xu et al. Multi-scale geometric consistency guided and planar prior assisted multi-view stereo
KR20210119417A (en) Depth estimation
Choi et al. A consensus-driven approach for structure and texture aware depth map upsampling
CN111144213B (en) Object detection method and related equipment
EP3293700B1 (en) 3d reconstruction for vehicle
WO2021044122A1 (en) Scene representation using image processing
Xu et al. Survey of 3D modeling using depth cameras
CN112630469B (en) Three-dimensional detection method based on structured light and multiple light field cameras
CN114782628A (en) Indoor real-time three-dimensional reconstruction method based on depth camera
O'Byrne et al. A stereo‐matching technique for recovering 3D information from underwater inspection imagery
CN113450396A (en) Three-dimensional/two-dimensional image registration method and device based on bone features
Lo et al. Depth map super-resolution via Markov random fields without texture-copying artifacts
CN115205463A (en) New visual angle image generation method, device and equipment based on multi-spherical scene expression
CN114519772A (en) Three-dimensional reconstruction method and system based on sparse point cloud and cost aggregation
Chen et al. Kinect depth recovery using a color-guided, region-adaptive, and depth-selective framework
CN117274515A (en) Visual SLAM method and system based on ORB and NeRF mapping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant