CN114943776A - Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss - Google Patents

Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss Download PDF

Info

Publication number
CN114943776A
CN114943776A CN202210606204.4A CN202210606204A CN114943776A CN 114943776 A CN114943776 A CN 114943776A CN 202210606204 A CN202210606204 A CN 202210606204A CN 114943776 A CN114943776 A CN 114943776A
Authority
CN
China
Prior art keywords
cost
plane
hypothesis
image
normal vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210606204.4A
Other languages
Chinese (zh)
Inventor
朱翱宇
陈珺
罗林波
官文俊
熊永华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202210606204.4A priority Critical patent/CN114943776A/en
Publication of CN114943776A publication Critical patent/CN114943776A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Graphics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a three-dimensional reconstruction method and a device based on cross-correlation function and normal vector loss, comprising the following steps: acquiring a multi-view image of a scene to be reconstructed and corresponding camera parameters; down-sampling the image and reducing the internal parameters of the camera; establishing an initial plane hypothesis in a random initialization mode at a minimum scale; calculating the cost of multi-view matching according to the central value normalized cross-correlation function and the normal vector loss function; passing the low-cost plane hypothesis in the neighborhood to the plane hypothesis of the current point; finding a better hypothesis through random perturbation; and (3) upsampling the hypothesis with the optimal current scale by a combined bilateral upsampling method, and continuing cost calculation, hypothesis propagation, plane disturbance optimization and upsampling as a plane hypothesis initialization result of the next scale until the original scale of the image. Therefore, the purposes of improving the accuracy of multi-view dense matching of the weak texture region and improving the accuracy and integrity of the reconstructed point cloud weak texture region are achieved.

Description

Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss
Technical Field
The invention relates to the technical field of computer vision image processing, in particular to a three-dimensional reconstruction technology, and provides a three-dimensional reconstruction method and a three-dimensional reconstruction device based on a cross-correlation function and normal vector loss.
Background
Three-dimensional reconstruction refers to accurately restoring the three-dimensional spatial shape of a scene or an object from the sampled data of a sensor. Three-dimensional reconstruction can be roughly classified into the following three types according to the difference of sensors: three-dimensional reconstruction based on lidar, three-dimensional reconstruction based on structured light, and three-dimensional reconstruction techniques based on Multi-View stereogeometry (MVS). The MVS technology is adopted for three-dimensional reconstruction, and the requirements on the sensor are not high, the cost is low, and the MVS is suitable for three-dimensional reconstruction of a large scene. MVS means that a three-dimensional scene is shot and imaged under multiple visual angles, depth information lost in the imaging process is recovered by using a solid geometry principle, and point cloud is reconstructed by using depth and camera parameters.
At present, MVS three-dimensional reconstruction has been widely applied in the fields of geological mapping, urban mapping and the like. The quality of the reconstructed point cloud mainly comprises integrity and accuracy, and the improvement of the quality of the point cloud is beneficial to subsequent geological disaster analysis and the like.
Although a high-quality point cloud model can be reconstructed for most scene MVS, for areas with insufficient color information, due to the fact that texture details are few, large-area similar areas exist, dense matching has ambiguity, a luminosity consistency cost function for evaluating depth estimation accuracy is degraded, and unacceptable errors occur in depth estimation. The ambiguity of dense matching of MVS methods is more severe especially when a certain point in space appears in a multi-view image only a few times.
In the MVS field, the existing basic thought for reconstructing weak texture regions has two: firstly, a weak texture region can be approximated by a spatial plane, parameters of one plane can be calculated by only using three points with correct depth in the weak texture region, and the problem that the weak texture region cannot be reconstructed can be solved by approximating the weak texture region by the plane. However, such methods are often only applicable to indoor scenes or artificial buildings, and will not be feasible when weakly textured areas cannot be approximated with a plane. Secondly, the idea of increasing the relative window size is that when texture information in a fixed window is not abundant, the matching degree between windows cannot be accurately evaluated, and the size of the window is increased by an appropriate value, which is beneficial to improving the color information amount in the window and improving the accuracy of dense matching. However, the photometric cost function actually measures the degree of matching between two windows, which can be approximated to the center pixel when the window size is small, but which approximation will introduce unacceptable errors when the window is too large.
Disclosure of Invention
In order to solve the problem that the existing MVS method can not accurately reconstruct the texture of the weak texture area, the invention adopts the technical scheme that a three-dimensional reconstruction method and a three-dimensional reconstruction device based on a cross-correlation function and normal vector loss are provided.
According to an aspect of the present invention, there is provided a three-dimensional reconstruction method based on a cross-correlation function and normal vector loss, comprising the steps of:
s1: acquiring a multi-view image of a scene to be reconstructed and corresponding camera parameters;
s2: down-sampling the image and reducing the internal parameters of the camera;
s3: adjusting the image to the minimum scale, and establishing an initial plane hypothesis pi in a random initialization mode p
S4: selecting one of the multi-view images to be reconstructed in turn as a reference image, performing paired dense matching on the reference image and the rest images, and calculating the luminosity consistency cost among the multi-view images by using a central value normalized cross-correlation function; the rest images are source images;
s5: in a checkerboard fashion at a pixel point p to be processedSelecting one point with the lowest cost from eight directions in the neighborhood as a candidate point p j Assuming the plane corresponding to the candidate point as pi pj And the initial plane hypothesis π p Forming a set S of candidate plane hypotheses π
S6: selecting the visual angle of the image pixel level by utilizing the luminosity consistency cost, and calculating the weight of the cost values of different visual angles;
s7: forming an image pair by a reference image and a certain source image, then carrying out binocular stereo matching, and calculating S π Assuming corresponding luminosity consistent cost, depth consistent cost and normal vector consistent cost for each plane, and performing weighted average processing on the multi-view image according to the weight obtained in the step S6 to obtain comprehensive cost;
s8: selecting a plane hypothesis corresponding to the lowest comprehensive cost as a new plane hypothesis of the pixel point p;
s9: finding a plane hypothesis with lower cost through random perturbation and updating the new plane hypothesis in the step S8;
s10, repeatedly executing the steps S4-S9 four times under the current scale, wherein the comprehensive cost is gradually reduced along with the increase of the iteration times;
s11: processing the optimal plane hypothesis under the current scale by a combined bilateral sampling method to serve as an initialization result of the plane hypothesis of the next scale, and amplifying internal parameters of the camera;
s12: the synthetic cost calculation (S4-S7), hypothesis propagation (S8), plane perturbation optimization (S9), and upsampling steps (S10-S11) are continued until the current scale reaches the original scale of the image.
Preferably, the S4 includes:
s4.1: for a multi-view image set I to be reconstructed, one image is selected from the multi-view image set I as a reference image I in turn ref The remaining pictures are designated collectively as source pictures I src To 1, pair ref And I src Performing dense matching of pairs;
s4.2: calculating a homography matrix H:
Figure BDA0003671452050000031
wherein K ref Is I ref Internal reference of camera, K srcj Is jth sheet I src R denotes a camera rotation matrix of the corresponding image, R srcj A camera rotation matrix representing a source image,
Figure BDA0003671452050000032
a transpose of a camera rotation matrix representing the reference image, c is a coordinate of a column vector representing the optical center of the corresponding camera in the world coordinate system, c ref Coordinates representing a reference image, c srcj The coordinates of the source image are represented,
Figure BDA0003671452050000033
is a line vector, represents a normal vector, dist is c ref Distance to plane assumption;
s4.3: passing I through a homography matrix H ref All pixel points x in a fixed-size window centered on p i Mapping to I srcj Pixel point y in i I.e. y i =Hx i
S4.4: the weight of the united bilateral filtering algorithm is used as the weight of a pixel at a certain point in a window, and the weight calculation formula is as follows:
Figure BDA0003671452050000034
||p-x i || 2 denotes x i And the L2 distance, | C, between the p coordinates p -C xi I represents the absolute value of the difference between pixel values between two points, σ s And σ c For fixed parameters, I is finally calculated by weighting the Normalized Cross-Correlation function (NCC) ref And I srcj The similarity of the pixel values in the corresponding two windows:
Figure BDA0003671452050000041
where j denotes that the corresponding source image is I srcj P is I ref A certain pixel point in, pi p Is the plane hypothesis for the point, W p Is a window of size 11X 11, C, centered on p p Representing the pixel value corresponding to the point,
Figure BDA0003671452050000042
represents an average value of pixel values within the window;
s4.5: the window mean value in the NCC is improved, the pixel of the window center value replaces the window mean value and is named as NCCC, and the calculation formula is as follows:
Figure BDA0003671452050000043
where p' denotes p-point mapping to I srcj Comparing the similarity results calculated by NCC and NCCC, selecting the result with high numerical value as the similarity of two points, rewriting the similarity into a cost function form, and calculating to obtain the photometric consistency cost (photometric consistency cost) by the following formula
Figure BDA0003671452050000044
Figure BDA0003671452050000045
Figure BDA0003671452050000046
Has a variation range of [0,2 ]]。
Preferably, the S5 includes:
s5.1: statistics of
Figure BDA0003671452050000047
I less than 2 srcj Is named as N, if N is greater than zero, then pi p At a photometric cost of:
Figure BDA0003671452050000048
If N is equal to zero:
e p (p,π p )=2
eight candidate hypothesis points are selected within the p-point neighborhood.
Preferably, the S6 includes:
s6.1: for the assumption of eight candidate planes corresponding to the m source images and the eight candidate points, calculating the cost loss, and obtaining a cost matrix with the size of 8 × m:
Figure BDA0003671452050000051
wherein a is i,j =e p (p,π ni ) I 1,2, 8, j is a positive integer no greater than m, pi ni Represents the ith plane hypothesis out of 8 plane hypotheses, and j represents I ref And j (I) srcj Performing dense matching when a i,j Is less than
Figure BDA0003671452050000052
Hour best point S g When a is i,j Greater than τ 1 Judging as a dead pixel, wherein t represents the current iteration times; for a particular viewing angle src j The weight for calculating the cost value is as follows:
Figure BDA0003671452050000053
wherein | S g (j) I denotes src j Number of points, σ v Is a parameter used to adjust the size of the weights.
Preferably, the S7 includes:
s7.1: calculating the depth consistency cost between multiple views:
first calculate I ref A certain image inPlain dot correspondence I src Two-dimensional coordinates of the projection of (a) followed by inverting I src Coordinate points are projected to I ref Calculating the distance between a starting point and an end point as the depth consistency cost;
s7.2: and (3) calculating a normal vector loss term to obtain the consistent cost of the normal vector as follows:
Figure BDA0003671452050000054
s7.3: and integrating all the cost items to obtain a comprehensive cost, namely:
Figure BDA0003671452050000055
wherein λ d And λ n The weights for the depth error and the normal vector error are adjusted,
Figure BDA0003671452050000056
in order to achieve a consistent cost in terms of luminosity,
Figure BDA0003671452050000057
in order to achieve a consistent cost in depth,
Figure BDA0003671452050000058
the cost is consistent with the normal vector;
s7.4: and performing cost aggregation according to the weight of each view in the step S6 to obtain the comprehensive cost of the multiple views:
Figure BDA0003671452050000061
wherein, ω (src) j ) Is the weight of each view in S6, e j (p,π ni ) Is a reference picture I ref Upper points p and I srcj The sum of the costs between corresponding points, i 1,2, 8, j is a positive integer no greater than m, pi ni Represents the ith plane hypothesis of the eight plane hypotheses, and j represents I ref And a firstj sheets of paper I srcj Dense matching is performed.
Preferably, the S8 includes: if e (p, π) ni ) The smallest value of (d) is less than e (p, pi) p ) And pi ni The depth of the corresponding point is in the interval between the maximum depth and the minimum depth (d) min ,d max ) In the interior, then use pi ni Substituted n p Namely:
Figure BDA0003671452050000062
preferably, step S9 includes: for plane hypothesis pi p =(n x ,n y ,n z D) performing a perturbation, calculating a planar assumption of the perturbation pi Corresponding to the sum of the costs in step S7.4, if π pi Has a composite cost less than the planar hypothesis pi p The composite cost of (2) is pi pi Substituted n p
According to another aspect of the present invention, there is also provided a three-dimensional reconstruction apparatus based on cross-correlation function and normal vector loss, comprising the following modules
The data preprocessing module is used for down-sampling the original image and reducing the camera parameters;
the initialization plane hypothesis module is used for randomly initializing a plane hypothesis corresponding to each pixel point on the minimum scale;
the luminosity consistency cost calculation module is used for calculating luminosity consistency costs among the multi-view pictures according to the plane hypothesis;
the pixel-level visual angle selection module is used for calculating the weights of different visual angles according to the luminosity consistency cost;
the hypothesis propagation module is used for selecting one plane hypothesis with the minimum comprehensive cost in the candidate plane hypotheses of the eight neighborhoods;
the plane perturbation module is used for trying to find a better plane hypothesis through random perturbation;
and the up-sampling module is used for up-sampling the plane hypothesis with low scale through a joint bilateral up-sampling algorithm.
The method starts from the construction of a more accurate and reliable cost function, and basically depends on the fact that imaging points of a three-dimensional point in different viewing angles have the same normal vector and the pixel values of the neighborhood of the imaging points have higher similarity. Firstly, image down-sampling is carried out, a plane hypothesis is initialized randomly for the down-sampled image, the quality of the plane hypothesis is evaluated through a cost function, then a checkerboard propagation algorithm is utilized to propagate the accurate plane hypothesis in the plane to the neighborhood, then plane disturbance optimization is carried out, and finally the image and the corresponding plane hypothesis are up-sampled and are continuously updated in a propagation and optimization mode until the image reaches the original size. When evaluating the properties of the plane hypothesis, the central value is used to normalize the cross-correlation function, and the similarity between the central points of the two windows is evaluated more accurately. Meanwhile, normal vector loss among multiple visual angles is introduced, dense matching ambiguity of weak texture areas is further relieved, and completeness and accuracy of point cloud reconstruction are improved.
The technical scheme provided by the invention has the following beneficial effects:
1. the method can solve the problem that dense matching of the image weak texture area has ambiguity, and improve the accuracy of dense matching.
2. Dense matching among multiple pictures can be rapidly completed by utilizing GPU parallel computing, and a large scene three-dimensional point cloud containing tens of millions of elements is generated.
Drawings
The specific effects of the present invention will be further explained with reference to the drawings and examples, wherein:
FIG. 1 is a flow chart of a method for three-dimensional reconstruction based on cross-correlation function and normal vector loss in accordance with the present invention;
FIG. 2 is a general block diagram of the three-dimensional reconstruction method of the present invention based on cross-correlation functions and normal vector penalties;
FIG. 3 is a screenshot of the point cloud generated by the present invention;
FIG. 4 is a depth contrast plot for the present invention and conventional methods;
fig. 5 is a diagram of normal vector comparison between the present invention and the conventional method.
Detailed Description
For a more clear understanding of the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
Referring to fig. 1 and fig. 2, this example provides a more accurate dense matching similarity measurement formula, and introduces normal vector loss between multi-view images, that is, provides a three-dimensional reconstruction method based on cross-correlation function and normal vector loss, which mainly includes the following steps:
s1: acquiring a multi-view image of a scene to be reconstructed and corresponding camera parameters;
step S1 specifically includes:
the camera pose (K, R, t) and the sparse point cloud are obtained by performing sparse reconstruction through a recovery Structure From Motion (SFM), wherein K is an internal reference of the camera and is a 3 x 3 matrix which represents the corresponding relation between two-dimensional projections of three-dimensional coordinates, R is a rotation matrix of the camera and is a 3 x 3 unit orthogonal matrix, and 3 row vectors in the matrix are standard orthogonal bases which respectively represent coordinates of vectors of unit lengths on an x axis, a y axis and a z axis of a camera coordinate system in a world coordinate system. t is a translation vector, is a 3 × 1 column vector, and represents the coordinates of the world coordinate system origin in the camera coordinate system.
Determining the depth variation range (d) of each picture of the camera by the depth of the characteristic points obtained by SFM reconstruction min ,d max )。
And counting the matched characteristic points in the multi-view images to obtain the initial similarity between the multi-view images. Specifically, the more the feature points with the same name are in the two images, the higher the similarity degree is.
S2: down-sampling the image and correspondingly reducing the internal parameters of the camera;
the step 2 specifically comprises the following steps:
and (3) carrying out down-sampling on the image for multiple times in a bilinear interpolation mode until the length and the width of the image are not more than 1200 pixel points, and multiplying the offset of the focal length and the optical center in the camera internal reference on the imaging surface by a down-sampling proportion in each down-sampling.
S3: establishing an initial planar hypothesis pi in a random initialization manner at a minimum scale p
The step 3 specifically comprises:
plane hypothesis pi p Includes four elements, depth value d and normal vector n ═ n (n) x ,n y ,n z )。I ref Each pixel point in the three-channel normal vector graph corresponds to a plane hypothesis, all the depths are arranged according to two-dimensional coordinates to form a depth graph, and three channels of RGB respectively correspond to three components of a normal vector.
Reciprocal construction of the maximum and minimum depths intervals (1/d) determined with sparse reconstruction max ,1/d min ) Initializing the depth values at uniformly distributed random samples within the interval, i.e.
Figure BDA0003671452050000081
Since most points are concentrated close to the camera optical center and fewer points are far from the optical center, uniform sampling by inverse number focuses more on smaller depth values, and larger depth values are ignored appropriately.
In order to make the initialized normal vectors uniformly distributed on the unit hemisphere, random sampling is first performed on the interval (-1,1), and the initialization parameter q is randomized 1 And q is 2 To ensure
Figure BDA0003671452050000082
The normal vector can be initialized to the form:
Figure BDA0003671452050000091
since an object can only be viewed by the camera if the normal vector of the object's surface is oriented towards the camera's imaging plane, let n equal-n if the cross product of the normal vector and the camera's principal optical axis is greater than zero. So far the plane assumes initialization is complete, i.e.
Figure BDA0003671452050000092
S4: one image is selected from the multi-view images to be reconstructed in turn as a reference image, the reference image and the rest images (named as source images) are subjected to paired dense matching, and the photometric consistency cost is calculated by utilizing a central value normalized cross-correlation function.
The step 4 specifically comprises the following steps:
for a multi-view image set I to be reconstructed, one image is selected from the multi-view image set I as a reference image I in turn ref (reference image), the remaining images being designated collectively as source images I src (source image). In order to recover the depth information lost by the camera in the process of three-dimensional to two-dimensional projection, the multi-view geometrical principle is utilized to carry out the projection on the depth information ref And I src Dense matching of pairs is performed.
From the camera pose in S1 and the initialized plane hypothesis in S3, the homography matrix H can be calculated, H being a 3 x 3 matrix, which can be represented by I ref Pixel point p in (1) is mapped to I src Pixel point p' in (1). The formula for the homography matrix is as follows:
Figure BDA0003671452050000093
wherein K is ref Is a 1 ref Internal reference of camera, K srcj Is jth sheet I src R denotes a camera rotation matrix of the corresponding image, R srcj A camera rotation matrix representing a source image,
Figure BDA0003671452050000094
a transpose of a camera rotation matrix representing the reference image, c is a coordinate of a column vector representing the optical center of the corresponding camera in the world coordinate system, c ref Coordinates representing a reference image, c srcj The coordinates of the source image are represented,
Figure BDA0003671452050000095
is a line vector, represents a normal vector, dist is c ref Go to flatThe distance of the face hypothesis. R denotes the camera rotation matrix for the corresponding image, and c is the coordinate of the column vector representing the optical center of the corresponding camera in the world coordinate system.
Figure BDA0003671452050000096
Are row vectors, representing the normal vectors of the plane hypothesis in S3. dist is c ref The distance to the plane assumption, i.e.:
Figure BDA0003671452050000097
wherein d is init Depth initialized for S3, p is pixel point at I ref Coordinates of (2).
Although the direct purpose of the invention is to evaluate the matching degree between two pixel points, since the information amount of two pixel values is too small to accurately evaluate the matching degree between two points, considering that the pixel values in the image are continuously changed, the information amount of a window with a certain point as the center is rich, and the window has better similarity in the neighborhood window of the same point among multiple visual angles, so that the I can be represented by the homography matrix H ref All pixel points x in a fixed-size window centered on p i Is mapped to I srcj Pixel point y in i I.e. y i =Hx i
Calculating the similarity of pixel values in two windows through a weighted Normalized Cross Correlation (NCC), wherein the weight of a joint bilateral filter algorithm is used as the weight of a pixel calculation result at a certain point in the window in consideration of the fact that the pixel value which is possibly obviously changed in color in the window has a large influence on the result, and the weight calculation formula is as follows:
Figure BDA0003671452050000101
wherein
Figure BDA0003671452050000102
Denotes x i And a central pointThe degree of approximation of p, | | p i -x|| 2 Denotes x i And the L2 distance, | C, between the p coordinates p -C xi I represents the absolute value of the difference between pixel values between two points, σ s And σ c Are fixed parameters. When x is i And the closer the color and distance between p, x i The greater the contribution to the final result and vice versa. Final calculation of I ref And I srcj The function of the similarity of the corresponding two windows in (1) is:
Figure BDA0003671452050000103
the higher the similarity, the closer the corresponding plane hypothesis is to the true value, the numerator is the covariance representing the elements between the two windows, and the denominator plays a normalization role.
Because the NCC approximately measures the similarity between the pixels at the central points of two windows according to the similarity between the windows, when the size of the windows is increased, the approximation inevitably brings unacceptable errors, in order to more accurately measure the similarity relation between the central points, the window mean value in the NCC is improved, and the pixels of the window central value are used for replacing the window pixel mean value, so that the cost function more accurately evaluates the similarity between the central points. The modified formula is named as a central Normalized Cross Correlation function (NCCC) and is calculated as follows:
Figure BDA0003671452050000111
where p' denotes p-point mapping to I srcj The corresponding point on. Although the NCCC is more accurate, it is not robust enough compared with the NCC, so for the plane assumption of a certain point, the similarity results calculated by the NCC and the NCCC are compared, the result with a high value is selected as the similarity of two points, and for the convenience of integration with other cost items, the similarity is rewritten into a cost function form, that is:
Figure BDA0003671452050000112
wherein
Figure BDA0003671452050000116
Has a variation range of [0,2 ]]Named photometric consistent cost, denotes a planar assumption of π p In I ref And I srcj The cost of the calculation in (c).
S5: selecting a plane hypothesis pi corresponding to the point with the lowest cost in eight directions in the neighborhood of the central point in a checkerboard mode pj A plane hypothesis as a candidate;
the step 5 specifically comprises the following steps:
for I ref The plane in (1) assumes pi p At different viewing angles I srcj Unity cost of luminosity of
Figure BDA0003671452050000113
Conducting polymerization and statistics
Figure BDA0003671452050000114
I less than 2 srcj Is named as N, if N is greater than zero, then pi p The cost of (a) is:
Figure BDA0003671452050000115
if N is equal to zero:
e p (p,π p )=2
find e in eight candidate areas (up, down, left, right, and four directions at near and up, down, left, and right at far) p (p,π p ) The smallest point is used as a candidate hypothesis point on the region. Eight candidate hypothesis points are obtained.
S6: selecting a pixel-level visual angle by using luminosity loss, and calculating the weights of different visual angles;
the step 6 specifically comprises the following steps:
for m source images and eight candidate plane hypotheses, calculating the cost penalty will result in a cost matrix of 8 × m size:
Figure BDA0003671452050000121
wherein a is i,j =e p (p,π ni ),π ni Denotes the ith hypothesis of eight plane hypotheses, j denotes I ref And j (I) src And performing dense matching (dense matching is different from feature matching, the feature matching needs to find the same feature points in the picture, and the dense matching refers to trying to find corresponding pixel points in other source images for each pixel point in a reference image). When a is i,j Is less than
Figure BDA0003671452050000122
Hour is judged good spot S g When a is i,j Greater than τ 1 And judging as a dead pixel, wherein t represents the current iteration times. For a particular viewing angle src j I.e. the jth column in the matrix a, wherein if the number of good points is greater than 2 and the number of bad points is less than 3, this view is used to perform multi-view stereo matching (the multi-view stereo matching is actually a multi-group binocular stereo matching, and one reference image needs to perform dense matching pairwise with other source images at a time), and the weight of the calculated cost is:
Figure BDA0003671452050000123
wherein | S g (j) I denotes src j Number of well points, σ v Is a parameter for adjusting the weight, and the specific setting value is set according to experiments and experience.
If the number of good points is less than 2 and the number of bad points is less than 3, the view angle is used for multi-view angle stereo matching, and the weight of the calculated cost value is as follows:
ω(src j )=τ(t)
if the number of the dead pixels is more than 3, the view angle is not used for multi-view stereo matching, and the weight is 0.
S7: calculating the luminosity consistent cost, the depth consistent cost and the normal vector consistent cost of binocular stereo matching, and carrying out weighted average on the cost of the multi-view image according to the weight in S6;
the step 7 specifically comprises the following steps:
the depth consistency penalty between multiple views (i.e., multi-view) is computed. The similarity calculation is performed through the color information in both the NCC and the NCCC, and besides, the accuracy of dense matching can be improved by using the geometric relationship between the depth values corresponding to the multi-view image. Firstly, I is ref The two-dimensional coordinate of a certain pixel point in the system is calculated out a three-dimensional coordinate through depth and camera parameters, namely:
Figure BDA0003671452050000131
then through I src Calculating three-dimensional point X by using camera parameters ref (p) in I src The coordinates of the imaging point in (1), namely: p is a radical of src =P src X ref In which P is src =K src [R src |t]=K src [R src |-R src C src ]To two-dimensional coordinate p src Perform similar operations to calculate it at I ref The distance between x and p is used as a depth cost function of dense matching to improve the matching accuracy. The depth cost is calculated by the formula:
Figure BDA0003671452050000132
when the occlusion condition occurs in multiple views, the depth cost is very large, and therefore, a truncation threshold value delta is introduced, so that the cost function is more robust.
The computational vector loss term, namely:
Figure BDA0003671452050000133
and integrating all the cost items to obtain a comprehensive cost, namely:
Figure BDA0003671452050000134
wherein λ d ,λ n Are weights that adjust for depth errors and normal vector errors.
And performing cost aggregation on the comprehensive cost values of the multiple views according to the weight of each view in the step S6, that is:
Figure BDA0003671452050000135
wherein, ω (src) j ) Is the weight of each view in S6, e j (p,π ni ) Is a reference picture I ref Upper points p and I srcj The sum of the costs between corresponding points, i 1,2, 8, j is a positive integer no greater than m, pi ni Represents the ith plane hypothesis of the eight plane hypotheses, and j represents I ref And j (I) srcj Performing dense matching;
s8: selecting a plane hypothesis corresponding to the lowest comprehensive cost as a new plane hypothesis of the pixel point p;
the step 8 specifically comprises:
if e (p, π) ni ) The smallest value of (d) is less than e (p, pi) p ) And pi ni The depth of the corresponding point is (d) min ,d max ) Within the interval, then use pi ni Substituted n p Namely:
Figure BDA0003671452050000141
s9: and finding a plane hypothesis with lower cost through random disturbance and updating the plane hypothesis.
The step 9 specifically comprises:
for plane hypothesis pi p =(n x ,n y ,n z D) performing a perturbation, let d perturb U (d- γ, d + γ), γ being the perturbation amplitude.
Figure BDA0003671452050000142
Figure BDA0003671452050000143
Wherein q is 1 ~U(-1,1),q 2 ~U(-1,1),
Figure BDA0003671452050000144
n perturb =Rn
Figure BDA0003671452050000145
To speed up the computation, we reduce R to a single matrix:
Figure BDA0003671452050000146
where R is the vector of rotation, θ xyz Is the angle of rotation of the vector about the x, y, z axes.
And (3) arranging and combining the new normal vectors and depths obtained by perturbation and random to form a new plane hypothesis:
π p1 =(n,d rand ),
π p2 =(n rand ,d),
π p3 =(n perturb ,d rand ),
π p4 =(n perturb ,d),
π p5 =(n,d perturb )
respectively calculating the synthetic cost corresponding to the new plane hypothesis, together with the initial pi p Selecting the least costly plane hypothesis as the new pi p And completing the propagation of the checkerboard hypothesis.
S10, repeatedly executing the steps S4-S9 four times under the current scale, wherein the comprehensive cost is gradually reduced along with the increase of the iteration times;
s11: sampling the hypothesis with the optimal current scale by a combined bilateral sampling method, taking the sampling as an initialization result of the plane hypothesis of the next scale, and amplifying internal parameters of the camera;
the step 11 specifically includes:
the obtained planar hypotheses are jointly bilateral upsampled. The coordinate of a certain pixel point p in the up-sampled image corresponding to the point of the low-scale image is o, the up-sampling proportion is k, and any point p in a window with the size of k multiplied by k and taking o as the center is traversed i And calculating the spatial similarity:
Figure BDA0003671452050000151
calculating the color similarity:
Figure BDA0003671452050000152
C i is the color similarity, C pi And C o Are the pixel values of the pi-point and the o-point, respectively. The depth of the p-point is then:
Figure BDA0003671452050000153
d i is a point p i Corresponding depth values, three normal phasor components for p points can be calculated for the same reason:
Figure BDA0003671452050000154
n xi is p i The x-component of the point-corresponding normal vector, then for camera-intrinsic parametersThe offset of the focal length and the optical center on the imaging plane is multiplied by the magnification.
S12: and continuing to perform the steps of cost calculation, hypothesis propagation, plane disturbance optimization and upsampling until the current scale reaches the original scale of the image, namely the maximum scale of the image, removing inconsistent depth estimation according to the depth map and the normal vector, and synthesizing point cloud according to the depth map and the camera parameters to obtain a reconstructed three-dimensional map.
In order to test the performance of the method, experiments are respectively carried out on the own unmanned aerial vehicle surface data set ZIGUI and the three-dimensional reconstruction public data set ETH3D,
ZIGUI data set is gathered by unmanned aerial vehicle aerial photography, and the shooting object is the earth's surface, shoots the image of a hundred more different visual angles to an area, and most region is strong texture region in the image. In FIG. 3, the left image shows a reconstructed point cloud screenshot of a pipe scene in ETH3D, the qualitative performance of the method on an indoor data set is shown, the reconstructed point cloud of a certain area in a ZIGUI data set is shown on the right side, and the performance of the method on an outdoor unmanned aerial vehicle data set is shown.
(1) Experimental parameters
The evaluation indexes in the quantitative experiment mainly comprise three indexes: accuracy (accuracy), integrity (completeness), and F-score.
Under the error of epsilon (default is 2cm), if a certain point P in the point cloud falls into a sphere with a certain point G as a center and the radius of epsilon in a real label (Ground Truth), P is judged as an interior point. Let the number of points in the real label be N G The number of interior points is N in Then the integrity is defined as:
Figure BDA0003671452050000161
in the same way, the roles of the GT of the point cloud are exchanged, and if a certain point G in the real label (Ground Truth) falls in a sphere with a radius epsilon taking the certain point P of the point cloud as the center, G is judged as an interior point. Let the number of the point cloud midpoints be N p The number of interior points is N in The definition of accuracy is then:
Figure BDA0003671452050000162
f-score is a comprehensive assessment of alignment accuracy and integrity, and is defined as:
Figure BDA0003671452050000163
after multiple times of parameter debugging of experiments, the hyper-parameter in the experiments is set as sigma s =1、σ c =20.5、σ v =0.28、γ=0.2、θ x =θ y =θ z =0.04π、τ 0 =1.2、σ 1 =0.5、σ t =20、λ d =0.8、λ n =0.42。
(2) Three-dimensional reconstruction quantitative and qualitative testing
Finally, the quantitative results on ETH3D are shown in Table I, and it can be seen that the method provided by the invention has obvious improvement in both integrity and accuracy, and F-score improves two points, thereby significantly improving the accuracy of the reconstructed point cloud.
Qualitative experimental results are shown in fig. 3 and 4, the original experimental results of the modified method are shown on the left side, the experimental results of the method are shown on the right side, a normal vector comparison graph of the invention and the conventional method is shown in fig. 5, and the obtained quantitative experimental results are shown in table 1:
Method completeness accuracy F-score
NCC 0.726445 0.903786 0.805469
cost of center value 0.736656 0.906098 0.812638
Central value cost and normal vector cost 0.753447 0.904288 0.822005
In conclusion, the experimental result of the method is better than that of the original method no matter the depth, the normal vector or the finally generated point cloud, and the central value normalization cross-correlation function and the normal vector consistency cost function are very helpful to the quality improvement of the reconstructed point cloud.
In some embodiments, a three-dimensional reconstruction apparatus of a center-valued normalized cross-correlation function and normal vector loss is also provided, comprising the following modules:
and the data preprocessing module is used for down-sampling the original image and reducing the camera parameters.
And the plane hypothesis initializing module is used for randomly initializing the plane hypothesis corresponding to each pixel point on the minimum scale.
The luminosity consistency cost calculation module is used for calculating luminosity consistency costs among the multi-view pictures according to the plane hypothesis;
the pixel-level visual angle selection module is used for calculating the weights of different visual angles according to the luminosity consistency cost;
the hypothesis propagation module is used for selecting one hypothesis with the minimum comprehensive cost in the candidate plane hypotheses of the eight neighborhoods;
the plane perturbation module is used for trying to find a better plane hypothesis through random perturbation;
and the up-sampling module is used for up-sampling the plane hypothesis with the low scale to the high scale through a joint bilateral up-sampling algorithm. For example, 400 × 400 is used for a low-scale picture, each point corresponds to a plane hypothesis, so the dimension of the plane hypothesis is 400 × 400 as the scale of the picture, when the low-scale picture is upsampled, the plane hypothesis is regarded as a "multi-channel picture", and the upsampling is performed by the same method to obtain a higher-dimensional "multi-channel picture".
The method starts from the construction of a more accurate and reliable cost function, and basically depends on the fact that imaging points of a three-dimensional point in different viewing angles have the same normal vector and the pixel values of the neighborhood of the imaging points have higher similarity. Firstly, image down-sampling is carried out, a plane hypothesis is initialized randomly for the down-sampled image, the quality of the plane hypothesis is evaluated through a cost function, then a checkerboard propagation algorithm is utilized to propagate the accurate plane hypothesis in the plane to the neighborhood, then plane disturbance optimization is carried out, and finally the image and the corresponding plane hypothesis are up-sampled and are continuously updated in a propagation and optimization mode until the image reaches the original size. When evaluating the properties of the plane hypothesis, the central value is used to normalize the cross-correlation function, and the similarity between the central points of the two windows is evaluated more accurately. Meanwhile, normal vector loss among multiple visual angles is introduced, dense matching ambiguity of weak texture areas is further relieved, and completeness and accuracy of reconstructed point cloud are improved
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third and the like do not denote any order, but rather the words first, second and the like may be interpreted as indicating any order.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (8)

1. A three-dimensional reconstruction method based on cross-correlation function and normal vector loss is characterized by comprising the following steps:
s1: acquiring a multi-view image of a scene to be reconstructed and corresponding camera parameters;
s2: down-sampling the image and reducing the internal parameters of the camera;
s3: adjusting the image to the minimum scale, and establishing an initial plane hypothesis pi in a random initialization mode p
S4: selecting one of the multi-view images to be reconstructed in turn as a reference image, performing paired dense matching on the reference image and the rest images, and calculating the luminosity consistency cost among the multi-view images by using a central value normalized cross-correlation function; the rest images are source images;
s5: selecting a point with the lowest cost as a candidate point p in eight directions in the neighborhood of a pixel point p to be processed in a checkerboard mode j The plane corresponding to the candidate point is assumed to be pi pj And the initial plane hypothesis pi p Forming a set S of candidate plane hypotheses π
S6: selecting the visual angle of the image pixel level by utilizing the luminosity consistency cost, and calculating the weight of the cost values of different visual angles;
s7: forming an image pair by a reference image and a certain source image, then carrying out binocular stereo matching, and calculating S π Assuming corresponding luminosity consistent cost, depth consistent cost and normal vector consistent cost for each plane, and performing weighted average processing on the multi-view image according to the weight obtained in the step S6 to obtain comprehensive cost;
s8: selecting a plane hypothesis corresponding to the lowest comprehensive cost as a new plane hypothesis of the pixel point p;
s9: finding a plane hypothesis with lower cost through random perturbation and updating the new plane hypothesis in the step S8;
s10, repeatedly executing the steps S4-S9 for four times under the current scale, wherein the comprehensive cost is gradually reduced along with the increase of the iteration times;
s11: processing the optimal plane hypothesis under the current scale by a combined bilateral sampling method to serve as an initialization result of the plane hypothesis of the next scale, and amplifying internal parameters of the camera;
s12: the synthetic cost calculation similar to steps S4-S7, the hypothesis propagation of step S8, the plane perturbation optimization of step S9, and the iteration and upsampling operations of steps S10-S11 are continued until the current scale reaches the original scale of the image.
2. The three-dimensional reconstruction method based on the cross-correlation function and the normal vector penalty of claim 1, wherein said S4 comprises:
s4.1: for a multi-view image set I to be reconstructed, one image is selected from the multi-view image set I as a reference image I in turn ref The remaining pictures are designated collectively as source pictures I src To 1, pair ref And I src Performing dense matching of pairs;
s4.2: calculating a homography matrix H:
Figure FDA0003671452040000021
wherein K is ref Is I ref Internal reference of camera, K srcj Is jth sheet I src R denotes a camera rotation matrix of the corresponding image, R srcj A camera rotation matrix representing a source image,
Figure FDA0003671452040000024
a transpose of a camera rotation matrix representing the reference image, c is the coordinate of the column vector representing the optical center of the corresponding camera in the world coordinate system, c ref Coordinates representing a reference image, c srcj The coordinates of the source image are represented,
Figure FDA0003671452040000025
is a line vector, represents a normal vector, dist is c ref Distance to plane assumption;
s4.3: by means of a homography matrix H, I ref All pixel points x in a fixed-size window centered on p i Mapping to I srcj Pixel point y in i I.e. y i =Hx i
S4.4: the weight of the united bilateral filtering algorithm is used as the weight of a pixel at a certain point in a window, and the weight calculation formula is as follows:
Figure FDA0003671452040000022
||p-x i || 2 represents x i And the L2 distance, | C, between the p coordinates p -C xi I represents the absolute value of the difference between pixel values between two points, σ s And σ c For fixed parameters, I is finally calculated by a weighted normalized cross-correlation function ref And I srcj The similarity of the pixel values in the corresponding two windows:
Figure FDA0003671452040000023
wherein j denotes that the corresponding source image is I srcj P is I ref A certain pixel point in (n), n p Is the plane hypothesis for the point, W p Is a window of size 11X 11, C, centered on p p Representing the pixel value corresponding to the point,
Figure FDA0003671452040000031
represents an average value of pixel values within the window;
s4.5: the window mean value in the NCC is improved, the pixel of the window center value replaces the window mean value and is named as NCCC, and the calculation formula is as follows:
Figure FDA0003671452040000032
where p' denotes p-point mapping to I srcj Comparing the similarity results calculated by NCC and NCCC, selecting the result with high numerical value as the similarity of two points, rewriting the similarity into a cost function form, and calculating by the following formula to obtain the cost with consistent luminosity
Figure FDA0003671452040000033
Figure FDA0003671452040000034
Figure FDA0003671452040000035
Has a variation range of [0,2 ]]。
3. The method for three-dimensional reconstruction based on cross-correlation function and normal vector penalty of claim 2, wherein said S5 includes:
s5.1: statistics of
Figure FDA0003671452040000036
I less than 2 srcj Is named as N, if N is greater than zero, then pi p The photometric consistency penalty of (1) is:
Figure FDA0003671452040000037
if N is equal to zero:
e p (p,π p )=2
eight candidate hypothesis points are selected within the p-point neighborhood.
4. The three-dimensional reconstruction method based on the cross-correlation function and the normal vector penalty of claim 1, wherein said S6 comprises:
s6.1: for the assumption of eight candidate planes corresponding to the m source images and the eight candidate points, calculating the cost loss, and obtaining a cost matrix with the size of 8 × m:
Figure FDA0003671452040000041
wherein a is i,j =e p (p,π ni ) I 1,2, 8, j is a positive integer no greater than m, pi ni Represents the ith plane hypothesis of the eight plane hypotheses, and j represents I ref And j (I) srcj Performing dense matching when a i,j Is less than
Figure FDA0003671452040000042
Hour is judged good spot S g When a is i,j Greater than τ 1 Judging as a dead pixel, wherein t represents the current iteration times; for a particular viewing angle src j The weight for calculating the cost value is as follows:
Figure FDA0003671452040000043
wherein | S g (j) I denotes src j Number of well points, σ v Is a parameter used to adjust the size of the weights.
5. The method for three-dimensional reconstruction based on cross-correlation function and normal vector penalty of claim 4, wherein said S7 includes:
s7.1: calculating the depth consistency cost between multiple views:
first calculate I ref A certain pixel point in (1) corresponds to src Two-dimensional coordinates of the projection of (a) followed by inverting I src Coordinate points are projected to I ref Calculating the distance between a starting point and an end point as the depth consistency cost;
s7.2: and (3) calculating a normal vector loss term to obtain the consistent cost of the normal vector as follows:
Figure FDA0003671452040000044
s7.3: and integrating all the cost items to obtain a comprehensive cost, namely:
Figure FDA0003671452040000045
wherein λ d And λ n Respectively weights for adjusting the depth error and the normal vector error,
Figure FDA0003671452040000046
in order to achieve a consistent cost in terms of luminosity,
Figure FDA0003671452040000047
in order to achieve a consistent cost in depth,
Figure FDA0003671452040000048
the cost is consistent with the normal vector;
s7.4: and performing cost aggregation according to the weight of each view in the step S6 to obtain the comprehensive cost of the multiple views:
Figure FDA0003671452040000051
wherein, ω (src) j ) Is the weight of each view in S6, e j (p,π ni ) Is a reference picture I ref Upper points p and I srcj The sum of the costs between corresponding points, i 1,2, 8, j is a positive integer no greater than m, pi ni Represents the ith plane hypothesis of the eight plane hypotheses, j represents I ref And j (I) srcj And performing dense matching.
6. The three-dimensional reconstruction method based on the cross-correlation function and the normal vector penalty of claim 5, wherein said S8 comprises: if e (p, π) ni ) The smallest value of (d) is less than e (p, pi) p ) And n is n ni The depth of the corresponding point is in the interval between the maximum depth and the minimum depth (d) min ,d max ) In the interior, then use pi ni Substituted n p Namely:
Figure FDA0003671452040000052
7. the three-dimensional reconstruction method based on the cross-correlation function and the normal vector penalty of claim 6, wherein said S9 comprises: for plane hypothesis pi p =(n x ,n y ,n z D) performing a perturbation, calculating a planar assumption of the perturbation pi Corresponding to the sum of the costs in step S7.4, if π pi Has a composite cost less than the planar hypothesis pi p The comprehensive cost of (2) is n pi Substituted n p
8. A three-dimensional reconstruction method based on cross-correlation function and normal vector loss is characterized by comprising the following modules:
the data preprocessing module is used for down-sampling the original image and reducing the camera parameters;
the initialization plane hypothesis module is used for adjusting the image to the minimum scale, and then randomly initializing the plane hypothesis corresponding to each pixel point;
the luminosity consistency cost calculation module is used for calculating luminosity consistency costs among the multi-view pictures according to the plane hypothesis;
the pixel-level visual angle selection module is used for calculating the weights of different visual angles according to the luminosity consistency cost;
the hypothesis propagation module is used for selecting one plane hypothesis with the minimum comprehensive cost in the candidate plane hypotheses of the eight neighborhoods;
the plane perturbation module is used for trying to find a better plane hypothesis through random perturbation;
and the upsampling module is used for upsampling the plane hypothesis with low scale through a joint bilateral upsampling algorithm.
CN202210606204.4A 2022-05-31 2022-05-31 Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss Pending CN114943776A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210606204.4A CN114943776A (en) 2022-05-31 2022-05-31 Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210606204.4A CN114943776A (en) 2022-05-31 2022-05-31 Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss

Publications (1)

Publication Number Publication Date
CN114943776A true CN114943776A (en) 2022-08-26

Family

ID=82909838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210606204.4A Pending CN114943776A (en) 2022-05-31 2022-05-31 Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss

Country Status (1)

Country Link
CN (1) CN114943776A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117092206A (en) * 2023-08-09 2023-11-21 国网四川省电力公司电力科学研究院 Defect detection method for cable lead sealing area, computer equipment and storage medium
CN117456114A (en) * 2023-12-26 2024-01-26 北京智汇云舟科技有限公司 Multi-view-based three-dimensional image reconstruction method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100315412A1 (en) * 2009-06-15 2010-12-16 Microsoft Corporation Piecewise planar reconstruction of three-dimensional scenes
CN112734915A (en) * 2021-01-19 2021-04-30 北京工业大学 Multi-view stereoscopic vision three-dimensional scene reconstruction method based on deep learning
CN113160390A (en) * 2021-04-28 2021-07-23 北京理工大学 Three-dimensional dense reconstruction method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100315412A1 (en) * 2009-06-15 2010-12-16 Microsoft Corporation Piecewise planar reconstruction of three-dimensional scenes
CN112734915A (en) * 2021-01-19 2021-04-30 北京工业大学 Multi-view stereoscopic vision three-dimensional scene reconstruction method based on deep learning
CN113160390A (en) * 2021-04-28 2021-07-23 北京理工大学 Three-dimensional dense reconstruction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张平;王山东;黄瑾娉;周明明;: "基于SFM和CMVS/PMVS的建筑物点云重构方法研究", 苏州科技学院学报(自然科学版), no. 03, 15 September 2015 (2015-09-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117092206A (en) * 2023-08-09 2023-11-21 国网四川省电力公司电力科学研究院 Defect detection method for cable lead sealing area, computer equipment and storage medium
CN117456114A (en) * 2023-12-26 2024-01-26 北京智汇云舟科技有限公司 Multi-view-based three-dimensional image reconstruction method and system
CN117456114B (en) * 2023-12-26 2024-04-30 北京智汇云舟科技有限公司 Multi-view-based three-dimensional image reconstruction method and system

Similar Documents

Publication Publication Date Title
CN110738697B (en) Monocular depth estimation method based on deep learning
CN106780590B (en) Method and system for acquiring depth map
CN106910242B (en) Method and system for carrying out indoor complete scene three-dimensional reconstruction based on depth camera
Long et al. Adaptive surface normal constraint for depth estimation
CN108932536B (en) Face posture reconstruction method based on deep neural network
CN114943776A (en) Three-dimensional reconstruction method and device based on cross-correlation function and normal vector loss
US20110176722A1 (en) System and method of processing stereo images
CN110517306B (en) Binocular depth vision estimation method and system based on deep learning
CN105205858A (en) Indoor scene three-dimensional reconstruction method based on single depth vision sensor
CN103106688A (en) Indoor three-dimensional scene rebuilding method based on double-layer rectification method
CN112132958A (en) Underwater environment three-dimensional reconstruction method based on binocular vision
CN110910437B (en) Depth prediction method for complex indoor scene
CN110197505B (en) Remote sensing image binocular stereo matching method based on depth network and semantic information
WO2018133119A1 (en) Method and system for three-dimensional reconstruction of complete indoor scene based on depth camera
CN116129037B (en) Visual touch sensor, three-dimensional reconstruction method, system, equipment and storage medium thereof
CN114429555A (en) Image density matching method, system, equipment and storage medium from coarse to fine
CN107679542B (en) Double-camera stereoscopic vision identification method and system
CN110910456A (en) Stereo camera dynamic calibration algorithm based on Harris angular point mutual information matching
CN116310131A (en) Three-dimensional reconstruction method considering multi-view fusion strategy
CN116958434A (en) Multi-view three-dimensional reconstruction method, measurement method and system
CN115471749A (en) Multi-view multi-scale target identification method and system for extraterrestrial detection unsupervised learning
CN111951339A (en) Image processing method for performing parallax calculation by using heterogeneous binocular cameras
CN113781311A (en) Image super-resolution reconstruction method based on generation countermeasure network
CN117830520A (en) Multi-view three-dimensional reconstruction method based on depth residual error and neural implicit surface learning
US20230177771A1 (en) Method for performing volumetric reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination