CN110033514B

CN110033514B - Reconstruction method based on point-line characteristic rapid fusion

Info

Publication number: CN110033514B
Application number: CN201910267055.1A
Authority: CN
Inventors: 张元林; 赵君
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2019-04-03
Filing date: 2019-04-03
Publication date: 2021-05-28
Anticipated expiration: 2039-04-03
Also published as: CN110033514A

Abstract

The invention discloses a reconstruction method based on dotted line characteristic rapid fusion, which comprises the following steps. Video preprocessing, namely intercepting a video into an image and carrying out preprocessing such as focal length extraction and downsampling to reduce reconstruction complexity; point feature matching, namely extracting and matching point features by adopting scale-invariant feature transformation; the line features are quickly matched from coarse to fine, the line features are extracted and described by using the features of the line segment segmentation detector, and image line segment feature matching pairs are obtained by carrying out four steps of violence matching, motion estimation, Hamming distance threshold judgment and length screening on line segment descriptors; point-line characteristic fusion, converting the final line segment characteristic matching pair into pixel points, analyzing the pixel points and the existing point characteristic pixel coordinate positions, and fusing line segment pixel points and point characteristics after deleting repeated points; calculating the external pose and the three-dimensional point cloud of the camera, calculating an essential matrix by using the final image point-line characteristic matching pair, solving the external pose of the camera, solving the three-dimensional point cloud in a triangularization mode, and optimizing a result by using a light beam adjustment method.

Description

Reconstruction method based on point-line characteristic rapid fusion

Technical Field

The invention belongs to the field of computer vision three-dimensional reconstruction, and particularly relates to a reconstruction method based on rapid fusion of point-line characteristics.

Background

Image three-dimensional reconstruction based on sparse point clouds, also known as structure recovery from motion. In the traditional reconstruction process, the camera pose and the sparse three-dimensional point cloud are calculated mainly through point characteristics. Currently, the research is mainly divided into two categories.

One is incremental reconstruction. Firstly, extracting image point characteristics, generally adopting scale invariant characteristic transformation, solving the external attitude of the camera and the three-dimensional point cloud through extracting and matching the point characteristics between image pairs, and performing least square optimization on the result.

The other is global reconstruction. Firstly, the image point features are extracted and matched, global pose calculation is carried out, and the main improvement is that global beam adjustment constraint is carried out once to reduce reconstruction time. It can be seen that the point-of-use feature is still used in the aspect of feature extraction matching.

In sparse three-dimensional reconstruction, features occupy a significant position, and the accuracy of the reconstruction pose is directly influenced by the result of feature extraction and matching, so that the accuracy of the three-dimensional point cloud is influenced. However, the image features are also rich in information of line segment features in addition to the point features. Particularly, when three-dimensional reconstruction is performed in an image with lacking texture information, a small number of point features are obtained, the number of generated three-dimensional point clouds is small, and the result is unreliable.

Generally, the accuracy is calculated using the reprojection error between the three-dimensional point cloud result obtained from the average backprojection error calculation and the image feature reprojected three-dimensional point location.

Disclosure of Invention

The invention aims to quickly extract the line segment characteristics of an image in the sparse three-dimensional reconstruction process, so that the point-line characteristics jointly participate in the three-dimensional reconstruction process, the reconstruction result has line segment information, and the reconstruction quality is more robust.

The technical scheme of the invention is as follows:

a reconstruction method based on dotted line feature fast fusion comprises the following steps:

step 1: video preprocessing, namely intercepting every fixed frame of a video, storing the intercepted image, analyzing the stored image information EXIF, extracting the size of pixels of the length and the width of the image, the model of a camera and focal length information, judging the size of the pixels of the image, and downsampling the image if the product of the length and the width of the image is larger than a threshold value, so that the calculation complexity is reduced;

step 2: point feature matching, namely detecting image point features by adopting Scale Invariant Feature Transform (SIFT), detecting the feature points of an image and generating feature descriptors, carrying out L1 standardization on the obtained 128-dimensional feature descriptors, namely dividing each value of a 128-dimensional vector by the sum of the vectors, then opening the square root of each vector, and after experiments, improving the matching accuracy rate by the point features after standardization;

and step 3: the line features are quickly matched from coarse to fine, firstly, line feature extraction is carried out, line segment detection is carried out on each image by using line segment segmentation detector features (LSD), values of two scale spaces are respectively calculated during detection so as to improve the scale invariability of the line features, then the line features are sorted according to the response values, and binary descriptors of the first scale space of the line segment features are extracted and stored;

secondly, matching line features from coarse to fine, firstly carrying out violent matching, searching the Hamming distance of the descriptor of the second image for all line segment features of the first image, and finally returning the line segment with the closest distance, wherein all matching pairs with the closest distance are the primary result of line segment feature matching; extracting the midpoint positions of all the line segments, screening the line segment midpoints of the two images by using global motion estimation, and quickly screening out inconsistent matching pairs; secondly, judging the Hamming distance of the binary descriptor matching result, and excluding the matching pairs with overlarge distance; and finally, screening the characteristic length of the line segment, comparing the pixel length of the line segment matching pair, and deleting the inconsistent line segment matching pair.

And 4, step 4: and fusing point-line characteristics, converting the final line segment characteristic matching pair into matching pixel points, namely converting each line segment into pixels for storage, analyzing each pixel point of the line segment with the position of an existing point characteristic pixel coordinate system, namely calculating the Manhattan distance between the line segment pixels and the point pixels, if the value is 1, considering that the line segment characteristics contain the point characteristics, deleting the point from the line segment pixel points, and adding all the pixels of the final line segment characteristics into the point characteristic pixels after deleting all repeated points.

And 5: and calculating the pose and the three-dimensional point cloud of the camera by using epipolar geometry. And (3) calculating an essential matrix by using the final image characteristic matching pair, solving the appearance of the camera, solving the three-dimensional point cloud in a triangularization mode, and optimizing the result by using a light beam adjustment method.

Compared with the prior art, the invention has the following advantages:

1. the main idea of the invention is to quickly match line segment characteristics and to participate in the three-dimensional reconstruction process. As can be seen from the sparse three-dimensional reconstruction result, the number of the three-dimensional point clouds is more, and rich line segments are generated on the reconstruction details.

2. Because the point features are processed and the line segment features are matched from coarse to fine, the feature points are accurately extracted, and in the light beam adjustment stage, the re-projection error of the reconstruction result based on the point-line feature fusion is smaller, and the result is more robust.

3. In time, the extraction and matching of the line segment features are rapid, and the time performance is better compared with the matching of the traditional point features.

Drawings

FIG. 1 is a flow chart of sparse three-dimensional reconstruction according to the present invention.

Detailed Description

Referring to fig. 1: the dotted line feature fast fusion process of the present invention is shown in phantom in fig. 1. As can be seen from fig. 1, the point features are first matched and the descriptors are normalized for the sequence image. And then matching the line segment characteristics from coarse to fine, and finally fusing the point line characteristics to participate in the three-dimensional reconstruction.

According to the method, sparse three-dimensional reconstruction is performed by using a multi-view set data set fountain-P11 of the Swiss Federal institute of technology, an experimental environment is a Vmware Ubuntu16.04 virtual machine, hardware is configured into a 4-core processor 8G memory, and a GPU is not used for acceleration. A three-dimensional reconstruction program is written using opencv4.0. Further explanation is made through experimental procedures.

Step 1: and (5) image preprocessing. The data set includes 11 time series images of size 3072 x 2048 pixels. The Exif information of an image is extracted using an EasyExif library, which can facilitate extraction of camera information. Downsampling to 1536 x 1024. The reconstruction time can be reduced after downsampling, and the reconstruction result is hardly influenced.

Step 2: and matching point features. And extracting and matching point features of adjacent images. Taking the former two images as an example, 3717 point features are detected in the image 1 and 4327 point features are detected in the image 2 by Scale Invariant Feature Transform (SIFT) detection. And (4) normalizing the descriptors, opening square roots, and finally obtaining 614 pairs of matching pairs of the previous two images through nearest neighbor matching. The total time taken for detection and matching was 0.95 seconds.

And step 3: the line features are matched quickly from coarse to fine. And extracting the feature LSD of the line segment segmentation detector from the adjacent images, extracting the feature of the first scale of each image, and sequencing according to the response value. Taking the first two images as an example, the number of line segment features acquired in the 1 st image is 2494, and the number of line segment features acquired in the 1 st image is 2891.

And carrying out violent rough matching on the line segment characteristics to obtain all matching pairs, wherein the first two images are matched with the characteristic pair 2494. And (5) carrying out motion estimation filtering on the line segment centering points, quickly deleting inconsistent matching pairs, setting a matching threshold value to be 25, and obtaining 513 initial matching pairs. And then judging the pixel value of the matching pair, deleting the inconsistent matching of the pixels, and finally obtaining a matching pair 41. As can be seen from the table below, the extraction and matching of line segment features is faster than line segment features.

And 4, step 4: the dotted features fuse. And fusing the finally obtained line segment characteristics into the point characteristics. The first two images are taken as an example. Firstly, line features are converted into pixel points for storage, the 1 st image has 553 line segment pixel points, and the 2 nd image has 553 line segment pixel points due to the corresponding pixel points. And then judging the distance between the feature point of the line segment of each image and the feature point of the point, deleting if the Manhattan distance is less than 1, deleting 9 pixel points from the 1 st image, deleting 15 pixel points from the 2 nd image, and finally obtaining the pixel point pair of the line segment as 529. And adding the line segment feature points into the SIFT points, and finally obtaining 1143 pixel pairs of the first two images participating in three-dimensional reconstruction.

And 5: and calculating the pose and the three-dimensional point cloud of the camera. Taking the former three images as an example, no line segment feature is added, the number of the acquired three-dimensional point clouds is 1343, the average back projection error is 0.24, and the time is 1.58 seconds, after the features are fused, the number of the acquired three point clouds is 2588, the average back projection error is 0.27, and the time is 18.55 seconds. As can be seen from the following table, with the increase of the pictures, after the matched line segment points are added, the number of the point clouds is greatly increased, and the average back projection error is also increased due to the higher precision of the matched points.

As can be seen from the above table, in the sparse reconstruction process, the robust line segment feature points and the point feature points are fused by extracting the line segment features of the image, so that the line segment details of the image can be retained, the number of point clouds is increased, and the accuracy of the reconstruction result is improved. Experiments show that the straight-line segment characteristics are reserved for the three-dimensional reconstruction based on the dotted line characteristic fusion, and the three-dimensional reconstruction details are richer.

Claims

1. A reconstruction method based on dotted line feature fast fusion comprises the following steps:

and step 3: the line features are quickly matched from coarse to fine, firstly, the line features are extracted, the line segment segmentation detector features are detected for each image, and the extracted features are generated into binary descriptors; secondly, matching line features from coarse to fine, and performing violence matching, motion estimation and Hamming distance threshold judgment on the line segment descriptors; finally, screening the characteristic length of the line segment, comparing the pixel length of the matching pair of the line segment, and deleting the inconsistent matching pair of the line segment;

firstly, extracting line features, detecting line segments of each image by using line segment segmentation detector features (LSDs), respectively calculating values of two scale spaces during detection so as to improve the scale invariability of the line features, sequencing the line features according to the response values, and extracting and storing a binary descriptor of a first scale space of the line segment features;

secondly, matching line features from coarse to fine, firstly carrying out violent matching, searching the Hamming distance of a descriptor of a second image for all line segment feature descriptors of a first image, and finally returning the line segment with the closest distance, wherein all matching pairs with the closest distance are primary results of line segment feature matching; extracting the midpoint positions of all the line segments, screening the line segment midpoints of the two images by utilizing grid motion estimation, and quickly screening out inconsistent matching pairs; secondly, judging the Hamming distance of the binary descriptor matching result, excluding the matching pair with overlarge distance, and setting a distance threshold value to be 25 according to an empirical value in an experiment;

and 4, step 4: point-line characteristic fusion, converting the final line segment characteristic matching pair into pixel points, analyzing the pixel points with the existing point characteristic pixel coordinate positions, deleting points which are repeated with the point characteristic positions from the line segment pixel points, and finally adding all pixels of the line segment characteristics into the point characteristic pixels;

and 5: calculating the pose of the camera and the three-dimensional point cloud by using epipolar geometry, calculating an essential matrix by using a final image feature matching pair, solving the pose of the camera, solving the three-dimensional point cloud in a triangularization mode, and optimizing a result by using a light beam adjustment method.

2. The reconstruction method based on the dotted line feature fast fusion as claimed in claim 1, wherein the step 4 comprises the following steps: and fusing point-line characteristics, converting the final line segment characteristic matching pair into matching pixel points, namely converting each line segment into pixels for storage, analyzing each pixel point of the line segment with the position of an existing point characteristic pixel coordinate system, namely calculating the Manhattan distance between the line segment pixels and the point pixels, if the value is 1, considering that the line segment characteristics contain the point characteristics, deleting the point from the line segment pixel points, and adding all the pixels of the final line segment characteristics into the point characteristic pixels after deleting all repeated points.