CN107220932B

CN107220932B - Panoramic image splicing method based on bag-of-words model

Info

Publication number: CN107220932B
Application number: CN201710251608.5A
Authority: CN
Inventors: 张国山; 张玉龙; 黄伟杰
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2017-04-18
Filing date: 2017-04-18
Publication date: 2020-03-20
Anticipated expiration: 2037-04-18
Also published as: CN107220932A

Abstract

The invention relates to a panoramic image splicing method based on a bag-of-words model, which comprises the following steps: preparing a trained ORB bag tree; carrying out ORB feature extraction on an image i in an image data set to be spliced, setting k as a key frame inserted into a splicing structure chart, carrying out feature search on two images through a forward index, and establishing correspondence between features so as to obtain correspondence between the features of the two images; obtaining a homography matrix between the two images; minimizing a secondary projection error and excluding outer points by utilizing a random sampling maximum likelihood estimation algorithm so as to obtain an inner point set of a corresponding image pair; calculating the overlapping percentage of the images through a boundary matrix; calculating the overlap between the corresponding image pairs, and when the image i is used as a key frame and added into the splicing structure chart, establishing a relation with the previous key frame through a homography matrix; detecting a loop; forming a loop; optimizing a homography matrix; and fusing the image splicing structure chart.

Description

Panoramic image splicing method based on bag-of-words model

Technical Field

The invention belongs to the field of computer vision and digital image processing.

Background

The image stitching technology is an important branch of the field of computer vision and digital image processing, and is a technology for seamlessly stitching more than two partially overlapped images to obtain a higher-resolution or wide-viewing-angle image. Two of the most critical links in image stitching are image registration and image fusion. For the image fusion technology, the current method is not very different in time consumption and fusion effect, and is mature. However, for image registration, since the registration time and effect directly affect the speed and success rate of image stitching, image registration is always a hotspot of current research in image stitching. Common image registration methods are based on SIFT or SURF feature description images, which have the advantages of rotation invariance, scale invariance and being not easily affected by illumination, and research in the present stage is focused on binary features such as BRIEF, ORB and BRISK, because they require less storage space and are fast in computation. However, most of the image stitching methods at the present stage are based on a frame-to-frame comparison method, which can achieve a good effect when the number of images is small, but with the increase of the number of images, the method is not suitable when the real-time requirement is high.

Disclosure of Invention

The invention aims to provide a bag-of-words model-based panoramic image stitching method which is suitable for high real-time requirements.

A panoramic image splicing method based on a bag-of-words model comprises the following steps:

1) preparing a trained ORB bag tree;

2) carrying out ORB feature extraction on an image i in an image data set to be spliced, gradually descending a feature descriptor extracted from the image i from a root node of an ORB bag-of-words tree to a leaf node according to a Hamming distance, after traversing all features, storing a forward index of all features of the image in the bag-of-words tree, setting k as a key frame inserted into a splicing structure chart last time, carrying out feature search on two images through the forward index, and establishing correspondence between the features so as to obtain correspondence between the features of the two images;

3) obtaining a homography matrix between the two images according to the obtained correspondence of the characteristics between the two images^kH_i；

4) Minimizing a secondary projection error and excluding outer points by utilizing a random sampling maximum likelihood estimation algorithm so as to obtain an inner point set of a corresponding image pair;

5) calculating a boundary matrix of the image according to the obtained inner point set, and calculating the overlapping percentage of the image through the boundary matrix;

6) calculating an overlap between corresponding pairs of images based on the calculated percentage of overlap^kO_i＝min(O_k,O_i) If the number of interior points is greater than the threshold τ_inAnd overlap^kO_iGreater than a threshold τ_ovThe image i is saved as a potential key frame if the data to be stitchedIf the number and the overlap of the inner points obtained by concentrating the next frame of image and the key frame k do not meet the two thresholds, adding the image i as the key frame into the splicing structure chart;

7) when the image i is added into the mosaic structure diagram as a key frame, the image i needs to be associated with the previous key frame through a homography matrix, namely the image i is the (k + 1) th key frame of the mosaic structure diagram, and then the image i is the key frame^kH_iIs expressed as^kH_k+1Then homography matrix for k +1 key frame^MH_k+1Expressed as:^MH_k+1＝^MH_k ^kH_k+1wherein, in the step (A),^MH_ka homography matrix representing the kth key frame, M being a generic key frame defined to ensure alignment between key frames;

8) and (3) detecting a loop, and retrieving and matching the newly added key frame image i with all key frame images before the splicing structure chart: entering ORB characteristics of the newly added key frame image into a bag-of-words model, gradually reaching leaf nodes from a root node of a bag-of-words tree downwards according to a Hamming distance, calculating the frequency tf of each leaf node, namely each word in the bag-of-words tree appearing in an image i, searching all characteristics of the newly added key frame image in the bag-of-words tree to obtain the value of each word, and forming the values into a description vector of the image; setting the description vectors of the newly added key frame image and the last key frame image matched with the newly added key frame image as v respectively₁V and v₂The similarity score calculation formula of the two key frame images is expressed as follows:

the higher the score is, the higher the similarity of the two key frame images is, so that the similarity of the newly added key frame image and the previous key frame image can be obtained, and a high-to-low key frame similarity list can be obtained, wherein the key frame images are the key frames which are likely to form a loop with the newly added key frame image.

9) Calculating homography matrixes of the key frames and newly added key frames according to the similarity list sequence of the key frames, wherein if the number of the inner points obtained through the homography matrixes is more than a fixed threshold value, the corresponding connection relation becomes a part of a splicing structure chart, namely a loop is formed;

10) and optimizing the homography matrix, reducing errors caused by the homography matrix by adopting beam adjustment, wherein an error function epsilon is as follows:

wherein

And

representing corresponding feature points in both images, R: (^MH_i) Is a homography matrix^MH_iTo reduce the influence of outliers, a Huber loss function h (ε) { | ε { (L) } is introduced²if | Epsilon | is less than or equal to 1; 2| epsilon | to 1if | epsilon | is more than 1}, solving the obtained system nonlinear equation by a nonlinear least square algorithm so as to adjust and optimize a homography matrix;

11) and fusing the image splicing structure chart.

The main advantages and the characteristics of the invention are embodied in the following aspects:

1. at present, image registration in an image stitching algorithm is based on SIFT or SURF feature description images, and is not easily influenced by illumination due to the scale, rotation invariance and the like of the features, but the extraction of the features needs too much time, so that the real-time performance of the algorithm cannot meet the requirement, and in addition, the image registration is based on frame-to-frame matching, and the registration time is also increased. The retrieval structure based on the bag-of-words model provided by the invention adopts ORB characteristic descriptors. Experiments show that the image registration method based on the bag-of-words model can remarkably reduce the algorithm time while obtaining the registration effect.

2. The image splicing algorithm is mostly based on a single-thread algorithm at present, all parts of the algorithm have obvious sequentiality and coupling, the algorithm can adopt a multi-thread architecture, the simultaneous execution of different parts of the algorithm is realized, and the algorithm time can be effectively shortened on the basis of ensuring the splicing effect.

Drawings

FIG. 1 is a flow chart of the present invention bag-of-words model based multi-threaded image stitching algorithm;

FIG. 2 is a graph of image stitching of a Valldemossa dataset;

FIG. 3 is a topological structure diagram of a Valldemossa data set;

FIG. 4 is an image mosaic of an Odemar dataset;

FIG. 5 is a topological structure diagram of an Odemar dataset.

Detailed Description

The invention provides a multi-thread panoramic image splicing technology based on a bag-of-words model, which is described in detail by combining an example and an attached drawing as follows:

the overall framework of the algorithm of the invention is shown in fig. 1, the system is divided into four parts and the four parts can run in parallel, and the parallel design can reduce the coupling among the parts, thereby reducing the running time of the algorithm. The four parts are connected through a structure called a splicing structure diagram, and the structure is used for estimating the topological structure of a splicing environment and coordinating the relation among the parts to ensure real-time performance.

The splicing structure chart part is an important component of the method, wherein the topological graph represents the topological structure of the splicing environment and unifies the running mechanism among different parts. The topology of the environment represents the images participating in the image stitching and the connections between them. In the present invention, the mathematical model of the topology is in the form of an undirected graph, where the nodes represent the images selected in the final stitching and the connecting lines represent the overlap between them, and in the present invention, the images selected are called key frames. In order to generate the final mosaic, the key frames are selected, i.e. the mosaic image frames.

The other parts of the system and the building of the splicing structure chart are synchronously carried out, the key frame part describes an input image, enters a bag-of-words retrieval structure to process the image and determines whether the image is a key frame or not and is a component of a final splicing structure image; the closed loop detection part can establish the relation between the current frame and the matched key frame after detecting the matched image pair to form a loop; the optimization part adjusts the homography matrix through the adjustment of the beam method to reduce errors caused by mismatching; and the optimized mosaic structure diagram enters a fusion part to generate a final mosaic. The specific embodiment is as follows:

7) the bag-of-words tree is constructed, the DBoW2 library utilizes a large image database, and the ORB library and the SIFT library are trained off line and are used by people. In the present invention, the ORB bag tree that has been trained in the DBoW2 library is used.

8) And performing ORB feature extraction on an image i in the image data set to be spliced, gradually descending feature descriptors extracted from the image i from a root node of an ORB bag-of-words tree to leaf nodes according to a Hamming distance, and storing forward indexes of all features of the image in the bag-of-words tree after all features are traversed. k is a key frame inserted into the splicing structure chart, feature search is carried out on the two images through a forward index, and the correspondence between the features is established, so that the correspondence between the features of the two images is obtained;

9) obtaining a homography matrix between the two images according to the obtained correspondence of the characteristics between the two images^kH_i；

10) Minimizing a secondary projection error and excluding outer points by utilizing a random sampling maximum likelihood estimation algorithm so as to obtain an inner point set of a corresponding image pair;

11) calculating a boundary matrix of the image according to the obtained inner point set, and calculating the overlapping percentage of the image through the boundary matrix;

12) calculating an overlap between corresponding pairs of images based on the calculated percentage of overlap^kO_i＝min(O_k,O_i) If the number of interior points is greater than the threshold τ_inAnd overlap^kO_iGreater than a threshold τ_ovThen the image i is stored as a potential key frame, if the next frame image in the data set to be spliced and the key frame k obtain the number and the weight of the interior pointsIf the superposition does not meet the two thresholds, adding the image i as a key frame into the splicing structure chart;

7) when the image i is added into the splicing structure chart as a key frame, the image i needs to be connected with the previous key frame through a homography matrix, and then the homography matrix of the (k + 1) th key frame^MH_k+1Expressed as:^MH_k+1＝^MH_k ^kH_k+1wherein M is a general key frame defined for ensuring the alignment between key frames, and a homography matrix of each key frame in the splicing structure chart^MH_k+1，^MH_kAre all established with the key frame M;

8) and (3) detecting a loop, and retrieving and matching the newly added key frame image i with all key frame images before the splicing structure chart: the ORB characteristics of the newly added key frame images enter a bag-of-words model, the ORB characteristics of the newly added key frame images gradually reach leaf nodes from a root node of a bag-of-words tree step by step according to the Hamming distance, the frequency tf of each leaf node, namely each word in the bag-of-words tree appearing in an image i is calculated, each leaf node in the bag-of-words tree stores a reverse index, namely an image ID (identity) of the leaf node and the value of the word in an image description vector are stored, all the characteristics of the newly added key frame images are searched in the bag-of-words tree to obtain the value of each word, and the values form the description vector of the image; setting the description vectors of the newly added key frame image and the last key frame image matched with the newly added key frame image as v respectively₁V and v₂The similarity score calculation formula of the two key frame images is expressed as follows:

the higher the score is, the higher the similarity of the two key frame images is, so that the newly added key frame image and the previous key frame image can be obtained

The similarity of the images can obtain a high-to-low key frame similarity list, and the key frame images are possible to be related to new joining

The key frame image forms a key frame of a loop.

10) optimizing a homography matrix, wherein key frame images are connected through the homography matrix, the homography matrix has errors, optimization is needed, the errors caused by the homography matrix are reduced by adopting a light beam method adjustment, and an error function epsilon is as follows:

wherein

And

representing corresponding feature points in both images, R: (^MH_i) Is a homography matrix^MH_iTo reduce the influence of outliers, a Huber loss function h (ε) { | ε { (L) } without phosphor is introduced²if | Epsilon | is less than or equal to 1; 2| ε | -1if | ε | > 1 }; the obtained system nonlinear equation is solved through a nonlinear least square algorithm, the homography matrix is used as an initial value, convergence can be obtained through iteration, and therefore the homography matrix is adjusted.

11) And fusing the image splicing structure chart.

The fusion is the last step of the stitching algorithm for generating the final seamless stitching map, which is the application of stitching in the OPENCV library, including the stitch and line technique and the exposure compensation.

In order to verify the effectiveness and real-time performance of the method, two sets of data sets are selected. A Valldemossa data set collected in the underwater environment of Valldemossa in a Port city of Spain, including 201 320 × 180 pictures taken by a camera looking down, the data set including a large closed loop; the Odemar dataset, which is an underwater environment collected by Miquel Massot-Campos, contains 64 480X 270 pictures taken from the camera looking down, and does not contain a large closed loop. The experimental results are as follows:

1. the resulting mosaic of the Valldemossa dataset is shown in FIG. 2, the topology estimate of the environment is shown in FIG. 3, and the topology map contains 76 key frames. The mosaic of the acquired Odemar data set is shown in FIG. 4, the topological structure of the environment is shown in FIG. 5, and the topological structure contains 22 key frames.

2. The set of experiments are comparative experiments, and the two sets of data sets are still selected by comparing the method disclosed by the invention with a single-thread splicing method adopting an ORB characteristic, and the obtained experimental data are shown in table 1.

Table 1 is a data comparison of the experimental data of the present invention with a single-threaded image stitching algorithm based on ORB features;

table 1.

Claims

1. A panoramic image splicing method based on a bag-of-words model comprises the following steps:

1) preparing a trained ORB bag tree;

5) calculating a boundary matrix according to the obtained inner point set, and further obtaining the overlapping percentage of the boundary matrix in the image;

6) calculating an overlap between corresponding pairs of images based on the calculated percentage of overlap^kO_i＝min(O_k,O_i) If the number of interior points is greater than the threshold τ_inAnd overlap^kO_iGreater than a threshold τ_ovIf the number and the overlap of the inner points obtained by the next frame of image in the data set to be spliced and the key frame k do not meet the two thresholds, adding the image i as the key frame into the splicing structure chart;

8) and (3) detecting a loop, and retrieving and matching the newly added key frame image i with all key frame images before the splicing structure chart: entering ORB characteristics of newly added key frame images into a bag-of-words model, gradually reaching leaf nodes from a root node of a bag-of-words tree according to a Hamming distance, calculating the frequency tf of each leaf node, namely each word in the bag-of-words tree appearing in an image i, searching all characteristics of the newly added key frame images in the bag-of-words tree to obtain the value of each word, and forming the description direction of the image by the valuesAn amount; setting the description vectors of the newly added key frame image and the last key frame image matched with the newly added key frame image as v respectively₁V and v₂The similarity score calculation formula of the two key frame images is expressed as follows:

the higher the score is, the higher the similarity of the two key frame images is, so that the similarity of the newly added key frame image and the previous key frame image can be obtained, and a high-to-low key frame similarity list can be obtained, wherein the key frame images are the key frames which are possibly looped with the newly added key frame image;

wherein

And

representing corresponding feature points in both images, R: (^MH_i) Is a homography matrix^MH_iTo reduce the influence of outliers, a Huber loss function h (ε) { | ε { (L) } is introduced²If | epsilon | is less than or equal to 1; 2| epsilon | minus 1, if | epsilon | is more than 1, and the obtained system nonlinear equation is solved by a nonlinear least square algorithm, so that the optimization homography matrix is adjusted;

11) and fusing the image splicing structure chart.