CN111275702A

CN111275702A - Loop detection method based on convolutional neural network

Info

Publication number: CN111275702A
Application number: CN202010120637.XA
Authority: CN
Inventors: 程向红; 高源东
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-02-26
Filing date: 2020-02-26
Publication date: 2020-06-12
Anticipated expiration: 2040-02-26
Also published as: CN111275702B

Abstract

The invention discloses a loop detection method based on a convolutional neural network, which comprises the steps of firstly, carrying out convolution processing and pooling processing on a picture to be detected by utilizing an integration V3 network trained by open source framework Tensorflow and extracting descriptor vectors output by the neural network; secondly, performing dimensionality reduction on the high-dimensional descriptor vector by using a principal component analysis algorithm, and calculating the similarity of the descriptor vector by using Euclidean distance; and finally, setting a similarity threshold of the descriptor vector, and drawing an accuracy-recall curve to verify the performance of the loop detection algorithm. The method for detecting the loop based on the convolutional neural network overcomes the defects of low accuracy, large calculated amount, poor real-time performance and the like of the method for detecting the loop based on the artificial features.

Description

Loop detection method based on convolutional neural network

Technical Field

The invention relates to a loop detection method based on a convolutional neural network, and belongs to the technical field of deep learning and SLAM.

Background

The mobile robot constructs a map while positioning according to the data of the vision sensor in an indoor environment, namely a vision SLAM (Simultaneous Localization and Mapping) technology, which is a key for realizing the autonomous positioning of the mobile robot. The traditional visual SLAM technology is divided into four parts, namely a visual odometer, rear-end optimization, loop detection, map construction and the like. The loop detection is an important loop in the visual SLAM system, and plays an important role in reducing the accumulated error and improving the map construction precision. Therefore, the algorithm performance of loop detection will directly affect the accuracy of map construction, and a wrong loop will cause the whole map construction to fail.

The key to loop detection is to identify the scene that the mobile robot has arrived at. And comparing the image acquired by the robot at the current position with the image at the previous position by using an appearance-based loop detection algorithm, and once the similarity between the images is higher than a set threshold value, the robot is considered to return to the position visited once again. Thus, the essence of loop detection is the image matching problem. In the current relatively mature visual SLAM system, a Bag of Words (BoW) method is generally adopted to realize loop detection. The method describes image features by extracting feature points and calculates the similarity between frame images. Although the bag-of-words model method achieves good effects in an open source SLAM framework, the bag-of-words model method is very sensitive to scene appearance changes, has the problems of large calculated amount, high mismatching rate and the like, and is highly dependent on the number of words trained offline.

In recent years, deep learning theories and technologies are rapidly developed, and the advantages of the deep learning theories and the technologies are obvious in the aspect of solving the problems of image identification and classification. The method for detecting the loopback based on the convolutional neural network can extract high-level characteristic information of the image, has stronger robustness, and can improve the accuracy and recall rate of the method for detecting the loopback.

Disclosure of Invention

The technical problem is as follows: the invention provides a loop detection method based on a convolutional neural network, which can effectively solve the problems of high mismatching rate, large calculated amount and the like of the traditional loop detection method.

The technical scheme is as follows: the invention discloses a loop detection method based on a convolutional neural network, which comprises the following steps:

step 1, an image I to be detected_iInputting the data into a trained convolutional neural network, performing convolution and pooling, and extracting a high-dimensional descriptor vector D_iI-1, 2 … N, where the high-dimensional descriptor vector D_iDimension d, N is the image I to be detected_iThe length of the sequence;

step 2, extracting the high-dimensional descriptor vector D in the step 1_iPerforming dimensionality reduction by using principal component analysis method to obtain low-dimensional descriptor vector D_PCA-iWherein the low-dimensional descriptor vector D_PCA-iHas dimension d_pca；

Step 3, traversing the image sequence, and utilizing an Euclidean distance formula to describe a subvector D in a low dimension_PCA-iCarrying out similarity calculation;

step 4, setting a similarity threshold, and judging the detection result as a loop when the similarity of two low-dimensional descriptor vectors in the image sequence is smaller than the similarity threshold;

step 5, checking the output loopback information, and calculating the accuracy and the recall rate of the loopback detection algorithm;

and 6, repeating the steps 4 to 5, and drawing an accuracy-recall rate curve according to the multiple groups of calculated accuracy rates and recall rates so as to verify the performance of the loopback detection algorithm.

Further, in the method of the present invention, the convolutional neural network selected in step 1 is an inclusion v3 network pre-trained in the deep learning framework tensflow, and the network is composed of 42 layers.

Furthermore, in the method of the present invention, in step 3, the euclidean distance formula is used to reduce the descriptor vector D after dimension reduction_–PCA-iAnd (3) carrying out similarity calculation, wherein the calculation formula is as follows:

wherein, the similarity matrix Sim is an N-order square matrix, and N is the image I to be detected_iLength of sequence, D_PCA-mFor the current frame image I_mOf a low-dimensional descriptor vector, D_PCA-nFor the current frame image I_mCorresponding historical frame image I_nIs used to describe the subvector, | D_PCA-m||₂Is D_PCA-mL of₂Norm, | | D_PCA-n||₂Is D_PCA-nL of₂Norm, L₂Norm, i.e. the modulus of the vector, Sim (m, n) represents I_mAnd image I_nThe similarity of (c).

Further, in the method of the present invention, the calculation formula for setting the similarity threshold in step 4 is as follows:

thr_k＝1-thr_base*k，count＝1/thr_base，k＝1,2…count；

wherein, thr_kRepresents a similarity threshold, count represents the number of similarity thresholds, thr_baseRepresents the step size of the similarity threshold, and k represents the number corresponding to the similarity threshold.

Further, in the method of the present invention, in step 5,

accuracy P_kThe calculation formula of (2) is as follows: p_k＝TP_k/(TP_k+FP_k) Wherein, TP_kThe meaning of (A) is: when the threshold is thr_kIn time, the actual number is the number of loops and the judgment result of the algorithm is the number of loops; FP_kThe meaning of (A) is: when the threshold is thr_kWhen the number of the loops is not the actual loop, the algorithm judges that the number of the loops is the result;

recall rate R_kThe calculation formula of (2) is as follows: r_k＝TP_k/(TP_k+FN_k) In which FN_kThe meaning of (A) is: when the threshold is thr_kThen, it is actually a loop but the algorithmic decision is not the number of loops.

Has the advantages that: compared with the prior art, the invention has the following advantages:

compared with the prior art, the loop detection method based on the convolutional neural network has the following advantages that: 1.

high-level characteristic information of the image is extracted through a convolutional neural network, so that the robustness of the model is ensured, and the reliability of a prediction result is improved; 2. the similarity of the images is calculated by adopting an Euclidean distance formula, and the discrimination of elements in a similarity matrix can be effectively improved.

Drawings

FIG. 1 is a block flow diagram of an embodiment of the present invention;

FIG. 2 is the loopback information of the City Center dataset after being processed by the method;

FIG. 3 is the real loopback information for the City Center dataset;

FIG. 4 is an accuracy-recall curve for the City Center-Left dataset;

FIG. 5 is an accuracy-recall curve for the City Center-Right dataset.

Detailed Description

For the purpose of illustrating the technical solutions disclosed in the present invention in detail, the following description is further provided with reference to the accompanying drawings and specific embodiments.

The platform of the implementation is Windows10 operating system, and the development environment is Pycharm2017 and Matlab 2014.

As shown in fig. 1, the method for detecting a loop based on a convolutional neural network of the present invention comprises the following specific steps:

1) to-be-detected image I_iInputting the image into an Inception V3 network which is trained in advance in a deep learning framework Tensorflow, performing convolution and pooling processing, and extracting an image I_iThe output vector of the pooling layer is taken as a high-dimensional descriptor vector D_iI is 1,2 … N, wherein D_iD, N is the length of the image sequence, the dimension d of the high-dimensional descriptor vector is determined by the dimension of the output layer of the neural network, in this embodiment 2048 is taken, and N is 1237;

2) high-dimensional descriptor vector D output by neural network_iPerforming dimensionality reduction by using principal component analysis method to obtain low-dimensional descriptor vector D_PCA-iWherein the low-dimensional descriptor vector D_PCA-iHas dimension d_pca，d_pcaDetermined by the dimension d of the high-dimensional descriptor vector, d in this example being empirically determined_pcaTaking 500;

3) using Euclidean distance formula to describe the subvector D in low dimension_PCA-iAnd (3) carrying out similarity calculation, wherein the calculation formula is as follows:

wherein, the similarity matrix Sim is an N-order square matrix, and N is the image I to be detected_iLength of sequence, D_PCA-mFor the current frame image I_mLow dimensional descriptor direction ofAmount, D_PCA-nFor the current frame image I_mCorresponding historical frame image I_nIs used to describe the subvector, | D_PCA-m||₂Is D_PCA-mL of₂Norm, | | D_PCA-n||₂Is D_PCA-nL of₂Norm, L₂Norm, i.e. the modulus of the vector, Sim (m, n) represents I_mAnd image I_nThe similarity of (c).

4) The calculation formula for setting the similarity threshold is as follows:

thr_k＝1-thr_base*k，count＝1/thr_base，k＝1,2…count；

wherein, thr_kRepresents a similarity threshold, count represents the number of similarity thresholds, thr_baseThe step size representing the similarity threshold is obtained by weighing the calculation amount and the accuracy according to the general experience value in this embodiment, and thr is obtained_baseTaking 0.005, taking 200 as count, and taking k as a number corresponding to the similarity threshold; when the similarity of the two vectors is less than the similarity threshold thr_kWhen the detection result is judged to be a loop.

5) Checking the output loop information and calculating the accuracy P of the loop detection algorithm_kAnd recall rate R_k；

recall rate R_kThe calculation formula of (2) is as follows: r_k＝TP_k/(TP_k+FN_k) In which FN_kThe meaning of (A) is: when the threshold is thr_kWhen the number of actual loops is not the number of loops

6) Repeating the steps 4) to 5), and calculating multiple groups of accuracy rates P according to the calculated multiple groups of accuracy rates_kAnd recall rate R_kDrawing an accuracy-recall curve to verify the detection algorithm of the loopCan be used.

Specific examples are as follows:

the data set used in this example is the City Center at oxford university, which has a total of 2474 images, with odd frames representing the sequence of images taken by the left camera and even frames representing the sequence of images taken by the right camera. In this embodiment, 1237 images of the left camera and the right camera are respectively input into the neural network for loop detection.

To further prove the effectiveness of the algorithm of the embodiment, the performance of the algorithm is compared with that of the Fabmap algorithm, which is a very classical loop detection algorithm based on a bag-of-words model. FIG. 2 is the loopback information of the City Center dataset after being processed by the method; FIG. 3 is the real loopback information for the City Center dataset.

FIG. 4 is an accuracy-recall curve for a left camera image sequence of the City Center dataset and FIG. 5 is an accuracy-recall curve for a right camera image sequence of the City Center dataset. As shown in fig. 4, in the case that the accuracy of the algorithm is 100%, the maximum recall rate of the algorithm reaches 45.63%, and the maximum recall rate of the Fabmap algorithm is 30.96%; as shown in fig. 5, the maximum recall rate of the implemented algorithm reaches 68.63% when the accuracy is 100%, and the maximum recall rate of the Fabmap algorithm is 36.19%. By combining the accuracy-recall rate curve information of fig. 4 and 5, the decreasing trend of the accuracy rate of the algorithm is more gradual than that of the Fabmap algorithm in the process of gradually increasing the recall rate.

The above examples are only preferred embodiments of the present invention, it should be noted that: it will be apparent to those skilled in the art that various modifications and equivalents can be made without departing from the spirit of the invention, and it is intended that all such modifications and equivalents fall within the scope of the invention as defined in the claims.

Claims

1. A loop detection method based on a convolutional neural network is characterized by comprising the following steps:

step 1, an image I to be detected_iInputting into a trained convolutional neural networkPerforming convolution and pooling processing, and extracting image I_iThe output vector of the pooling layer is taken as a high-dimensional descriptor vector D_iI-1, 2 … N, where the high-dimensional descriptor vector D_iDimension d, N is the image I to be detected_iThe length of the sequence;

2. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein the convolutional neural network selected in step 1 is an inclusion v3 network pre-trained in deep learning framework Tensorflow, and the network consists of 42 layers.

3. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein the Euclidean distance formula is used to describe the subvector D in the step 3_PCA-iAnd (3) carrying out similarity calculation, wherein the calculation formula is as follows:

wherein the similarity matrix Sim is an N-order square matrix, N is the sameImage to be detected I_iLength of sequence, D_PCA-mFor the current frame image I_mOf a low-dimensional descriptor vector, D_PCA-nFor the current frame image I_mCorresponding historical frame image I_nIs used to describe the subvector, | D_PCA-m||₂Is D_PCA-mL of₂Norm, | | D_PCA-n||₂Is D_PCA-nL of₂Norm, L₂Norm, i.e. the modulus of the vector, Sim (m, n) represents the current frame image I_mThe corresponding historical frame image I_nThe similarity of (c).

4. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein the calculation formula for setting the similarity threshold in step 4 is as follows:

thr_k＝1-thr_base*k，count＝1/thr_base，k＝1,2…count；

5. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein in step 5,