CN111275702A - Loop detection method based on convolutional neural network - Google Patents

Loop detection method based on convolutional neural network Download PDF

Info

Publication number
CN111275702A
CN111275702A CN202010120637.XA CN202010120637A CN111275702A CN 111275702 A CN111275702 A CN 111275702A CN 202010120637 A CN202010120637 A CN 202010120637A CN 111275702 A CN111275702 A CN 111275702A
Authority
CN
China
Prior art keywords
pca
similarity
neural network
convolutional neural
thr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010120637.XA
Other languages
Chinese (zh)
Other versions
CN111275702B (en
Inventor
程向红
高源东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202010120637.XA priority Critical patent/CN111275702B/en
Publication of CN111275702A publication Critical patent/CN111275702A/en
Application granted granted Critical
Publication of CN111275702B publication Critical patent/CN111275702B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a loop detection method based on a convolutional neural network, which comprises the steps of firstly, carrying out convolution processing and pooling processing on a picture to be detected by utilizing an integration V3 network trained by open source framework Tensorflow and extracting descriptor vectors output by the neural network; secondly, performing dimensionality reduction on the high-dimensional descriptor vector by using a principal component analysis algorithm, and calculating the similarity of the descriptor vector by using Euclidean distance; and finally, setting a similarity threshold of the descriptor vector, and drawing an accuracy-recall curve to verify the performance of the loop detection algorithm. The method for detecting the loop based on the convolutional neural network overcomes the defects of low accuracy, large calculated amount, poor real-time performance and the like of the method for detecting the loop based on the artificial features.

Description

Loop detection method based on convolutional neural network
Technical Field
The invention relates to a loop detection method based on a convolutional neural network, and belongs to the technical field of deep learning and SLAM.
Background
The mobile robot constructs a map while positioning according to the data of the vision sensor in an indoor environment, namely a vision SLAM (Simultaneous Localization and Mapping) technology, which is a key for realizing the autonomous positioning of the mobile robot. The traditional visual SLAM technology is divided into four parts, namely a visual odometer, rear-end optimization, loop detection, map construction and the like. The loop detection is an important loop in the visual SLAM system, and plays an important role in reducing the accumulated error and improving the map construction precision. Therefore, the algorithm performance of loop detection will directly affect the accuracy of map construction, and a wrong loop will cause the whole map construction to fail.
The key to loop detection is to identify the scene that the mobile robot has arrived at. And comparing the image acquired by the robot at the current position with the image at the previous position by using an appearance-based loop detection algorithm, and once the similarity between the images is higher than a set threshold value, the robot is considered to return to the position visited once again. Thus, the essence of loop detection is the image matching problem. In the current relatively mature visual SLAM system, a Bag of Words (BoW) method is generally adopted to realize loop detection. The method describes image features by extracting feature points and calculates the similarity between frame images. Although the bag-of-words model method achieves good effects in an open source SLAM framework, the bag-of-words model method is very sensitive to scene appearance changes, has the problems of large calculated amount, high mismatching rate and the like, and is highly dependent on the number of words trained offline.
In recent years, deep learning theories and technologies are rapidly developed, and the advantages of the deep learning theories and the technologies are obvious in the aspect of solving the problems of image identification and classification. The method for detecting the loopback based on the convolutional neural network can extract high-level characteristic information of the image, has stronger robustness, and can improve the accuracy and recall rate of the method for detecting the loopback.
Disclosure of Invention
The technical problem is as follows: the invention provides a loop detection method based on a convolutional neural network, which can effectively solve the problems of high mismatching rate, large calculated amount and the like of the traditional loop detection method.
The technical scheme is as follows: the invention discloses a loop detection method based on a convolutional neural network, which comprises the following steps:
step 1, an image I to be detectediInputting the data into a trained convolutional neural network, performing convolution and pooling, and extracting a high-dimensional descriptor vector DiI-1, 2 … N, where the high-dimensional descriptor vector DiDimension d, N is the image I to be detectediThe length of the sequence;
step 2, extracting the high-dimensional descriptor vector D in the step 1iPerforming dimensionality reduction by using principal component analysis method to obtain low-dimensional descriptor vector DPCA-iWherein the low-dimensional descriptor vector DPCA-iHas dimension dpca
Step 3, traversing the image sequence, and utilizing an Euclidean distance formula to describe a subvector D in a low dimensionPCA-iCarrying out similarity calculation;
step 4, setting a similarity threshold, and judging the detection result as a loop when the similarity of two low-dimensional descriptor vectors in the image sequence is smaller than the similarity threshold;
step 5, checking the output loopback information, and calculating the accuracy and the recall rate of the loopback detection algorithm;
and 6, repeating the steps 4 to 5, and drawing an accuracy-recall rate curve according to the multiple groups of calculated accuracy rates and recall rates so as to verify the performance of the loopback detection algorithm.
Further, in the method of the present invention, the convolutional neural network selected in step 1 is an inclusion v3 network pre-trained in the deep learning framework tensflow, and the network is composed of 42 layers.
Furthermore, in the method of the present invention, in step 3, the euclidean distance formula is used to reduce the descriptor vector D after dimension reduction–PCA-iAnd (3) carrying out similarity calculation, wherein the calculation formula is as follows:
Figure BDA0002392864300000021
wherein, the similarity matrix Sim is an N-order square matrix, and N is the image I to be detectediLength of sequence, DPCA-mFor the current frame image ImOf a low-dimensional descriptor vector, DPCA-nFor the current frame image ImCorresponding historical frame image InIs used to describe the subvector, | DPCA-m||2Is DPCA-mL of2Norm, | | DPCA-n||2Is DPCA-nL of2Norm, L2Norm, i.e. the modulus of the vector, Sim (m, n) represents ImAnd image InThe similarity of (c).
Further, in the method of the present invention, the calculation formula for setting the similarity threshold in step 4 is as follows:
thrk=1-thrbase*k,count=1/thrbase,k=1,2…count;
wherein, thrkRepresents a similarity threshold, count represents the number of similarity thresholds, thrbaseRepresents the step size of the similarity threshold, and k represents the number corresponding to the similarity threshold.
Further, in the method of the present invention, in step 5,
accuracy PkThe calculation formula of (2) is as follows: pk=TPk/(TPk+FPk) Wherein, TPkThe meaning of (A) is: when the threshold is thrkIn time, the actual number is the number of loops and the judgment result of the algorithm is the number of loops; FPkThe meaning of (A) is: when the threshold is thrkWhen the number of the loops is not the actual loop, the algorithm judges that the number of the loops is the result;
recall rate RkThe calculation formula of (2) is as follows: rk=TPk/(TPk+FNk) In which FNkThe meaning of (A) is: when the threshold is thrkThen, it is actually a loop but the algorithmic decision is not the number of loops.
Has the advantages that: compared with the prior art, the invention has the following advantages:
compared with the prior art, the loop detection method based on the convolutional neural network has the following advantages that: 1.
high-level characteristic information of the image is extracted through a convolutional neural network, so that the robustness of the model is ensured, and the reliability of a prediction result is improved; 2. the similarity of the images is calculated by adopting an Euclidean distance formula, and the discrimination of elements in a similarity matrix can be effectively improved.
Drawings
FIG. 1 is a block flow diagram of an embodiment of the present invention;
FIG. 2 is the loopback information of the City Center dataset after being processed by the method;
FIG. 3 is the real loopback information for the City Center dataset;
FIG. 4 is an accuracy-recall curve for the City Center-Left dataset;
FIG. 5 is an accuracy-recall curve for the City Center-Right dataset.
Detailed Description
For the purpose of illustrating the technical solutions disclosed in the present invention in detail, the following description is further provided with reference to the accompanying drawings and specific embodiments.
The platform of the implementation is Windows10 operating system, and the development environment is Pycharm2017 and Matlab 2014.
As shown in fig. 1, the method for detecting a loop based on a convolutional neural network of the present invention comprises the following specific steps:
1) to-be-detected image IiInputting the image into an Inception V3 network which is trained in advance in a deep learning framework Tensorflow, performing convolution and pooling processing, and extracting an image IiThe output vector of the pooling layer is taken as a high-dimensional descriptor vector DiI is 1,2 … N, wherein DiD, N is the length of the image sequence, the dimension d of the high-dimensional descriptor vector is determined by the dimension of the output layer of the neural network, in this embodiment 2048 is taken, and N is 1237;
2) high-dimensional descriptor vector D output by neural networkiPerforming dimensionality reduction by using principal component analysis method to obtain low-dimensional descriptor vector DPCA-iWherein the low-dimensional descriptor vector DPCA-iHas dimension dpca,dpcaDetermined by the dimension d of the high-dimensional descriptor vector, d in this example being empirically determinedpcaTaking 500;
3) using Euclidean distance formula to describe the subvector D in low dimensionPCA-iAnd (3) carrying out similarity calculation, wherein the calculation formula is as follows:
Figure BDA0002392864300000031
wherein, the similarity matrix Sim is an N-order square matrix, and N is the image I to be detectediLength of sequence, DPCA-mFor the current frame image ImLow dimensional descriptor direction ofAmount, DPCA-nFor the current frame image ImCorresponding historical frame image InIs used to describe the subvector, | DPCA-m||2Is DPCA-mL of2Norm, | | DPCA-n||2Is DPCA-nL of2Norm, L2Norm, i.e. the modulus of the vector, Sim (m, n) represents ImAnd image InThe similarity of (c).
4) The calculation formula for setting the similarity threshold is as follows:
thrk=1-thrbase*k,count=1/thrbase,k=1,2…count;
wherein, thrkRepresents a similarity threshold, count represents the number of similarity thresholds, thrbaseThe step size representing the similarity threshold is obtained by weighing the calculation amount and the accuracy according to the general experience value in this embodiment, and thr is obtainedbaseTaking 0.005, taking 200 as count, and taking k as a number corresponding to the similarity threshold; when the similarity of the two vectors is less than the similarity threshold thrkWhen the detection result is judged to be a loop.
5) Checking the output loop information and calculating the accuracy P of the loop detection algorithmkAnd recall rate Rk
Accuracy PkThe calculation formula of (2) is as follows: pk=TPk/(TPk+FPk) Wherein, TPkThe meaning of (A) is: when the threshold is thrkIn time, the actual number is the number of loops and the judgment result of the algorithm is the number of loops; FPkThe meaning of (A) is: when the threshold is thrkWhen the number of the loops is not the actual loop, the algorithm judges that the number of the loops is the result;
recall rate RkThe calculation formula of (2) is as follows: rk=TPk/(TPk+FNk) In which FNkThe meaning of (A) is: when the threshold is thrkWhen the number of actual loops is not the number of loops
6) Repeating the steps 4) to 5), and calculating multiple groups of accuracy rates P according to the calculated multiple groups of accuracy rateskAnd recall rate RkDrawing an accuracy-recall curve to verify the detection algorithm of the loopCan be used.
Specific examples are as follows:
the data set used in this example is the City Center at oxford university, which has a total of 2474 images, with odd frames representing the sequence of images taken by the left camera and even frames representing the sequence of images taken by the right camera. In this embodiment, 1237 images of the left camera and the right camera are respectively input into the neural network for loop detection.
To further prove the effectiveness of the algorithm of the embodiment, the performance of the algorithm is compared with that of the Fabmap algorithm, which is a very classical loop detection algorithm based on a bag-of-words model. FIG. 2 is the loopback information of the City Center dataset after being processed by the method; FIG. 3 is the real loopback information for the City Center dataset.
FIG. 4 is an accuracy-recall curve for a left camera image sequence of the City Center dataset and FIG. 5 is an accuracy-recall curve for a right camera image sequence of the City Center dataset. As shown in fig. 4, in the case that the accuracy of the algorithm is 100%, the maximum recall rate of the algorithm reaches 45.63%, and the maximum recall rate of the Fabmap algorithm is 30.96%; as shown in fig. 5, the maximum recall rate of the implemented algorithm reaches 68.63% when the accuracy is 100%, and the maximum recall rate of the Fabmap algorithm is 36.19%. By combining the accuracy-recall rate curve information of fig. 4 and 5, the decreasing trend of the accuracy rate of the algorithm is more gradual than that of the Fabmap algorithm in the process of gradually increasing the recall rate.
The above examples are only preferred embodiments of the present invention, it should be noted that: it will be apparent to those skilled in the art that various modifications and equivalents can be made without departing from the spirit of the invention, and it is intended that all such modifications and equivalents fall within the scope of the invention as defined in the claims.

Claims (5)

1. A loop detection method based on a convolutional neural network is characterized by comprising the following steps:
step 1, an image I to be detectediInputting into a trained convolutional neural networkPerforming convolution and pooling processing, and extracting image IiThe output vector of the pooling layer is taken as a high-dimensional descriptor vector DiI-1, 2 … N, where the high-dimensional descriptor vector DiDimension d, N is the image I to be detectediThe length of the sequence;
step 2, extracting the high-dimensional descriptor vector D in the step 1iPerforming dimensionality reduction by using principal component analysis method to obtain low-dimensional descriptor vector DPCA-iWherein the low-dimensional descriptor vector DPCA-iHas dimension dpca
Step 3, traversing the image sequence, and utilizing an Euclidean distance formula to describe a subvector D in a low dimensionPCA-iCarrying out similarity calculation;
step 4, setting a similarity threshold, and judging the detection result as a loop when the similarity of two low-dimensional descriptor vectors in the image sequence is smaller than the similarity threshold;
step 5, checking the output loopback information, and calculating the accuracy and the recall rate of the loopback detection algorithm;
and 6, repeating the steps 4 to 5, and drawing an accuracy-recall rate curve according to the multiple groups of calculated accuracy rates and recall rates so as to verify the performance of the loopback detection algorithm.
2. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein the convolutional neural network selected in step 1 is an inclusion v3 network pre-trained in deep learning framework Tensorflow, and the network consists of 42 layers.
3. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein the Euclidean distance formula is used to describe the subvector D in the step 3PCA-iAnd (3) carrying out similarity calculation, wherein the calculation formula is as follows:
Figure FDA0002392864290000011
wherein the similarity matrix Sim is an N-order square matrix, N is the sameImage to be detected IiLength of sequence, DPCA-mFor the current frame image ImOf a low-dimensional descriptor vector, DPCA-nFor the current frame image ImCorresponding historical frame image InIs used to describe the subvector, | DPCA-m||2Is DPCA-mL of2Norm, | | DPCA-n||2Is DPCA-nL of2Norm, L2Norm, i.e. the modulus of the vector, Sim (m, n) represents the current frame image ImThe corresponding historical frame image InThe similarity of (c).
4. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein the calculation formula for setting the similarity threshold in step 4 is as follows:
thrk=1-thrbase*k,count=1/thrbase,k=1,2…count;
wherein, thrkRepresents a similarity threshold, count represents the number of similarity thresholds, thrbaseRepresents the step size of the similarity threshold, and k represents the number corresponding to the similarity threshold.
5. The convolutional neural network-based loopback detection method as claimed in claim 1, wherein in step 5,
accuracy PkThe calculation formula of (2) is as follows: pk=TPk/(TPk+FPk) Wherein, TPkThe meaning of (A) is: when the threshold is thrkIn time, the actual number is the number of loops and the judgment result of the algorithm is the number of loops; FPkThe meaning of (A) is: when the threshold is thrkWhen the number of the loops is not the actual loop, the algorithm judges that the number of the loops is the result;
recall rate RkThe calculation formula of (2) is as follows: rk=TPk/(TPk+FNk) In which FNkThe meaning of (A) is: when the threshold is thrkThen, it is actually a loop but the algorithmic decision is not the number of loops.
CN202010120637.XA 2020-02-26 2020-02-26 Loop detection method based on convolutional neural network Active CN111275702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010120637.XA CN111275702B (en) 2020-02-26 2020-02-26 Loop detection method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010120637.XA CN111275702B (en) 2020-02-26 2020-02-26 Loop detection method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN111275702A true CN111275702A (en) 2020-06-12
CN111275702B CN111275702B (en) 2022-11-18

Family

ID=71004133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010120637.XA Active CN111275702B (en) 2020-02-26 2020-02-26 Loop detection method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111275702B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070122A (en) * 2020-08-14 2020-12-11 五邑大学 Classification method and device of slam map and storage medium
CN113377987A (en) * 2021-05-11 2021-09-10 重庆邮电大学 Multi-module closed-loop detection method based on ResNeSt-APW
CN117237858A (en) * 2023-11-15 2023-12-15 成都信息工程大学 Loop detection method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533661A (en) * 2019-09-04 2019-12-03 电子科技大学 Adaptive real-time closed-loop detection method based on characteristics of image cascade
CN110555881A (en) * 2019-08-29 2019-12-10 桂林电子科技大学 Visual SLAM testing method based on convolutional neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555881A (en) * 2019-08-29 2019-12-10 桂林电子科技大学 Visual SLAM testing method based on convolutional neural network
CN110533661A (en) * 2019-09-04 2019-12-03 电子科技大学 Adaptive real-time closed-loop detection method based on characteristics of image cascade

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余宇等: "基于深度学习的视觉SLAM回环检测方法", 《计算机工程与设计》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070122A (en) * 2020-08-14 2020-12-11 五邑大学 Classification method and device of slam map and storage medium
CN112070122B (en) * 2020-08-14 2023-10-17 五邑大学 Classification method, device and storage medium of slam map
CN113377987A (en) * 2021-05-11 2021-09-10 重庆邮电大学 Multi-module closed-loop detection method based on ResNeSt-APW
CN117237858A (en) * 2023-11-15 2023-12-15 成都信息工程大学 Loop detection method
CN117237858B (en) * 2023-11-15 2024-03-12 成都信息工程大学 Loop detection method

Also Published As

Publication number Publication date
CN111275702B (en) 2022-11-18

Similar Documents

Publication Publication Date Title
CN109443382B (en) Visual SLAM closed loop detection method based on feature extraction and dimension reduction neural network
CN111275702B (en) Loop detection method based on convolutional neural network
CN110070074B (en) Method for constructing pedestrian detection model
CN111832484B (en) Loop detection method based on convolution perception hash algorithm
CN110781838A (en) Multi-modal trajectory prediction method for pedestrian in complex scene
CN109341703B (en) Visual SLAM algorithm adopting CNNs characteristic detection in full period
CN110555881A (en) Visual SLAM testing method based on convolutional neural network
CN107330357A (en) Vision SLAM closed loop detection methods based on deep neural network
CN112418095A (en) Facial expression recognition method and system combined with attention mechanism
CN113139470B (en) Glass identification method based on Transformer
CN109087337B (en) Long-time target tracking method and system based on hierarchical convolution characteristics
CN111881731A (en) Behavior recognition method, system, device and medium based on human skeleton
CN113313763A (en) Monocular camera pose optimization method and device based on neural network
CN112419317B (en) Visual loop detection method based on self-coding network
CN110533661A (en) Adaptive real-time closed-loop detection method based on characteristics of image cascade
CN111027555B (en) License plate recognition method and device and electronic equipment
CN109446897B (en) Scene recognition method and device based on image context information
CN111507320A (en) Detection method, device, equipment and storage medium for kitchen violation behaviors
CN113312973A (en) Method and system for extracting features of gesture recognition key points
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium
CN116188825A (en) Efficient feature matching method based on parallel attention mechanism
CN114861761A (en) Loop detection method based on twin network characteristics and geometric verification
CN111578956A (en) Visual SLAM positioning method based on deep learning
CN108960347B (en) System and method for evaluating effect of handwriting recognition sequencing stability of convolutional neural network
Liu et al. Facial landmark localization in the wild by backbone-branches representation learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant