CN113155140B

CN113155140B - Robot SLAM method and system used in outdoor characteristic sparse environment

Info

Publication number: CN113155140B
Application number: CN202110350778.5A
Authority: CN
Inventors: 陈卫东; 田琛晟; 王景川
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2022-08-02
Anticipated expiration: 2041-03-31
Also published as: CN113155140A

Abstract

The invention provides a mobile robot vision inertia SLAM method and a system used in an outdoor characteristic sparse environment, which comprises the following steps: acquiring an environment image in an outdoor characteristic sparse environment, and preprocessing the environment image to obtain a preprocessed environment image; extracting sparse features of the preprocessed environment image through a partitioning SIFT feature extraction algorithm; estimating the inter-frame motion of the robot through the pre-integration component of the inertial unit IMU, estimating the inter-frame displacement of the matching point, judging the importance of sparse features through the displacement, and calculating to obtain a visual reprojection error; obtaining an IMU error according to the variance accumulation in the IMU pre-integration process; constructing a loss function according to the visual reprojection error and the IMU error; and (3) minimizing a loss function through a nonlinear optimization method, and solving the pose transformation and map point space coordinates of the robot. The invention can reduce the influence of the difficulties of severe illumination change, sparse characteristics and the like on the positioning performance and improve the autonomy of the mobile robot in a complex outdoor environment.

Description

Robot SLAM method and system used in outdoor characteristic sparse environment

Technical Field

The invention relates to a robot instant positioning and mapping algorithm, in particular to a robot SLAM method and a system used in an outdoor characteristic sparse environment, and more particularly to a robot visual inertia instant positioning and mapping algorithm used in the outdoor characteristic sparse environment.

Background

The problem of SLAM is generally directed to scenes such as rooms and streets with relatively closed environments and rich environmental features. In an outdoor characteristic sparse environment (such as a desert, a beach, a stone beach, a planet, a moon surface and the like), the real-time positioning and mapping problems of the robot relate to autonomous positioning, safety, task planning and the like of the robot in the scene, and particularly, the autonomous positioning capability of the robot is particularly critical under the condition of no GPS satellite signal (accurate absolute positioning cannot be carried out). IMU inertial measurement unit is often used for the autonomic relative positioning problem of robot, nevertheless can produce certain drift error, uses vision and inertial sensor information fusion, then can revise the drift error that is produced by IMU through the vision, and richer environmental information can be gathered to the camera simultaneously, and the robot of being convenient for realizes more complicated function. In conclusion, a robot real-time positioning mapping algorithm aiming at outdoor characteristic sparse scenes is urgently needed.

The publication number is CN110309834A (application number: 201910393343.1), which discloses an SLAM algorithm for outdoor off-line navigation, and a certain number of feature points are obtained through a local adaptive threshold adjustment algorithm, so that the problem of insufficient accuracy of feature point extraction under the condition of objective factors such as shadow, weak illumination, sudden noise and the like existing in an image of a field scene is solved. The method does not fundamentally solve the problems of difficult feature extraction and severe illumination change existing in outdoor feature sparse environments, and aims at reducing feature instability caused by feature detection threshold values in sparse environments, so that larger errors can be caused; and the description of the visual features is fuzzy by using the multi-scale detection method, and the weight of the image features in optimization cannot be accurately measured.

Patent document CN109166149A (application number: 201810917346.6) discloses a method and system for fusing binocular camera and IMU positioning and three-dimensional wireframe structure reconstruction. On the basis of binocular vision, the method utilizes a divide-and-conquer strategy to initialize and fuse inertial measurement information, implements tracking positioning and mapping, and can stably operate in indoor and outdoor environments and complex motion conditions. And on the basis of accurate positioning, reconstructing and iteratively optimizing a three-dimensional wire frame based on the pose optimized by the key frame. And matching the straight line segments through the local features and the space geometric constraint and back projecting the straight line segments into a three-dimensional space. The straight line segments are divided into different sets by angle and distance constraints. Based on the grouping results, fitting regions are determined and straight line segments are merged. And finally outputting the three-dimensional wire frame structure.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a robot SLAM method and system used in an outdoor characteristic sparse environment.

The invention provides a robot SLAM method used in an outdoor characteristic sparse environment, which comprises the following steps:

step S1: acquiring an image in an outdoor characteristic sparse environment, and preprocessing the acquired image to obtain a preprocessed characteristic sparse environment image;

step S2: extracting sparse point features of the preprocessed feature sparse environment image through a blocking SIFT feature extraction algorithm;

step S3: judging the importance of the sparse features through the pre-integration quantity of the inertial unit IMU, and calculating to obtain a visual reprojection error;

step S4: obtaining an IMU error according to the variance accumulation in the IMU pre-integration process;

step S5: constructing a loss function according to the visual reprojection error and the IMU error;

step S6: and (3) minimizing a loss function through a nonlinear optimization method, and solving the pose transformation and map point space coordinates of the robot.

Preferably, the step S1 includes:

step S1.1: acquiring an image in an outdoor characteristic sparse environment, performing contrast stretching on a local image block by using a CLAHE local histogram equalization algorithm, eliminating a block boundary effect, and obtaining an image subjected to local equalization processing;

step S1.2: and stretching the contrast of the whole image by using a histogram equalization algorithm on the image subjected to the local equalization processing to obtain a preprocessed characteristic sparse environment image.

Preferably, the step S2 includes:

step S2.1: dividing the preprocessed feature sparse environment image into a preset number of small blocks, and extracting SIFT features from the preset number of small blocks respectively;

step S2.2: according to the extracted coordinates of the SIFT feature points under the whole image, performing quadtree distribution in the whole image until the number of the SIFT feature points in all the image blocks subjected to the quadtree distribution is not more than a preset value;

step S2.3: and calculating Harris response values for the SIFT feature points in all the image blocks, and reserving one SIFT feature point with the maximum Harris response value in each image block.

Preferably, the step S3 includes:

step S3.1: pre-integrating IMU data between front and back frames to describe pose transformation of the robot in time periods of the front and back frames;

step S3.2: according to pose transformation of the robot in time periods of front and rear frames and a camera imaging model, feature points in images of the front frame of the robot are projected to images of the rear frame, the pixel displacement distance of the feature points is calculated, and a corresponding image feature weight matrix is calculated according to the pixel displacement distance of the feature points.

Preferably, said step S3.1 comprises: using discrete IMU data pre-integration calculation to obtain the motion estimation between the previous frame and the next frame, wherein the calculation formula is as follows:

wherein w represents an estimated value of the angular velocity of the robot;

actual measurement data representing the angular velocity of the robot at time k;

representing the deviation of the robot angular velocity measurement at time k,

representing the cumulative amount of rotation of the robot from time i to time k; δ t represents the length of time from k to time k + 1;

representing a right multiplication operation in a lie algebra; a represents an estimated value of the acceleration of the robot;

actual measurement values representing robot acceleration at time k;

representing the deviation of the measured value of the robot acceleration at the moment k;

representing the displacement predicted component of the robot from the time i to the time k;

representing the robot speed pre-integral quantity at the k moment;

representing a gaussian error in the acceleration measurement unit;

representing a gaussian error in an angular velocity measurement unit; δ t ² Representing the square of the length of time from time k to time k + 1.

Preferably, said step S3.2 comprises:

step S3.2.1: a characteristic point set P monitored at the k-1 moment _k-1 ＝{U _k-1 ,V _k-1 ,D _k-1 Matching with the characteristic points monitored at the moment k, wherein:

U _k-1 ＝{u ₁ ,u ₂ ,...,u _N } (8)

V _k-1 ＝{v ₁ ,v ₂ ,...,v _N } (9)

D _k-1 ＝{d ₁ ,d ₂ ,...,d _N } (10)

wherein, U _k ，V _k ,D _k A set of pixel coordinates and depth values representing the feature points at time k;

step S3.2.2: according to the position and posture estimation at the time k obtained by pre-integration, the matched feature points at the time k-1 are projected to a camera coordinate system at the time k, and the calculation formula is as follows:

P _c,k ＝T _c,k,k-1 P _c,k-1 (11)

T _c,k,k-1 ＝T _cb T _k,k-1 T _bc (12)

T _k,k-1 ＝I _k (13)

wherein, P _c,k Representing the estimated value of the coordinates of the robot at the moment k under the coordinate system of the matched feature point camera; t is _c,k,k-1 A transformation matrix representing a robot camera coordinate system from the time k-1 to the time k; i is _k Is an attitude transformation estimate obtained by pre-integration; t is _cb A transformation matrix representing a robot body coordinate system to a camera coordinate system; t is _bc A transformation matrix representing a robot camera coordinate system to a body coordinate system; t is _bc Including a rotation transformation matrix R _bc And translating the transformed vector t _bc ；T _cb The same process is carried out; p _c,k-1 The coordinates of the characteristic point camera coordinate system at the moment of k-1 are obtained; t is t _k Representing translation transformation vectors of the robot from the k-1 moment to the k moment; -t _k Representing a translation transformation vector from the k moment to the k-1 moment;

the robot rotation transformation matrix representing the time from k time to k-1 time is a rotation transformation matrix R from k-1 time to k time _k The inverse matrix of (d);

and obtaining according to the inverse extrapolation of the camera projection model:

wherein f is _x 、f _y 、c _x 、c _y Is an intrinsic parameter of the camera;

step S3.2.3: converting the matching points projected to the k moment under the camera coordinate system into pixel coordinates, and calculating the offset between the front frame and the rear frame under the corresponding characteristic point pixel coordinate system, wherein the calculation formula is as follows:

Δ＝(U _k ,V _k )-(U _k-1 ,V _k-1 ) (18)

wherein K represents an internal reference matrix of the camera

Delta represents the offset of the corresponding characteristic point pixel coordinate system between the previous frame and the next frame;

step S3.2.4: calculating error loss weight of corresponding matching feature points according to the distance delta, and describing through the form of an information covariance matrix H _c ：

Wherein λ represents a relative IMU weight adjustment parameter; f represents a weight calculation function, and the calculation formula is as follows:

wherein s represents a visual error relative weight distribution adjustment parameter;

step S3.2.5: by means of a covariance matrix H _c Inverting to obtain a weight W _c A matrix;

preferably, the step S6 includes: solving pose transformation and map point space coordinates of the robot by a nonlinear optimization method;

wherein,

representing the visual reprojection error of the jth matching point at the moment i; w _c Is a weight matrix of visual reprojection error terms;

an IMU error representing time i;

an IMU error weight representing time i;

representing the reprojection deviation corresponding to the jth image characteristic point at the moment i;

representing the weight corresponding to the reprojection deviation of the jth image feature point at the moment i;

representing the reprojection deviation corresponding to the jth image characteristic point at the time i;

and

the same, the transposition relation is formed; k represents the total number of time instants at which the error term is incorporated; f (k) represents a set of all image feature points at time k;

a transpose of the IMU error vector representing time i;

and

similarly, a transposed form is written for matrix computation.

Preferably, the nonlinear optimization method in step S6 includes: levenberg-marquardt algorithm and gauss-newton method.

According to the invention, the robot SLAM system used in the outdoor characteristic sparse environment comprises:

module M1: acquiring an image in an outdoor characteristic sparse environment, and preprocessing the acquired image to obtain a preprocessed characteristic sparse environment image;

module M2: extracting sparse point features of the preprocessed feature sparse environment image through a blocking SIFT feature extraction algorithm;

module M3: judging the importance of the sparse features through the pre-integration quantity of the inertial unit IMU, and calculating to obtain a visual reprojection error;

module M4: obtaining an IMU error according to the variance accumulation in the IMU pre-integration process;

module M5: constructing a loss function according to the visual reprojection error and the IMU error;

module M6: and (3) minimizing a loss function through a nonlinear optimization method, and solving the pose transformation and map point space coordinates of the robot.

Preferably, said module M1 comprises:

module M1.1: acquiring an image in an outdoor characteristic sparse environment, performing contrast stretching on a local image block by using a CLAHE local histogram equalization algorithm, eliminating a block boundary effect, and obtaining an image subjected to local equalization processing;

module M1.2: stretching the contrast of the whole image by using a histogram equalization algorithm on the image subjected to the local equalization processing to obtain a preprocessed characteristic sparse environment image;

the module M2 includes:

module M2.1: dividing the preprocessed feature sparse environment image into a preset number of small blocks, and extracting SIFT features from the preset number of small blocks respectively;

module M2.2: according to the extracted coordinates of the SIFT feature points under the whole image, and performing quadtree distribution in the whole image until the number of the SIFT feature points in all the image blocks subjected to the quadtree distribution is not more than a preset value;

module M2.3: calculating Harris response values of the SIFT feature points in all the image blocks, and reserving a SIFT feature point with the maximum Harris response value in each image block;

the module M3 includes:

module M3.1: pre-integrating IMU data between front and back frames to describe pose transformation of the robot in time periods of the front and back frames;

module M3.2: according to pose transformation of the robot in time periods of front and rear frames and a camera imaging model, feature points in images of the front frame of the robot are projected to images of the rear frame, the pixel displacement distance of the feature points is calculated, and a corresponding image feature weight matrix is calculated according to the pixel displacement distance of the feature points.

Compared with the prior art, the invention has the following beneficial effects:

1. according to the robot instant positioning and mapping algorithm under the outdoor characteristic sparse environment, the influence of illumination with violent change on image characteristic extraction is effectively eliminated and weakened by designing an image preprocessing algorithm, more uniform and robust sparse characteristic points are effectively extracted by a blocking SIFT characteristic extraction algorithm, and finally the problem of information quantity provided by image characteristics in a wide-scale scene is considered, so that the weight occupied by visual loss in the pose and map point bundle optimization problem is optimized based on an IMU (inertial measurement unit) pre-integration theory;

2. the method fully utilizes the existing information in actual use, is simple in calculation, can complete the positioning of the robot and the construction of the sparse point map in real time, and has high operation efficiency;

3. the robot positioning and mapping algorithm provided by the invention has the advantages of improving the positioning capability of the robot in a characteristic sparse scene lacking GPS absolute positioning information, avoiding accidents caused by the positioning problem of the robot, effectively assisting related personnel to carry out task planning and other works, and being suitable for various fields of civil use, industry, military use and the like;

4. the visual inertia SLAM algorithm provided by the invention can improve the positioning capability of the robot in an outdoor characteristic sparse environment and reduce the influence of the system caused by the tracking failure problem caused by severe illumination change and difficult characteristic extraction;

5. the invention can be widely applied to the scenes such as deserts, sand beaches, stone beaches, planets, moon surfaces and the like, and has stronger popularization.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a system framework diagram of the present invention;

FIG. 2 is a schematic diagram of block feature detection;

FIG. 3 is a schematic diagram of SIFT feature quadtree allocation and non-maximum suppression;

FIG. 4 is a schematic displacement diagram of a far and near feature point in a linear motion under a pixel coordinate system;

fig. 5 is a schematic displacement diagram of the far and near feature points in the rotation motion under the pixel coordinate system.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the concept of the invention. All falling within the scope of the present invention.

Example 1

The invention aims at the outdoor characteristic sparse environment, provides a method for eliminating the influence of severe change of illumination conditions on scene image information by image preprocessing, obtains more robust and uniform characteristics by improving a characteristic extraction algorithm, optimizes a visual loss function by using a visual loss weight optimization algorithm based on IMU (inertial measurement unit) pre-integration on the globally extracted image characteristics and further obtains more accurate robot positioning and mapping results aiming at the problems existing in the current robot instant positioning and mapping algorithm.

According to the robot SLAM method for the outdoor characteristic sparse environment provided by the invention, as shown in figures 1 to 5, the method comprises the following steps:

according to the invention, the system needs to input the collected environment image data and the IMU inertial sensor measurement data of the robot, the visual image data is transmitted according to the frame rate of the camera, the IMU data between two frames is accumulated in the memory of the computer, and the IMU data is input together with the image data and is marked with the corresponding timestamp. After inputting the data, the algorithm will perform the following steps:

step S4: obtaining an IMU error according to the variance accumulation in the IMU pre-integration process; the IMU error weight term is transmitted through variance in the IMU pre-integration process, and the weight is obtained when the variance is larger;

Specifically, the step S1 includes:

because the characteristic sparse environment is generally influenced by the change of environmental illumination, even glare or sensor failure caused by overexposure of images can occur, the algorithm is required to have certain self-adaptive capacity aiming at the images, and the influence of the environmental illumination on the image characteristic extraction effect is eliminated or weakened as much as possible; in addition, most color features tend to be consistent in an outdoor feature sparse environment (such as sand, rock beach, planet surface and the like), the contrast is weak, and the contrast of scene images and the like needs to be improved, so that the extracted features are more stable and robust, and a targeted image preprocessing algorithm needs to be designed. According to the image requirement, the method is completed by the following steps:

step S1.1: acquiring an image in an outdoor characteristic sparse environment, performing contrast stretching on a local image block by using a CLAHE local histogram equalization algorithm, eliminating a block boundary effect, and obtaining an image subjected to local equalization processing; the method aims to eliminate the phenomenon that the image is locally over-bright and over-dark so as to obtain an image with higher local contrast;

step S1.2: and stretching the contrast of the whole image by using a histogram equalization algorithm on the image subjected to the local equalization processing to ensure that the contrast of brightness and darkness of the image is higher, so as to obtain a preprocessed characteristic sparse environment image.

After the image is processed, a sparse feature point detection algorithm is needed for the feature points to have higher uniformity and robustness. For example, when the land texture is uneven in a scene, the full-image features are often concentrated in image blocks with rich textures, and a part of information of the image is excessively depended on.

Specifically, according to the requirements of uniformity and robustness of the sparse feature points of the image, the step S2 includes:

step S2.1: dividing the preprocessed feature sparse environment image into a preset number of small blocks, and extracting SIFT features from the preset number of small blocks respectively; the purpose of extracting features in the small blocks is to ensure that each part of the image contains a certain number of feature points;

step S2.2: after extraction is finished, calculating the coordinate of each feature point under the whole image, and performing quadtree distribution in the whole image according to the extracted coordinate of the SIFT feature point under the whole image, wherein the distribution aims to solve the problem that local feature points are too concentrated;

because the length of the image is about 2 times of the width, in order to make the image blocks present a square shape as much as possible, the image is selected to be divided into a left image block and a right image block, when the number of the feature points in the image blocks is more than or equal to 4, the image is divided into four small image blocks in an average manner from left to right and from top to bottom, and as shown in fig. 3, the steps are repeated until the number of the feature points in all the image blocks is not more than 4;

In consideration of wide feature point scale distribution in a feature sparse scene, under different motion states of the robot, the information amount brought to the pose estimation process by different feature points is very different, for example, a far point in linear motion hardly moves, and a near point can well reflect the motion of the robot. It is therefore desirable to increase the weight that the visual loss of the significant feature points occupies in the motion estimation.

Specifically, the step S3 includes:

When the pose transformation of the robot is known, the information quantity provided by the feature points at different distances can be known, the motion condition of the robot is not known in advance, and the approximate motion of the robot is described by using the IMU pre-integration quantity.

In particular, said step S3.1 comprises: using discrete IMU data pre-integration calculation to obtain the motion estimation between the previous frame and the next frame, wherein the calculation formula is as follows:

wherein w represents an estimated value of the angular velocity of the robot;

indicating the deviation of the robot angular velocity measurement at time k,

representing a right multiplication operation in a lie algebra; a represents an estimated value of robot acceleration;

actual measurement values representing robot acceleration at time k;

representing the robot speed pre-integral quantity at the k moment;

representing a gaussian error in the acceleration measurement unit;

representing a gaussian error in the angular velocity measurement unit; δ t ² Representing the square of the length of time from time k to time k + 1.

In particular, said step S3.2 comprises:

step S3.2.1: a characteristic point set P monitored at the k-1 moment _k-1 ＝{U _k-1 ,V _k-1 ,D _k-1 Matching with the characteristic points monitored at the time k, wherein:

U _k-1 ＝{u ₁ ,u ₂ ,...,u _N } (8)

V _k-1 ＝{v ₁ ,v ₂ ,...,v _N } (9)

D _k-1 ＝{d ₁ ,d ₂ ,...,d _N } (10)

P _c,k ＝T _c,k,k-1 P _c,k-1 (11)

T _c,k,k-1 ＝T _cb T _k,k-1 T _bc (12)

T _k,k-1 ＝I _k (13)

wherein, P _c,k Representing the estimated value of the coordinates of the robot at the moment k under the coordinate system of the matched feature point camera; t is _c,k,k-1 A transformation matrix representing a robot camera coordinate system from the time k-1 to the time k; i is _k Is an attitude transformation estimate obtained by pre-integration; t is _cb A transformation matrix representing a robot body coordinate system to a camera coordinate system; t is _bc A transformation matrix representing a robot camera coordinate system to a body coordinate system; t is _bc Including a rotation transformation matrix R _bc And translating the transformed vector t _bc ；T _cb The same process is carried out; p is _c,k-1 The coordinates of the characteristic point camera coordinate system at the moment of k-1 are obtained; t is t _k Representing translation transformation vectors of the robot from the k-1 moment to the k moment; -t _k Representing a translation transformation vector from the k moment to the k-1 moment;

Δ＝(U _k ,V _k )-(U _k-1 ,V _k-1 ) (18)

wherein K represents an internal reference matrix of the camera

specifically, the step S6 includes: solving pose transformation and map point space coordinates of the robot by a nonlinear optimization method;

wherein,

representing the visual reprojection error of the jth matching point at the moment i; w _c Is due to visionA weight matrix for the perceptual projection error term;

representing the IMU error at time i;

an IMU error weight representing time i;

and

a transpose of the IMU error vector representing time i;

and

the same, writing the matrix calculation into a transposed form; and the optimal pose transformation and map point coordinates of the robot can be obtained by minimizing J.

Specifically, the nonlinear optimization method in step S6 includes: levenberg-marquardt algorithm and gauss-newton method. The optimization method can be selected and used according to actual conditions and effects.

module M4: obtaining an IMU error according to the variance accumulation in the IMU pre-integration process; the IMU error weight term is transmitted through variance in the IMU pre-integration process, and the weight is obtained when the variance is larger;

Specifically, the module M1 includes:

because the characteristic sparse environment is generally influenced by the change of environmental illumination, even glare or sensor failure caused by overexposure of images can occur, the algorithm has certain self-adaption capability aiming at the images, and the influence of the environmental illumination on the image characteristic extraction effect is eliminated or weakened as much as possible; in addition, most color features tend to be consistent in an outdoor feature sparse environment (such as sand, rock beach, planet surface and the like), the contrast is weak, and the contrast of scene images and the like needs to be improved, so that the extracted features are more stable and robust, and a targeted image preprocessing algorithm needs to be designed.

Module M1.1: acquiring an image in an outdoor characteristic sparse environment, performing contrast stretching on a local image block by using a CLAHE local histogram equalization algorithm, eliminating a block boundary effect, and obtaining an image subjected to local equalization processing; the method aims to eliminate the phenomenon that the image is locally over-bright and over-dark so as to obtain an image with higher local contrast;

module M1.2: and stretching the contrast of the whole image by using a histogram equalization algorithm on the image subjected to the local equalization processing to ensure that the contrast of brightness and darkness of the image is higher, so as to obtain a preprocessed characteristic sparse environment image.

Specifically, according to the requirements of uniformity and robustness of the sparse feature points of the image, the module M2 includes:

module M2.1: dividing the preprocessed feature sparse environment image into a preset number of small blocks, and extracting SIFT features from the preset number of small blocks respectively; the purpose of extracting features in the small blocks is to ensure that each part of the image contains a certain number of feature points;

module M2.2: after extraction is finished, calculating the coordinate of each feature point under the whole image, and performing quadtree distribution in the whole image according to the extracted coordinate of the SIFT feature point under the whole image, wherein the distribution aims to solve the problem that local feature points are too concentrated;

because the length of the image is about 2 times of the width, in order to make the image blocks present a square shape as much as possible, the image is selected to be divided into a left image block and a right image block, when the number of feature points in the image blocks is greater than or equal to 4, the image is divided into four small image blocks in an average manner from left to right and from top to bottom, and as shown in fig. 3, the starting execution is repeated until the number of feature points in all the image blocks is not greater than 4;

module M2.3: and calculating Harris response values for the SIFT feature points in all the image blocks, and reserving one SIFT feature point with the maximum Harris response value in each image block.

Considering that the feature points in the feature sparse scene are widely distributed, the information quantity brought to the pose estimation process by different feature points is very different under different motion states of the robot, for example, a far point in linear motion hardly moves, and a near point can well reflect the motion of the robot. It is therefore desirable to increase the weight that the visual loss of significant feature points occupies in motion estimation.

In particular, the module M3 comprises:

In particular, said module M3.1 comprises: using discrete IMU data pre-integral calculation to obtain the motion estimation between the previous frame and the next frame, wherein the calculation formula is as follows:

wherein w represents an estimated value of the angular velocity of the robot;

indicating the deviation of the robot angular velocity measurement at time k,

actual measurement values representing robot acceleration at time k;

the displacement prediction component represents the displacement of the robot from the moment i to the moment k;

representing the robot speed pre-integral quantity at the k moment;

representing a gaussian error in the acceleration measurement unit;

In particular, said module M3.2 comprises:

module M3.2.1: a characteristic point set P monitored at the k-1 moment _k-1 ＝{U _k-1 ,V _k-1 ,D _k-1 Matching with the characteristic points monitored at the moment k, wherein:

U _k-1 ＝{u ₁ ,u ₂ ,...,u _N } (8)

V _k-1 ＝{v ₁ ,v ₂ ,...,v _N } (9)

D _k-1 ＝{d ₁ ,d ₂ ,...,d _N } (10)

module M3.2.2: according to the position and posture estimation at the time k obtained by pre-integration, the matched feature points at the time k-1 are projected to a camera coordinate system at the time k, and the calculation formula is as follows:

P _c,k ＝T _c,k,k-1 P _c,k-1 (11)

T _c,k,k-1 ＝T _cb T _k,k-1 T _bc (12)

T _k,k-1 ＝I _k (13)

wherein, P _c,k Representing the estimated value of the coordinates of the robot at the moment k under the coordinate system of the matched feature point camera; t is a unit of _c,k,k-1 A transformation matrix representing a robot camera coordinate system from the time k-1 to the time k; i is _k Is an attitude transformation estimate obtained by pre-integration; t is _cb A transformation matrix representing a robot body coordinate system to a camera coordinate system; t is _bc A transformation matrix representing a robot camera coordinate system to a body coordinate system; t is _bc Including a rotation transformation matrix R _bc And translating the transformed vector t _bc ；T _cb The same process is carried out; p _c,k-1 The coordinates of the characteristic point camera coordinate system at the moment of k-1 are obtained; t is t _k Representing translation transformation vectors of the robot from the k-1 moment to the k moment; -t _k Representing a translation transformation vector from the k moment to the k-1 moment;

module M3.2.3: converting the matching points projected to the k moment under the camera coordinate system into pixel coordinates, and calculating the offset between the front frame and the rear frame under the corresponding characteristic point pixel coordinate system, wherein the calculation formula is as follows:

Δ＝(U _k ,V _k )-(U _k-1 ,V _k-1 ) (18)

wherein K represents an internal reference matrix of the camera

module M3.2.4: calculating error loss weight of corresponding matching feature points according to the distance delta, and describing through the form of an information covariance matrix H _c ：

module M3.2.5: by means of a covariance matrix H _c Inverting to obtain a weight W _c A matrix;

specifically, the module M6 includes: solving pose transformation and map point space coordinates of the robot by a nonlinear optimization method;

wherein,

representing the IMU error at time i;

an IMU error weight representing time i;

and

a transpose of the IMU error vector representing time i;

and

Specifically, the nonlinear optimization method in the module M6 includes: levenberg-marquardt algorithm and gauss-newton method. The optimization method can be selected and used according to actual conditions and effects.

Example 2

Example 2 is a modification of example 1

The positioning and mapping algorithm comprises (1) image preprocessing in an outdoor characteristic sparse environment; (2) extracting the blocking SIFT features; (3) visual loss weight optimization three parts based on IMU pre-integration are composed of the following steps:

step 1.1: a CLAHE local histogram equalization algorithm is used for the image in the outdoor characteristic sparse environment;

step 1.2: and using a histogram equalization algorithm for the image after the local equalization processing.

Step 2.1: dividing the image into small blocks, and extracting SIFT features in the small blocks respectively;

step 2.2: performing quadtree distribution on all the features, wherein a left image block and a right image block exist in an initial image, when the number of feature points in the image blocks is not less than 4, the image blocks are divided into four small image blocks in an average manner from top to bottom, and the steps are repeated until the number of the feature points in all the image blocks is not more than 4;

step 2.3: calculating Harris response values for the distributed characteristic points, and requiring that at most one characteristic point with the maximum response value can be reserved in each small image block;

step 3.1: estimating pose transformation of the robot in an inter-frame time period of two frames of images, specifically comprising:

the discrete IMU data is used for pre-integral calculation to obtain the motion estimation between the previous frame and the next frame, and the calculation formula is as follows:

in the formula: omega represents an estimate of the angular velocity of the robot,

actual measurement data representing the angular velocity of the robot at time k,

represents the deviation of the robot angular velocity measurement at time k (subject to a gaussian walking model),

indicating the cumulative amount of rotation of the robot from time i to time i + k, deltat indicating the length of time from k to time k +1,

represents the right multiplication operation in the lie algebra, a is the estimated value of the robot acceleration,

representing the actual measurement of the robot acceleration at time k,

indicating the deviation of the robot acceleration measurements at time k (clothes)From a gaussian walk model),

represents the predicted component of the displacement of the robot from the moment i to the moment i + k (by recursion calculation),

indicating the robot velocity (calculated by recursion) at time k.

Step 3.2: and projecting the feature points in the front frame image of the robot to the rear frame image according to the pose transformation relation and the camera imaging model, calculating the pixel displacement distance of the feature points, and finally calculating a corresponding image feature weight matrix according to the pixel displacement distance of the feature points. Specifically, the method comprises the following steps:

step 3.2.1: a characteristic point set P monitored at the k-1 moment _k-1 ＝{U _k-1 ,V _k-1 ,D _k-1 Matching the feature points monitored in the image at the time k, wherein:

U _k-1 ＝{u ₁ ,u ₂ ,…,u _N },V _k-1 ＝{v ₁ ,v ₂ ,…,v _N },D _k-1 ＝{d ₁ ,d ₂ ,…,d _N } (8)

U _k 、V _k 、D _k a set of pixel coordinates and depth values representing the feature points at time k.

Step 3.2.2: according to the position and posture estimation at the time k obtained by pre-integration, the matched feature points at the time k-1 are projected to a camera coordinate system at the time k, and the calculation formula is as follows:

P _c,k ＝T _c,k,k-1 P _c,k-1 (9)

wherein:

T _c,k,k-1 ＝T _cb T _k,k-1 T _bc ，T _k,k-1 ＝I _k (10)

P _c,k an estimated value T representing coordinates of the robot at the moment k under a coordinate system of a camera matched with the feature points _c,k,k-1 Transformation matrix representing the coordinate system of the robot camera from time k-1 to time k, I _k Is an attitude transformation estimate, T, obtained by pre-integration _cb Transformation matrix, T, representing the coordinate system of a robot to the coordinate system of a camera _bc A transformation matrix, T, representing the robot camera coordinate system to the body coordinate system _bc Including a rotation transformation matrix R _bc And translating the transformed vector t _bc ，T _cb The same is true. P _c,k-1 The coordinates of the feature points at the time k-1 in the camera coordinate system can be obtained by inverse extrapolation of the camera projection model:

wherein f is _x 、f _y 、c _x 、c _y Is an intrinsic parameter of the camera.

Step 3.2.3: converting the matching points projected to the k moment under the camera coordinate system into pixel coordinates, and calculating the offset between the front frame and the rear frame under the corresponding characteristic point pixel coordinate system, wherein the calculation formula is as follows:

Δ＝(U _k ,V _k )-(U _k-1 ,V _k-1 ) (15)

and delta represents the offset of the corresponding characteristic point pixel coordinate system between the previous frame and the next frame.

Step 3.2.4: calculating error loss weight of corresponding matching feature points according to the distance delta, and describing through the form of an information covariance matrix H _c ：

Wherein λ is a relative IMU weight adjustment parameter, f is a weight calculation function, and is a programmable function, where the form of f used here is as follows:

s is a visual error relative weight distribution adjustment parameter.

After the information covariance matrix is obtained through calculation, the covariance matrix H is used _c Inverting to obtain a weight W _c Matrix:

and 4, step 4: and constructing a vision and IMU error bundle set optimization problem according to the weight. The optimization objective function is:

wherein

Representing the visual reprojection error, W, of the jth matching point at time i _c Is the weight matrix of the image error terms calculated in step 3,

indicating the IMU error at time i,

and an IMU error item information matrix representing the i moment. And solving the poses of the robot at k moments and the key points of the map through nonlinear optimization. The optimization method uses the Levenburg-Marquardt algorithm and the Gauss-Newton method. Specifically, the optimization method can be based on the actualAnd selecting and using the conditions and the effects.

The invention will be further described in detail with reference to the accompanying drawings and implementation examples, and an algorithm framework of the invention is shown in a figure 1. Considering here a robot configuration using a binocular camera + IMU inertial sensor, the input data includes left and right eye camera images and acceleration, angular velocity information measured by the IMU sensor.

This example is illustrated using the ROS system to build a test environment. Through a subscription. In the preprocessing process, a CLAHE local histogram equalization algorithm is used, a threshold parameter is set to be 4.0, after illumination influence is eliminated, histogram equalization processing is carried out for the first time, then blocking SIFT feature detection is carried out on the preprocessed image, a schematic diagram of image blocking is shown in a diagram 2, and a schematic diagram of feature point quadtree allocation and non-maximum suppression is shown in a diagram 3.

And (3) obtaining approximate motion estimation between image frames according to IMU pre-integration, wherein the displacement motion of the far and near points of linear motion in pixel coordinates is shown in a graph 4, and the displacement motion of the far and near points of rotary motion in pixel coordinates is shown in a graph 5, so that the weight occupied by the far and near points in the bundle set optimization is calculated through the motion estimation and the spatial positions of the feature points. Here, the relative IMU weight adjustment parameter is set to 13, the visual error relative weight parameter is set to 0.5, and the visual information matrix is:

and constructing a vision and IMU combined optimization objective function through a vision information matrix and an IMU error matrix, and optimizing and solving by using a Levenberg-Marquardt algorithm. When setting the weight adjusting parameters, the adjustment needs to be performed according to the actually used scene, if the scene area is large, a large relative weight parameter is used, but correspondingly, the relative weight parameter of the IMU needs to be reduced, so that the weight of the entire IMU basically remains unchanged relative to the IMU.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A robot SLAM method for use in outdoor feature sparse environments, comprising:

step S6: minimizing a loss function through a nonlinear optimization method, and solving pose transformation and map point space coordinates of the robot;

the step S3 includes:

step S3.2: according to pose transformation of the robot in time periods of front and rear frames and a camera imaging model, projecting feature points in a front frame image of the robot to a rear frame image, calculating a feature point pixel displacement distance, and calculating a corresponding image feature weight matrix according to the feature point pixel displacement distance;

said step S3.2 comprises:

U _k-1 ＝{u ₁ ,u ₂ ,...,u _N } (8)

V _k-1 ＝{v ₁ ,v ₂ ,...,v _N } (9)

D _k-1 ＝{d ₁ ,d ₂ ,...,d _N } (10)

P _c,k ＝T _c,k,k-1 P _c,k-1 (11)

T _c,k,k-1 ＝T _cb T _k,k-1 T _bc (12)

T _k,k-1 ＝I _k (13)

wherein, P _c,k Representing the estimated value of the coordinates of the robot at the moment k under the coordinate system of the matched feature point camera; t is _c,k,k-1 A transformation matrix representing a robot camera coordinate system from the time k-1 to the time k; i is _k Is an attitude transformation estimate obtained by pre-integration; t is a unit of _cb A transformation matrix representing a robot body coordinate system to a camera coordinate system; t is _bc A transformation matrix representing a robot camera coordinate system to a body coordinate system; t is _bc Including a rotation transformation matrix R _bc And translating the transformed vector t _bc ；T _cb The same process is carried out; p _c,k-1 The coordinates of the characteristic point camera coordinate system at the moment of k-1 are obtained; t is t _k Representing translation transformation vectors of the robot from the k-1 moment to the k moment; -t _k Representing a translation transformation vector from the k moment to the k-1 moment;

Δ＝(U _k ,V _k )-(U _k-1 ,V _k-1 ) (18)

wherein K represents an internal reference matrix of the camera

2. the robot SLAM method for outdoor sparse feature environment of claim 1, wherein said step S1 comprises:

3. The robot SLAM method for outdoor sparse feature environment of claim 1, wherein said step S2 comprises:

4. The robot SLAM method for use in outdoor feature sparse environments of claim 1, wherein said step S3.1 comprises: using discrete IMU data pre-integration calculation to obtain the motion estimation between the previous frame and the next frame, wherein the calculation formula is as follows:

wherein w represents an estimated value of the angular velocity of the robot;

indicating the deviation of the robot angular velocity measurement at time k,

actual measurement values representing robot acceleration at time k;

representing the robot speed pre-integral quantity at the k moment;

representing a gaussian error in the acceleration measurement unit;

5. The robot SLAM method for outdoor sparse feature environment of claim 1, wherein said step S6 comprises: solving pose transformation and map point space coordinates of the robot by a nonlinear optimization method;

wherein,

representing the IMU error at time i;

an IMU error weight representing time i;

and

a transpose of the IMU error vector representing time i;

and with

Similarly, a transposed form is written for matrix computation.

6. The robot SLAM method for outdoor sparse feature environments of claim 5, wherein the nonlinear optimization method in step S6 comprises: levenberg-marquardt algorithm and gauss-newton method.

7. A robotic SLAM system for use in outdoor feature sparse environments, comprising:

module M6: minimizing a loss function through a nonlinear optimization method, and solving pose transformation and map point space coordinates of the robot;

the module M3 includes:

module M3.2: according to pose transformation of the robot in time periods of front and rear frames and a camera imaging model, projecting feature points in a front frame image of the robot to a rear frame image, calculating a feature point pixel displacement distance, and calculating a corresponding image feature weight matrix according to the feature point pixel displacement distance;

said module M3.2 comprises:

module M3.2.1: a characteristic point set P monitored at the k-1 moment _k-1 ＝{U _k-1 ,V _k-1 ,D _k-1 Matching with the characteristic points monitored at the time k, wherein:

U _k-1 ＝{u ₁ ,u ₂ ,...,u _N } (8)

V _k-1 ＝{v ₁ ,v ₂ ,...,v _N } (9)

D _k-1 ＝{d ₁ ,d ₂ ,...,d _N } (10)

P _c,k ＝T _c,k,k-1 P _c,k-1 (11)

T _c,k,k-1 ＝T _cb T _k,k-1 T _bc (12)

T _k,k-1 ＝I _k (13)

wherein, P _c,k Representing the estimated value of the coordinates of the robot at the moment k under the coordinate system of the matched feature point camera; t is _c,k,k-1 A transformation matrix representing a robot camera coordinate system from the time k-1 to the time k; i is _k Is an attitude transformation estimate obtained by pre-integration; t is a unit of _cb A transformation matrix representing a robot body coordinate system to a camera coordinate system; t is _bc A transformation matrix representing a robot camera coordinate system to a body coordinate system; t is _bc Including a rotation transformation matrix R _bc And translating the transformed vector t _bc ；T _cb In the same way; p is _c,k-1 The coordinates of the characteristic point camera coordinate system at the moment of k-1 are obtained; t is t _k Representing translation transformation vectors of the robot from the k-1 moment to the k moment; -t _k Representing a translation transformation vector from the k moment to the k-1 moment;

Δ＝(U _k ,V _k )-(U _k-1 ,V _k-1 ) (18)

wherein K represents an internal reference matrix of the camera

8. the robotic SLAM system for use in outdoor feature sparse environments of claim 7, wherein the module M1 comprises:

the module M2 includes: