CN113706626B

CN113706626B - Positioning and mapping method based on multi-sensor fusion and two-dimensional code correction

Info

Publication number: CN113706626B
Application number: CN202110874239.1A
Authority: CN
Inventors: 史晓军; 胡佳祥; 张小栋; 梅雪松; 姚鑫; 王迎新
Original assignee: Xian Jiaotong University
Current assignee: Xi'an Yunchi Zhitong Technology Co ltd
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2022-12-09
Anticipated expiration: 2041-07-30
Also published as: CN113706626A

Abstract

The invention discloses a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction, which belongs to the field of computer vision and comprises the following steps: completing visual inertia fusion by using an RGB-D camera and an IMU inertia measurement unit based on a tight coupling method and an EPnP method, and realizing the estimation pose of a visual inertia module measured based on an RGB-D-IMU visual inertia module; performing pose fusion by using an extended Kalman filtering method to construct a pose constraint graph; and front-end processing: and optimizing by using an L-M optimization algorithm, acquiring pose data in the ground two-dimensional code by using a two-dimensional code scanning camera, correcting the obtained optimized pose, and completing positioning and mapping based on multi-sensor fusion and two-dimensional code correction. The method effectively solves the problems that the mapping accuracy in a single-sensor positioning and mapping system is not high and the tracking loss is easy to occur in a degraded environment, and effectively realizes the synchronous positioning and mapping with high accuracy and high robustness.

Description

Positioning and mapping method based on multi-sensor fusion and two-dimensional code correction

Technical Field

The invention belongs to the field of computer vision, and relates to a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction.

Background

Synchronous positioning And Mapping (SLAM for short) refers to a system that places a robot in an unknown environment, moves from an unknown position, collects And processes environment information through various sensor data equipped for the robot, and generates positioning And scene map information of the position And posture of the robot. The SLAM technology is of great importance to the autonomous action and interaction capacity of the intelligent robot, and has wide application in the fields of automatic driving, robots, unmanned planes, three-dimensional reconstruction, AR/VR and the like. Aiming at the problems of precision and robustness of a mobile robot in the mapping process, a large number of researchers provide various laser SLAM methods based on filtering and mapping and visual SLAM methods based on a monocular camera, a binocular camera and an RGB-D camera, such as a Cartographer algorithm of a Google open source in the laser SLAM, an ORB-SLAM2 algorithm of an open source in the visual SLAM and the like. The methods can meet the requirements of positioning and mapping accuracy and robustness to a certain extent, but still have a plurality of defects, particularly the problem that the loss is easy to occur in a degraded environment.

For pure laser SLAM, the main problems are: (1) it is not good at completing mapping in dynamic environment (such as a large amount of people moving in the measuring environment); (2) not good at mapping in unstructured environments (e.g., long and straight corridor environments), loop detection in such environments is highly susceptible to errors, causing the mapping to crash. Due to poor repositioning capability, the laser SLAM is difficult to return to the working state again after the tracking is lost, so that the constructed map is seriously deformed. For pure visual SLAM, the main problems are: (1) tracking loss is very easy to occur in a non-texture environment (such as a white wall) and a transparent wall environment. (2) In an environment with particularly weak illumination, tracking loss can also be caused because the features extracted by the camera are far from actual. Therefore, the existing pure laser or pure visual single-sensor positioning has the problems of low precision, easy tracking loss in a degraded environment and the like, and synchronous positioning and mapping with high precision and high robustness cannot be realized.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction, which effectively solves the problems that the mapping accuracy in a single-sensor positioning and mapping system is not high and tracking loss is easy to occur in a degraded environment, and effectively realizes synchronous positioning and mapping with high accuracy and high robustness.

In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:

the invention discloses a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction, which adopts a laser radar, an RGB-D camera and an IMU inertial measurement unit, and comprises front-end processing in the process of constructing a pose and a constraint map and back-end processing in the process of optimizing the pose and the constraint map and forming a 2.5D map based on a map optimization method;

wherein, the front-end processing comprises the following steps:

the RGB-D camera performs image feature detection and extraction on shot image data by utilizing ORB features to obtain feature extraction information, calculates depth information of the obtained feature extraction information, estimates camera pose according to the obtained depth information and generates a bag of words BoW; constructing a dictionary corresponding to the key frame by using the visual characteristics of the bag of words BoW, matching, calculating the similarity of the images, and completing loop detection; based on a tight coupling method, a motion equation and an observation equation are constructed by using an RGB-D camera and an IMU inertia measurement unit, the motion relation and pose transformation of adjacent frames are estimated based on an EPnP method, the relative motion relation of the adjacent frames is obtained, then the received point cloud is registered according to the obtained relative motion relation of the adjacent frames, data integration is carried out, visual inertia fusion is completed, and the estimation pose of a visual inertia module is measured based on an RGB-D-IMU visual inertia module; performing pose fusion on the pose obtained by laser scanning matching and the pose estimated by the visual inertia module by using an extended Kalman filtering method to construct a pose constraint graph;

wherein, the back-end processing comprises the following steps:

firstly, optimizing the pose obtained by laser scanning matching and the estimated pose of the RGB-D-IMU visual inertia module by using an L-M optimization algorithm to obtain an optimized pose; and then, acquiring pose data in the ground two-dimensional code by using a two-dimensional code scanning camera, correcting the obtained optimized pose, and completing positioning and mapping based on multi-sensor fusion and two-dimensional code correction.

Preferably, the ORB features are used for detecting and extracting image features, and by adopting an ORB method, descriptions of the scale and rotation of FAST corners are added;

the ORB feature performs downsampling on the image at different levels to obtain images with different resolutions, an image pyramid is constructed, corner points are detected on each layer of the constructed image pyramid, scale invariance is achieved by matching the images on different layers, and rotation of the feature is achieved through a gray level centroid method.

Further preferably, the rotation of the features is realized by a gray centroid method, which specifically comprises the following steps:

defining a small image block with moments:

m _pq ＝∑ _x，y∈B x ^p y ^q i (x, y), p, q = {0,1}; where (x, y) represents the image pixel coordinates, I (x, y) represents the gray scale value at (x, y), m _pq Representing moments of the image blocks;

finding the centroid of the image block by moments:

where c denotes the centroid of the image block, m _pq Representing moments of the image blocks;

connecting the geometric center o and the centroid c of the image block to obtain a direction vector

The direction of the feature points is then defined as:

in the formula, θ represents the direction of the characteristic point, m _pq Representing moments of the image blocks;

whereby rotation of the features is achieved by the grayscale centroid method.

Preferably, the pose fusion is performed by using extended kalman filtering, specifically including a prediction phase and an update phase, and the steps are as follows:

in the prediction stage, the observation effect is not considered, and the pose state and the covariance at the current moment are predicted respectively:

x _k|k-1 ＝f(x _k-1|k-1 ，u _k )；

residual errors were observed during the update phase:

y _k ＝z _k -h(x _k|k-1 )；

calculating an approximate Kalman gain K _k ：

Correcting the prediction based on the observation to obtain a state estimate x updated at time k _k|k ：

x _k|k ＝x _k|k-1 +K _k y _k ；

Finally updating the covariance posterior estimate P _k|k ：

P _k|k ＝(I-K _k H _k )P _k|k-1 ；

In the formula, x _k Camera pose, u, at time k _k Indicating the control input at time k, Q _k Covariance matrix, K, representing the system noise _k Representing the Kalman gain, F _k First order Jacobian matrix, H, representing the equation of state transition _k A first order jacobian matrix representing the observation equation, where,

preferably, optimizing the pose obtained by matching the laser scanning and the estimated pose of the visual inertia module by using an L-M optimization algorithm, and comprising the following operations:

firstly, judging the difference between a first-order Taylor expansion approximate model and an actual function: let the ratio ρ of the actual drop value of the function to the approximate drop value represent this degree of difference:

when rho is close to 1, the Taylor approximation is considered to be good, when rho is too small, the actual drop value is far smaller than the approximate drop value, the approximate range needs to be narrowed, otherwise, the approximate range needs to be enlarged;

let the initial optimization radius be μ, the optimization problem becomes:

min _Δx ||f(x)+J(x)Δx|| ² ，||DΔx|| ² ≤μ；

where D is taken as the identity matrix. The unconstrained optimization problem for the above formula construction is:

where λ is the lagrange multiplier, the above equation is expanded and the first order partial derivative is made to be 0, and the resulting incremental equation is:

(H+λD ^T D)Δx＝g；

it can be known that when λ is small, the incremental solution approaches gauss-newton method; when λ is large, it approaches a gradient descent method.

Preferably, a two-dimensional code position and posture correction method is adopted, and the two-dimensional code needs to be arranged on the environment ground;

wherein the arrangement is not uniform;

the method comprises the following steps of carrying out dense arrangement on the ground in a non-texture environment or a dim environment, a dynamic environment or an unstructured environment, and carrying out sparse arrangement on the ground in a multi-texture, bright, structured and static environment.

Preferably, when the system detects that the system is in a non-texture environment or a dim environment, the system sends a signal to the RGB-D-IMU visual inertia module and does not receive data transmitted by the RGB-D camera and the IMU inertial measurement unit; at the moment, the laser radar independently completes the positioning and mapping work.

Preferably, when the system detects that the system is currently in a dynamic environment or an unstructured environment, a signal is sent to the laser radar, and scanning data transmitted by the laser radar is not received; at the moment, the RGB-D-IMU vision inertia module independently completes the positioning and mapping work.

Preferably, a world coordinate system W, a camera coordinate system C and a two-dimensional code coordinate system Q are constructed based on a laser radar, an RGB-D camera and an IMU inertial measurement unit.

Further preferably, the method includes the following steps of acquiring pose data in the ground two-dimensional code by using the two-dimensional code scanning camera, and correcting the pose:

i) Calculate initial pose P ₀ Matrix relation relative to world coordinate system W

ii) a two-dimensional code scanning camera arranged on the robot chassis shoots a target two-dimensional code image through a two-dimensional code calibration position, and the target image is extracted;

iii) Obtaining the center coordinate Q of the current two-dimension code _j That is, the current position of the camera in the world coordinate system is obtained, and the deflection angle theta of the two-dimensional code image shot by the current passing two-dimensional code relative to the positive direction is determined _j Calculating to obtain the matrix transformation relation of the two-dimensional code about the positive direction

Namely the posture of the current camera under a world coordinate system;

iv) according to the transformation relationship between the camera coordinate system C and the world coordinate system W

And the transformation relation between the current two-dimensional code coordinate system Q and the world coordinate system W

Obtaining the pose of the current two-dimensional code under a camera coordinate system, namely the pose P of the robot at the moment _j ；

v) with the pose P found _j And covering and correcting the estimated pose in the current pose constraint map.

Compared with the prior art, the invention has the following beneficial effects:

the invention discloses a positioning and mapping method based on multi-sensor fusion and two-dimension code correction, which can effectively solve the problems of motion blur of a camera when the camera moves too fast, too few overlapping regions between two frames to finish feature matching and data drift in IMU inertial navigation by fusing RGB-D camera vision and an IMU inertial measurement unit, and can reduce the influence of a dynamic object on visual positioning to a certain extent; the estimated pose obtained by scanning and matching the laser radar is fused with the estimated pose obtained by a vision inertia module corresponding to an RGB-D camera and an IMU inertia measurement unit by using an extended Kalman filtering method, the characteristics of high measurement precision of the laser SLAM, good point cloud matching effect and the like are combined with the characteristics of abundant texture information of the vision SLAM, larger size and better effect in a dynamic environment, and the characteristics are mutually supplemented, so that the positioning and mapping effects of the system in the environment which is not suitable for pure laser SLAM or pure vision SLAM are improved; by using the method for correcting and estimating the pose by the aid of the two-dimensional code with pose information, the accumulated error generated by the system in positioning and mapping is greatly reduced, and the positioning and mapping precision of the system is obviously improved; therefore, the method solves the problems that the positioning and mapping precision of the existing pure laser or pure vision single sensor is not high, the tracking loss is easy to occur in a degraded environment and the like, and realizes the positioning and mapping method which is suitable for the SALM system and has high precision and high robustness and is provided with the auxiliary two-dimensional code pose correction by combining the fusion data of the laser radar-RGB-D-IMU inertial measurement unit.

Furthermore, the pose obtained by fusing the laser radar module and the visual inertia module is optimized by adopting an L-M optimization algorithm, so that the problems that the traditional gradient descent method is low in convergence speed and easy to vibrate and the Gauss-Newton method cannot converge are effectively solved, the optimal value is found more quickly and more efficiently, and the searching efficiency and the real-time performance of the method can be improved.

Furthermore, the invention adopts a method of sparsely arranging the two-dimension codes on the environmental ground, utilizes the two-dimension code scanning camera arranged at the chassis of the robot to acquire images, calculates the current pose state, realizes the real-time pose correction of the two-dimension codes, can accurately correct the estimated pose of the robot at the current moment, and improves the precision of the method.

Furthermore, the multi-module cooperative work system adopted by the invention can automatically change from the cooperative work mode to the mode that other modules complete the positioning and mapping tasks when one module fails, can achieve better positioning and mapping effects even in an unstructured environment and a non-texture environment, and improves the robustness of the method.

Drawings

FIG. 1 is a schematic flow chart of a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction according to the present invention;

FIG. 2 is an exemplary diagram of a pose correction two-dimensional code in the present invention;

FIG. 3 is a schematic flow chart of the RGB-D-IMU visual inertial module of the present invention during failure;

FIG. 4 is a schematic diagram illustrating a process of the present invention when the corresponding scanning matching module of the lidar fails.

Detailed Description

In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The invention is described in further detail below with reference to the accompanying drawings:

referring to fig. 1 to 4, a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction includes the following steps:

1) A laser vision inertial odometer is formed on the basis of a laser radar-RGB-D camera-IMU inertial measurement unit, and the process is as follows:

the invention provides a positioning and mapping method based on multi-sensor fusion and two-dimensional code correction, which is based on a laser vision inertial odometer consisting of a laser radar, an RGB-D camera and an IMU inertial measurement unit to realize positioning and mapping, and the flow schematic diagram is shown in figure 1. When the method is carried out, the external parameters between the internal parameters of the RGB-D camera and each sensor are known, and the sensors have time synchronism. And constructing a world coordinate system W, a camera coordinate system C and a two-dimensional code coordinate system Q.

The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction adopts a graph optimization method, and the front end is a pose and constraint graph constructing process.

The RGB-D camera detects and extracts image features of shot image data by utilizing ORB features, calculates depth information of the features, respectively constructs and generates a bag of words BoW according to the depth information and finishes estimation of camera pose; constructing a dictionary corresponding to the key frame by using the visual characteristics of a bag of words BoW by using a DBoW3 method, matching, calculating the similarity of images, and finishing loop detection; based on a data fusion principle, firstly preprocessing IMU data, secondly constructing a motion equation and an observation equation by using a tightly coupled RGB-D-IMU visual inertia module (comprising an RGB-D camera and an IMU inertia measurement unit) method, estimating the motion relation and pose transformation of adjacent frames by using an EPnP method to obtain the relative motion relation of the adjacent frames, finally registering point clouds received in the period according to the obtained relative motion relation of the adjacent frames, integrating data, finishing visual inertia fusion, and realizing that the estimated pose of the visual inertia module is measured based on the RGB-D-IMU visual inertia module (comprising the RGB-D camera and the IMU inertia measurement unit); and finally, scanning and matching the data scanned by the laser radar by using an extended Kalman filtering method, carrying out pose fusion on the estimated pose of the laser radar module obtained by scanning and matching and the estimated pose of the visual inertia module obtained by the RGB-D-IMU visual inertia module, generating a pose constraint graph formed by pose and constraint, and finishing the construction of the pose constraint graph.

The graph optimization method adopted by the invention is characterized in that the process of optimizing the pose and the constraint graph and forming the 2.5D map is carried out at the rear end. Firstly, optimizing the pose obtained by fusing the laser radar module and the visual inertia module by using an L-M optimization algorithm; and then, acquiring pose data in the ground two-dimensional code by using a two-dimensional code scanning camera, and correcting the optimized pose obtained by the optimization through the L-M method to complete image construction.

The main details of the positioning and mapping method based on multi-sensor fusion and two-dimensional code correction are as follows:

(1) ORB characteristics: aiming at the defect that the FAST corner has no directionality and scale, the ORB method adopted by the invention adds the description of scale and rotation to the FAST corner. Scale invariance is achieved by constructing an image pyramid (down-sampling the image at different levels to obtain images of different resolutions) and detecting corners on each level of the pyramid, while rotation of ORB features is achieved by the grayscale centroid method. The bottom layer of the pyramid method is an original image, and the fixed-magnification scaling of the original image is performed on the upper layer, so that images with different resolutions are obtained, and the scale invariance is realized by matching the images on different layers. The gray centroid method firstly defines a small image block, and the moment of the small image block is as follows:

m _pq ＝∑ _x，y∈B x ^p y ^q I(x，y)，p，q＝{0，1}；

where (x, y) represents the image pixel coordinates, and I (x, y) represents the grayscale value at (x, y);

the centroid of the image block can be found by the moments:

connecting o and the centroid c in the geometry of the image block to obtain a direction vector

The direction of the feature points can then be defined as:

by the method, scale and rotation information is added into the FAST corner points, so that the robustness of expression of the FAST corner points among different images is greatly improved.

(2) The extended Kalman filtering method comprises the following steps: the invention adopts an extended Kalman filtering method to perform pose fusion on the fusion part of vision and laser positioning results. Specifically, scanning and matching are carried out on laser scanning data by using an extended Kalman filtering method, and pose obtained by scanning and matching is fused with pose obtained by an RGB-D-IMU visual inertia module. The extended Kalman filtering is divided into a prediction part and an updating part. In the prediction stage, the observation effect is not considered, and the pose and the covariance of the current time are predicted respectively:

x _k|k-1 ＝f(x _k-1|k-1 ，u _k )；

the residual is observed in the update phase:

y _k ＝z _k -h(x _k|k-1 )；

calculating an approximate Kalman gain K _k ：

x _k|k ＝x _k|k-1 +K _k y _k ；

Finally updating the covariance posterior estimate P _k|k ：

P _k|k ＝(I-K _k H _k )P _k|k-1 ；

In the formula, x _k Camera pose, u, at time k _k Indicating the control input at time k, Q _k Covariance matrix, K, representing the system noise _k Representing the Kalman gain, F _k First-order Jacobian matrix, H, representing the equation of state transition _k A first order jacobian matrix representing the observation equation, where,

(3) L-M optimization algorithm: aiming at the problems that the gradient descent method is low in convergence speed and easy to vibrate and the Gauss-Newton method cannot converge, the pose obtained by fusing the laser radar module and the visual inertia module is optimized by the L-M optimization algorithm, so that the error of pose estimation is reduced. Firstly, judging the difference between a first-order Taylor expansion approximate model and an actual function: let the ratio ρ of the actual drop value to the approximate drop value of the function represent this difference:

when ρ is close to 1, it is considered that taylor approximation is good, and when ρ is too small, the actual drop value is much smaller than the approximate drop value, and the approximate range needs to be narrowed, otherwise, the approximate range needs to be enlarged. Let the initial optimization radius be μ, the optimization problem becomes:

min _Δx ||f(x)+J(x)Δx|| ² ，||DΔx|| ² ≤μ；

where λ is the lagrange multiplier, the above equation is expanded and the first order partial derivative is made 0, and the resulting solution incremental equation is:

(H+λD ^T D)Δx＝g；

it can be known that when λ is small, the incremental solution approaches gauss-newton method; when λ is large, approaching the gradient descent method, the L-M optimization algorithm provides a more stable and accurate incremental solution method.

2) Correcting the pose of the two-dimensional code, wherein the process is as follows:

the two-dimension code position and posture correction method adopted by the invention needs to arrange the two-dimension code on the environment ground, the arrangement mode is not uniform, dense arrangement is carried out on the ground of a non-texture environment or a dark environment, a dynamic environment or an unstructured environment, and sparse arrangement is carried out on the ground of a multi-texture, bright, structured and static environment, so that high-precision map construction can be more effectively realized.

In the invention, the pose of the two-dimensional code under a world coordinate system is stored in each two-dimensional code, and the two-dimensional code pattern is shown in fig. 2. The chassis of the robot is provided with a two-dimensional code camera which scans vertically downwards, and can directly read the coordinates of the current two-dimensional code and judge the current running direction, namely the attitude of the robot according to the deflection of the shot two-dimensional code image. P _i Representing the pose of the camera at time i, E _j The coordinate of the jth two-dimensional code is represented, the X-axis direction of a world coordinate system is defined as a positive direction, and theta _j The method represents the included angle of the positive direction of the jth two-dimensional code shot by the camera, and is implemented as follows:

i) Calculating the matrix transformation relation of the initial pose P0 relative to the world coordinate system W

iii) Obtaining the center coordinate Q of the current two-dimension code _j That is, the current position of the camera in the world coordinate system is determined, and the deflection angle theta of the two-dimensional code image shot at the current position of the two-dimensional code relative to the positive direction is determined according to the deflection angle theta of the two-dimensional code image shot at the current position of the two-dimensional code _j Calculating to obtain the matrix transformation relation of the two-dimensional code about the positive direction

Namely the posture of the current camera under a world coordinate system;

The pose of the current two-dimensional code under the camera coordinate system, namely the pose P of the robot at the moment can be obtained _j ；

v) with the pose P found _j The coverage correction is carried out on the pose obtained by the optimization of the L-M method, so that the positioning precision is greatly improved.

3) Robustness enhancement processing, the process is as follows:

according to the characteristics of the laser SLAM and the vision SLAM, when the system is in a non-texture environment or a dim environment, tracking loss can occur, and a vision module can be invalid; when the system is in a dynamic environment or an unstructured environment, the scanning matching of the laser radar module is easy to make mistakes. When one module fails, the multi-module cooperative work system provided by the invention can automatically change from cooperative work to the completion of positioning and drawing tasks by other modules.

As shown in FIG. 1, when a laser radar module corresponding to a laser radar and an RGB-D-IMU vision inertial module (including an RGB-D camera and an IMU inertial measurement unit work simultaneously, firstly, the RGB-D camera performs feature extraction on each pixel of a shot image, calculates depth information of the pixel, directly estimates a camera pose according to the depth information, then performs image feature detection and extraction on image data by using ORB features to generate a bag of words (BoW) and finish estimation of the camera pose, then uses a DBoW3 method to construct a dictionary corresponding to a key frame by using the visual features of the BoW, performs matching and calculates similarity of the image to finish loop detection, and further, based on a data fusion principle, performs preprocessing on the IMU data and fuses with loop detection matching results to obtain a vision inertial module estimation pose, and finally fuses with the laser radar module estimation pose obtained by matching the vision inertial module estimation pose with laser radar scanning to generate a pose constraint map formed by pose and constraint.

As shown in fig. 3, when the system detects that it is currently in a non-textured or dim environment, it sends a signal to the RGB-D-IMU visual inertial module (including the RGB-D camera and the IMU inertial measurement unit) to no longer receive data from the RGB-D camera and the IMU inertial measurement unit, i.e., to disable the module. At the moment, the laser radar module independently completes positioning and mapping work.

As shown in fig. 4, when the system detects that it is currently in a dynamic environment or an unstructured environment, it sends a signal to the lidar to no longer receive the scan data from the lidar. And at the moment, the vision inertia module corresponding to the RGB-D-IMU inertia measurement unit independently finishes the positioning and mapping work.

In summary, the invention discloses a positioning and mapping method based on multi-sensor fusion and two-dimension code correction, and mainly relates to improvement of technical bases of sensor data fusion, SLAM, two-dimension code auxiliary positioning and the like of a laser radar-RGB-D camera-IMU inertial measurement unit. Based on a graph optimization method, the invention better solves the problems of motion blur in pure visual positioning and environment degradation in pure laser positioning by fusing the laser radar, the RGB-D camera and IMU data to obtain the consistent pose estimation. The SLAM system corrects the estimated pose of the laser radar-visual inertial data fusion by extracting the characteristics of the two-dimensional code landmarks with pose information, and obtains higher positioning mapping precision. In addition, when the laser radar positioning module or the visual inertia module cannot work normally, the system can remove the module which cannot work, and the other module can independently complete pose estimation and positioning mapping. Therefore, the invention can achieve the positioning and mapping effects with high robustness and high precision.

Compared with the prior positioning and mapping technology, the invention has the advantages that:

(1) The RGB-D camera and IMU inertial measurement unit data are fused, so that the problems of motion blur of the camera when the camera moves too fast, feature matching failure due to too few overlapping regions between two frames and data drift in IMU inertial navigation can be effectively solved, the IMU can sense the motion information of the IMU, and the influence of a dynamic object on visual positioning is reduced to a certain extent;

(2) The invention uses an extended Kalman filtering method to fuse pose data obtained by scanning and matching laser radar with estimated poses obtained by an RGB-D-IMU visual inertia module, combines the characteristics of high measurement precision of laser SLAM, good point cloud matching effect and the like with the characteristics of abundant texture information of visual SLAM, larger size and better effect in a dynamic environment, supplements each other, and improves the positioning and mapping effects of the system in the environment which is not suitable for pure laser SLAM or pure visual SLAM;

(3) The invention provides a method for correcting and estimating a pose by using a two-dimensional code with pose information in an auxiliary manner, which greatly reduces the accumulated error generated by a system in positioning and mapping so as to obviously improve the positioning and mapping precision of the system;

(4) The multi-module cooperative work system adopted by the invention can automatically change from the cooperative work mode to the mode that other modules finish positioning and mapping tasks when one module fails, can achieve better positioning and mapping effects even in an unstructured environment and a non-texture environment, and improves the robustness of the SLAM system.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. A positioning and mapping method based on multi-sensor fusion and two-dimensional code correction is characterized by comprising front-end processing for constructing a pose constraint map and back-end processing for forming a map;

wherein, the front-end processing comprises the following steps: carrying out image feature detection and extraction on image data shot by an RGB-D camera to obtain feature extraction information, calculating depth information of the obtained feature extraction information, estimating the pose of the camera according to the obtained depth information and generating a bag of words BoW; constructing a dictionary corresponding to the key frame by using a bag of words (BoW), matching and calculating similarity, and completing loop detection; based on a tight coupling method, a motion equation and an observation equation are constructed by utilizing an RGB-D camera and an IMU inertia measurement unit, the motion relation and pose transformation of adjacent frames are estimated based on an EPnP method, the relative motion relation of the adjacent frames is obtained, then integration is carried out according to the obtained relative motion relation of the adjacent frames, visual inertia fusion is completed, and the estimation pose of a visual inertia module is measured based on an RGB-D-IMU visual inertia module; performing pose fusion on the pose obtained by laser scanning matching and the pose estimated by the visual inertia module by using an extended Kalman filtering method to construct a pose constraint graph;

wherein, the back-end processing comprises the following steps: optimizing the pose obtained by laser scanning matching and the estimated pose of the RGB-D-IMU visual inertia module by using an L-M optimization algorithm to obtain an optimized pose; and acquiring pose data in the ground two-dimensional code by using a two-dimensional code scanning camera, correcting the obtained optimized pose, and completing positioning and mapping based on multi-sensor fusion and two-dimensional code correction.

2. The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction as claimed in claim 1, wherein ORB features are used to detect and extract image features of image data captured by RGB-D camera, and by adopting ORB method, description of scale and rotation of FAST corner is added;

the ORB features perform downsampling on the image at different levels to obtain images with different resolutions, an image pyramid is constructed, angular points are detected on each layer of the constructed image pyramid, scale invariance is achieved by matching the images on different layers, and rotation of the features is achieved through a gray centroid method.

3. The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction according to claim 2, wherein feature rotation is realized by a gray scale centroid method, specifically comprising the following steps:

defining a small image block with moments:

m _pq ＝∑ _x，y∈B x ^p y ^q i (x, y), p, q = {0,1}; where (x, y) denotes image pixel coordinates, I (x, y) denotes a gray value at (x, y), m _pq Representing moments of the image blocks;

finding the centroid of the image block by moment:

Defining the direction of the feature points as:

whereby rotation of the features is achieved by the grayscale centroid method.

4. The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction according to claim 1, characterized in that the pose fusion is performed by using extended kalman filtering, specifically comprising a prediction stage and an update stage, and the steps are as follows:

in the prediction stage, the observation effect is not considered, and the pose state and the covariance thereof at the current moment are predicted respectively:

x _k|k-1 ＝f(x _k-1|k-1 ，u _k )；

the residual is observed in the update phase:

y _k ＝z _k -h(x _k|k-1 )；

calculating an approximate Kalman gain K _k ：

x _k|k ＝x _k|k-1 +K _k y _k ；

Finally updating the covariance posterior estimate P _k|k ：

P _k|k ＝(I-K _k H _k )P _k|k-1 ；

In the formula, x _k Camera pose, u, at time k _k Indicating the control input at time k, Q _k Covariance matrix, k, representing the system noise _k Denotes the Kalman gain, F _k First-order Jacobian matrix, H, representing the equation of state transition _k A first-order Jacobian matrix representing an observation equation, wherein,

5. the positioning and mapping method based on multi-sensor fusion and two-dimensional code correction according to claim 1, wherein the pose obtained by laser scanning matching and the estimated pose of the visual inertia module are optimized by using an L-M optimization algorithm, and the method comprises the following operations:

firstly, judging the difference between a first-order Taylor expansion approximate model and an actual function: let the ratio ρ of the actual degradation value to the approximate degradation value of the function represent the degree of difference:

let the initial optimization radius be μ, the optimization problem becomes:

min _Δx ||f(x)+J(x)Δx|| ² ，||DΔx|| ² ≤μ；

wherein D is taken as an identity matrix; the unconstrained optimization problem is constructed for the above formula as follows:

(H+λD ^T D)Δx＝g；

it can be known that when λ is small, incremental solution approaches gauss-newton method; when λ is large, it approaches a gradient descent method.

6. The positioning and mapping method based on multi-sensor fusion and two-dimension code correction as claimed in claim 1, wherein a two-dimension code gesture correction method is adopted, and the two-dimension code is required to be arranged on the environment ground;

wherein the arrangement is not uniform;

the method comprises the following steps of carrying out dense arrangement on the ground of a non-texture environment or a dim environment, a dynamic environment or an unstructured environment, and carrying out sparse arrangement on the ground of a multi-texture, bright, structured and static environment.

7. The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction as claimed in claim 1, wherein when the system detects that it is currently in a non-textured environment or a dim environment, it sends a signal to the RGB-D-IMU visual inertial module, and no longer receives data from the RGB-D camera and IMU inertial measurement unit; at the moment, the laser radar independently completes the positioning and mapping work.

8. The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction as claimed in claim 1, wherein when the system detects that it is currently in a dynamic environment or an unstructured environment, it sends a signal to the lidar and no longer receives scanning data from the lidar; at the moment, the RGB-D-IMU vision inertia module independently completes the positioning and mapping work.

9. The positioning and mapping method based on multi-sensor fusion and two-dimension code correction as claimed in claim 1, wherein a world coordinate system W, a camera coordinate system C and a two-dimension code coordinate system Q are constructed based on a laser radar, an RGB-D camera and an IMU inertial measurement unit.

10. The positioning and mapping method based on multi-sensor fusion and two-dimensional code correction according to claim 9, wherein a two-dimensional code scanning camera is used to acquire pose data in a ground two-dimensional code and correct a pose, and the method comprises the following specific steps:

iii) Obtaining the central coordinate Q of the current two-dimensional code _j That is, the current position of the camera in the world coordinate system is shot according to the current position passing through the two-dimensional codeDeflection angle theta of shot two-dimensional code image relative to positive direction _j Calculating to obtain the matrix transformation relation of the two-dimensional code about the positive direction

Namely the posture of the current camera under a world coordinate system;

Obtaining the pose of the current two-dimensional code under the camera coordinate system, namely the pose P of the robot at the moment _j ；

v) at the pose P _j And covering and correcting the estimated pose in the current pose constraint map.