CN113052855A

CN113052855A - Semantic SLAM method based on visual-IMU-wheel speed meter fusion

Info

Publication number: CN113052855A
Application number: CN202110218495.5A
Authority: CN
Inventors: 李威; 李晓馨; 朴松昊; 陈立国
Original assignee: Suzhou Maisijie Intelligent Technology Co ltd
Current assignee: Suzhou Maisijie Intelligent Technology Co ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2021-06-29
Anticipated expiration: 2041-02-26
Also published as: CN113052855B

Abstract

The invention relates to the technical field of computer vision, and discloses a semantic SLAM method based on vision-IMU-wheel speed meter fusion. And then inputting the estimated pose of each frame of image, the corresponding depth map and the 2D semantic segmentation result into a 3D map building module to build a global 3D semantic mesh map. And the road condition information is transmitted to a positioning module at the front end of the SLAM through a 2D semantic segmentation result, whether the current trolley moves abnormally is judged according to the road condition in front of the trolley, the prediction position error of the IMU and the wheel speed meter and the alignment error between the IMU and the wheel speed meter measurement value, if the current trolley moves abnormally, the pre-integral measurement value of the wheel speed meter corresponding to the current frame is actively deleted from the state estimation equation, and the robustness of attitude estimation in a complex scene is improved.

Description

Semantic SLAM method based on visual-IMU-wheel speed meter fusion

Technical Field

The invention relates to the technical field of computer vision, in particular to a semantic SLAM method based on vision-IMU-wheel speed meter fusion.

Background

While it is possible to use fused inertial sensors and visual data to compensate for the scale uncertainty and poor fast motion tracking inherent in pure visual pose estimation methods, the combined method is ineffective in non-textured or low light environments where the visual sensors cannot obtain usable information. In this case, the visual inertial method will degrade to dead reckoning based on inertial navigation only, and the attitude error will increase rapidly over time. For a mobile robot with a wheel speed sensor, a camera, an inertial sensor and the wheel speed sensor can be fused to improve the robustness of attitude estimation in a complex scene.

Ground robots often experience limited motion (approximately planar and mostly moving in an arc or straight line at constant speed or acceleration) while navigating indoors, changing the observability of the VINS and making some additional degrees of freedom unobservable. When the local acceleration is constant, the magnitude of the true IMU acceleration cannot be distinguished from the magnitude of the accelerometer bias, as they are at least temporarily constant. Therefore, the magnitude of the true IMU acceleration may be arbitrary, resulting in scale ambiguity. When there is no rotational movement, the direction of the local gravitational acceleration cannot be distinguished from the direction of the accelerometer bias, since they are at least temporarily constant. Therefore, the roll angle and the pitch angle become blurred. In both of these non-observable cases, the non-observability of the second case can be easily eliminated by allowing the robot to deviate from its straight path, but making the scale observable is very challenging, as it will require the robot to constantly modify its acceleration, which will increase the wear of its moving system. We therefore solved this problem and ensured the observability of VINS by extending VINS and incorporating the measurements provided by the robot's wheel speed meter.

But the wheel speed meter only has correct measurement when the cart is in plane motion and there is no motion anomaly, i.e. if the robot moves in an inclined environment (e.g. uneven road surface, slope, pit on the road surface, speed bump, etc.) or the wheels slip, the wrong wheel speed meter measurement will make the SLAM algorithm unable to correctly estimate the pose of the robot (the scale estimation is inaccurate), and the positioning may fail.

Disclosure of Invention

The purpose of the invention is as follows: the invention provides a semantic SLAM method based on vision-IMU-wheel speed meter fusion, which solves the problems in the prior art.

The technical scheme is as follows: the invention provides a semantic SLAM method based on vision-IMU-wheel speed meter fusion, which comprises the following steps:

s1: fixedly mounting a depth camera on a trolley, mounting a wheel speed meter on the trolley, and acquiring a color image and a depth image, IMU (inertial measurement unit) measurement values and wheel speed meter reading at the previous moment and the current moment;

s2: calculating pre-integrals of the IMU and wheel speed meter for the IMU measurements and wheel speed meter readings in S1;

s3: calculating a visual reprojection error, an IMU pre-integration error and a wheel speed meter pre-integration error between two adjacent frames;

s4: performing pose estimation, namely performing nonlinear optimization on the visual reprojection error, the IMU pre-integration error and the wheel speed meter pre-integration error in S3 to solve the poses of all frames in the sliding window;

s5: performing 2D semantic segmentation on the road surface by using a convolutional neural network, segmenting an uneven area of the road surface, and obtaining a 2D road surface semantic segmentation result map;

s6: inputting the pose of each frame in S4, the 2D road surface semantic segmentation result map in S5 and the depth map in S1 into a global 3D semantic map building module to build a global 3D semantic mesh map with semantic labels;

s7: detecting abnormal movement of the trolley;

s8: and (4) posture optimization, namely actively deleting the wheel speed meter pre-integral measurement value of the current frame from the state estimation equation when abnormal movement of the trolley is detected.

Further, the calculating of the pre-integration of the IMU and the wheel speed in S2 is specifically as follows:

wherein i and j respectively represent the moment when the camera shoots the kth picture and the kth +1 picture; l and l +1 are two time instants between i and j;

is an external parameter between the IMU and the wheel speed meter in the pre-integration stage

A value of (d);

and

readings of an accelerometer, a gyroscope, and a wheel speed meter, respectively; δ t_lIs the time l andthe time interval between l + 1;

and

respectively representing accelerometer and gyroscope bias at a certain time t;

is a pre-integration result and can be regarded as displacement, speed change and relative rotation of the IMU;

indicating the displacement of the wheel speed meter.

Further, the method for calculating the weight sensing projection error, the IMU pre-integration error and the wheel speed meter pre-integration error in S3 includes:

wherein, g^wIs the gravitational acceleration, Δ t, in the world coordinate system_kRepresenting the time interval between image frames k and k + 1;

and

is compensated by variation of IMU deviation

And

and IMU-wheel speed meter external reference

The compensated pre-integration result.

Further, the method for estimating the position in S4 includes:

wherein c (x) represents a cost function, l represents an index of map points, B_lIs the set of image frames on which the map point l appears; k is the number of frames in the sliding window,

the error of the re-projection is represented,

indicating IMU-wheel speed error, e_mDenotes the marginalized residual, W_rIs an information matrix of all the reprojection items, and the quantity x to be optimized contains the state quantity x of each frame_kInverse depth λ of each map point_lRotation of the camera to the IMU

And translation

And rotation of the wheel speed meter to the IMU

And translation

Wherein the state quantity x of each frame_kIncludedDisplacement of IMU to world coordinate system at each frame corresponding time

Speed of rotation

And rotation

And acceleration biasing of IMU

And angular velocity offset

Further, the constructing the three-dimensional map in S6 further includes: generating three-dimensional point cloud from RGBD data, then obtaining TSFD on each key frame by using a ray projection method, and extracting grids from the TSFD by using a marching cube algorithm; the method specifically comprises the following steps:

1) inputting a 2D road surface semantic segmentation result with a semantic label in each key frame, and then attaching the label to each three-dimensional point;

2) semantic labels are also projected during light projection, and for each beam of light in the light projection, a label probability vector is established according to the frequency of the labels observed in the beam;

3) this information is only propagated within the TSDF truncation distance, i.e. the near surface, to save computation time;

4) updating the label probability of each voxel by using a Bayesian method;

5) after semantic ray casting, each voxel has a label probability vector, and a label with the maximum probability is selected from the label probability vectors;

6) and extracting the semantic grids from the labels with the highest probability in the step 5) by using a marching cubes algorithm.

Further, the voxel label probability updating formula based on the bayesian method in 4) is as follows:

wherein k is the observed image frame sequence number, I is the observed image frame, I is the sequence number of the semantic category, and P () is the probability distribution of the ith classification; u. of_(s,k)Pixel coordinates representing the s-th point element in the k-th frame; o is a vector of n elements, n represents the number of categories, wherein the ith element is taken out to obtain a sheet l_iAnd the probability distribution map of the class takes the probability value of the position of u to multiply the original probability for updating once.

Further, the detecting of abnormal movement of the dolly in S7 includes:

1) detection of abnormal movement of the trolley based on semantic segmentation of the road surface: when the road condition with uneven road surface is detected in front of the trolley, the pre-integration result of the wheel speed meter is regarded as abnormal, and the trolley is in an abnormal motion state;

2) detecting abnormal movement of the trolley based on the consistency of inertial navigation and wheel speed meters: acquiring the position, the speed and the attitude of the IMU in a world coordinate system when the IMU receives a camera frame for the last time from a state estimator, wherein the position, the speed and the attitude of the IMU have known initial attitude and speed, and the IMU dead reckoning algorithm based on inertial navigation predicts the real-time attitude and the speed of the trolley in a short time on the basis of IMU measurement under the condition of no gravity acceleration;

the wheel speed meter pre-integration algorithm predicts the real-time attitude and the speed of the trolley according to the motion state of the previous frame, the position of the robot is calculated from the IMU or the pre-point of the wheel speed meter according to the position covariance obtained by the wheel speed meter pre-integration, the Markov distance calculated by the two methods is more than 1.5, the pre-integration result of the wheel speed meter is considered as abnormal, and the trolley is in an abnormal motion state;

3) detection of trolley movement abnormality based on alignment of a wheel speed meter and an IMU: and (3) directly carrying out linear alignment on the IMU pre-integration and the wheel speed meter pre-integration, then calculating the deviation between an IMU preset point and a wheel speed meter preset point in the latest frame, and if the deviation of the Mahalanobis distance is more than 1.5, the pre-integration result of the wheel speed meter is regarded as abnormal, and the trolley is in an abnormal motion state.

Has the advantages that:

1. the wheel speed meter measurement is integrated into the vision-IMU odometer, so that the problem of fuzzy initial scales of the VINS when the trolley moves on the flat ground is solved, and the positioning robustness is improved.

2. The invention constructs the semantic map, updates the probability distribution of the category to which each map node belongs by using a Bayesian increment method, and solves the problem of inconsistent type probabilities in the process that the same spatial point is continuously observed.

3. The method detects the abnormal motion of the trolley by a semantic method, and avoids the adverse effect of wrong wheel speed meter measurement on the attitude estimation of the robot.

Drawings

FIG. 1 is a flow chart of the system of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

On the basis of the traditional visual inertial navigation SLAM, the method integrates wheel speed meter measurement by using a pre-integration and optimization-based method, performs road surface semantic segmentation on a key frame, and constructs a global three-dimensional semantic map. And (3) combining the road surface semantic information to detect abnormal movement of the trolley, and actively deleting the wheel speed meter pre-integral measurement value of the current frame from the state estimation equation when the abnormal movement of the trolley is detected.

FIG. 1 shows a semantic SLAM method flow diagram based on visual-IMU-wheel speed meter fusion, according to one embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:

s1: the depth camera is fixedly installed on a trolley, a wheel speed meter is installed on the trolley, and a color image and a depth image, an IMU (inertial measurement unit) measured value and a wheel speed meter reading at the previous moment and the current moment are acquired.

S2: pre-integration of the IMU and wheel speed are calculated for IMU measurements and wheel speed readings in S1.

The calculation of the pre-integration of the IMU and the wheel speed meter is specifically:

A value of (d);

and

readings of an accelerometer, a gyroscope, and a wheel speed meter, respectively; δ t_lIs the time interval between times l and l + 1;

and

respectively representing accelerometer and gyroscope bias at a certain time t;

indicating the displacement of the wheel speed meter.

S3: and calculating the visual reprojection error, the IMU pre-integration error and the wheel speed meter pre-integration error between two adjacent frames.

The method for calculating the vision reprojection error, the IMU pre-integration error and the wheel speed meter pre-integration error comprises the following steps:

and

is compensated by variation of IMU deviation

And

and IMU-wheel speed meter external reference

The compensated pre-integration result.

S4: and (4) pose estimation, namely performing nonlinear optimization on the visual reprojection error, the IMU pre-integration error and the wheel speed meter pre-integration error in S3 to solve the poses of all frames in the sliding window.

The pose estimation method comprises the following steps:

the error of the re-projection is represented,

And translation

And rotation of the wheel speed meter to the IMU

And translation

Wherein the state quantity x of each frame_kIncluding displacement of the IMU to the world coordinate system at the time corresponding to each frame

Speed of rotation

And rotation

And acceleration biasing of IMU

And angular velocity offset

S5: and (3) performing 2D semantic segmentation on the road surface by using a convolutional neural network, segmenting an uneven area of the road surface, and obtaining a 2D road surface semantic segmentation result map.

S6: inputting the pose of each frame in S4, the 2D road surface semantic segmentation result map in S5 and the depth map in S1 into a global 3D semantic map building module to build a global 3D semantic mesh map with semantic labels.

Constructing the three-dimensional map further comprises: generating three-dimensional point cloud from RGBD data, then obtaining TSFD on each key frame by using a ray projection method, and extracting grids from the TSFD by using a marching cube algorithm; the method specifically comprises the following steps:

1) and inputting a 2D road surface semantic segmentation result with a semantic label on each key frame, and then attaching the label to each three-dimensional point.

2) Semantic labels are also projected during ray casting, and for each ray in the ray casting, a label probability vector is established according to the frequency of the observed labels in the beam.

3) This information is only propagated within the TSDF truncation distance, i.e. close to the surface, to save computation time.

4) Using a bayesian approach, the label probability for each voxel is updated.

The voxel label probability updating formula based on the Bayesian method is as follows:

5) After semantic ray casting, each voxel has a label probability vector from which the label with the highest probability is selected.

S7: and detecting the movement abnormality of the trolley.

The detection of the abnormal movement of the trolley comprises the following steps:

1) detection of abnormal movement of the trolley based on semantic segmentation of the road surface: and if the road condition with uneven road surface is detected in front of the trolley, the pre-integration result of the wheel speed meter is regarded as abnormal, and the trolley is in an abnormal motion state.

2) Detecting abnormal movement of the trolley based on the consistency of inertial navigation and wheel speed meters: obtaining the position, the speed and the attitude of the IMU in a world coordinate system when the IMU receives a camera frame for the last time from the state estimator, wherein the position, the speed and the attitude have known initial attitude and speed, and the IMU dead reckoning algorithm based on inertial navigation predicts the real-time attitude and the speed of the trolley in a short time based on IMU measurement under the condition of no gravity acceleration.

The wheel speed meter pre-integration algorithm can also predict the real-time attitude and speed of the trolley according to the motion state of the previous frame, and the position of the robot can be deduced from the IMU or the wheel speed meter pre-integration according to the position covariance obtained by the wheel speed meter pre-integration, and if the Markov distance calculated by both methods is greater than 1.5 (the corresponding probability is about 13.4%), the wheel speed meter pre-integration result is considered as abnormal, and the trolley is in an abnormal motion state.

3) Detection of trolley movement abnormality based on alignment of a wheel speed meter and an IMU: the IMU pre-integration and the wheel speed meter pre-integration are directly aligned linearly, then the deviation between the IMU pre-setting point and the wheel speed meter pre-setting point is calculated in the latest frame, if the deviation of the Mahalanobis distance is larger than 1.5 (the corresponding probability is about 13.4%), the wheel speed meter pre-integration result is regarded as abnormal, and the trolley is in an abnormal motion state.

The above embodiments are merely illustrative of the technical concepts and features of the present invention, and the purpose of the embodiments is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. A semantic SLAM method based on visual-IMU-wheel speed meter fusion is characterized by comprising the following steps:

s7: detecting abnormal movement of the trolley;

2. The semantic SLAM method based on visual-IMU-wheel speed fusion of claim 1, wherein the pre-integration of IMU and wheel speed in S2 is specifically:

A value of (d);

and

and

respectively representing accelerometer and gyroscope bias at a certain time t;

indicating the displacement of the wheel speed meter.

3. The semantic SLAM method based on vision-IMU-wheel speed meter fusion of claim 1, wherein the calculation method of the vision reprojection error, the IMU pre-integration error and the wheel speed meter pre-integration error in S3 is as follows:

and

is compensated by variation of IMU deviation

And

and IMU-wheel speed meter external reference

The compensated pre-integration result.

4. The semantic SLAM method based on visual-IMU-wheel speed meter fusion of claim 1, wherein the pose estimation method in S4 is:

the error of the re-projection is represented,

And translation

And rotation of the wheel speed meter to the IMU

And translation

Speed of rotation

And rotation

And acceleration biasing of IMU

And angular velocity offset

5. The semantic SLAM method based on visual-IMU-wheel speed fusion of claim 1, wherein the constructing the three-dimensional map in S6 further comprises: generating three-dimensional point cloud from RGBD data, then obtaining TSFD on each key frame by using a ray projection method, and extracting grids from the TSFD by using a marching cube algorithm; the method specifically comprises the following steps:

4) updating the label probability of each voxel by using a Bayesian method;

6. The semantic SLAM method based on visual-IMU-wheel speed meter fusion of claim 5, wherein the voxel label probability updating formula based on the Bayesian method in 4) is as follows:

where k is the observed image frameI is the observed image frame, I is the sequence number of the semantic class, and P () is the probability distribution of the ith class; u. of_(s,k)Pixel coordinates representing the s-th point element in the k-th frame; o is a vector of n elements, n represents the number of categories, wherein the ith element is taken out to obtain a sheet l_iAnd the probability distribution map of the class takes the probability value of the position of u to multiply the original probability for updating once.

7. The semantic SLAM method based on visual-IMU-wheel speed meter fusion of any one of claims 1 to 6, wherein the detection of the abnormal motion of the dolly in S7 comprises:

detection of abnormal movement of the trolley based on semantic segmentation of the road surface: and if the road condition with uneven road surface is detected in front of the trolley, the pre-integration result of the wheel speed meter is regarded as abnormal, and the trolley is in an abnormal motion state.

8. The semantic SLAM method based on visual-IMU-wheel speed meter fusion of any one of claims 1 to 6, wherein the detection of the abnormal motion of the dolly in S7 further comprises:

detecting abnormal movement of the trolley based on the consistency of inertial navigation and wheel speed meters: acquiring the position, the speed and the attitude of the IMU in a world coordinate system when the IMU receives a camera frame for the last time from a state estimator, wherein the position, the speed and the attitude of the IMU have known initial attitude and speed, and the IMU dead reckoning algorithm based on inertial navigation predicts the real-time attitude and the speed of the trolley in a short time on the basis of IMU measurement under the condition of no gravity acceleration;

the wheel speed meter pre-integration algorithm predicts the real-time attitude and the speed of the trolley according to the motion state of the previous frame, the position of the robot is calculated from the IMU or the pre-point of the wheel speed meter according to the position covariance obtained by the wheel speed meter pre-integration, the Markov distance calculated by the two methods is larger than 1.5, the pre-integration result of the wheel speed meter is considered as abnormal, and the trolley is in an abnormal motion state.

9. The semantic SLAM method based on visual-IMU-wheel speed meter fusion of any one of claims 1 to 6, wherein the detection of the abnormal motion of the dolly in S7 further comprises:

detection of trolley movement abnormality based on alignment of a wheel speed meter and an IMU: and (3) directly carrying out linear alignment on the IMU pre-integration and the wheel speed meter pre-integration, then calculating the deviation between an IMU preset point and a wheel speed meter preset point in the latest frame, and if the deviation of the Mahalanobis distance is more than 1.5, the pre-integration result of the wheel speed meter is regarded as abnormal, and the trolley is in an abnormal motion state.