CN116299525A - Dynamic environment RGB-D vision SLAM method based on point cloud region correlation - Google Patents

Dynamic environment RGB-D vision SLAM method based on point cloud region correlation Download PDF

Info

Publication number
CN116299525A
CN116299525A CN202310176750.3A CN202310176750A CN116299525A CN 116299525 A CN116299525 A CN 116299525A CN 202310176750 A CN202310176750 A CN 202310176750A CN 116299525 A CN116299525 A CN 116299525A
Authority
CN
China
Prior art keywords
point cloud
point
sub
dynamic
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310176750.3A
Other languages
Chinese (zh)
Inventor
禹鑫燚
郑万财
欧林林
周利波
魏岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202310176750.3A priority Critical patent/CN116299525A/en
Publication of CN116299525A publication Critical patent/CN116299525A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Theoretical Computer Science (AREA)
  • Electromagnetism (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A dynamic environment RGB-D vision SLAM method based on point cloud region correlation can be used as front-end preprocessing of an ORBSLAM2/3 system to filter out dynamic regions, so that erroneous data association is reduced. The input of the whole system requires the generation of point cloud by the RGB image and the depth map of the current frame, and the point cloud is temporarily stored in the system as a reference frame after one frame is processed, and secondary processing is not performed. The depth map is clustered by using a clustering algorithm to divide sub-point clouds, then sub-pose changes between the sub-point clouds and a reference frame are sequentially calculated, meanwhile, an idea of dynamic and static correlation of the sub-point clouds is provided, which sub-point clouds belong to dynamic point clouds can be effectively identified by combining the idea and the sub-pose, the constraint of epipolar geometry is not needed, and the approximate self pose of a camera is not needed to be known in advance. The invention well reduces the positioning influence of the dynamic object on the robot, and improves the robustness of the system and the generalization of the scene.

Description

Dynamic environment RGB-D vision SLAM method based on point cloud region correlation
Technical Field
The invention relates to the field of global positioning of robots, in particular to a dynamic environment RGB-D vision SLAM method based on point cloud region correlation.
Background
Today, visual real-time localization and mapping is widely studied, and it is attempted to be applied to various devices as the core of localization, such as blind aids, reality augmentation technology, robots, autopilot, etc. With the development of nearly 20 years, the overall framework of visual SLAM is gradually maturing, and at the same time, many advanced visual positioning systems with high precision and high robustness have been proposed. These vision-based systems may estimate the camera's own motion from only successive images captured by the surrounding environment and generate successive motion trajectories representing the camera. However, most excellent systems are designed assuming that the surrounding environment is static in order to reduce the estimation problem. If dynamic objects exist in the environment, the irregular moving landmarks are introduced in actual observation, so that the system cannot distinguish whether the system is moving or the objects in the environment are moving, and finally, inaccuracy of positioning information and serious deviation of environment map construction are caused. In order to solve the problem of poor performance of the visual SLAM in a dynamic environment, researchers take a number of measures, such as patent number CN115393538A, which performs feature extraction on the image information based on GCNv2 algorithm, and removes dynamic feature points in the image information by using an optical flow dynamic point removal algorithm; patent number CN114283198A detects a dynamic target in an image using a target detection neural network YOLOv 5; and combining the image depth information, and processing by a pixel level segmentation method to achieve the effect of pixel level segmentation. The existing method is based on deep learning, however, the method based on deep learning needs to know which objects are moving in advance and which objects appear in the field of view of the equipment, besides, the calculation amount of an inference model is huge, and the method is difficult to apply to occasions with low cost and high real-time requirements.
At present, the following defects and disadvantages exist for the positioning of the robot in a dynamic environment: 1) The unknown characteristic points can be further identified and judged on the basis of the pose estimation information, and the system failure can be directly caused under severe environments. 2) The deep learning method needs to know which objects are moving and which objects appear in the visual field of the equipment in advance, besides, the calculation amount of the inference model is huge, and the method is difficult to be applied to occasions with low cost and high real-time requirements. 3) The existence of dynamic objects is unavoidable in a real environment based on the fact that the geometrical method is easy to have false negative (static mark is dynamic) phenomenon 4), and the existence of the dynamic objects can greatly influence the positioning precision of equipment, and is one of main problems which prevent the further development of vision SLAM.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a dynamic environment RGB-D vision SLAM method based on the correlation of the point cloud area, which can improve the positioning accuracy and the robustness in a dynamic real environment. The invention relates to an independent segmentation method which does not need to consider the motion of a camera and gets rid of projective transformation which is needed to be relied on by most of the current algorithms, and in addition, the identification of a dynamic region can be achieved without prior information of training samples and dynamic objects.
In order to achieve the above purpose, the present invention provides the following technical solutions: a dynamic environment RGB-D vision SLAM method based on point cloud region correlation comprises the following steps:
s1: performing two-dimensional plane downsampling on the RGB image and generating a point cloud by combining a depth map;
the specific step of generating the point cloud comprises the following steps:
s11: and carrying out downsampling on a two-dimensional plane on the gray level diagram to generate a point cloud by simulating a data acquisition form of the rotary laser radar on the current frame so as to accelerate construction of the KD tree.
S12: the downsampled point cloud is rasterized by a grid with a side length of l, and an ith grid G is calculated i Gray average value of all points in the gridIntensity value of center of gravity:
Figure SMS_1
Figure SMS_2
representing grid G i I (x, y) represents the gray value of the pixel point coordinates (x, y).
S2: minimizing the point cloud generated in the step S1 into sub point clouds;
the generating of the sub-point cloud specifically comprises the following steps:
s21: and removing the depth unstable region, and setting the depth value of a certain point in the depth image to be 0 when the depth value is larger than a certain value according to the noise model of the depth camera and the empirical value obtained in the experiment.
S22: since the point cloud obtained by the original depth of the RGBD camera has a multi-layer structure or a cavity, the point cloud obtained by the S1 is smoothed by using the least square of the movement.
S23: and clustering the point clouds by using a K-Means clustering algorithm to obtain K sub-point clouds.
S3: setting the previous frames of the current frame as reference frames;
s4: an accurate transformation matrix is obtained by adopting a method of fusion of gray level auxiliary nearest neighbor iteration and traditional nearest neighbor iteration, comprising the following steps: carrying out gray nearest neighbor iteration on each sub-point cloud of the S2 and the target point cloud of the reference frame to calculate a rough transformation matrix and a corresponding point, and then calculating an accurate transformation matrix by utilizing the conventional nearest neighbor iteration to register the point cloud;
the intensity-assisted nearest neighbor iterative registration method specifically comprises the following steps:
s41: estimating and predicting initial pose according to uniform motion model of ORBSLAM2
Figure SMS_3
Providing the initial pose can effectively prevent the sinking into the locally optimal solution and accelerate the iteration speed:
Figure SMS_4
t in (2) (·) Are all T in the world coordinate system pre Representing the increment of pose transformation between the last two common frames, T last Representing the camera pose of the previous frame, T ref Pose of reference frame.
S42: according to the initial pose
Figure SMS_5
Search for sub-point cloud->
Figure SMS_6
Alpha closest points closest to the target point cloud as candidate corresponding points +.>
Figure SMS_7
S43: calculating candidate points
Figure SMS_8
Selecting the highest score S (i, k) as the corresponding point +.>
Figure SMS_9
Figure SMS_10
Figure SMS_11
Figure SMS_12
Delta T represents the incremental matrix for each iteration, r G Representing geometric residual error, r I And (×) represents the intensity residual, w is the weight conversion function.
S431: wherein the weight transformation function of S43 is assumed herein to obey the t-distribution, which weight transformation function can be expressed as:
Figure SMS_13
Figure SMS_14
Figure SMS_15
v represents the degree of freedom of the distribution, μ (*) Mean value representing geometric or intensity residual, and the same thing sigma (*) Represented as variance.
S44: and S4, calculating the pairing points of the points, and calculating a gain matrix by using the pairing points to ensure that the total Euclidean distance sum is minimum:
Figure SMS_16
s45: updating transformation matrix T * ←△TT *
S5: transforming the source point clouds according to the transformation matrix calculated in the step S4, and estimating the degree of correlation between every two sub point clouds;
the correlation degree calculation includes:
s51: find the i-th point
Figure SMS_17
Is the closest point P of (2) tgt (j) Besides, the Euclidean distance is calculated as a main index, an additional intensity value is used as a punishment item, and the score of all points is accumulated and averaged to be used as a correlation degree score:
Figure SMS_18
Figure SMS_19
Figure SMS_20
pose transformation matrix T for representing source point cloud passing through jth sub point cloud Bj In the point cloud obtained later->
Figure SMS_21
Ith point, ++>
Figure SMS_22
Representation dot->
Figure SMS_23
Intensity values of (2).
S6: calculating a threshold divides all sub-point clouds into two groups: a dynamic correlation group and a static correlation group;
s61: a correlation degree score { R (j) } of the sub-point cloud calculated according to S5 j∈K Setting a threshold epsilon as a zone static sub-point cloud, the threshold can be determined by the following formula:
Figure SMS_24
Figure SMS_25
Figure SMS_26
s62: according to the threshold value, which point clouds belong to the static point clouds can be effectively judged:
Figure SMS_27
s represents a static sub-point cloud set, D belongs to a dynamic sub-point cloud set, and all point clouds in the dynamic sub-point cloud set are projected onto a camera plane to generate a mask.
S7: projecting the point cloud of the dynamic correlation group to a camera plane, drawing a circle by taking the point cloud as a center, and filling the point cloud by morphological geometry to finally form a required pixel mask;
s8: the feature points in the mask are not added into pose estimation of the camera;
s9: the processed frame is stored in the system as a reference frame and is used as an alternative frame for updating the reference frame;
s10: and refining the mask generated by the key frames by using the historical observation data to generate more stable map points.
S101, when the key frame is generated, taking the previous N frames as historical observation data, not all historical data, and estimating the pixel point weight value by the following formula:
Figure SMS_28
I i (x, y) represents the pixel point with the coordinate of (x, y) of the ith data observation, ρ is the weight proportion of the pixel, and the pixel weight is lambda times higher than the total observation is regarded as a dynamic pixel point:
Figure SMS_29
s102: since feature points are easily generated at the edges of the object, but these edge points are extremely unstable, the mask generated by the refined key frame will be properly expanded so that the edge points do not participate in the camera pose estimation process.
The invention can be used as front-end preprocessing of the ORBSLAM2/3 system to filter out dynamic areas, thereby reducing erroneous data association. The input of the whole system requires the generation of point cloud by the RGB image and the depth map of the current frame, and the point cloud is temporarily stored in the system as a reference frame after one frame is processed, and secondary processing is not performed. The K-Means is utilized to cluster and divide the depth map into sub point clouds, then sub pose changes between the sub point clouds and the reference frame are sequentially calculated, meanwhile, the invention provides an idea of dynamic and static correlation of the sub point clouds, which sub point clouds belong to dynamic point clouds can be effectively identified by combining the idea and the sub pose, the dependency on epipolar geometric constraint is avoided, and the approximate self pose of a camera is not required to be known in advance. The invention well reduces the positioning influence of the dynamic object on the robot, and improves the robustness of the system and the generalization of the scene.
Compared with the prior art, the invention has the beneficial effects that:
the invention discloses a dynamic environment RGB-D vision SLAM method based on the correlation of point cloud areas, which solves the problem that the traditional dynamic vision SLAM can judge the dynamic areas only by considering the movement of a camera, and in addition, the prior knowledge of training samples and dynamic objects is not needed. Meanwhile, the positioning accuracy and the robustness of the visual SLAM in a dynamic environment are improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2a to 2b are schematic diagrams of detection effects of a dynamic region according to an embodiment of the disclosure, where fig. 2a is an original image and fig. 2b is a schematic diagram of detection results of the dynamic region;
FIGS. 3 a-3 b are schematic diagrams of the present invention in comparison to a true trajectory and a trajectory of the ORB-SLAM2 algorithm on a TUM dataset, wherein FIG. 3a is a translational movement of the camera along the X-Y-Z axis, FIG. 3b is a movement of the camera along a hemisphere with a radius of 1 meter, a blue trajectory is a positioning result trajectory of the present invention, a dashed line is a movement trajectory of the true camera, and a green trajectory is a positioning result trajectory of the ORB-SLAM 2;
FIGS. 4 a-4 b are diagrams illustrating the effect of keyframe refinement in accordance with embodiments of the present disclosure, wherein FIG. 4a is a sequence of historical observation data frames, FIG. 4b is an original keyframe prior to refinement, and FIG. 4c is a keyframe after refinement;
fig. 5a to 5b are schematic views of the dynamic region detection effect under an actual scene according to the disclosed embodiment of the invention, wherein fig. 5a is an original image, and fig. 5b is a schematic view of the dynamic region detection result.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings. It should be understood that the examples are presented for the purpose of illustrating the invention only and are not intended to limit the invention.
The invention discloses a dynamic environment RGB-D vision SLAM method flow chart based on point cloud area correlation, which is shown in figure 1 and comprises the following steps.
S1: performing two-dimensional plane downsampling on the RGB image and generating a point cloud by combining a depth map;
the specific steps of the point cloud generation are as follows:
s11: and (3) carrying out data acquisition form of simulating a rotary laser radar on the current frame, extracting a gray level image at intervals of 4 pixels and 4 pixels, and generating a point cloud by combining a depth image, so as to realize downsampling on a two-dimensional plane and accelerate construction of a KD tree.
S12: the downsampled point cloud is rasterized by a grid with a side length of 0.05, and an ith grid G is calculated i Gray average value of all points in the grid, and the gray average value is taken as the intensity value of the center of gravity of the grid:
Figure SMS_30
Figure SMS_31
representing grid G i I (x, y) represents the gray value of the pixel point coordinates (x, y).
S2: minimizing the point cloud generated in the step S1 into sub point clouds;
the specific steps of generating the sub-point cloud are as follows:
s21: and removing the depth unstable region, and setting the depth value of a certain point in the depth image to be 0 when the depth value is larger than 6 meters according to the noise model of the depth camera and the empirical value obtained in the experiment.
S22: since the point cloud obtained by the original depth of the RGBD camera has a multi-layer structure or a cavity, the point cloud obtained by the S1 is smoothed by using the least square of the movement.
S23: and clustering the point clouds by using a K-Means clustering algorithm to obtain K=10 sub-point clouds.
S3: setting the first 3 frames of the current frame as reference frames;
s4: an accurate transformation matrix is obtained by adopting a method of fusion of gray level auxiliary nearest neighbor iteration and traditional nearest neighbor iteration, comprising the following steps: performing 5 gray nearest neighbor iterations on each sub-point cloud of the S2 and the target point cloud of the reference frame to calculate a rough transformation matrix and a corresponding point, and performing 5 iteration to point cloud registration by using the traditional nearest neighbor iterations to calculate an accurate transformation matrix;
the intensity-assisted nearest neighbor iterative registration method specifically comprises the following steps:
s41: estimating and predicting initial pose according to uniform motion model of ORBSLAM2
Figure SMS_32
Providing the initial pose can effectively prevent the sinking into the locally optimal solution and accelerate the iteration speed:
Figure SMS_33
t in (2) (·) Are all T in the world coordinate system pre Representing the increment of pose transformation between the last two common frames, T last Representing the camera pose of the previous frame, T ref Pose of reference frame.
S42: according to the initial pose
Figure SMS_34
Search for sub-point cloud->
Figure SMS_35
α=10 closest points closest to the target point cloud as candidate corresponding points +.>
Figure SMS_36
S43: calculating candidate points
Figure SMS_37
Selecting the highest score S (i, k) as the corresponding point +.>
Figure SMS_38
Figure SMS_39
Figure SMS_40
Figure SMS_41
Delta T represents the incremental matrix for each iteration, r G Representing geometric residual error, r I And (×) represents the intensity residual, w is the weight conversion function.
S431: wherein the weight transformation function of S43 is assumed herein to obey the t-distribution, which weight transformation function can be expressed as:
Figure SMS_42
Figure SMS_43
Figure SMS_44
v represents the degree of freedom of the distribution, and is set to 5 in the experiment. Mu (mu) (*) Mean value representing geometric or intensity residual, and the same thing sigma (*) Represented as variance.
S44: and S4, calculating the pairing points of the points, and calculating a gain matrix by using the pairing points to ensure that the total Euclidean distance sum is minimum:
Figure SMS_45
s45: updating transformation matrix T * ←△TT *
S5: transforming the source point clouds according to the transformation matrix calculated in the step S4, and estimating the degree of correlation between every two sub point clouds;
the correlation degree calculation includes:
find the i-th point
Figure SMS_46
Is the closest point P of (2) tgt (j) Besides, the Euclidean distance is calculated as a main index, an additional intensity value is used as a punishment item, and the score of all points is accumulated and averaged to be used as a correlation degree score:
Figure SMS_47
Figure SMS_48
Figure SMS_49
pose transformation matrix representing source point cloud passing through jth sub point cloud>
Figure SMS_50
In the point cloud obtained later->
Figure SMS_51
Ith point, ++>
Figure SMS_52
Representation dot->
Figure SMS_53
Intensity values of (2).
S6: calculating a threshold divides all sub-point clouds into two groups: a dynamic correlation group and a static correlation group;
s61: a correlation degree score { R (j) } of the sub-point cloud calculated according to S5 j∈K Setting a threshold epsilon as a zone static sub-point cloud, the threshold can be determined by the following formula:
Figure SMS_54
Figure SMS_55
Figure SMS_56
s62: according to the threshold value, which point clouds belong to the static point clouds can be effectively judged:
Figure SMS_57
s represents a static sub-point cloud set, D belongs to a dynamic sub-point cloud set, and all point clouds in the dynamic sub-point cloud set are projected onto a camera plane to generate a mask.
S7: projecting the point cloud of the dynamic correlation group to a camera plane, drawing a circle by taking the point cloud as a center, and filling the point cloud by morphological geometry to finally form a required pixel mask;
s8: the feature points in the mask are not added into pose estimation of the camera;
s9: the processed frames are stored in a system and used as alternative frames updated by reference frames;
s10: and refining the mask generated by the key frames by using the historical observation data to generate more stable map points.
S101, when the key frame is generated, the first n=7 frames are taken as history observation data, not all history data, and the pixel weight is estimated by the following formula:
Figure SMS_58
I i (x, y) represents the pixel point with the coordinate of (x, y) of the ith data observation, and ρ is the weight proportion of the pixel. In the experiment, the weight rho of the key frame is 3, the weight rho of the common frame is 2, and N is set to 7. A pixel weight exceeding λ=0.6 times the total observation is considered a dynamic pixel point:
Figure SMS_59
s102: since feature points are easily generated at the edges of the object, but these edge points are extremely unstable, the mask generated by the refined key frame will be properly expanded so that the edge points do not participate in the camera pose estimation process.
Fig. 2a and 2b identify a dynamic region on the TUM dataset for a pedestrian, which is well identified by the present invention and less affected by the rotation of the camera. Fig. 3a and 3b are graphs comparing absolute positioning accuracy of the present invention with the trajectories of the true and ORB-SLAM2 algorithms, where the trajectories of the present invention closely fit the true trajectories, however, the trajectories of ORB-SLAM2 deviate severely, which proves that the present invention can improve positioning accuracy and robustness of SLAM systems in dynamic environments. Fig. 4a and 4b are visual effect diagrams of the key frame refinement. The original keyframes before refinement of fig. 4b are filtered by the historical observation data frame sequence of fig. 4a, and finally the keyframes after refinement of fig. 4c are obtained. As the mask area of the refined key frame is greatly reduced, the influence of false negative can be effectively reduced, and the robustness of the system is improved. Fig. 5a and 5b are visualizations of dynamic region identification in a real environment, which fully represent the reliability and utility of the present invention.
It should be emphasized that the embodiments described herein are illustrative rather than limiting, and that this invention includes, by way of example, but is not limited to the specific embodiments described herein, as other embodiments similar thereto will occur to those of skill in the art based upon the teachings herein and fall within the scope of this invention.

Claims (8)

1. The dynamic environment RGB-D vision SLAM method based on the point cloud region correlation is characterized by comprising the following steps:
s1: performing two-dimensional plane downsampling on the RGB image and generating a point cloud by combining a depth map;
s2: minimizing the point cloud generated in the step S1 into sub point clouds;
s3: setting the previous frames of the current frame as reference frames;
s4: an accurate transformation matrix is obtained by adopting a method of fusion of gray level auxiliary nearest neighbor iteration and traditional nearest neighbor iteration, comprising the following steps: carrying out gray nearest neighbor iteration on each sub-point cloud of the S2 and the target point cloud of the reference frame to calculate a rough transformation matrix and a corresponding point, and then calculating an accurate transformation matrix by utilizing the conventional nearest neighbor iteration to register the point cloud;
s5: transforming the source point clouds according to the transformation matrix calculated in the step S4, and estimating the degree of correlation between every two sub point clouds;
s6: calculating a threshold divides all sub-point clouds into two groups: a dynamic correlation group and a static correlation group;
s7: projecting the point cloud of the dynamic correlation group to a camera plane, drawing a circle by taking the point cloud as a center, and filling by morphological geometry to finally form a pixel mask required by us;
s8: the feature points in the mask are not added into pose estimation of the camera;
s9: storing the processed frame in a system as an alternative frame for updating the reference frame;
s10: and refining the mask generated by the key frames by using the historical observation data to generate more stable map points.
2. The dynamic environment RGB-D visual SLAM method based on point cloud region correlation of claim 1, wherein the point cloud generation in step S1 comprises:
s11: the method comprises the steps that a data acquisition form of a rotary laser radar is simulated for a current frame, and down-sampling on a two-dimensional plane is conducted on a gray level diagram to generate a point cloud so as to accelerate construction of a KD tree;
s12: the downsampled point cloud is rasterized by a grid with a side length of l, and an ith grid G is calculated i Gray average value of all points in the grid, and the gray average value is taken as the intensity value of the center of gravity of the grid:
Figure QLYQS_1
Figure QLYQS_2
representing grid G i I (x, y) represents the gray value of the pixel point coordinates (x, y).
3. The dynamic environment RGB-D visual SLAM method based on point cloud region correlation of claim 1, wherein the generating the sub-point cloud in step S2 comprises:
s21: and removing the depth unstable region, and setting the depth value of a certain point in the depth image to be 0 when the depth value is larger than a certain value according to the noise model of the depth camera and the empirical value obtained in the experiment.
S22: because the point cloud obtained by the original depth of the RGBD camera has a multi-layer structure or a cavity, the point cloud obtained by the S1 is smoothed by using the least square of the movement;
s23: and clustering the point clouds by using a K-Means clustering algorithm to obtain K sub-point clouds.
4. The method for dynamic environment RGB-D vision SLAM based on point cloud region correlation of claim 1, wherein step S4 gray-scale assisted nearest neighbor iterative computation of the transformation matrix comprises:
s41: estimating and predicting initial pose according to uniform motion model of ORBSLAM2
Figure QLYQS_3
Providing the initial pose can effectively prevent the sinking into the locally optimal solution and accelerate the iteration speed:
Figure QLYQS_4
t in (2) (·) Are all T in the world coordinate system pre Representing the increment of pose transformation between the last two common frames, T last Representing the camera pose of the previous frame, T ref Pose of reference frame;
s42: according to the initial pose
Figure QLYQS_5
Search for sub-point cloud->
Figure QLYQS_6
Alpha nearest points nearest to the target point cloud are used as candidate corresponding points
Figure QLYQS_7
S43: calculating candidate points
Figure QLYQS_8
Selecting the highest score S (i, k) as the corresponding point +.>
Figure QLYQS_9
Figure QLYQS_10
Figure QLYQS_11
Figure QLYQS_12
Delta T represents the incremental matrix for each iteration, r G Representing geometric residual error, r I Representing the intensity residual, w being the weight conversion function;
s431: wherein the weight transformation function of S43 is assumed herein to obey the t-distribution, which weight transformation function can be expressed as:
Figure QLYQS_13
Figure QLYQS_14
Figure QLYQS_15
v represents the degree of freedom of the distribution, μ (*) Mean value representing geometric or intensity residual, and the same thing sigma (*) Represented as variance;
s44: and S4, calculating the pairing points of the points, and calculating a gain matrix by using the pairing points to ensure that the total Euclidean distance sum is minimum:
Figure QLYQS_16
s45: updating transformation matrix T * ←ΔTT *
5. The dynamic environment RGB-D visual SLAM method based on point cloud region correlation of claim 1, wherein the calculating of the degree of correlation between sub-point clouds in step S5 comprises:
find the i-th point
Figure QLYQS_17
Is the closest point P of (2) tgt (j) Besides, the Euclidean distance is calculated as a main index, an additional intensity value is used as a punishment item, and the score of all points is accumulated and averaged to be used as a correlation degree score:
Figure QLYQS_18
Figure QLYQS_19
Figure QLYQS_20
pose transformation matrix representing source point cloud passing through jth sub point cloud>
Figure QLYQS_21
In the point cloud obtained later->
Figure QLYQS_22
The point(s) of the (i) th,
Figure QLYQS_23
representation dot->
Figure QLYQS_24
Intensity values of (2).
6. The method of dynamic environment RGB-D vision SLAM based on point cloud area correlation of claim 1, wherein step S6 of calculating the threshold value of the zone transfer static correlation group comprises:
s61: a correlation degree score { R (j) } of the sub-point cloud calculated according to S5 j∈K Setting a threshold epsilon as a zone static sub-point cloud, the threshold can be determined by the following formula:
Figure QLYQS_25
Figure QLYQS_26
Figure QLYQS_27
s62: judging which point clouds belong to static point clouds according to a threshold value:
Figure QLYQS_28
s represents a static sub-point cloud set, D belongs to a dynamic sub-point cloud set, and all point clouds belonging to the D set are projected onto a camera plane to generate a mask.
7. The method of claim 1, wherein in step S7, all the point clouds in the dynamic correlation group are projected onto the camera plane, and a circle is drawn around it, and then filled by morphological geometry to finally form the required pixel mask.
8. The dynamic environment RGB-D vision SLAM method based on point cloud region correlation of claim 1, wherein step S10 comprises:
s101, when the key frame is generated, taking the previous N frames as historical observation data, not all historical data, and estimating the pixel point weight value by the following formula:
Figure QLYQS_29
I i (x, y) represents a pixel point with the coordinates (x, y) of the ith data observation, ρ is the pixel weight proportion, and the pixel weight exceeding λ times of the total observation is regarded as a dynamic pixel point:
Figure QLYQS_30
s102: because the feature points are easily generated at the edges of the object, but the edge points are unstable, the mask generated by the refined key frames is expanded, so that the edge points do not participate in the pose estimation process of the camera.
CN202310176750.3A 2023-02-28 2023-02-28 Dynamic environment RGB-D vision SLAM method based on point cloud region correlation Pending CN116299525A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310176750.3A CN116299525A (en) 2023-02-28 2023-02-28 Dynamic environment RGB-D vision SLAM method based on point cloud region correlation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310176750.3A CN116299525A (en) 2023-02-28 2023-02-28 Dynamic environment RGB-D vision SLAM method based on point cloud region correlation

Publications (1)

Publication Number Publication Date
CN116299525A true CN116299525A (en) 2023-06-23

Family

ID=86802418

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310176750.3A Pending CN116299525A (en) 2023-02-28 2023-02-28 Dynamic environment RGB-D vision SLAM method based on point cloud region correlation

Country Status (1)

Country Link
CN (1) CN116299525A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117190900A (en) * 2023-11-07 2023-12-08 中铁八局集团第二工程有限公司 Tunnel surrounding rock deformation monitoring method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117190900A (en) * 2023-11-07 2023-12-08 中铁八局集团第二工程有限公司 Tunnel surrounding rock deformation monitoring method
CN117190900B (en) * 2023-11-07 2024-01-02 中铁八局集团第二工程有限公司 Tunnel surrounding rock deformation monitoring method

Similar Documents

Publication Publication Date Title
CN111563442B (en) Slam method and system for fusing point cloud and camera image data based on laser radar
CN111798475B (en) Indoor environment 3D semantic map construction method based on point cloud deep learning
CN110097553B (en) Semantic mapping system based on instant positioning mapping and three-dimensional semantic segmentation
CN111325843B (en) Real-time semantic map construction method based on semantic inverse depth filtering
CN110232350B (en) Real-time water surface multi-moving-object detection and tracking method based on online learning
CN110363816B (en) Mobile robot environment semantic mapping method based on deep learning
CN112396595B (en) Semantic SLAM method based on point-line characteristics in dynamic environment
CN111899334A (en) Visual synchronous positioning and map building method and device based on point-line characteristics
CN113506318B (en) Three-dimensional target perception method under vehicle-mounted edge scene
CN113570629B (en) Semantic segmentation method and system for removing dynamic objects
CN113345008A (en) Laser radar dynamic obstacle detection method considering wheel type robot position and posture estimation
CN104715251A (en) Salient object detection method based on histogram linear fitting
CN111998862A (en) Dense binocular SLAM method based on BNN
CN111914832B (en) SLAM method of RGB-D camera under dynamic scene
CN114549549B (en) Dynamic target modeling tracking method based on instance segmentation in dynamic environment
CN113406659A (en) Mobile robot position re-identification method based on laser radar information
CN116299525A (en) Dynamic environment RGB-D vision SLAM method based on point cloud region correlation
Liu et al. KMOP-vSLAM: Dynamic visual SLAM for RGB-D cameras using K-means and OpenPose
CN113689459B (en) Real-time tracking and mapping method based on GMM and YOLO under dynamic environment
CN113487631A (en) Adjustable large-angle detection sensing and control method based on LEGO-LOAM
CN117496401A (en) Full-automatic identification and tracking method for oval target points of video measurement image sequences
CN116429126A (en) Quick repositioning method for ground vehicle point cloud map in dynamic environment
CN111239761B (en) Method for indoor real-time establishment of two-dimensional map
CN112446885A (en) SLAM method based on improved semantic optical flow method in dynamic environment
CN116363218B (en) Lightweight visual SLAM method suitable for dynamic environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination