CN116625380B - Path planning method and system based on machine learning and SLAM - Google Patents

Path planning method and system based on machine learning and SLAM Download PDF

Info

Publication number
CN116625380B
CN116625380B CN202310921547.4A CN202310921547A CN116625380B CN 116625380 B CN116625380 B CN 116625380B CN 202310921547 A CN202310921547 A CN 202310921547A CN 116625380 B CN116625380 B CN 116625380B
Authority
CN
China
Prior art keywords
point
image
points
path
optical flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310921547.4A
Other languages
Chinese (zh)
Other versions
CN116625380A (en
Inventor
陈洪佳
陈炜楠
管贻生
朱蕾
陈世浪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202310921547.4A priority Critical patent/CN116625380B/en
Publication of CN116625380A publication Critical patent/CN116625380A/en
Application granted granted Critical
Publication of CN116625380B publication Critical patent/CN116625380B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a path planning method and a system based on machine learning and SLAM, which belong to the technical field of unmanned aerial vehicles, and the method comprises the following steps: performing optical flow estimation on path points in the acquired image by using FlowNetS; screening candidate path points according to the optical flow of the path points; judging whether the candidate path point meets the decision index, if yes, setting the candidate path point as a final path point, and if not, regenerating the candidate path point; and generating a feasible path according to the final path point. The invention solves the problem of excessive key frame number required by the visual SLAM direct method for predicting the optical flow, solves the problem of tracking performance of images by improving the accuracy of selecting candidate points of reference frames, improves the real-time performance of the system, and avoids the problems of high communication requirement and large calculation load in the edge cloud cooperation of the traditional SLAM system.

Description

Path planning method and system based on machine learning and SLAM
Technical Field
The invention relates to the technical field of unmanned aerial vehicles, in particular to a path planning method and system based on machine learning and SLAM.
Background
Synchronous localization and mapping (SLAM) is a difficult task with the goal of building a map in an unknown environment and locating its own position in the map. This approach is an important component of many modern emerging technologies, such as indoor robots, outdoor autonomous vehicles, and drones. With the rapid development of the fields of computer vision and machine learning, the optical flow estimation and vision SLAM system has wide application prospects in the fields of computer vision, robot technology, automatic driving and the like. Optical flow estimation is a key technique for estimating pixel displacement between adjacent image frames, which can help us understand motion and depth information of objects in a scene. The visual SLAM system is a key technology for constructing a map in an unknown environment and positioning the map, and is very important for applications such as robot navigation, three-dimensional reconstruction, augmented reality and the like.
One common SLAM method is to directly align the image, pass through a set of selected pixels and minimize photometric errors between the points to accomplish frame-to-frame tracking, which is known as a direct method. The performance of this method depends on the quality of the photometric information and the accuracy of the image matching. Direct and keyframe-based methods rely on optical flow for tracking, thus, there are severe constraints on frame-to-frame displacement. In order to maintain good tracking performance, the direct method generally requires more key frames, resulting in a larger map size, and thus, a large map maintenance problem is a big bottleneck in robot projects and long-term collaborative applications. The existing work is mainly focused on redundant detection aspect of SLAM map sparsification, but has less structural consideration on the environment. Other methods deal with map compression by retaining data useful for location identification and discarding information that does not affect SLAM performance, but the computation required for the pruning process is large, and we want to select better data for preliminary processing. Still other work has explored in terms of global positioning systems and model-learning optical flow systems, but this approach may not be reliable in environments where robots are moving rapidly. Further, since the conventional visual localization and mapping method generally performs computation locally on the robot, the computing resources of the local devices limit the performance and expansibility of the system as the complexity and data volume of the robot task increases. To solve this problem, the concept of Bian Yun collaboration (Edge-Cloud Collaboration) is proposed, which aims to distribute computing tasks between Edge devices and cloud resources, so as to achieve collaboration and optimization of computing. However, in visual localization and mapping, there are some challenges to directly applying the conventional edge cloud collaboration method. For example, conventional methods typically locate and map based on feature point matching and descriptors, but for some direct methods (such as LDSO), they do not have explicit feature points and descriptors, resulting in the inability to directly apply conventional edge cloud collaboration methods.
Disclosure of Invention
The invention aims to provide a path planning method and a system based on machine learning and SLAM, which solve the problem of excessive key frame number required by the visual SLAM direct method for predicting optical flow, overcome the problem of accuracy of selecting candidate points of reference frames to improve tracking performance of images, improve real-time performance of the system and avoid the problems of high communication requirement and large calculation load in edge-cloud cooperation of the traditional SLAM system.
A machine learning and SLAM-based path planning method, comprising:
performing optical flow estimation on path points in the acquired image by using FlowNetS;
screening candidate path points according to the optical flow of the path points;
judging whether the candidate path point meets the decision index, if yes, setting the candidate path point as a final path point, and if not, regenerating the candidate path point;
and generating a feasible path according to the final path point.
Preferably, the method further comprises: when the side end computer cannot judge whether the candidate path points meet the decision index, the candidate path points are sent to the cloud end computer;
judging whether the candidate path points meet decision indexes or not by adopting cloud computer assistance;
and the cloud computer sends the judged result to the side computer.
Preferably, the optical flow estimation of the path points in the acquired image using FlowNetS includes:
inputting the reference image and the target image into the FlowNetS, and estimating the positions of pixel points in the reference image and the target image by adopting a flow vector, wherein the formula is as follows:
wherein ,is the pixel position in the reference image, the corresponding coordinate is +.>,/>Is->Is>Is->An ordinate; />Is the estimated position in the target image, the corresponding coordinates are +.>,/>Is->Is>Is->An ordinate; u represents the component of the optical flow vector in the horizontal direction, namely the displacement of the pixel point on the X axis; v represents the component of the optical flow vector in the vertical direction, i.e., the displacement of the pixel point on the Y-axis;
calculating photometric errors for motion estimation using point-to-point projections, reference framePoints of->And target systemPoint observed in->Is defined as:
wherein ,the weight of the pixel point p is represented and is used for adjusting the contribution of different pixel points to errors; />Representing an imageIs a photometric affine transformation parameter of (a); />Representation of image->Is a photometric affine transformation parameter of (a); />Representation of image->Exposure time of (2); />Representation of image->Exposure time of (2); />Representation of image->Is a camera response function of (a); />Representation of image->Is a camera response function of (a); />: represented in the picture +.>The pixel value at the p' position is calculated through an optical flow field; />: represented in the picture +.>A pixel value of the middle pixel point p; by minimizing photometric error->An estimate of the optical flow and the position of the pixel in the target image is obtained.
Preferably, photometric errors are minimizedComprising the following steps:
initializing camera pose estimation, and setting an initial camera pose estimation value: rotating the matrix R and shifting the vector t;
according to the current camera pose estimation, calculating a jacobian matrix J of the photometric error on the camera pose, wherein the calculation formula of the jacobian matrix J is as follows:
wherein E represents luminosity error, T represents camera pose,representing the partial derivative of the photometric error E on the camera pose T;
according to the jacobian matrix J and the luminosity error E, an increment equation is constructed, wherein the increment equation is in the form of:
wherein ,representing the transpose of jacobian matrix J, +.>An increment representing the pose of the camera;
solving an increment equation by using a numerical optimization method to obtain the increment of the pose of the camera
Increment is increasedThe camera pose estimation method is applied to current camera pose estimation and updates the camera pose:
judging whether the change of the pose of the camera is smaller than a set threshold value, and stopping iteration if the convergence condition is met; otherwise, continuing iteration until convergence.
Preferably, screening the candidate path points according to the optical flow of the path points includes:
the LDSO is used to screen candidate waypoints by image gradients.
Preferably, determining whether the candidate path point satisfies the decision index, if so, setting the candidate path point as a final path point, and if not, regenerating the candidate path point includes:
introducing decision-making indexWherein 1 is<α<2 if the photometric error of the current frame exceeds +.>Doubling:
image alignment fails and the image with alignment failure is processed by FlowNetS, whereinRepresenting the photometric error of the current frame, +.>Representing the photometric error of the previous frame;
if it is
The image is directly aligned using a point-to-point projection method by a set of selected pixels and minimizing photometric errors between the pixels.
Preferably, the determining whether the candidate path point meets the decision index with the aid of the cloud computer comprises:
by means of decision-making criteriaPerforming evaluation of the edge rough tracking performance, and starting a cloud meter if image alignment failsThe computer performs tracking alignment of the images, and simultaneously, the edge side performs tracking mapping of the real-time images in parallel;
dividing sub-pictures based on camera views, wherein frames of each camera view are divided into one sub-picture;
constructing data association of a sub-graph constructed by a cloud and a sub-graph constructed by an edge, realizing sub-graph alignment and realizing global tracking modeling;
after receiving data from the edge computer, the cloud computer parallelly executes FlowNetS optical flow estimation and descriptor extraction, and returns the optimized pose to the edge computer for edge cloud fusion;
and after receiving the candidate path point data transmitted by the edge end computer, the cloud end computer performs feature matching with the pixel points obtained by optical flow estimation of the cloud end computer, and if the matching is successful, the final path point obtained after the matching is compressed and then transmitted to the edge end computer.
A machine learning and SLAM based path planning system, comprising:
the path point processing module is used for carrying out optical flow estimation on the path points in the acquired images by using the FlowNetS;
the path point screening module is used for screening candidate path points according to the optical flows of the path points;
the judging module is used for judging whether the candidate path points meet the decision index, if so, setting the candidate path points as final path points, and if not, regenerating the candidate path points;
and the path generation module is used for generating a feasible path according to the final path point.
The FlowNetS-based optical flow prediction method provided by the invention can obtain more accurate optical flow and pixel point position estimation, so that the number of selected key frames can be reduced, and the tracking performance of a system is ensured. The point selection method can avoid selecting candidate points which are not matched with the reference frames, so that better tracking is ensured between frames, and the tracking performance of the system is improved. The established decision index alpha can determine when to use FlowNetS to carry out optical flow estimation, so that the time loss can be reduced, and the real-time performance of the system is ensured; bian Yun cooperation VSLAM method with subgraph as calculation task allocation unit and tracking performance as trigger effectively overcomes the defect of high communication requirement of the existing edge cloud cooperation VSLAM method, and can realize Bian Yun cooperation with low bandwidth.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of flow estimation based on FlowNetS of the present invention;
FIG. 3 is a decision index and decision flow chart of the present invention;
FIG. 4 is a diagram of an edge cloud collaboration system framework in accordance with the present invention;
FIG. 5 is a flow chart of a sub-division method of the present invention;
FIG. 6 is a flow chart of the local descriptor construction of the present invention;
FIG. 7 is a flow chart of global descriptor construction in accordance with the present invention;
FIG. 8 is a graph comparing experimental results of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that all directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are merely used to explain the relative positional relationship, movement, etc. between the components in a particular posture (as shown in the drawings), and if the particular posture is changed, the directional indicator is changed accordingly.
Furthermore, the description of "first," "second," etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
The existing work is mainly focused on redundant detection aspect of SLAM map sparsification, but has less structural consideration on the environment. Other methods deal with map compression by retaining data useful for location identification and discarding information that does not affect SLAM performance, but the amount of computation required for the pruning process is large, and it is desirable to select better data for preliminary processing. Still other work has explored in terms of global positioning systems and model-learning optical flow systems, but this approach may not be reliable in environments where robots are moving rapidly. To solve the above problems, the key of the present invention is to reduce the number of selected key frames.
Further, since the conventional visual localization and mapping method generally performs computation locally on the robot, the computing resources of the local devices limit the performance and expansibility of the system as the complexity and data volume of the robot task increases. To solve this problem, the concept of Bian Yun collaboration (Edge-Cloud Collaboration) is proposed, which aims to distribute computing tasks between Edge devices and cloud resources, so as to achieve collaboration and optimization of computing. However, in visual localization and mapping, there are some challenges to directly applying the conventional edge cloud collaboration method. For example, conventional methods typically locate and map based on feature point matching and descriptors, but for some direct methods (such as LDSO), they do not have explicit feature points and descriptors, resulting in the inability to directly apply conventional edge cloud collaboration methods. Therefore, the invention provides a direct method-based Bian Yun collaborative SLAM method and a direct method-based Bian Yun collaborative SLAM system, which aim to improve the performance, the instantaneity and the expansibility of the direct method SLAM system and enable a robot to still have better performance in a more complex environment. The FlowNetS-based optical flow prediction method provided by the invention can obtain more accurate optical flow and pixel point position estimation, so that the number of selected key frames can be reduced, and the tracking performance of a system is ensured. The point selection method can avoid selecting candidate points which are not matched with the reference frames, so that better tracking is ensured between frames, and the tracking performance of the system is improved. The established decision index alpha can determine when to use FlowNetS to carry out optical flow estimation, so that the time loss can be reduced, and the real-time performance of the system is ensured; bian Yun cooperation VSLAM method with subgraph as calculation task allocation unit and tracking performance as trigger effectively overcomes the defect of high communication requirement of the existing edge cloud cooperation VSLAM method, and can realize Bian Yun cooperation with low bandwidth.
Example 1
A machine learning and SLAM-based path planning method, referring to fig. 1, comprising:
s100, performing optical flow estimation on path points in the acquired image by using FlowNetS;
s200, screening candidate path points according to the optical flows of the path points;
s300, judging whether the candidate path points meet decision indexes, if so, setting the candidate path points as final path points, and if not, regenerating the candidate path points;
s400, generating a feasible path according to the final path point.
Preferably, the method further comprises: when the side end computer cannot judge whether the candidate path points meet the decision index, the candidate path points are sent to the cloud end computer;
judging whether the candidate path points meet decision indexes or not by adopting cloud computer assistance;
and the cloud computer sends the judged result to the side computer.
Preferably, the optical flow estimation of the path points in the acquired image using flownet comprises:
inputting the reference image and the target image into the FlowNetS, and estimating the positions of pixel points in the reference image and the target image by adopting a flow vector, wherein the formula is as follows:
wherein ,is the pixel position in the reference image, the corresponding coordinate is +.>,/>Is->Is>Is->An ordinate; />Is the estimated position in the target image, the corresponding coordinates are +.>,/>Is->Is>Is->An ordinate; u represents the component of the optical flow vector in the horizontal direction, namely the displacement of the pixel point on the X axis; v represents the component of the optical flow vector in the vertical direction, i.e., the displacement of the pixel point on the Y-axis;
calculating photometric errors for motion estimation using point-to-point projections, reference framePoints of->And target systemPoint observed in->Is defined as:
wherein ,the weight of the pixel point p is represented and is used for adjusting the contribution of different pixel points to errors; />Representing an imageIs a photometric affine transformation parameter of (a); />Representation of image->Is a photometric affine transformation parameter of (a); />Representation of image->Exposure time of (2); />Representation of image->Exposure time of (2); />Representation of image->Is a camera response function of (a); />Representation of image->Is a camera response function of (a); />: represented in the picture +.>The pixel value at the p' position is calculated through an optical flow field; />: represented in the picture +.>A pixel value of the middle pixel point p; by minimizing photometric error->An estimate of the optical flow and the position of the pixel in the target image is obtained.
For optical flow estimation, flownet and pre-trained weights are used. Flownet is a generic network consisting of only convolutional layers. This model is advantageous because it is small in size, runs much faster than other most advanced algorithms, while achieving comparable performance. FlowNetS can also better generalize natural scenarios and has better performance in datasets with large shifts (e.g., KITTIs).
To predict optical flow, flowNetS takes as input the stacked image pair, which in this embodiment is the reference image (i.e., the selected latest key frame) and the target image (i.e., the current image). The output of the model is a three-channel image in which the first two channels describe the movement of the horizontal and vertical pixels. The flow vector can be used to estimate the location of the pixel point as follows:
wherein :
is the pixel position in the reference image (history frame) corresponding to the coordinate +.>
Is the estimated position in the target image (new frame) with corresponding coordinates +.>
u represents the component of the optical flow vector in the horizontal direction, namely the displacement of the pixel point on the X axis;
v denotes a component of the optical flow vector in the vertical direction, i.e., a displacement amount of the pixel point on the Y axis.
In LDSO, used is a direct point-to-point projection method to calculate the photometric error of motion estimation, which will reference the frameIn (a) and (b)Point->Is->Point observed in->Is defined as the photometric error of (2)
wherein :
the weight of the pixel point p is represented and is used for adjusting the contribution of different pixel points to errors;
representation of image->Is a photometric affine transformation parameter of (a);
representation of image->Is a photometric affine transformation parameter of (a);
representation of image->Exposure time of (2);
representation of image->Exposure time of (2);
representation of image->Is a camera response function of (a);
representation of image->Is a camera response function of (a);
: represented in the picture +.>The pixel value at the p' position is calculated through an optical flow field;
: represented in the picture +.>The pixel value of the middle pixel point p.
By minimizing photometric errorsMore accurate estimates of optical flow and pixel location are obtained and points between frames can be better tracked. Fewer key frames are needed for state estimation because the connectivity between frames is better estimated. In this way, the size of the map can be reduced.
Preferably, photometric errors are minimizedComprising the following steps:
initializing camera pose estimation, and setting an initial camera pose estimation value: rotating the matrix R and shifting the vector t;
according to the current camera pose estimation, calculating a jacobian matrix J of the photometric error on the camera pose, wherein the calculation formula of the jacobian matrix J is as follows:
wherein E represents luminosity error, T represents camera pose,representing the partial derivative of the photometric error E on the camera pose T;
according to the jacobian matrix J and the luminosity error E, an increment equation is constructed, wherein the increment equation is in the form of:
wherein ,representing the transpose of jacobian matrix J, +.>An increment representing the pose of the camera;
solving an increment equation by using a numerical optimization method to obtain the increment of the pose of the camera
Increment is increasedThe camera pose estimation method is applied to current camera pose estimation and updates the camera pose:
judging whether the change of the pose of the camera is smaller than a set threshold value, and stopping iteration if the convergence condition is met; otherwise, continuing iteration until convergence.
The steps for minimizing photometric errors are as follows:
initializing camera pose estimation:
an initial camera pose estimation value is set, including a rotation matrix R and a translation vector t.
Iterative optimization process: repeating the following steps until convergence:
jacobian matrix (Jacobian) calculation:
and calculating a jacobian matrix J of the photometric error on the camera pose according to the current camera pose estimation. The jacobian matrix describes the derivative of the photometric error with respect to the pose of the camera. The jacobian matrix J has the following formula:
wherein E represents luminosity error, T represents camera pose,representing the partial derivative of the photometric error E with respect to the camera pose T.
And (3) constructing an increment equation:
and constructing an incremental equation according to the jacobian matrix J and the luminosity error E. The form of the delta equation is:
wherein ,representing the transpose of jacobian matrix J, +.>Representing an increment of camera pose.
Solving an increment equation:
solving the incremental equation using a numerical optimization method (e.g., gauss-Newton) to obtainIncrement to camera pose
Updating the camera pose estimation:
increment is increasedThe camera pose estimation method is applied to current camera pose estimation and updates the camera pose:
checking convergence conditions:
checking whether the change of the pose of the camera is smaller than a set threshold value, and stopping iteration if the convergence condition is met; otherwise, continuing the iteration.
Preferably, screening the candidate path points according to the optical flow of the path points includes:
the LDSO is used to screen candidate waypoints by image gradients.
LDSO selects candidate points by image gradients in order to track the surrounding environment. Based on this approach, the present invention devised a point selection method that utilizes a machine learning model to predict a pixel light flow graph, which is selected only when the point has significant light flow predicted from the model (e.g., certain light flow smoothness is met). Specifically, only when the preliminary candidate point selected by the LDSO and the pixel point estimated by the FlowNetSimple meet the decision index, the candidate point is finally determined, otherwise, the candidate point is rejected, and a new preliminary candidate point is reselected.
Unlike the direct method, which processes pixels of the entire image, the FlowNetSimple does not necessarily match all pixels of the target frame to the reference frame. Through the above procedure, selection of candidate points that do not match the reference frame can be avoided to ensure better tracking between frames.
By the candidate point selection method, candidate points which have significant optical flows and are matched with the reference frame can be screened out, and nonsensical or unmatched candidate points are removed. This reduces the redundant information in the key frames, leaving only frames with significant motion and features as key frames, thereby reducing the number of key frames.
Preferably, as shown in fig. 3, determining whether the candidate path point satisfies the decision index, if so, setting the candidate path point as a final path point, and if not, regenerating the candidate path point includes:
introducing decision-making index, wherein />If the photometric error of the current frame exceeds +.>Doubling:
image alignment fails and the image with alignment failure is processed by FlowNetS, whereinRepresenting the photometric error of the current frame, +.>Representing the photometric error of the previous frame;
if it is
The image is directly aligned using a point-to-point projection method by a set of selected pixels and minimizing photometric errors between the pixels.
Running the machine learning model on all pairs of frames in a sequence can greatly increase time consumption. To implement a real-time system, the model of the method is only used to process pairs of frames with poor direct image registration.
LDSO assumes that if the residual error (photometric error) of the current frame exceeds twice that of the previous frame, the alignment fails, which means that the motion between the current frame and the previous frame is large or there is a large scene change. In this case, the intensity variation between pixels is also large, and the accuracy of the direct method may be affected, while FlowNetSimple can better handle optical flow estimation in large motion or scene change cases, providing more accurate results. The invention aims to support the tracking process by using the FlowNetSimple when the alignment performance is not good, thus introducing a decision index, wherein />Can be adjusted according to actual needs so as to ensure that the system has better performance than the traditional methods such as a direct method and the like. If the residual error (photometric error) of the current frame exceeds +.>Double, i.e. if:
the image alignment fails. Wherein the method comprises the steps ofLuminosity error (residual) representing the current frame, is->Representing the photometric error of the previous frame, both of which can passAnd (5) calculating to obtain the product.
At this time, the FlowNetS is started to perform optical flow estimation of the next frame, otherwise, a direct method is still used, so that the robustness of the system can be improved, and unnecessary time consumption is reduced.
Preferably, the determining whether the candidate path point meets the decision index with the aid of the cloud computer comprises:
by means of decision-making criteriaEvaluating the rough tracking performance of the edge, and if the image alignment fails, starting a cloud computer to perform image tracking alignment, and simultaneously, executing tracking and mapping of real-time images by the edge in parallel;
dividing sub-pictures based on camera views, wherein frames of each camera view are divided into one sub-picture;
constructing data association of a sub-graph constructed by a cloud and a sub-graph constructed by an edge, realizing sub-graph alignment and realizing global tracking modeling;
after receiving data from the edge computers, the cloud computer executes FlowNetS optical flow estimation and descriptor extraction in parallel as shown in FIG. 3, and returns the optimized pose to the edge computers for edge cloud fusion;
and after receiving the candidate path point data transmitted by the edge end computer, the cloud end computer performs feature matching with the pixel points obtained by optical flow estimation of the cloud end computer, and if the matching is successful, the final path point obtained after the matching is compressed and then transmitted to the edge end computer.
It should be noted that, in order to cope with more complex environments, the present invention also proposes an edge cloud collaboration method based on the above method, as shown in fig. 4.
Next, a specific implementation method of the edge cloud collaboration system is described with reference to the accompanying drawings.
And (5) data acquisition. And installing a camera on the mobile robot, and operating direct method SLAM such as LDSO and the like on the current frame captured by the camera and the frame history stored by the side equipment.
And (5) selecting the preliminary candidate points. And taking the candidate points selected by the LDSO as preliminary candidate points, compressing the preliminary candidate points and then sending the compressed candidate points to the cloud equipment.
Edge tracking performance evaluation. Evaluating the boundary end coarse tracking performance by utilizing the decision index alpha mentioned in the step three, if the boundary end coarse tracking performance meets the requirementThe tracking and image alignment performance of the edge is described to meet the requirements, and cloud computing is not required to be started at the moment; if the calculation capacity of the edge is not enough to process the current image, the tracking of the edge fails, and the cloud GPU is started to calculate the power to realize tracking. Meanwhile, the border end performs tracking and mapping of the real-time image in parallel, so that the real-time performance of the system is ensured.
And (5) sub-graph division. The invention is based on camera view partitioning, wherein the frames of each camera view are partitioned into one sub-picture. The specific partitioning method is shown in fig. 5:
local descriptors and global descriptors are constructed. Since the direct method has no descriptor, it takes time to construct a descriptor for each frame. Thus, only the local descriptor and the global descriptor are constructed for the image uploaded from the edge to the cloud. Therefore, the data association of the sub-graph constructed by the cloud and the sub-graph constructed by the side can be constructed, the alignment of the sub-graphs is realized, and the global tracking modeling is realized. The construction method of the local descriptor and the global descriptor is as shown in fig. 6 and 7:
and after the cloud end receives the data from the edge, performing the flowets optical flow estimation and descriptor extraction in parallel. And finally, returning the optimized pose to the edge end to perform edge cloud fusion.
And after receiving the LDSO candidate point data transmitted by the side end, the cloud end performs feature matching with the pixel points obtained by optical flow estimation in the previous step, and if the matching is successful, the final candidate point data obtained after the matching is successful is compressed and then is transmitted to the side end.
In the invention, four data sets are selected in the experiment, and each sequence is run 5 times to obtain a statistical conclusion so as to consider a learning model and the non-deterministic behavior of the SLAM system.
Tables three, four and five show adjustment and selection indicatorsAs a result of (a). Table six shows the percentage of frames using the model for each test sequence. Except for evaluation systemsIn addition to the accuracy of (a), a trade-off between the number of key frames and the time consumption is considered. In detail, using FlowNetS can greatly reduce the number of key frames; however, running the model on each processed frame is time consuming. Thus, all evaluation metrics (i.e., APE RMSE, number of key frames, and time consumption) were fully examined to select the most representative value. In order to increase the readability of the result, only two values are presented (/ ->=1.17 and->=1.2), with the most prominent performance in this section.
As shown in table three, the present invention works better than the original LDSO system in 9 out of 10 KITTI sequences. In particular, α=1.17 achieved lower errors in 5 of them.
It is also worth noting that for some sequences, running the model on all frames reduces system accuracy, however, selectively using the model (i.e.) The accuracy can be improved. For example, for sequence 07, the error would +.>The data generated by setting to 0 is almost 4 times that of the original data. When a learning model is used for 32.15% of samples, frames (++>=1.17), RMSE is reduced by more than half than before.
As shown in table four, the number of key frames selected is reduced by means of the learned optical flow while maintaining considerable accuracy. Especially for sequence 08, whenWhen the key frame is set to 0, the selected key frame is reduced by about 70% of the original system, and the precision is improved12.2%. However, as shown in table five, the run time increased by 41.31%. By looking at tables four, six and five, the more frames the model is used, the fewer key frames are selected. This behavior is expected because one of the key frame selection criteria is average optical flow. Since the learned optical flow is very dense, the optical flow is averaged over a large set, which results in a smaller average than the original method. In this way, the number of key frame selections is reduced, but the performance is not negatively impacted. Although the time consumption increases, since not all frames require a learning method, the increased time can be reduced by more reasonably using the model. />
In Table seven, the present invention is compared with ORB-SLAM3. Selection of1.17, as it not only demonstrates good performance of the present invention, but also has a reasonable amount of key frame selection and time consumption. The invention is superior to LDSO and ORB-SLAM3 in 4 out of 10 sequences. Some trajectory diagrams are illustrated, as shown in fig. 8, with the trajectory estimated with the present invention being closer to ground truth than with LDSO.
Example 2
A machine learning and SLAM based path planning system, comprising:
the path point processing module is used for carrying out optical flow estimation on the path points in the acquired images by using the FlowNetS;
the path point screening module is used for screening candidate path points according to the optical flows of the path points;
the judging module is used for judging whether the candidate path points meet the decision index, if so, setting the candidate path points as final path points, and if not, regenerating the candidate path points;
and the path generation module is used for generating a feasible path according to the final path point.
The FlowNetS-based optical flow prediction method provided by the invention can obtain more accurate optical flow and pixel point position estimation, so that the number of selected key frames can be reduced, and the tracking performance of a system is ensured. The point selection method can avoid selecting candidate points which are not matched with the reference frames, so that better tracking is ensured between frames, and the tracking performance of the system is improved. Established decision indexWhen the FlowNetS is used for optical flow estimation can be determined, so that the time loss can be reduced, and the real-time performance of the system is ensured; bian Yun cooperation VSLAM method with subgraph as calculation task allocation unit and tracking performance as trigger effectively overcomes the defect of high communication requirement of the existing edge cloud cooperation VSLAM method, and can realize Bian Yun cooperation with low bandwidth.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (5)

1. A machine learning and SLAM-based path planning method, comprising:
performing optical flow estimation on path points in the acquired image by using FlowNetS;
screening candidate path points according to the optical flow of the path points;
judging whether the candidate path point meets the decision index, if yes, setting the candidate path point as a final path point, and if not, regenerating the candidate path point;
when the side end computer cannot judge whether the candidate path points meet the decision index, the candidate path points are sent to the cloud end computer;
judging whether the candidate path points meet decision indexes or not by adopting cloud computer assistance;
the cloud computer sends the judged result to the side computer;
judging whether the candidate path point meets the decision index, if yes, setting the candidate path point as a final path point, and if not, regenerating the candidate path point comprises the following steps:
introducing decision-making index, wherein />If the photometric error of the current frame exceeds +.>Doubling:
image alignment fails and the image with alignment failure is processed by FlowNetS, whereinRepresenting the photometric error of the current frame, +.>Representing the photometric error of the previous frame;
if it is
Directly aligning the image by a point-to-point projection method through a group of selected pixel points and minimizing luminosity errors among the pixel points;
the step of adopting the cloud computer to assist in judging whether the candidate path point meets the decision index comprises the following steps:
by means of decision-making criteriaEvaluating the rough tracking performance of the edge, and if the image alignment fails, starting a cloud computer to perform image tracking alignment, and simultaneously, executing tracking and mapping of real-time images by the edge in parallel;
dividing sub-pictures based on camera views, wherein frames of each camera view are divided into one sub-picture;
constructing data association of a sub-graph constructed by a cloud and a sub-graph constructed by an edge, realizing sub-graph alignment and realizing global tracking modeling;
after receiving data from the edge computer, the cloud computer parallelly executes FlowNetS optical flow estimation and descriptor extraction, and returns the optimized pose to the edge computer for edge cloud fusion;
the cloud computer receives the candidate path point data transmitted by the edge computer, performs feature matching with the pixel points obtained by optical flow estimation of the cloud computer, and if the matching is successful, compresses the final path point obtained after the matching is successful and sends the compressed final path point to the edge computer;
and generating a feasible path according to the final path point.
2. The machine learning and SLAM-based path planning method of claim 1, wherein the using flownet to perform optical flow estimation on the path points in the acquired image comprises:
inputting the reference image and the target image into the FlowNetS, and estimating the positions of pixel points in the reference image and the target image by adopting a flow vector, wherein the formula is as follows:
wherein ,is the pixel position in the reference image, the corresponding coordinate is +.>,/>Is->Is>Is thatAn ordinate; />Is the estimated position in the target image, the corresponding coordinates are +.>,/>Is->Is>Is->An ordinate; />Representing the component of the optical flow vector in the horizontal direction, namely the displacement of the pixel point on the X axis; />Representing the component of the optical flow vector in the vertical direction, namely the displacement of the pixel point on the Y axis;
calculating photometric errors for motion estimation using point-to-point projections, reference framePoints of->Is->Point observed in->Is defined as:
wherein ,the weight of the pixel point p is represented and is used for adjusting the contribution of different pixel points to errors; />Representation of image->Is a photometric affine transformation parameter of (a); />Representation of image->Is a photometric affine transformation parameter of (a); />Representation of image->Exposure time of (2);representation of image->Exposure time of (2); />Representation of image->Is a camera response function of (a); />Representation of image->Is a camera response function of (a); />Represented in the picture +.>The pixel value at the p' position is calculated through an optical flow field; />Represented in the picture +.>A pixel value of the middle pixel point p; by minimizing photometric error->An estimate of the optical flow and the position of the pixel in the target image is obtained.
3. The machine learning and SLAM-based path planning method of claim 2, wherein minimizing photometric errorsComprising the following steps:
initializing camera pose estimation, and setting an initial camera pose estimation value: rotating the matrix R and shifting the vector t;
according to the current camera pose estimation, calculating a jacobian matrix J of the photometric error on the camera pose, wherein the calculation formula of the jacobian matrix J is as follows:
wherein E represents luminosity error, T represents camera pose,representing the partial derivative of the photometric error E on the camera pose T;
according to the jacobian matrix J and the luminosity error E, an increment equation is constructed, wherein the increment equation is in the form of:
wherein ,representing the transpose of jacobian matrix J, +.>An increment representing the pose of the camera;
solving an increment equation by using a numerical optimization method to obtain the increment of the pose of the camera
Increment is increasedThe camera pose estimation method is applied to current camera pose estimation and updates the camera pose:
judging whether the change of the pose of the camera is smaller than a set threshold value, and stopping iteration if the convergence condition is met; otherwise, continuing iteration until convergence.
4. The machine learning and SLAM-based path planning method of claim 1, wherein the screening out candidate path points based on optical flow of the path points comprises:
the LDSO is used to screen candidate waypoints by image gradients.
5. A machine learning and SLAM-based path planning system, comprising:
the path point processing module is used for carrying out optical flow estimation on the path points in the acquired images by using the FlowNetS;
the path point screening module is used for screening candidate path points according to the optical flows of the path points;
the judging module is used for judging whether the candidate path points meet the decision index, if so, setting the candidate path points as final path points, and if not, regenerating the candidate path points;
when the side end computer cannot judge whether the candidate path points meet the decision index, the candidate path points are sent to the cloud end computer;
judging whether the candidate path points meet decision indexes or not by adopting cloud computer assistance;
the cloud computer sends the judged result to the side computer;
judging whether the candidate path point meets the decision index, if yes, setting the candidate path point as a final path point, and if not, regenerating the candidate path point comprises the following steps:
introducing decision-making index, wherein />If the photometric error of the current frame exceeds +.>Doubling:
image alignment fails and the image with alignment failure is processed by FlowNetS, whereinRepresenting the photometric error of the current frame, +.>Representing the photometric error of the previous frame;
if it is
Directly aligning the image by a point-to-point projection method through a group of selected pixel points and minimizing luminosity errors among the pixel points;
the step of adopting the cloud computer to assist in judging whether the candidate path point meets the decision index comprises the following steps:
by means of decision-making criteriaPerforming edge rough trackingIf the image alignment fails, starting the cloud computer to track and align the images, and simultaneously executing the tracking and mapping of the real-time images by the edge side in parallel;
dividing sub-pictures based on camera views, wherein frames of each camera view are divided into one sub-picture;
constructing data association of a sub-graph constructed by a cloud and a sub-graph constructed by an edge, realizing sub-graph alignment and realizing global tracking modeling;
after receiving data from the edge computer, the cloud computer parallelly executes FlowNetS optical flow estimation and descriptor extraction, and returns the optimized pose to the edge computer for edge cloud fusion;
the cloud computer receives the candidate path point data transmitted by the edge computer, performs feature matching with the pixel points obtained by optical flow estimation of the cloud computer, and if the matching is successful, compresses the final path point obtained after the matching is successful and sends the compressed final path point to the edge computer;
and the path generation module is used for generating a feasible path according to the final path point.
CN202310921547.4A 2023-07-26 2023-07-26 Path planning method and system based on machine learning and SLAM Active CN116625380B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310921547.4A CN116625380B (en) 2023-07-26 2023-07-26 Path planning method and system based on machine learning and SLAM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310921547.4A CN116625380B (en) 2023-07-26 2023-07-26 Path planning method and system based on machine learning and SLAM

Publications (2)

Publication Number Publication Date
CN116625380A CN116625380A (en) 2023-08-22
CN116625380B true CN116625380B (en) 2023-09-29

Family

ID=87613932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310921547.4A Active CN116625380B (en) 2023-07-26 2023-07-26 Path planning method and system based on machine learning and SLAM

Country Status (1)

Country Link
CN (1) CN116625380B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761737A (en) * 2014-01-22 2014-04-30 北京工业大学 Robot motion estimation method based on dense optical flow
CN108986037A (en) * 2018-05-25 2018-12-11 重庆大学 Monocular vision odometer localization method and positioning system based on semi-direct method
CN111707281A (en) * 2020-06-30 2020-09-25 华东理工大学 SLAM system based on luminosity information and ORB characteristics
CN113076988A (en) * 2021-03-25 2021-07-06 重庆邮电大学 Mobile robot vision SLAM key frame self-adaptive screening method based on neural network
WO2022187753A1 (en) * 2021-03-18 2022-09-09 Innopeak Technology, Inc. Slam-guided monocular depth refinement system using self-supervised online learning
CN116242374A (en) * 2023-04-27 2023-06-09 厦门大学 Direct method-based multi-sensor fusion SLAM positioning method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733474B2 (en) * 2018-07-03 2020-08-04 Sony Corporation Method for 2D feature tracking by cascaded machine learning and visual tracking

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761737A (en) * 2014-01-22 2014-04-30 北京工业大学 Robot motion estimation method based on dense optical flow
CN108986037A (en) * 2018-05-25 2018-12-11 重庆大学 Monocular vision odometer localization method and positioning system based on semi-direct method
CN111707281A (en) * 2020-06-30 2020-09-25 华东理工大学 SLAM system based on luminosity information and ORB characteristics
WO2022187753A1 (en) * 2021-03-18 2022-09-09 Innopeak Technology, Inc. Slam-guided monocular depth refinement system using self-supervised online learning
CN113076988A (en) * 2021-03-25 2021-07-06 重庆邮电大学 Mobile robot vision SLAM key frame self-adaptive screening method based on neural network
CN116242374A (en) * 2023-04-27 2023-06-09 厦门大学 Direct method-based multi-sensor fusion SLAM positioning method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Alexey Dosovitskiy,Philipp Fischer,Eddy Ilg,Philip Hausser,Caner Hazırbas,Vladimir Golkov.FlowNet: Learning Optical Flow with Convolutional Networks.2015 IEEE International Conference on Computer Vision.2015,2758-2766. *
基于线结构光视觉的方形状物***姿估计;叶文达;侯宇瀚; 陈洪佳;李艺帆;《机械工程与自动化》(第第2期期);21-24 *
最小化光度误差先验的视觉SLAM算法;韩健英;王浩;方宝富;;小型微型计算机***;第41卷(第10期);2177-2183 *

Also Published As

Publication number Publication date
CN116625380A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
Cvišić et al. SOFT‐SLAM: Computationally efficient stereo visual simultaneous localization and mapping for autonomous unmanned aerial vehicles
CN112902953B (en) Autonomous pose measurement method based on SLAM technology
CN108242079B (en) VSLAM method based on multi-feature visual odometer and graph optimization model
CN105809687B (en) A kind of monocular vision ranging method based on point information in edge in image
CN111563442A (en) Slam method and system for fusing point cloud and camera image data based on laser radar
CN111462207A (en) RGB-D simultaneous positioning and map creation method integrating direct method and feature method
CN108615246B (en) Method for improving robustness of visual odometer system and reducing calculation consumption of algorithm
CN113674416B (en) Three-dimensional map construction method and device, electronic equipment and storage medium
CN111899280B (en) Monocular vision odometer method adopting deep learning and mixed pose estimation
Taketomi et al. Real-time and accurate extrinsic camera parameter estimation using feature landmark database for augmented reality
CN109035329A (en) Camera Attitude estimation optimization method based on depth characteristic
CN111812978B (en) Cooperative SLAM method and system for multiple unmanned aerial vehicles
Tian et al. Resilient and distributed multi-robot visual slam: Datasets, experiments, and lessons learned
He et al. Online semantic-assisted topological map building with LiDAR in large-scale outdoor environments: Toward robust place recognition
KR101803340B1 (en) Visual odometry system and method
Zhang et al. A visual-inertial dynamic object tracking SLAM tightly coupled system
Merzić et al. Map quality evaluation for visual localization
CN114792338A (en) Vision fusion positioning method based on prior three-dimensional laser radar point cloud map
Li et al. BA-LIOM: tightly coupled laser-inertial odometry and mapping with bundle adjustment
CN116625380B (en) Path planning method and system based on machine learning and SLAM
Wang et al. Real-time omnidirectional visual SLAM with semi-dense mapping
CN115239899B (en) Pose map generation method, high-precision map generation method and device
CN114707611B (en) Mobile robot map construction method, storage medium and equipment based on graph neural network feature extraction and matching
CN114202579B (en) Dynamic scene-oriented real-time multi-body SLAM system
CN113158816B (en) Construction method of visual odometer quadric road sign for outdoor scene object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant