WO2024096819A1 - System and method for map building and map- and image-based object control - Google Patents

System and method for map building and map- and image-based object control Download PDF

Info

Publication number
WO2024096819A1
WO2024096819A1 PCT/SG2023/050726 SG2023050726W WO2024096819A1 WO 2024096819 A1 WO2024096819 A1 WO 2024096819A1 SG 2023050726 W SG2023050726 W SG 2023050726W WO 2024096819 A1 WO2024096819 A1 WO 2024096819A1
Authority
WO
WIPO (PCT)
Prior art keywords
map
image
area
level
vehicle
Prior art date
Application number
PCT/SG2023/050726
Other languages
French (fr)
Inventor
Shaojun CAI
Original Assignee
National University Of Singapore
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University Of Singapore filed Critical National University Of Singapore
Publication of WO2024096819A1 publication Critical patent/WO2024096819A1/en

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/20Control system inputs
    • G05D1/24Arrangements for determining position or orientation
    • G05D1/243Means capturing signals occurring naturally from the environment, e.g. ambient optical, acoustic, gravitational or magnetic signals
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/20Control system inputs
    • G05D1/24Arrangements for determining position or orientation
    • G05D1/246Arrangements for determining position or orientation using environment maps, e.g. simultaneous localisation and mapping [SLAM]
    • G05D1/2462Arrangements for determining position or orientation using environment maps, e.g. simultaneous localisation and mapping [SLAM] using feature-based mapping
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D2107/00Specific environments of the controlled vehicles
    • G05D2107/60Open buildings, e.g. offices, hospitals, shopping areas or universities
    • G05D2107/63Offices, universities or schools
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D2109/00Types of controlled vehicles
    • G05D2109/10Land vehicles
    • G05D2109/12Land vehicles with legs
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D2111/00Details of signals used for control of position, course, altitude or attitude of land, water, air or space vehicles
    • G05D2111/10Optical signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates, in general terms, to systems for building maps of areas through which robots and autonomous vehicles can be controlled.
  • the present invention also relates to navigating a robot or vehicle through an area using such a map, and to methods relating to building maps and navigation control.
  • Guide dogs navigate in complex and large-scale environments to help visually impaired individuals become more independent and mobile. For example, a typical use case of guide dogs is to lead the user from their home to a nearby shopping mall. However, unlike a real guide dog, it is possible to program a robot to perform specific tasks in environments it has never seen before.
  • Robotic mapping is a discipline related to computer vision and cartography. Most of the current robotic mapping and navigation applications rely on expensive LiDAR sensors and Simultaneous Localization and Mapping (SLAM). These methods may work reasonably well in small-scale environments such as for service robots in the restaurants or home robot vacuum, but they do not suffice for complex environments that are frequently encountered in guiding applications.
  • SLAM Simultaneous Localization and Mapping
  • mapping Using currently available methods for mapping such as SLAM, a device can simultaneously localize (locate itself in a map) and map (create a virtual mapof the location).
  • SLAM SLAM
  • complex environments such as rooms or street blocks connected by narrow passages with doors and intersections require a more complex solution.
  • the map can be divided into two levels. In the low (first) level, images are colleted from robot on-board cameras and and those images are connected based on co-visibility and traversability. In the high (second) level, a clustering algorithm groups low- level locations (at which images were captured) into a set of topological nodes. A local controller is used to travel from one topological node to the next.
  • the field of applications include navigation for robotic systems and autonomous cars.
  • a system for building a map of an area comprising: memory; a vehicle comprising an image capture device; an image analysis engine; at least one processor; and a control system for controlling the vehicle, wherein the memory stores instructions that, when executed by the at least one processor, cause the at least one processor to: form a first level of the map by: controlling the vehicle, using the control system, to navigate through the area; capturing images of the area using the image capture device, each image corresponding to a location in the area; aligning the images based on image features that appear in two or more of the images, and determining co-visibility and traversability for each location at which a said image was captured; form a second level of the map by: defining clusters of the locations based on the co-visibility and traversability of the locations; assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; and upload the map to the control system.
  • a navigation control system for navigating an object through an area, comprising: an input device; a controller comprising memory, the memory storing a map comprising: a first level formed from images captured at locations in the area, the images being aligned based on image features appearing in two or more of the images and a co-visibility and traversability for each location; a second level formed by defining clusters of the locations based on the co-visibility and traversability of the locations, and assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; an image capture device; a path planning module; and at least one processor, the at least one processor being programmed to: receive a further image from the image capture device, the further image being captured at a current location of the object; localise the object in the map based on image features in the further image and the first level; receive a destination from the input device; and use the path planning module to determine a path from the current location to the destination via one or more of the
  • Also disclosed is a method for building a map of an area comprising: forming a first level of the map by: controlling a vehicle to navigate through the area; capturing images of the area using an image capture device on the vehicle, each image corresponding to a location in the area; aligning the images based on image features that appear in two or more of the images, and determining co-visibility and traversability for each location at which a said image was captured; forming a second level of the map by: defining clusters of the locations based on the co-visibility and traversability of the locations; assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; and uploading the map to a control system for controlling an object through the area.
  • the map is a topometric map representation of the area. This removes the need for accurate map reconstruction in traditional simultaneous localisation and mapping (SLAM)-based robot systems.
  • SLAM simultaneous localisation and mapping
  • the systems use a learning-based local controller.
  • a learning-based local controller can be generalised to different environments without the need of customisation.
  • Figure 1 illustrates the overall topometric map construction and usage
  • Figure 2 illustrates progressive steps in building a topometric map in accordance with an embodiment of the invention
  • FIG. 3 illustrates local sensorimotor control in accordance with an embodiment of the invention
  • Figure 4 is a visualisation of map clustering results in accordance with an embodiment of the invention.
  • Figure 5 illustrates a block diagram of a system for building a map of an area in accordance with an embodiment of the invention
  • Figure 6 illustrates a flowchart for a method for building a map of an area according to an embodiment of the invention
  • Figure 7 illustrates a block diagram of a navigation control system in accordance with an embodiment of the invention.
  • Figure 8 illustrates a flowchart for a method for navigating an object through an area according to an embodiment of the invention.
  • Described below are systems and methods for building maps of areas through which robots and autonomous vehicles can be controlled. Also described is the process of navigating a robot or vehicle through an area using such a map, and methods relating to map building and navigation control by tackling the limitations of traditional SLAM methods using abstraction.
  • the abstract topology of environments is captured with just a few traversals.
  • the abstraction typically contains a set of sub-spaces, such as rooms and street blocks, that are connected by narrow passages, such as doors and intersections.
  • co-visibil ity denotes the number of places that a point can see and the number of places from which the point can be seen
  • traversability denotes the cases of whether or not a trajectory can be traversed directly between two points that can see each other.
  • the methods broadly comprise forming a two-level map and outputting the map.
  • the first level of the map is a low-level graph
  • the second level map is a high-level graph.
  • the low-level graph is formed by controlling a vehicle to navigate through the area over which the map is to be generated and capturing images of the area using an image capture device, such as a camera, on the vehicle.
  • Each image corresponds to a location in the area, such as the location from which the image was captured.
  • the images are then aligned based on image features that appear in two or more of the images - i.e. co-registration of image features.
  • the covisibility and traversability for each location is then determined.
  • the high-level graph can then be formed by defining clusters of the locations ascertained during generation of the low-level graph, based on the co-visibility and traversability of the locations.
  • Switch nodes are then assigned to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster. This accounts for, for example, cases where a direct line of sight exists between two points but a robot travelling along that direct line will impact an object disposed slightly laterally (to the side) of the line of sight. In such circumstances the co-visibility of those points will be lower than for the allocated switching node through which the robot navigates between clusters.
  • the switching node, having higher covisibility has greater space on either side of the line of sight to the switching node of the cluster to which the robot is to navigate.
  • the low-level graph 102 is directly constructed from the pre-exploration or exploration sequence 104.
  • the low-level graph 102 is constructed to include nodes each denoted as a place node Vi (corresponding to a location at which the corresponding image was captured) with an associated red, green, blue and depth (RGB-D) panoramic image L.
  • Place node Vi represents a previously visited place (i.e. a location) in the environment.
  • An edge (or link) in the graph indicates that the two views have sufficient similarity, and the vehicle or mapping agent (e.g. robot) is able to travel from one node to the other through a simple path - e.g. a linear path.
  • the similarity between two images is measured by the number of matched Scale-Invariant Feature Transform (SIFT) features between the two images.
  • SIFT Scale-Invariant Feature Transform
  • the similarity is sufficient for creation of an edge, if the number of matched SIFT features (or image features determined by an alternative image analysis model) are above a predetermined threshold - that threshold may be set by a user or may be empirically determined based on past traversals between nodes, made by an object such as a robot or vehicle.
  • determination of a simple path between the two images is done by comparing the ratio of the geodesic distance to the euclidean distance between the two images. A simple path exists if the ratio is below 1.2.
  • a high-level graph 108 is constructed.
  • normalized cut 110 is performed iteratively on the low-level graph Gi.
  • Each normalised cut cuts through bottlenecks corresponding to the narrow passages.
  • a narrow passage may be a passage that has a line of sight to a destination location, but has obstacles off the line of sight that would impact the vehicle if the vehicle were to traverse along the line of sight.
  • Each cluster Ci corresponds to a sub-space that is termed a "context". Due to the definitions in the low-level graph 102, the place nodes in each context share similar visual information (co-visibility) and permit travel between each other (traversability).
  • a switch node that connects two neighboring contexts is defined. For an edge between two neighboring contexts Ci to , the switch node is defined as the place node in Ci that has the most low-level edges to nodes in .
  • the two-level graph serves multiple purposes. Firstly, the mapping agent can use the most relevant information (i.e., place nodes in its immediate context) for localization, which improves efficiency and accuracy. Secondly, switch nodes signal the need for context change and serve as goals that guide local navigation. Thirdly, the map acts as an abstraction that decouples low-level control from the environment, enabling the controller to learn robust obstacle avoidance skills that are orthogonal to global environments.
  • Figure 1 illustrates this concept.
  • the leftmost inset shows panoramas 106 taken at locations in the area over which the topometric map is to be built.
  • the centre inset illustrates edges connecting co-visible ones of the locations between which a simple path is defined.
  • the rightmost inset illustrates the switch nodes generated for each cluster, that connect various regions or sub-areas of the map - e.g. different rooms or other areas within which there is essentially little or no difficulty navigating.
  • Figure 2 illustrates the use of the map generated in accordance with Figure 1.
  • a path planning module uses the map generated in accordance with Figure 1 to produce a control sequence for controlling an autonomous vehicle or robot through the area.
  • the navigation control system obtains the map 200 and receives a further image from the image capture device.
  • the further image will be captured at a current location of the object (i.e. the autonomous vehicle or robot).
  • the object is localised, per step 202, in the map based on image features in the further image and the first level map. Localisation involves identifying the cluster in which the object is located, per 204, and computing the coordinate of the object in the cluster, per 206.
  • a destination is received from an input device, the destination being the location in the area, to which the robot must navigate.
  • the path planning module determines a path, per 208 from the current location, reflected in the current cluster and current coordinates, to the destination via one or more of the switch nodes.
  • the path may be the shortest topological path between the current position and the destination. Where the path is not calculated by a processor in the object, the path generated by the path planning module can be uploaded to the object and a controller can then control the object to move from the current location to the destination along the path, per 210.
  • Localisation can involve any appropriate method.
  • localisation involves context and metric localisation.
  • Context localization is formulated as a place recognition problem.
  • Pre-trained AlexNet or some other feature recognition framework, is used to extract features from visual observations and use cosine similarity between features as a distance measure.
  • AlexNet or some other feature recognition framework, is used to extract features from visual observations and use cosine similarity between features as a distance measure.
  • the place node that is most similar to the current agent observation is identified among local maps of all contexts.
  • the context of the most similar place node is the context of the agent.
  • the metric pose of the current observation is calculated with respect to the switch node with the help of the local map.
  • the object is thus localised based on features in the image corresponding to the current location of the object, and the path between that current location and the location of the nearest switching node (or the switching node in the same cluster as the object's current location) is calcaulted.
  • SIFT feature matching is performed between the current observation and the features of the nodes in the local map.
  • the matched map features are unprojected to the 3D space and transformed to the coordinate of the switch node.
  • perspective- n-point (PnP) algorithm with RANSAC is performed to estimate the metric pose between the current observation and the switch node. If there are no sufficient inlier matches, the localization fails and the previously successfully estimated pose is used instead.
  • A* weighted A- star
  • Si the immediate next switch node Si is used as the local navigation target.
  • the controller takes as input the current depth observation Id 300 and a relative pose Psi that contains the relative distance t and angle (direction vector 302) from the object's current position to the next switch node Si.
  • the output of the controller is a discrete action, instructing the object to move along a predetermined trajectory to the switching node Si.
  • the distance with bins of 0.75m and the actions with bins of 5 degrees are discretised.
  • Other bin sizes can be used where the object is a different size, affecting traversability, where gaps through which the object must navigate are of a different size to those of the present area, or where the area includes obstacles.
  • the controller 304 is parameterized with a deep neural network (e.g. a convolutional neural network (CNN)) and trained with imitation learning. Based on the direction vector 302 and the depth 300, when processed through the CNN 306, a navigation instruction 308 is issued by which the object can autonomous move to the switching node - for subsequent movements, the instruction will generally direct the object to the next switching node, or the destination, along the path to the destination.
  • the learned policy can zero-shot transfer to the real world.
  • Figure 4 shows the visualization of map clustering results.
  • image (a) a low-level map is shown ina condition prior to down-sampling (i.e. clustering and allocation of switching nodes), where each dot is a place node (location at which an image was captured).
  • image (b) the down- sampled low-level map is shown, where each color denotes one context. Notably, very little of the metric map is covered by switching nodes since many locations are suboptimal for traversal through the area.
  • topological information is important.
  • the baseline experiences a significant drop in success rate from the present methodology. This is because it estimates a global vector pointing directly towards the goal and thus lacks global topology information.
  • Figure 5 illustrates a block diagram of a system for building a map of an area 500 in accordance to an embodiment of the invention.
  • the system comprises memory 502, a vehicle 504 comprising an image capture device 506, an image analysis engine 516, at least one processor 508 and a control system for controlling the vehicle 510.
  • the memory 502 comprises instructions executable by the at least one processor 508 to cause the at least one processor to perform at least a part of the method of the embodiments described herein.
  • the instructions cause the at least one processor 508 to form a first level of the map 512 before then forming a second level of the map 514 and outputting the map to a control system.
  • the method 600 for building a map of an area involves the at least one processor 508 forming a first level of the map (step 602). This is achieved by controlling the vehicle, using the control system, to navigate through the area (604). While doing so, the vehicle captures images of the area using the image capture device with each image corresponding to a location in the area (606). The images are then aligned based on image features appearing in two or more of the images as discussed above. Once aligned, the co-visibility and traversability can be determined for each location at which one of the images was captured (608). As discussed previously, traversability is the determination of a simple path between the two images which is done by comparing the ratio of the geodesic distance to the euclidean distance between the two images. A simple path exists if the ratio is below 1.2
  • the at least one processor 508 forms a second level of the map (step 610) by defining clusters of the locations based on the co-visibility and traversability of the locations (612). This is achieved by performing normalized cuts (defined, e.g., in h ttpsi./Ziegexplore Jeee.Qr /a bstrac /d oc ur e nt/868688). iteratively on the low level graph, each cut cutting through bottlenecks corresponding to the narrow passages. Further, a switch node is assigned to each cluster, with each switch node corresponding to a passage for the vehicle to traverse the respective cluster (614).
  • the switch node may be the node with highest covisibility and/or traversability, where traversability may be favoured over covisbility, particularly for undulating maps with step changes in floor height that do not affect covisibility.
  • the switch node may be located at a location at which an image was captured during generation of the low-level graph, or may result from interpretation between two or more locations in the low-level graph where those locations have the highest traversability and covisibility of the nodes in the relevant cluster.
  • the processor 508 then uploads the map to the control system (616) for controlling an object through the area.
  • Figure 7 illustrates a block diagram of a navigation control system 700 in accordance with an embodiment of the invention.
  • the system comprises an input device 716 for requesting or supplying a destination, a controller 706 comprising memory 708, in which the memory stores a first level graph or map 710 and a second level graph or map 712.
  • the system further comprises an image capture device 702, a path planning module 714 and at least one processor 704.
  • the at least one processor 704 is programmed to perform at least a part of the method of the embodiments described herein
  • the method 800 involves the at least one processor being programmed to receive a further image from the image capture device, the further image being captured at a current location of the object (step 802). Then, the at least one processor localises the object in the map based on image features in the further image and the first level of the map (804). A destination from the input device is then received by the at least one processor in step 806. Finally, the at least processor uses the path planning module to determine a path from the current location to the destination via one or more of the switch nodes (808).
  • control may be maintained by the object supplying, and the navigation control system receiving, a sequence of images (e.g. a video feed) from the image capture device of the object as the object moves through the environment or area. From these images, the navigation control system calculates, from each image in the sequence, a depth observation and relative pose to the switch node that is next along the path. The navigation control system then controls the object using discrete actions to repeatedly align the object with the path.
  • a sequence of images e.g. a video feed
  • the present methodologies have been evaluated using a quadruped robot in a school of computing which contains both indoor and outdoor environments.
  • the methods disclosed herein can navigate robustly at places with limited textures where SLAM-based methods fail most of the time.
  • SLAM is used in some prior art methodolgoies, it has its limitations. SLAM is sensitive to motion blur. Unlike common wheeled robots, quadruped robots unavoidably have more vibrations due to their locomotion. The vibration causes motion blur in the camera views, especially during rapid turns, which causes feature matching to fail. It is found that the SLAM is not able to successfully map beyond past some intersection. Secondly, the Real-Time Appearance Based Mapping (RTAB-Map) fails almost certainly during self-rotations, which happens when the robot reaches the end of a path or a corridor and needs to take a 180- degree turn back.
  • RTAB-Map Real-Time Appearance Based Mapping
  • RTAB-Map Due to difficulty in feature matching, RTAB-Map can only localize in feature-rich environments. Localisation can be brittle in environments with fewer distinctive image features by which to localise the object. Furthermore, the system cannot recover once the localization is lost, thus yielding poor performance in large- scale navigation.
  • the projected 2D Occupancy Grid Map produced using RTAB-Map contains fake obstacles, i.e., the dots along the corridor are actually not occupied. This could result from inaccurate point clouds generated from the mapping process, e.g., the depth sensor reading is noisy. When the occupancy map is too noisy, the planner is unable to find a feasible path to the goal, which fails navigation.
  • RTAB-Map SLAM in all tested routes.
  • RTAB-Map could not even successfully map the environment. This causes the baseline to fail completely.
  • the present methodologies do not require reconstruction or metric maps, thus free from such failures. This is made possible by the data-driven local controller that learns robust obstacle avoidance skills from data so that the map does not need to capture the notion of freespace.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Navigation (AREA)

Abstract

Disclosed are systems and methods for building maps of an area and navigating by those maps. Methods includes forming a multi-level map including a first level formed using images of the area that are aligned based on image features that appear in two or more of the images, and a co-visibility and traversability is determined for each location at which a said image was captured. A second level of the map is formed by defining clusters of the locations based on the co- visibility and traversability of the locations, and assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster. The map is then uploaded to a control system for controlling an object through the area.

Description

System and Method For Map Building And Map- And Image-Based Object Control
Technical Field
The present invention relates, in general terms, to systems for building maps of areas through which robots and autonomous vehicles can be controlled. The present invention also relates to navigating a robot or vehicle through an area using such a map, and to methods relating to building maps and navigation control.
Background
Guide dogs navigate in complex and large-scale environments to help visually impaired individuals become more independent and mobile. For example, a typical use case of guide dogs is to lead the user from their home to a nearby shopping mall. However, unlike a real guide dog, it is possible to program a robot to perform specific tasks in environments it has never seen before.
Robotic mapping is a discipline related to computer vision and cartography. Most of the current robotic mapping and navigation applications rely on expensive LiDAR sensors and Simultaneous Localization and Mapping (SLAM). These methods may work reasonably well in small-scale environments such as for service robots in the restaurants or home robot vacuum, but they do not suffice for complex environments that are frequently encountered in guiding applications.
Using currently available methods for mapping such as SLAM, a device can simultaneously localize (locate itself in a map) and map (create a virtual mapof the location). However, complex environments such as rooms or street blocks connected by narrow passages with doors and intersections require a more complex solution.
It would be desirable to alleviate at least one of the above-described problems with existing robotic mapping and navigation systems. Summary
Described herein is a new topological map representation system. The map can be divided into two levels. In the low (first) level, images are colleted from robot on-board cameras and and those images are connected based on co-visibility and traversability. In the high (second) level, a clustering algorithm groups low- level locations (at which images were captured) into a set of topological nodes. A local controller is used to travel from one topological node to the next. The field of applications include navigation for robotic systems and autonomous cars.
Disclosed is a system for building a map of an area, the system comprising: memory; a vehicle comprising an image capture device; an image analysis engine; at least one processor; and a control system for controlling the vehicle, wherein the memory stores instructions that, when executed by the at least one processor, cause the at least one processor to: form a first level of the map by: controlling the vehicle, using the control system, to navigate through the area; capturing images of the area using the image capture device, each image corresponding to a location in the area; aligning the images based on image features that appear in two or more of the images, and determining co-visibility and traversability for each location at which a said image was captured; form a second level of the map by: defining clusters of the locations based on the co-visibility and traversability of the locations; assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; and upload the map to the control system.
Also disclosed is a navigation control system for navigating an object through an area, comprising: an input device; a controller comprising memory, the memory storing a map comprising: a first level formed from images captured at locations in the area, the images being aligned based on image features appearing in two or more of the images and a co-visibility and traversability for each location; a second level formed by defining clusters of the locations based on the co-visibility and traversability of the locations, and assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; an image capture device; a path planning module; and at least one processor, the at least one processor being programmed to: receive a further image from the image capture device, the further image being captured at a current location of the object; localise the object in the map based on image features in the further image and the first level; receive a destination from the input device; and use the path planning module to determine a path from the current location to the destination via one or more of the switch nodes.
Also disclosed is a method for building a map of an area, the method comprising: forming a first level of the map by: controlling a vehicle to navigate through the area; capturing images of the area using an image capture device on the vehicle, each image corresponding to a location in the area; aligning the images based on image features that appear in two or more of the images, and determining co-visibility and traversability for each location at which a said image was captured; forming a second level of the map by: defining clusters of the locations based on the co-visibility and traversability of the locations; assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; and uploading the map to a control system for controlling an object through the area. Also disclosed is a method for navigating an object through an area using a map built according to the method described above, comprising: receiving a further image captured at a current location of the object; localising the object in the map based on image features in the further image and the first level; receiving a destination; and using the path planning module to determine a path from the current location to the destination via one or more of the switch nodes.
Advantageously, the map is a topometric map representation of the area. This removes the need for accurate map reconstruction in traditional simultaneous localisation and mapping (SLAM)-based robot systems.
Advantageously, the systems use a learning-based local controller. Such a controller can be generalised to different environments without the need of customisation.
Brief description of the drawings
Embodiments of the present disclosure will now be described, by way of nonlimiting example, with reference to the drawings in which:
Figure 1 illustrates the overall topometric map construction and usage;
Figure 2 illustrates progressive steps in building a topometric map in accordance with an embodiment of the invention;
Figure 3 illustrates local sensorimotor control in accordance with an embodiment of the invention;
Figure 4 is a visualisation of map clustering results in accordance with an embodiment of the invention;
Figure 5 illustrates a block diagram of a system for building a map of an area in accordance with an embodiment of the invention; Figure 6 illustrates a flowchart for a method for building a map of an area according to an embodiment of the invention;
Figure 7 illustrates a block diagram of a navigation control system in accordance with an embodiment of the invention; and
Figure 8 illustrates a flowchart for a method for navigating an object through an area according to an embodiment of the invention.
Detailed description
Described below are systems and methods for building maps of areas through which robots and autonomous vehicles can be controlled. Also described is the process of navigating a robot or vehicle through an area using such a map, and methods relating to map building and navigation control by tackling the limitations of traditional SLAM methods using abstraction. The abstract topology of environments is captured with just a few traversals. The abstraction typically contains a set of sub-spaces, such as rooms and street blocks, that are connected by narrow passages, such as doors and intersections.
Hereinafter, the system and method for map building and map- and imagebased object control according to the present invention will be described in detail with reference to Figure 1 to Figure 8 according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.
As used hereinafter, the term "co-visibil ity " denotes the number of places that a point can see and the number of places from which the point can be seen, and the term "traversability" denotes the cases of whether or not a trajectory can be traversed directly between two points that can see each other.
Thus, described are methods for building a map of an area. The methods broadly comprise forming a two-level map and outputting the map. The first level of the map is a low-level graph, and the second level map is a high-level graph. The low-level graph is formed by controlling a vehicle to navigate through the area over which the map is to be generated and capturing images of the area using an image capture device, such as a camera, on the vehicle. Each image corresponds to a location in the area, such as the location from which the image was captured. The images are then aligned based on image features that appear in two or more of the images - i.e. co-registration of image features. The covisibility and traversability for each location is then determined. The high-level graph can then be formed by defining clusters of the locations ascertained during generation of the low-level graph, based on the co-visibility and traversability of the locations. Switch nodes are then assigned to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster. This accounts for, for example, cases where a direct line of sight exists between two points but a robot travelling along that direct line will impact an object disposed slightly laterally (to the side) of the line of sight. In such circumstances the co-visibility of those points will be lower than for the allocated switching node through which the robot navigates between clusters. The switching node, having higher covisibility, has greater space on either side of the line of sight to the switching node of the cluster to which the robot is to navigate. Once the low- and -high-level graphs are generated, the resulting map can be uploaded to a control system for controlling an object, such as the robot, through the area.
With reference to the navigation flowchart 100 of Figure 1, a low-level (metric) graph is generated. The low-level graph 102, denoted as Gi = (Vi , Ei ), is directly constructed from the pre-exploration or exploration sequence 104. During the exploration sequence 104 the robot (or at least an image capture device) moves through the area or environment, capturing images 106 from multiple locations. The low-level graph 102 is constructed to include nodes each denoted as a place node Vi (corresponding to a location at which the corresponding image was captured) with an associated red, green, blue and depth (RGB-D) panoramic image L. Place node Vi represents a previously visited place (i.e. a location) in the environment. An edge (or link) in the graph indicates that the two views have sufficient similarity, and the vehicle or mapping agent (e.g. robot) is able to travel from one node to the other through a simple path - e.g. a linear path. The similarity between two images is measured by the number of matched Scale-Invariant Feature Transform (SIFT) features between the two images. The similarity is sufficient for creation of an edge, if the number of matched SIFT features (or image features determined by an alternative image analysis model) are above a predetermined threshold - that threshold may be set by a user or may be empirically determined based on past traversals between nodes, made by an object such as a robot or vehicle. Then, determination of a simple path between the two images is done by comparing the ratio of the geodesic distance to the euclidean distance between the two images. A simple path exists if the ratio is below 1.2.
After construction of the low-level graph, a high-level graph 108 is constructed. To achieve this, normalized cut 110 is performed iteratively on the low-level graph Gi. Each normalised cut cuts through bottlenecks corresponding to the narrow passages. A narrow passage may be a passage that has a line of sight to a destination location, but has obstacles off the line of sight that would impact the vehicle if the vehicle were to traverse along the line of sight.
The cuts result in a high-level graph Gh = (C, Eh) that contains a number of clusters 112. Each cluster Ci corresponds to a sub-space that is termed a "context". Due to the definitions in the low-level graph 102, the place nodes in each context share similar visual information (co-visibility) and permit travel between each other (traversability). Based on the graph, a switch node that connects two neighboring contexts is defined. For an edge between two neighboring contexts Ci to , the switch node is defined as the place node in Ci that has the most low-level edges to nodes in . For each context, six place nodes are randomly sampled, and these nodes are defined as the local map of the context. Downsampling place nodes reduces redundancy and improves the computational efficiency of localization.
The two-level graph serves multiple purposes. Firstly, the mapping agent can use the most relevant information (i.e., place nodes in its immediate context) for localization, which improves efficiency and accuracy. Secondly, switch nodes signal the need for context change and serve as goals that guide local navigation. Thirdly, the map acts as an abstraction that decouples low-level control from the environment, enabling the controller to learn robust obstacle avoidance skills that are orthogonal to global environments.
Figure 1 illustrates this concept. The leftmost inset shows panoramas 106 taken at locations in the area over which the topometric map is to be built. The centre inset illustrates edges connecting co-visible ones of the locations between which a simple path is defined. The rightmost inset illustrates the switch nodes generated for each cluster, that connect various regions or sub-areas of the map - e.g. different rooms or other areas within which there is essentially little or no difficulty navigating.
Figure 2 illustrates the use of the map generated in accordance with Figure 1. A path planning module (see 714 in Figure 7) uses the map generated in accordance with Figure 1 to produce a control sequence for controlling an autonomous vehicle or robot through the area. To achieved this, the navigation control system (see 700 in Figure 7) obtains the map 200 and receives a further image from the image capture device. The further image will be captured at a current location of the object (i.e. the autonomous vehicle or robot). The object is localised, per step 202, in the map based on image features in the further image and the first level map. Localisation involves identifying the cluster in which the object is located, per 204, and computing the coordinate of the object in the cluster, per 206. A destination is received from an input device, the destination being the location in the area, to which the robot must navigate. The path planning module then determines a path, per 208 from the current location, reflected in the current cluster and current coordinates, to the destination via one or more of the switch nodes. The path may be the shortest topological path between the current position and the destination. Where the path is not calculated by a processor in the object, the path generated by the path planning module can be uploaded to the object and a controller can then control the object to move from the current location to the destination along the path, per 210.
Localisation can involve any appropriate method. In the present embodiment, localisation involves context and metric localisation. Context localization is formulated as a place recognition problem. Pre-trained AlexNet, or some other feature recognition framework, is used to extract features from visual observations and use cosine similarity between features as a distance measure. To find the current context of the agent, the place node that is most similar to the current agent observation is identified among local maps of all contexts. The context of the most similar place node is the context of the agent.
To perform metric localisation, the metric pose of the current observation is calculated with respect to the switch node with the help of the local map. The object is thus localised based on features in the image corresponding to the current location of the object, and the path between that current location and the location of the nearest switching node (or the switching node in the same cluster as the object's current location) is calcaulted. Relatedly, SIFT feature matching is performed between the current observation and the features of the nodes in the local map. The matched map features are unprojected to the 3D space and transformed to the coordinate of the switch node. Then, perspective- n-point (PnP) algorithm with RANSAC is performed to estimate the metric pose between the current observation and the switch node. If there are no sufficient inlier matches, the localization fails and the previously successfully estimated pose is used instead.
Localisation of the object on the map affords path planning and subsequent control of the object to the destination. For global path planning, a weighted A- star (A*) is used to search for a path from the initial pose (current location) to the goal (destination) on the high-level map in the path planning module. The resultant path passes through a sequence of switch nodes. Si, Si+1 ...Sn . For the controller, the immediate next switch node Si is used as the local navigation target.
As illustrated in Figure 3, the controller takes as input the current depth observation Id 300 and a relative pose Psi that contains the relative distance t and angle (direction vector 302) from the object's current position to the next switch node Si. The output of the controller is a discrete action, instructing the object to move along a predetermined trajectory to the switching node Si. As a pre-processing step, the distance with bins of 0.75m and the actions with bins of 5 degrees are discretised. Other bin sizes can be used where the object is a different size, affecting traversability, where gaps through which the object must navigate are of a different size to those of the present area, or where the area includes obstacles.
The controller 304 is parameterized with a deep neural network (e.g. a convolutional neural network (CNN)) and trained with imitation learning. Based on the direction vector 302 and the depth 300, when processed through the CNN 306, a navigation instruction 308 is issued by which the object can autonomous move to the switching node - for subsequent movements, the instruction will generally direct the object to the next switching node, or the destination, along the path to the destination. The learned policy can zero-shot transfer to the real world.
Figure 4 shows the visualization of map clustering results. In Figure 4, image (a) a low-level map is shown ina condition prior to down-sampling (i.e. clustering and allocation of switching nodes), where each dot is a place node (location at which an image was captured). In Figure 4, image (b) the down- sampled low-level map is shown, where each color denotes one context. Notably, very little of the metric map is covered by switching nodes since many locations are suboptimal for traversal through the area.
The main navigation results are shown in Table I. In short, the system of embodiments of the present invention has a success rate of 82.0% and 66.3% in the standard and curved setting respectively, significantly outperforming baselines.
Path Type Model Succ SPL
NoMetric 57.9 50.1
NoTopo-EndGoal 69.9 62.8
, NoTopo-PlaceGoal 78.0 66.2 standard NoTopo.EndGoal-GTPose 87.6 86.0
NoTopo-PlaceGoal-GTPose 90.0 74.9
Ours-GTPose 89.5 87.7
Ours 82.0 77.9
NoMetric 47.1 32.9
NoTopo-EndGoal 48.0 38.1
, NoTopo-PlaceGoal 45.0 36.1 curved NoTopo-EndGoal-GTPose 65.3 56.2
NoTopo-PlaceGoal-GTPose 68.3 50.2
Ours-GTPose 80.0 68.5
Ours 66.3 53.9
TABLE I: Navigation performance in the Habitat simulator. The metric information remains highly relevant. In the absence of metric information, the baseline navigation system has a much lower success rate compared to the present methodology. This is because it estimates directly the direction vector between the current observation and the switch node. This performs particularly poorly in environments where the switch nodes are spread out. For example, in a long corridor where the switch nodes are only at the two ends, the direction estimation is inherently impractical due to the limited overlap between the two images.
Similarly, high-level topological information is important. In the absence of topological information, the baseline experiences a significant drop in success rate from the present methodology. This is because it estimates a global vector pointing directly towards the goal and thus lacks global topology information.
Figure 5 illustrates a block diagram of a system for building a map of an area 500 in accordance to an embodiment of the invention. According to Figure 5, the system comprises memory 502, a vehicle 504 comprising an image capture device 506, an image analysis engine 516, at least one processor 508 and a control system for controlling the vehicle 510. The memory 502 comprises instructions executable by the at least one processor 508 to cause the at least one processor to perform at least a part of the method of the embodiments described herein. The instructions cause the at least one processor 508 to form a first level of the map 512 before then forming a second level of the map 514 and outputting the map to a control system.
Some methods used herein are embodied by Figure 6, in which the method 600 for building a map of an area involves the at least one processor 508 forming a first level of the map (step 602). This is achieved by controlling the vehicle, using the control system, to navigate through the area (604). While doing so, the vehicle captures images of the area using the image capture device with each image corresponding to a location in the area (606). The images are then aligned based on image features appearing in two or more of the images as discussed above. Once aligned, the co-visibility and traversability can be determined for each location at which one of the images was captured (608). As discussed previously, traversability is the determination of a simple path between the two images which is done by comparing the ratio of the geodesic distance to the euclidean distance between the two images. A simple path exists if the ratio is below 1.2
Further, the at least one processor 508 forms a second level of the map (step 610) by defining clusters of the locations based on the co-visibility and traversability of the locations (612). This is achieved by performing normalized cuts (defined, e.g., in h ttpsi./Ziegexplore Jeee.Qr /a bstrac /d oc ur e nt/868688). iteratively on the low level graph, each cut cutting through bottlenecks corresponding to the narrow passages. Further, a switch node is assigned to each cluster, with each switch node corresponding to a passage for the vehicle to traverse the respective cluster (614). The switch node may be the node with highest covisibility and/or traversability, where traversability may be favoured over covisbility, particularly for undulating maps with step changes in floor height that do not affect covisibility. The switch node may be located at a location at which an image was captured during generation of the low-level graph, or may result from interpretation between two or more locations in the low-level graph where those locations have the highest traversability and covisibility of the nodes in the relevant cluster. Finally, the processor 508 then uploads the map to the control system (616) for controlling an object through the area.
Figure 7 illustrates a block diagram of a navigation control system 700 in accordance with an embodiment of the invention. According to Figure 7, the system comprises an input device 716 for requesting or supplying a destination, a controller 706 comprising memory 708, in which the memory stores a first level graph or map 710 and a second level graph or map 712. The system further comprises an image capture device 702, a path planning module 714 and at least one processor 704. The at least one processor 704 is programmed to perform at least a part of the method of the embodiments described herein
Some methods used herein are embodied by Figure 8, in which the method 800 involves the at least one processor being programmed to receive a further image from the image capture device, the further image being captured at a current location of the object (step 802). Then, the at least one processor localises the object in the map based on image features in the further image and the first level of the map (804). A destination from the input device is then received by the at least one processor in step 806. Finally, the at least processor uses the path planning module to determine a path from the current location to the destination via one or more of the switch nodes (808).
As an object navigates through the environment or area, based on the map, control may be maintained by the object supplying, and the navigation control system receiving, a sequence of images (e.g. a video feed) from the image capture device of the object as the object moves through the environment or area. From these images, the navigation control system calculates, from each image in the sequence, a depth observation and relative pose to the switch node that is next along the path. The navigation control system then controls the object using discrete actions to repeatedly align the object with the path.
The present methodologies have been evaluated using a quadruped robot in a school of computing which contains both indoor and outdoor environments. The methods disclosed herein can navigate robustly at places with limited textures where SLAM-based methods fail most of the time.
While SLAM is used in some prior art methodolgoies, it has its limitations. SLAM is sensitive to motion blur. Unlike common wheeled robots, quadruped robots unavoidably have more vibrations due to their locomotion. The vibration causes motion blur in the camera views, especially during rapid turns, which causes feature matching to fail. It is found that the SLAM is not able to successfully map beyond past some intersection. Secondly, the Real-Time Appearance Based Mapping (RTAB-Map) fails almost certainly during self-rotations, which happens when the robot reaches the end of a path or a corridor and needs to take a 180- degree turn back.
Due to difficulty in feature matching, RTAB-Map can only localize in feature-rich environments. Localisation can be brittle in environments with fewer distinctive image features by which to localise the object. Furthermore, the system cannot recover once the localization is lost, thus yielding poor performance in large- scale navigation.
The projected 2D Occupancy Grid Map produced using RTAB-Map contains fake obstacles, i.e., the dots along the corridor are actually not occupied. This could result from inaccurate point clouds generated from the mapping process, e.g., the depth sensor reading is noisy. When the occupancy map is too noisy, the planner is unable to find a feasible path to the goal, which fails navigation.
The methodologies presented herein outperform RTAB-Map SLAM in all tested routes. In particular, when the routes cover featureless environments, e.g., indoor COMI building of the National University of Singapore, RTAB-Map could not even successfully map the environment. This causes the baseline to fail completely. In contrast, the present methodologies do not require reconstruction or metric maps, thus free from such failures. This is made possible by the data-driven local controller that learns robust obstacle avoidance skills from data so that the map does not need to capture the notion of freespace.
It will be appreciated that many further modifications and permutations of various aspects of the described embodiments are possible. Accordingly, the described aspects are intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims.
Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.

Claims

Claims
1. A system for building a map of an area, the system comprising : memory; a vehicle comprising an image capture device; an image analysis engine; at least one processor; and a control system for controlling the vehicle, wherein the memory stores instructions that, when executed by the at least one processor, cause the at least one processor to: form a first level of the map by: controlling the vehicle, using the control system, to navigate through the area; capturing images of the area using the image capture device, each image corresponding to a location in the area; aligning the images based on image features that appear in two or more of the images, and determining co-visibility and traversability for each location at which a said image was captured; form a second level of the map by: defining clusters of the locations based on the co-visibility and traversability of the locations; assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; and upload the map to the control system for controlling an object through the area.
2. The system of claim 1, wherein each location in the first level of the map is a node with an associated panoramic image captured by the image capture device at the location.
3. The system of claim 2, wherein the at least one processor assembles the nodes into a graph, wherein an edge between two said nodes in the graph indicates that the vehicle can navigate between the locations corresponding to the two nodes along a simple path. The system of claim 3, where a simple path comprises a path with a ratio of geodesic distance to Euclidean distance of less than 1.2. The system of any one of claims 1 to 4, wherein defining clusters comprises defining a context for a sub-space of the area, the sub-space comprising locations that share image features and between which the vehicle can directly traverse. The system of any one of claims 1 to 5, wherein assigning a switch node to each cluster comprises assigning the switch node for each cluster based on a location with a highest number of edges to a neighbouring cluster. A navigation control system for navigating an object through an area, comprising: an input device; a controller comprising memory, the memory storing a map comprising: a first level formed from images captured at locations in the area, the images being aligned based on image features appearing in two or more of the images and a co-visibility and traversability for each location; a second level formed by defining clusters of the locations based on the co-visibility and traversability of the locations, and assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; an image capture device; a path planning module; and at least one processor, the at least one processor being programmed to: receive a further image from the image capture device, the further image being captured at a current location of the object; localise the object in the map based on image features in the further image and the first level; receive a destination from the input device; and use the path planning module to determine a path from the current location to the destination via one or more of the switch nodes. The navigation control system of claim 7, further comprising the object, the controller controlling the object from the current location to the destination along the path. The navigation control system of claim 7 or 8, wherein localising the vehicle comprises performing context localisation and metric localisation. The navigation control system of claim 9, wherein performing context localisation comprises based on cosine similarilty between features in the further image, and image features of the locations in the first level. The navigation control system of claim 9 or 10, wherein performing metric localisation comprises estimating a pose between the current location and switch node of a cluster by solving a perspective-n-point (PnP) problem based on the image features. The navigation control system of any one of claims 7 to 11, wherein the path planning module determines the path using a weighted A-star search of the switch nodes. The navigation control system of claim 8, wherein the controller: receives a sequence of images from the image capture device; calculates, from each image in the sequence, a depth observation and relative pose to the switch node that is next along the path; and controlling the object using discrete actions to repeatedly align the object with the path. A method for building a map of an area, the method comprising: forming a first level of the map by: controlling a vehicle to navigate through the area; capturing images of the area using an image capture device on the vehicle, each image corresponding to a location in the area; aligning the images based on image features that appear in two or more of the images, and determining co-visibility and traversability for each location at which a said image was captured; forming a second level of the map by: defining clusters of the locations based on the co-visibility and traversability of the locations; assigning a switch node to each cluster, each switch node corresponding to a passage for the vehicle to traverse the respective cluster; and uploading the map to a control system for controlling an object through the area. The method of claim 14, wherein each location in the first level of the map is a node with an associated panoramic image captured by the image capture device at the location, the method comprising assembling the nodes into a graph, wherein an edge between two said nodes in the graph indicates that the vehicle can navigate between the locations corresponding to the two nodes along a simple path. The method of claim 14 or 15, wherein defining clusters comprises defining a context for a sub-space of the area, the sub-space comprising locations that share image features and between which the vehicle can directly traverse. The method of any one of claims 14 to 16, wherein assigning a switch node to each cluster comprises assigning the switch node for each cluster based on a location with a highest number of edges to a neighbouring cluster. A method for navigating an object through an area using a map built according to the method of any one of claims 14 to 17, comprising : receiving a further image captured at a current location of the object; localising the object in the map based on image features in the further image and the first level; receiving a destination; and using the path planning module to determine a path from the current location to the destination via one or more of the switch nodes. The method of claim 18, further comprising controlling the object from the current location to the destination along the path. O.The method of claims 18 or 19, wherein localising the vehicle comprises performing context localisation based on cosine similarilty between features in the further image, and image features of the locations in the first level, and performing metric localisation by estimating a pose between the current location and switch node of a cluster by solving a perspective-n-point (PnP) problem based on the image features.
PCT/SG2023/050726 2022-11-02 2023-11-01 System and method for map building and map- and image-based object control WO2024096819A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10202251590R 2022-11-02
SG10202251590R 2022-11-02

Publications (1)

Publication Number Publication Date
WO2024096819A1 true WO2024096819A1 (en) 2024-05-10

Family

ID=90931697

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2023/050726 WO2024096819A1 (en) 2022-11-02 2023-11-01 System and method for map building and map- and image-based object control

Country Status (1)

Country Link
WO (1) WO2024096819A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220028156A1 (en) * 2019-10-28 2022-01-27 Zillow, Inc. Generating Floor Maps For Buildings From Automated Analysis Of Visual Data Of The Buildings' Interiors
US20220164493A1 (en) * 2019-08-28 2022-05-26 Zillow, Inc. Automated Tools For Generating Mapping Information For Buildings
US20220196409A1 (en) * 2020-12-23 2022-06-23 Here Global B.V. Method, apparatus, and computer program product for establishing three-dimensional correspondences between images
US20220196405A1 (en) * 2020-12-17 2022-06-23 Adobe Inc. Systems for Generating Indications of Traversable Paths

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220164493A1 (en) * 2019-08-28 2022-05-26 Zillow, Inc. Automated Tools For Generating Mapping Information For Buildings
US20220028156A1 (en) * 2019-10-28 2022-01-27 Zillow, Inc. Generating Floor Maps For Buildings From Automated Analysis Of Visual Data Of The Buildings' Interiors
US20220196405A1 (en) * 2020-12-17 2022-06-23 Adobe Inc. Systems for Generating Indications of Traversable Paths
US20220196409A1 (en) * 2020-12-23 2022-06-23 Here Global B.V. Method, apparatus, and computer program product for establishing three-dimensional correspondences between images

Similar Documents

Publication Publication Date Title
US10102429B2 (en) Systems and methods for capturing images and annotating the captured images with information
JP6896077B2 (en) Vehicle automatic parking system and method
US10496104B1 (en) Positional awareness with quadocular sensor in autonomous platforms
CN104536445B (en) Mobile navigation method and system
CN109186606B (en) Robot composition and navigation method based on SLAM and image information
US11268816B2 (en) Intermediate waypoint generator
Konolige et al. Mapping, navigation, and learning for off‐road traversal
Kumar et al. Visual memory for robust path following
CN107967457A (en) A kind of place identification for adapting to visual signature change and relative positioning method and system
CN107179082B (en) Autonomous exploration method and navigation method based on fusion of topological map and measurement map
Senlet et al. Satellite image based precise robot localization on sidewalks
CN110986945A (en) Local navigation method and system based on semantic height map
Fauadi et al. Intelligent vision-based navigation system for mobile robot: A technological review
Siagian et al. Autonomous mobile robot localization and navigation using a hierarchical map representation primarily guided by vision
WO2022238189A1 (en) Method of acquiring sensor data on a construction site, construction robot system, computer program product, and training method
Chen et al. Design and Implementation of AMR Robot Based on RGBD, VSLAM and SLAM
CN114460939A (en) Intelligent walking robot autonomous navigation improvement method under complex environment
Bai et al. Research on obstacles avoidance technology for UAV based on improved PTAM algorithm
WO2024096819A1 (en) System and method for map building and map- and image-based object control
CN109389677B (en) Real-time building method, system, device and storage medium of house three-dimensional live-action map
Ishiguro et al. T-net for navigating a vision-guided robot in a real world
EP4053801A1 (en) Landmark learning and localization without labels
US11865724B2 (en) Movement control method, mobile machine and non-transitory computer readable storage medium
CN114459483A (en) Landmark navigation map construction and application method and system based on robot navigation
Natan et al. DeepIPC: Deeply integrated perception and control for an autonomous vehicle in real environments

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23886448

Country of ref document: EP

Kind code of ref document: A1