CN117252999A - Dense map construction method based on semantic tags and sparse point cloud - Google Patents

Dense map construction method based on semantic tags and sparse point cloud Download PDF

Info

Publication number
CN117252999A
CN117252999A CN202311064044.6A CN202311064044A CN117252999A CN 117252999 A CN117252999 A CN 117252999A CN 202311064044 A CN202311064044 A CN 202311064044A CN 117252999 A CN117252999 A CN 117252999A
Authority
CN
China
Prior art keywords
semantic
pixel
pixels
distance
pixel points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311064044.6A
Other languages
Chinese (zh)
Inventor
李健华
王茂帅
谢恩鹏
秦西运
王辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong New Generation Information Industry Technology Research Institute Co Ltd
Original Assignee
Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong New Generation Information Industry Technology Research Institute Co Ltd filed Critical Shandong New Generation Information Industry Technology Research Institute Co Ltd
Priority to CN202311064044.6A priority Critical patent/CN117252999A/en
Publication of CN117252999A publication Critical patent/CN117252999A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a dense map construction method based on semantic tags and sparse point clouds, and relates to the technical field of three-dimensional map construction; comprises the following steps: collecting data; step 2: performing semantic segmentation and point cloud registration; step 3: carrying out local map joint optimization processing based on the point cloud information; step 4: fusing the local map, and performing closed loop detection and optimization; step 5: perfecting the dense map; and the construction of the dense map is rapidly and accurately realized.

Description

Dense map construction method based on semantic tags and sparse point cloud
Technical Field
The invention discloses a method, relates to the technical field of three-dimensional map construction, and in particular relates to a dense map construction method based on semantic tags and sparse point clouds.
Background
The existing map construction method mainly focuses on geometric accuracy, and acquires geometric information of the environment, such as point cloud, laser radar data and the like, through sensor data. However, geometric information often lacks semantic attributes, detailed descriptions of object features and scene layout in the environment cannot be provided, and a robot cannot better understand the surrounding environment and execute tasks such as obstacle recognition, target tracking, path planning and the like by combining the semantic information with map construction.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a dense map construction method based on semantic tags and sparse point clouds, which is characterized by combining point cloud configuration and interpolation operation of the sparse point cloud map through semantic tag segmentation, generating a new key frame based on a pixel point reselection strategy of a semantic plane and detection of semantic point motion consistency, optimizing data of a local map, and realizing dense map construction through semantic plane fitting and pixel point depth estimation.
The specific scheme provided by the invention is as follows:
the invention provides a dense map construction method based on semantic tags and sparse point clouds, which comprises the following steps:
step 1: collecting data: collecting key frames and sparse point cloud data, and defining semantic tags of pixel points of key frame pictures, wherein the semantic tags comprise static semantic tags and dynamic semantic tags;
step 2: semantic segmentation and point cloud registration are performed: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, distributing and mapping each point in the sparse point cloud data to a corresponding semantic label according to the area blocks to finish the correspondence of the sparse point cloud data and the pixel points,
registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information;
step 3: and carrying out local map joint optimization processing based on the point cloud information: detecting redundant key frames and deleting the redundant key frames to obtain the latest key frames, filtering out pixel points with dynamic labels in the latest key frames,
constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution;
step 4: fusing the local map, and performing closed loop detection and optimization;
step 5: and constructing a semantic plane by using a plane fitting algorithm according to the fused map, and perfecting a dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane.
Further, in the dense map construction method based on semantic tags and sparse point clouds, the static semantic tags in step 1 include Road, sidewalk sidwalk, building, wall, railing, post Pole, traffic signal light Traffic Sign, traffic Sign, green planting guide, topography Terrain,
the dynamic semantic tags include people, automobiles, bicycles, motorcycles, dogs, and cats.
Further, in the dense map construction method based on semantic tags and sparse point clouds, redundant key frames are detected and deleted in step 3, and the method comprises the following steps: setting the redundancy quantity, and deleting one key frame if the pixel points exceeding the redundancy quantity in the one key frame are observed by at least 3 key frames.
Further, in the step 3 of the dense map construction method based on semantic tags and sparse point clouds, a plane fitting algorithm is used to calculate a combined distance according to euclidean distances of pixels in a key frame and differences of gray scales of the pixels, the combined distance represents a distance between pixels of the same static semantic tag,
the tracked pixel points of the same static semantic label in the marginal key frame are aggregated into point pairs according to the combined distance, the point pairs are aggregated into a pixel point group through Euclidean distance between the pixel points in the point pairs and the average gray level difference of the point pairs, a distance threshold value is set, the pixel points in the pixel point group are prevented from crossing the static semantic label beyond the distance threshold value,
according to the three-dimensional coordinates of the pixel points in the pixel point group, a RANSAC method is utilized to form a semantic plane corresponding to the pixel point group.
In the dense map construction method based on semantic labels and sparse point clouds, in step 3, grid intervals of collecting key frame pixels are set to be S, the pixels in a key frame picture, euclidean distances dis between the pixels, pixel gray level differences dis, and combined distance dis are obtained, and are expressed by the following formula:
calculating the combined distance disC, N by the above formula c Representing the maximum color distance between pixel points, and setting a constant so that N s Equal to S, representing the maximum spatial distance between the acquired pixel points within the grid interval, (x) i ,y i ) And (x) j ,y j ) Respectively representing pixel coordinates, [ l ] i ,a i ,b i ]And [ l ] j ,a j ,b j ]Respectively representing the pixel colors of the pixel points in the cielab color space.
The invention provides a dense map construction device based on semantic tags and sparse point clouds, which comprises an acquisition module, a semantic segmentation and point cloud registration module, a joint optimization processing module, a fusion and detection module and an updating module,
the acquisition module acquires data: collecting key frames and sparse point cloud data, and defining semantic tags of pixel points of key frame pictures, wherein the semantic tags comprise static semantic tags and dynamic semantic tags;
the semantic segmentation and point cloud registration module performs semantic segmentation and point cloud registration: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, distributing and mapping each point in the sparse point cloud data to a corresponding semantic label according to the area blocks to finish the correspondence of the sparse point cloud data and the pixel points,
registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information;
the joint optimization processing module performs local map joint optimization processing based on the point cloud information: detecting redundant key frames and deleting the redundant key frames to obtain the latest key frames, filtering out pixel points with dynamic labels in the latest key frames,
constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution;
the fusion and detection module fuses the local map and performs closed loop detection and optimization;
and the updating module constructs a semantic plane by utilizing a plane fitting algorithm according to the fused map, and perfects the dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane.
Further, the static semantic tags defined by the acquisition module in the dense map construction device based on semantic tags and sparse point clouds comprise roads, sidewalks sidealk, building blocks, walls, railing, posts, traffic signal lights Traffic Sign, traffic Sign, green planting Vegetation, topography Terrain,
the dynamic semantic tags include people, automobiles, bicycles, motorcycles, dogs, and cats.
Further, the method for detecting and deleting redundant key frames by the joint optimization processing module in the dense map construction device based on the semantic tags and the sparse point cloud comprises the following steps: setting the redundancy quantity, and deleting one key frame if the pixel points exceeding the redundancy quantity in the one key frame are observed by at least 3 key frames.
Further, in the dense map construction device based on the semantic tags and the sparse point cloud, a plane fitting algorithm is utilized to calculate a combined distance according to Euclidean distance of the pixel points in the key frame and the difference of gray level of the pixel points, the combined distance represents the distance between the pixel points of the same static semantic tags,
the tracked pixel points of the same static semantic label in the marginal key frame are aggregated into point pairs according to the combined distance, the point pairs are aggregated into a pixel point group through Euclidean distance between the pixel points in the point pairs and the average gray level difference of the point pairs, a distance threshold value is set, the pixel points in the pixel point group are prevented from crossing the static semantic label beyond the distance threshold value,
according to the three-dimensional coordinates of the pixel points in the pixel point group, a RANSAC method is utilized to form a semantic plane corresponding to the pixel point group.
Further, in the dense map construction device based on semantic labels and sparse point clouds, grid intervals for collecting pixels of a key frame are set to be S, pixels in a key frame picture, euclidean distances disO among the pixels, gray differences disG of the pixels and combined distances disC are collected, and the method is expressed by the following formula:
calculating the combined distance disC, N by the above formula c Representing the maximum color distance between pixel points, and setting a constant so that N s Equal to S, representing the maximum spatial distance between the acquired pixel points within the grid interval, (x) i ,y i ) And (x) j ,y j ) Respectively representing pixel coordinates, [ l ] i ,a i ,b i ]And [ l ] j ,a j ,b j ]Respectively representing the pixel colors of the pixel points in the cielab color space.
The invention has the advantages that:
the invention provides a dense map construction method based on semantic tags and sparse point clouds, which is characterized in that the dense map construction is rapidly and accurately realized through semantic tag segmentation features, point cloud registration and interpolation operation of the sparse point cloud map, a new key frame is generated based on a pixel point reselection strategy of a semantic plane, and data of a local map is optimized through semantic plane fitting and pixel point depth estimation.
Drawings
FIG. 1 is a schematic flow chart of the method of the invention.
FIG. 2 is a schematic diagram of the layout of the application of the method of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.
The invention provides a dense map construction method based on semantic tags and sparse point clouds, which comprises the following steps:
step 1: collecting data: collecting key frames and sparse point cloud data, and defining semantic tags of pixel points of key frame pictures, wherein the semantic tags comprise static semantic tags and dynamic semantic tags;
step 2: semantic segmentation and point cloud registration are performed: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, mapping each point distribution in the sparse point cloud data to a corresponding semantic label according to the area blocks,
registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information;
step 3: and carrying out local map joint optimization processing based on the point cloud information: detecting redundant key frames and deleting the redundant key frames to obtain the latest key frames, filtering out pixel points with dynamic labels in the latest key frames,
constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution;
step 4: fusing the local map, and performing closed loop detection and optimization;
step 5: and constructing a semantic plane by using a plane fitting algorithm according to the fused map, and perfecting a dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane.
The method enriches semantic information for map construction, and the map construction method based on the semantic information can allocate semantic tags for each map point, such as roads, buildings, trees, pedestrians and the like, so that the meaning and expression capacity of the map are enriched;
enhancing the environmental awareness: the semantic information enables the robot to better understand and perceive the surrounding environment. Through semantic tags, the robot can recognize different objects and scenes, so that tasks such as position location, obstacle detection, target tracking and the like can be more accurately performed.
The efficiency of navigation and path planning is further improved: the robot can conduct path planning and navigation according to semantic attributes of the target, and the robot can reach the destination faster. For example, it is possible to select a path suitable for a pedestrian, instead of a vehicle road, or to pass through a pedestrian area while avoiding a vehicle area.
Optimizing decision and interaction capabilities: the map based on semantic information can provide more context and semantic information for the robot, helping it to make more accurate decisions. For example, the robot may distinguish between different types of obstacles based on semantic tags, selecting an appropriate detour strategy.
Better man-machine interaction and interpretability: the semantic information is helpful for intelligent decision making of robots, and can provide richer expression capability for human-computer interaction. Through the semantic map, people can more intuitively understand the environment perception and decision process of the robot.
In specific applications, in some embodiments of the method of the present invention, when constructing a dense map based on semantic tags and sparse point clouds, the following procedure may be referred to:
step 1: collecting data: and acquiring key frames and sparse point cloud data by using sensors such as a laser radar and the like, and defining semantic tags of pixels of the key frame picture, wherein the semantic tags comprise static semantic tags and dynamic semantic tags. After data acquisition, an initialization process may be performed.
Further, the static semantic tags include Road, pavement, building, wall, railing, post, traffic Sign, green plant vehicle, terrain,
the dynamic semantic tags include people, automobiles, bicycles, motorcycles, dogs, and cats. The invention uses the static semantic tags to reduce the consumption of computing resources on the semantic segmentation network.
Step 2: semantic segmentation and point cloud registration are performed: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, mapping each point distribution in the sparse point cloud data to a corresponding semantic label according to the area blocks,
and registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information. The semantic segmentation model can be an existing semantic segmentation model, a key frame is segmented into area blocks according to semantic tags, and each point in the sparse point cloud data is distributed and mapped to a corresponding semantic tag according to the area blocks.
And (3) aligning the point clouds at different positions of the acquired sparse point cloud data through a registration algorithm (Iterative Closest Point, ICP) so as to obtain globally consistent point cloud information under the same coordinate system.
Step 3: and carrying out local map joint optimization processing based on the point cloud information: redundant key frames are detected and deleted. Wherein detecting redundant key frames and deleting comprises: setting the redundancy quantity, and deleting one key frame if the pixel points exceeding the redundancy quantity in the one key frame are observed by at least 3 key frames. The redundancy number typically contains 90% of the pixels. The key frames can be obtained by using a beam adjustment method, a growth tree and a word bag BOW model of the key frames are calculated, and different positions of the same pixel points in the two key frames are tracked.
And obtaining the latest key frame, and filtering out the pixel points with dynamic labels in the latest key frame. Wherein the motion consistency pixel point is detected: dividing the tracked pixel points into a plurality of pixel point groups by utilizing semantic tags of the pixel points and position information of the pixel points in the image, filtering the pixel points with dynamic tags based on the pixel point groups and the polar plane geometric limitation, setting the pixel points to be untracked in the latest key frame in the area corresponding to the pixel points with the dynamic tags, and reducing the influence of a dynamic semantic tag target.
Constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution.
Further, in step 3, the distance between pixels of the same static semantic tag is calculated by using a plane fitting algorithm according to the Euclidean distance of pixels in the key frame and the difference combination distance of gray scales of the pixels,
the tracked pixel points of the same static semantic label in the marginal key frame are aggregated into point pairs according to the combined distance, the point pairs are aggregated into a pixel point group through Euclidean distance between the pixel points in the point pairs and the average gray level difference of the point pairs, a distance threshold value is set, the pixel points in the pixel point group are prevented from crossing the static semantic label beyond the distance threshold value,
according to the three-dimensional coordinates of the pixel points in the pixel point group, a RANSAC method is utilized to form a semantic plane corresponding to the pixel point group. When the plane fitting algorithm is executed, firstly, setting grid intervals for collecting pixels of a key frame as S, collecting pixels in a key frame picture, euclidean distance dis between the pixels, gray level difference dis of the pixels, and combining the distance dis, wherein the distance dis is expressed by the following formula:
calculating the combined distance disC, N by the above formula c Representing the maximum color distance between pixel points, and setting a constant so that N s Equal to S, representing the maximum spatial distance between the acquired pixel points within the grid interval, (x) i ,y i ) And (x) j ,y j ) Respectively representing pixel coordinates, [ l ] i ,a i ,b i ]And [ l ] j ,a j ,b j ]Respectively representing the pixel colors of the pixel points in the cielab color space. The invention uses the difference of pixel gray level to distinguish points at different positions of the same object, such as the wall and the top surface of the same house. First, the tracked pixel points of the same semantic label in the marginalized frame are aggregated into point pairs by using a combined distance disC, and then pass between points inside the point pairsEuclidean distance disk-O and point-to-average gray level difference disk-G, the nearest, next nearest point pair is aggregated into a group of pixel points. The generated pixel group is then prevented from crossing the same static semantic tag by setting a distance threshold disthp. After the repeated points are removed, at least four points which are not on the same line can exist in each pixel point group, the points in the pixel point groups belong to the same semantic label and are close to each other, and a semantic plane corresponding to the pixel point groups, such as a road plane or a building plane, is estimated by using a RANSAC method according to the three-dimensional coordinates of the points in the pixel point groups. The semantic planes generated above all have a corresponding logo with location, gray scale and semantic label information.
Step 4: and fusing the local map, and performing closed loop detection and optimization. And fusing a plurality of dense point cloud maps through closed loop detection, closed loop correction and global pose optimization operation, and eliminating conflict of the overlapped areas to obtain more complete map representation. And optimizing the fused map by using a map optimization algorithm to eliminate noise and estimation errors, wherein the map optimization algorithm comprises g2o, iSAM, HOG-Man, SPA2d and the like, and generally defines that points in the optimization map are camera pose, edges are camera pose transformation relations and generally are error items. After an optimization graph is constructed, initial values are selected for iteration, a jacobian matrix and a sea plug matrix corresponding to the current estimated value are calculated, meanwhile, a sparse linear equation HkDeltax= -bkHkDeltax= -bk is solved, the gradient direction is obtained, and iteration is carried out by a Gaussian-Newton or L-M method continuously. And returning to the optimized value until the iteration is finished.
Step 5: and constructing a semantic plane by using a plane fitting algorithm according to the fused map, and perfecting a dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane. The dense map is expressed in a three-dimensional grid form, tracked pixels in the marginalized frame are divided into a plurality of pixel point groups by utilizing the position information of the pixel points of the marginalized frame in the image and the corresponding semantic labels, and the points in each pixel point group have the same static semantic label and are mutually close to each other, so that the points can be visually on the same semantic plane. Fitting a semantic plane according to the three-dimensional coordinates of the pixel points, combining point positions and static semantic labels through interpolation algorithms such as nearest neighbor interpolation, gaussian process interpolation and the like, and recovering a dense semantic map of the urban road environment according to the semantic plane, so that the aim of perfecting the dense map from the sparse point cloud is finally achieved. The method can be used in applications such as navigation and path planning, and navigation and decision-making capability of the robot is optimized by utilizing semantic information in the map.
The invention provides a dense map construction device based on semantic tags and sparse point clouds, which comprises an acquisition module, a semantic segmentation and point cloud registration module, a joint optimization processing module, a fusion and detection module and an updating module,
the acquisition module acquires data: collecting key frames and sparse point cloud data, and defining semantic tags of pixel points of key frame pictures, wherein the semantic tags comprise static semantic tags and dynamic semantic tags;
the semantic segmentation and point cloud registration module performs semantic segmentation and point cloud registration: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, mapping each point distribution in the sparse point cloud data to a corresponding semantic label according to the area blocks,
registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information;
the joint optimization processing module performs local map joint optimization processing based on the point cloud information: detecting redundant key frames and deleting the redundant key frames to obtain the latest key frames, filtering out pixel points with dynamic labels in the latest key frames,
constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution;
the fusion and detection module fuses the local map and performs closed loop detection and optimization;
and the updating module constructs a semantic plane by utilizing a plane fitting algorithm according to the fused map, and perfects the dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane.
The content of information interaction and execution process between the modules in the device is based on the same conception as the embodiment of the method of the present invention, and specific content can be referred to the description in the embodiment of the method of the present invention, which is not repeated here.
Similarly, the device of the invention generates a new key frame based on a pixel point reselection strategy of a semantic plane by combining semantic label segmentation characteristics, point cloud registration and interpolation operation of a sparse point cloud map, optimizes data of a local map, and rapidly and accurately realizes construction of a dense map by semantic plane fitting and pixel point depth estimation.
It should be noted that not all the steps and modules in the above processes and the structures of the devices are necessary, and some steps or modules may be omitted according to actual needs. The execution sequence of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by multiple physical entities, or may be implemented jointly by some components in multiple independent devices.
The above-described embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims (10)

1. A dense map construction method based on semantic tags and sparse point clouds is characterized by comprising the following steps:
step 1: collecting data: collecting key frames and sparse point cloud data, and defining semantic tags of pixel points of key frame pictures, wherein the semantic tags comprise static semantic tags and dynamic semantic tags;
step 2: semantic segmentation and point cloud registration are performed: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, distributing and mapping each point in the sparse point cloud data to a corresponding semantic label according to the area blocks to finish the correspondence of the sparse point cloud data and the pixel points,
registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information;
step 3: and carrying out local map joint optimization processing based on the point cloud information: detecting redundant key frames and deleting the redundant key frames to obtain the latest key frames, filtering out pixel points with dynamic labels in the latest key frames,
constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution;
step 4: fusing the local map, and performing closed loop detection and optimization;
step 5: and constructing a semantic plane by using a plane fitting algorithm according to the fused map, and perfecting a dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane.
2. The dense map construction method based on semantic tags and sparse point clouds according to claim 1, wherein the static semantic tags in step 1 include Road, sidewalk sidwalk, building, wall, railing, post, traffic signal light Traffic Sign, traffic Sign, green plant Vegetation, topography Terrain,
the dynamic semantic tags include people, automobiles, bicycles, motorcycles, dogs, and cats.
3. The dense map construction method based on semantic tags and sparse point clouds according to claim 1, wherein the step 3 of detecting and deleting redundant key frames comprises: setting the redundancy quantity, and deleting one key frame if the pixel points exceeding the redundancy quantity in the one key frame are observed by at least 3 key frames.
4. The method for constructing a dense map based on semantic tags and sparse point clouds according to claim 1, wherein in the step 3, a plane fitting algorithm is used to calculate a combined distance from Euclidean distances of pixels in a key frame and differences in gray levels of the pixels, the combined distance represents a distance between pixels of the same static semantic tag,
the tracked pixel points of the same static semantic label in the marginal key frame are aggregated into point pairs according to the combined distance, the point pairs are aggregated into a pixel point group through Euclidean distance between the pixel points in the point pairs and the average gray level difference of the point pairs, a distance threshold value is set, the pixel points in the pixel point group are prevented from crossing the static semantic label beyond the distance threshold value,
according to the three-dimensional coordinates of the pixel points in the pixel point group, a RANSAC method is utilized to form a semantic plane corresponding to the pixel point group.
5. The dense map construction method based on semantic tags and sparse point clouds as claimed in claim 4, wherein in step 3, grid intervals of collecting keyframe pixels are set to be S, pixels in a keyframe picture, euclidean distances disO among the pixels, gray level differences disG of the pixels, and combined distances disC are set, and are expressed by the following formula:
calculating the combined distance disC, N by the above formula c Representing the maximum color distance between pixel points, and setting a constant so that N s Equal to S, representing the maximum spatial distance between the acquired pixel points within the grid interval, (x) i ,y i ) And (x) j ,y j ) Respectively representing pixel coordinates, [ l ] i ,a i ,b i ]And [ l ] j ,a j ,b j ]Respectively representing the pixel colors of the pixel points in the cielab color space.
6. The dense map construction device based on semantic tags and sparse point clouds is characterized by comprising an acquisition module, a semantic segmentation and point cloud registration module, a joint optimization processing module, a fusion and detection module and an updating module,
the acquisition module acquires data: collecting key frames and sparse point cloud data, and defining semantic tags of pixel points of key frame pictures, wherein the semantic tags comprise static semantic tags and dynamic semantic tags;
the semantic segmentation and point cloud registration module performs semantic segmentation and point cloud registration: dividing the key frame into area blocks according to semantic labels by utilizing a semantic division model, distributing and mapping each point in the sparse point cloud data to a corresponding semantic label according to the area blocks to finish the correspondence of the sparse point cloud data and the pixel points,
registering and aligning the sparse point cloud data under different coordinate systems by using a registration algorithm to obtain globally consistent point cloud information;
the joint optimization processing module performs local map joint optimization processing based on the point cloud information: detecting redundant key frames and deleting the redundant key frames to obtain the latest key frames, filtering out pixel points with dynamic labels in the latest key frames,
constructing a semantic plane by using a plane fitting algorithm: calculating the distance between pixels of the same static semantic label, aggregating the tracked pixels in the marginalized key frame into a plurality of pixel groups according to the distance and the corresponding static semantic label, wherein the pixels in each pixel group have the same static semantic label and are mutually close to each other to form a semantic plane of the pixel group, and filtering pixels inconsistent with the semantic plane distribution;
the fusion and detection module fuses the local map and performs closed loop detection and optimization;
and the updating module constructs a semantic plane by utilizing a plane fitting algorithm according to the fused map, and perfects the dense map by combining the pixel point group and the static semantic label through an interpolation algorithm based on the semantic plane.
7. The dense map construction apparatus based on semantic tags and sparse point clouds according to claim 6, wherein the static semantic tags defined by the acquisition module include Road, sidewalk sidwalk, building, wall, railing, post, traffic signal light Traffic Sign, green plant vehicle, topography Terrain,
the dynamic semantic tags include people, automobiles, bicycles, motorcycles, dogs, and cats.
8. The dense map construction apparatus based on semantic tags and sparse point clouds according to claim 6, wherein the joint optimization processing module detects redundant key frames and deletes the same, comprising: setting the redundancy quantity, and deleting one key frame if the pixel points exceeding the redundancy quantity in the one key frame are observed by at least 3 key frames.
9. The dense map construction apparatus based on semantic tags and sparse point clouds as claimed in claim 6, wherein the joint optimization processing module calculates a combined distance from Euclidean distances of pixels in the key frame and differences in gray level of the pixels using a plane fitting algorithm, the combined distance representing a distance between pixels of the same static semantic tag,
the tracked pixel points of the same static semantic label in the marginal key frame are aggregated into point pairs according to the combined distance, the point pairs are aggregated into a pixel point group through Euclidean distance between the pixel points in the point pairs and the average gray level difference of the point pairs, a distance threshold value is set, the pixel points in the pixel point group are prevented from crossing the static semantic label beyond the distance threshold value,
according to the three-dimensional coordinates of the pixel points in the pixel point group, a RANSAC method is utilized to form a semantic plane corresponding to the pixel point group.
10. The dense map construction method based on semantic tags and sparse point clouds as claimed in claim 9, wherein grid intervals of collecting keyframe pixels are set as S in a joint optimization processing module, pixels in a keyframe picture are collected, euclidean distances disO among the pixels, gray level differences disG of the pixels, and combined distances disC are represented by the following formula:
calculating the combined distance disC, N by the above formula c Representing the maximum color distance between pixel points, and setting a constant so that N s Equal to S, representing the maximum spatial distance between the acquired pixel points within the grid interval, (x) i ,y i ) And (x) j ,y j ) Respectively representing pixel coordinates, [ l ] i ,a i ,b i ]And [ l ] j ,a j ,b j ]Respectively representing the pixel colors of the pixel points in the cielab color space.
CN202311064044.6A 2023-08-23 2023-08-23 Dense map construction method based on semantic tags and sparse point cloud Pending CN117252999A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311064044.6A CN117252999A (en) 2023-08-23 2023-08-23 Dense map construction method based on semantic tags and sparse point cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311064044.6A CN117252999A (en) 2023-08-23 2023-08-23 Dense map construction method based on semantic tags and sparse point cloud

Publications (1)

Publication Number Publication Date
CN117252999A true CN117252999A (en) 2023-12-19

Family

ID=89134037

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311064044.6A Pending CN117252999A (en) 2023-08-23 2023-08-23 Dense map construction method based on semantic tags and sparse point cloud

Country Status (1)

Country Link
CN (1) CN117252999A (en)

Similar Documents

Publication Publication Date Title
CN111928862B (en) Method for on-line construction of semantic map by fusion of laser radar and visual sensor
Zai et al. 3-D road boundary extraction from mobile laser scanning data via supervoxels and graph cuts
CN110084272B (en) Cluster map creation method and repositioning method based on cluster map and position descriptor matching
CN108802785B (en) Vehicle self-positioning method based on high-precision vector map and monocular vision sensor
US10437252B1 (en) High-precision multi-layer visual and semantic map for autonomous driving
CN114842438B (en) Terrain detection method, system and readable storage medium for automatic driving automobile
US10794710B1 (en) High-precision multi-layer visual and semantic map by autonomous units
Liang et al. Video stabilization for a camcorder mounted on a moving vehicle
CN110648389A (en) 3D reconstruction method and system for city street view based on cooperation of unmanned aerial vehicle and edge vehicle
CN111652179A (en) Semantic high-precision map construction and positioning method based on dotted line feature fusion laser
CN114413881B (en) Construction method, device and storage medium of high-precision vector map
CN111006655A (en) Multi-scene autonomous navigation positioning method for airport inspection robot
Zou et al. Real-time full-stack traffic scene perception for autonomous driving with roadside cameras
CN114509065B (en) Map construction method, system, vehicle terminal, server and storage medium
CN113126115A (en) Semantic SLAM method and device based on point cloud, electronic equipment and storage medium
Jang et al. Road lane semantic segmentation for high definition map
CN114815810A (en) Unmanned aerial vehicle-cooperated overwater cleaning robot path planning method and equipment
CN114325634A (en) Method for extracting passable area in high-robustness field environment based on laser radar
CN114120283A (en) Method for distinguishing unknown obstacles in road scene three-dimensional semantic segmentation
Muresan et al. Real-time object detection using a sparse 4-layer LIDAR
CN112257668A (en) Main and auxiliary road judging method and device, electronic equipment and storage medium
CN116597122A (en) Data labeling method, device, electronic equipment and storage medium
CN117593685B (en) Method and device for constructing true value data and storage medium
CN113671522B (en) Dynamic environment laser SLAM method based on semantic constraint
Cheng et al. Semantic segmentation of road profiles for efficient sensing in autonomous driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination