CN116258826A - Semantic map construction and boundary real-time extraction method for open-air mining area - Google Patents

Semantic map construction and boundary real-time extraction method for open-air mining area Download PDF

Info

Publication number
CN116258826A
CN116258826A CN202310013284.7A CN202310013284A CN116258826A CN 116258826 A CN116258826 A CN 116258826A CN 202310013284 A CN202310013284 A CN 202310013284A CN 116258826 A CN116258826 A CN 116258826A
Authority
CN
China
Prior art keywords
point
semantic
map
frame
boundary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310013284.7A
Other languages
Chinese (zh)
Inventor
段星集
郭旭东
李华志
李德志
李晓根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tage Idriver Technology Co Ltd
Original Assignee
Beijing Tage Idriver Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tage Idriver Technology Co Ltd filed Critical Beijing Tage Idriver Technology Co Ltd
Priority to CN202310013284.7A priority Critical patent/CN116258826A/en
Publication of CN116258826A publication Critical patent/CN116258826A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/34Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Navigation (AREA)

Abstract

The invention relates to a semantic map construction and boundary real-time extraction method for open-air mining areas, belongs to the technical field of automatic driving, and aims to solve the problems that extraction of driving area boundaries is inaccurate and low in speed in the existing automatic driving technology, and the driving area boundaries cannot be extracted while semantic segmentation is carried out on images. The method comprises the steps of rasterizing a point cloud, and extracting the characteristics of the rasterized point cloud from the cylindrical characteristics of the rasterized point; the semantic RGB map is obtained by downsampling and upsampling, RGB branches are added at the same time, the boundary extraction architecture based on the semantic RGB map is used for analyzing points and grid features in grids divided in a semantic segmentation network during the construction of the map, boundary extraction is performed through the features, and control points are adopted for fitting smooth boundaries, so that the non-structured road boundary extraction precision and the boundary extraction speed are improved.

Description

Semantic map construction and boundary real-time extraction method for open-air mining area
Technical Field
The invention belongs to the technical field of automatic driving, and relates to a semantic map construction and boundary real-time extraction method for open-air mining areas.
Background
In recent years, with the rapid development of artificial intelligence and new generation information technology, the key technology of automatic driving is promoted to be further perfected. The open-pit mining area is one of the scenes of automatic driving and quick landing due to the characteristics of road closure, relatively simple environment and the like.
The automatic driving key technology can be divided into three technologies of environment sensing, planning decision and tracking control. Perceptually, the road boundary may narrow the effective ROI area. Therefore, the automatic and accurate extraction of the road boundary plays a significant role in the rapid landing of the automatic driving.
Chinese patent publication No. CN108320293A, entitled "Rapid Point cloud boundary extraction method combining improved particle swarm optimization", discloses a method for extracting boundary by selecting all boundary feature points in an optimal model through a particle swarm algorithm by selecting uniformity of point distribution in a K neighborhood. However, the method carries out boundary search through a particle swarm algorithm, and the method continuously searches the optimal boundary feature points in a traversing mode, so that the time consumption is long, and local optimal points are easy to fall into; in addition, inertia weight is required to be set manually to avoid the algorithm from sinking into local optimum, so that the automation degree of boundary extraction is reduced.
The invention discloses a method for identifying a drivable region, a system, computer equipment and a storage medium thereof, which are disclosed in China patent publication No. CN 114120260A. The method mainly performs semantic segmentation on the identification of the drivable area, but does not extract and optimize the road boundary, and the drivable area of the road obtained only by the two-dimensional image has the problem of instability and is easy to generate the phenomenon of segmentation errors.
Disclosure of Invention
In view of the analysis, the invention provides a semantic map construction and boundary real-time extraction method for open-air mining areas, which is used for solving the problems that the extraction of the driving area boundary is inaccurate and slow in the existing automatic driving technology, and the driving area boundary cannot be extracted while the image is subjected to semantic segmentation.
The invention discloses a semantic map construction and boundary real-time extraction method for open-air mining areas, which comprises the following specific steps:
step 1, constructing a semantic segmentation model:
step 11, obtaining pictures and point cloud data at the same moment;
step 12, rasterizing the point cloud data by a cylinder to obtain point cloud grid units and cylinder characteristics of points, and obtaining tensors of the characteristics of each point cloud grid unit based on the cylinder characteristics of the points and the MLP (multi-layer neural network);
step 13, downsampling the points of the point cloud grid unit:
step 131, inputting tensors of the characteristics of each point cloud grid unit into an asymmetric residual block; outputting tensors of the sparse convolution characteristics;
step 132, projecting RGB pixel point information of the picture onto point cloud data in each point cloud grid unit to obtain RGB point cloud grid units; performing multi-mode focus convolution downsampling based on the RGB point cloud grid unit:
step 14, up-sampling the points of the point cloud grid unit:
step 141, performing multi-mode focus convolution up-sampling based on the RGB point cloud grid unit:
step 142, up-sampling the obtained up-sampling output grid characteristics to obtain tensors of down-sampling and up-sampling superposition characteristics of each point;
step 15, refining tensors of down-sampling and up-sampling point superposition characteristics of each point by using the MLP point characteristics of each point obtained in the step 12; acquiring the category probability of each refined point by using a loss function, and taking the highest category probability of each point as the point label of the point;
step 2, constructing a real-time semantic map:
acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle;
using the semantic segmentation model constructed in the step 1 to form a key frame semantic point cloud for the label of each point in the obtained real-time point cloud data of each frame;
superposing all the key frame semantic point clouds to obtain a semantic map, and generating an optimized single-frame semantic RGB map;
step 3, extracting boundaries in real time:
based on the semantic segmentation model in the step 1, a grid map is constructed when an optimized single-frame semantic RGB map is generated, the boundary of the grid map is extracted, and a control point is adopted to fit a smooth boundary to obtain a final boundary.
Optionally, the method further comprises a step of filtering dynamic obstacles before extracting the boundary in real time, and the specific steps are as follows:
judging the label of each point in the optimized single-frame semantic RGB map, and eliminating non-ground and dynamic obstacle point clouds in the optimized single-frame semantic RGB map;
the weights of the multi-frame semantic map after superposition optimization are obtained to obtain a semantic RGB local map after superposition:
traversing the superimposed semantic RGB local map, and finally obtaining a corrected local map point cloud to form a corrected superimposed semantic map;
and when the boundary is extracted in real time, carrying out grid map construction on the corrected superposition semantic map, extracting the boundary of the grid map, and adopting control points to fit the smooth boundary to obtain a final boundary.
Optionally, the specific steps of obtaining the post-stack semantic RGB local map by stacking the weights of the optimized multi-frame semantic map are as follows: storing the vehicle GPS points of the current single-frame semantic RGB map, the current single-frame semantic RGB map and the point cloud data of the four subsequent single-frame semantic RGB maps;
taking a vehicle GPS point of a current single-frame semantic RGB map as an origin, converting point cloud data of the five-frame single-frame semantic RGB map into a world coordinate system for NDT matching and then overlapping; when the semantic RGB map point cloud data are overlapped, different weights are respectively given to the semantic RGB map point cloud data of the first frame to the semantic RGB map point cloud data of the fifth frame;
reserving overlapping point clouds of the point cloud data of the overlapping five-frame single-frame semantic RGB map within a threshold range in the Y-axis direction;
KNN voting is carried out on the ith point in the superimposed point cloud, N points of a preset distance corresponding to the ith point are searched through Kdtree, and the number Num of labels of the N points around the ith point is superimposed n,label Weight w n If the sum is K i Exceeding the threshold τ, the current ith point l Tag k of (2) i,labe Modified to coefficient maxk n
k n =w n* Num label,n
K i =w 1* Num label,1 +w 2* Num label,2 +…+w n* Num label,n >τ;
k i,label =max{w 1* Num label,1 ,w 2* Num label,2 ,…,w n* Num label,n };
And traversing each point in the superimposed point cloud, and carrying out KNN voting on each point to obtain a semantic RGB local map after superposition.
Optionally, when the point cloud data of the five-frame single-frame semantic RGB map are converted into the world coordinate system for NDT matching and then overlapped, the weight values of the point cloud data of the first frame to the point cloud data of the fifth frame are sequentially and linearly increased from the point cloud data of the first frame to the point cloud data of the fifth frame.
Optionally, the specific steps of extracting the boundary in real time are as follows:
based on the corrected and superimposed semantic map, acquiring vehicle GPS points of the current corrected single-frame semantic map and the first two corrected single-frame semantic maps, and converting the vehicle GPS points into a world map through a rotation translation matrix; judging road boundaries on opposite sides of a driving area through slopes of vehicle GPS points of the three-frame correction document semantic map;
judging the point cloud grid positions of road labels and retaining wall labels in the three-frame correction single-frame semantic map, searching opposite side boundary points of a running area farthest from a GPS point of a current frame vehicle in a plane grid in the three-frame correction single-frame semantic map, and respectively storing the opposite side boundary points into a left boundary point memory and a right boundary point memory;
removing outliers from road boundary points in a plurality of correction single-frame semantic maps in a left boundary point memory and a right boundary point memory;
converting the point cloud in the current correction single-frame semantic map with the left and right road boundary information after discrete points are removed into a world coordinate system through a rotation translation matrix; and selecting boundary fitting control points at intervals of preset distances in a single-frame RGB map with discrete points removed under a world coordinate system to perform least square curve fitting, so as to obtain a final boundary.
Optionally, the specific steps of eliminating the discrete points are as follows:
setting an initial distance threshold D, and initializing a left boundary point memory and a right boundary point memory;
the vehicle GPS Point of the first frame correction single frame semantic map is taken as an origin, and the left nearest Point which is nearest to the vehicle GPS Point is searched left Point closest to the right right The method comprises the steps of carrying out a first treatment on the surface of the With the nearest Point on the left left Point closest to the right right Respectively storing the two initial points into a left boundary point memory and a right boundary point memory;
with the nearest Point on the left left For the origin, search for the nearest Point on the left that does not exceed the initial distance threshold D left Storing the left boundary point in a left boundary point memory; point by right closest Point right For the origin, searching the right closest point not exceeding the initial distance threshold D and storing the right closest point into the right boundary pointIn a memory; if the left nearest Point left Or right closest Point right If the point is traversed or exceeds the threshold value, the nearest point is a discrete point, and the discrete point is removed;
after traversing all road boundary points, finishing searching to obtain all left and right road boundary points;
and obtaining a final boundary based on obtaining all the left and right road boundary points.
Optionally, when the real-time semantic map is constructed, the calibrated GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle are used for acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle.
Compared with the prior art, the invention has at least one of the following beneficial effects:
(1) According to the method, semantic segmentation is carried out on the unstructured road, and meanwhile, point clouds rho, theta and z are used as input point clouds to carry out feature extraction, so that the extraction precision of the unstructured road boundary is improved;
(2) The semantic map obtained by the method can enable the driving area boundary extraction to be performed while the image is subjected to semantic segmentation.
(3) The method is based on a boundary extraction framework of semantic segmentation, analyzes points and grid features in grids divided in a semantic segmentation network during the construction of a graph, performs boundary extraction through the features, and adopts control points to fit a smooth boundary, thereby improving the boundary extraction speed.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention.
FIG. 1 is a flow chart of the method for semantic map construction and boundary real-time extraction for open-air mining areas of the present invention;
FIG. 2 is a schematic diagram of a cylindrical grid point cloud feature extraction of the present invention;
FIG. 3 is a flow chart of downsampling of the present invention;
FIG. 4 is an upsampling flowchart of the present invention;
FIG. 5 is an asymmetric residual flow chart of the present invention;
FIG. 6 is a flow chart of a multi-modal focus convolution of the present invention;
FIG. 7 is a single frame semantic point cloud of the present invention;
FIG. 8 is a semantic map of the present invention;
fig. 9 is a diagram showing the boundary extraction effect of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings, so as to make the objects, technical solutions and advantages of the present invention more clear and complete. It should be understood that the described embodiments are merely illustrative and are not intended to limit the invention.
1-9, the invention discloses a semantic map construction and boundary real-time extraction method for an open-air mining area, which comprises the following specific steps:
step 1, constructing a semantic segmentation model;
step 11, enabling the output frequency of the picture and the point cloud data to be consistent, and obtaining the picture and the point cloud data at the same moment; thereby synchronizing the time of the picture with the point cloud data;
preferably, the picture is obtained by a camera and the point cloud data is obtained by a lidar.
Step 12, rasterizing the point cloud data by a cylinder to obtain point cloud grid units and cylinder characteristics of points, and obtaining characteristic tensors of each point cloud grid unit based on the cylinder characteristics of the points and the MLP (multi-layer neural network);
as shown in fig. 2, the point cloud data is divided according to a cylindrical grid from a cartesian coordinate system to obtain a cylindrical grid; converting the points in the cylindrical grid point cloud from the coordinates of a Cartesian coordinate system to cylindrical coordinates;
optionally, the point-wise feature of the point cloud in the cylindrical grid is extracted through the MLP multilayer neural network based on the point-wise target detection method, and the specific steps are as follows:
dividing the point cloud data from a Cartesian coordinate system according to a cylindrical grid to obtain point cloud grid units; the method comprises the steps of acquiring the cylindrical characteristics of points in a point cloud grid unit, wherein the expression is as follows:
Figure BDA0004038423420000051
Figure BDA0004038423420000061
z’=z。
wherein X is the X-axis coordinate of the point in a Cartesian coordinate system; y is the Y-axis coordinate of the point in the Cartesian coordinate system; z is the Z-axis coordinate of the point in the Cartesian coordinate system; ρ is the horizontal radius of the point and the Z-axis in the Cartesian coordinate system; θ is the azimuth angle of the point relative to the X-Y axis plane; z' is the difference in height between the point and the origin of the Cartesian coordinate system; all points are mapped into a cylindrical grid.
Meanwhile, inputting the cylindrical features rho, theta and z' of all points into an MLP (multi-layer neural network) to perform point-based MLP feature extraction on point cloud data, and increasing the 3-dimensional point features of the point cloud to C in Maintaining the point characteristics, and carrying out maximum pooling on points in the point cloud grid unit through a maximum pooling layer to obtain C in MLP point feature of dimension, output feature C in Tensors of H, W, L, wherein H, W and L are the radius, azimuth and altitude, respectively, of the point cloud grid element, C in Is the dimension of the point;
step 13, downsampling the points of the point cloud grid unit;
step 131, feature C in Tensors of H, W and L are input into an asymmetric residual block; horizontal three-dimensional convolution and vertical three-dimensional convolution feature C in Tensors of H, W, L; for horizontal three-dimensional convolution, 3×1×3 convolution kernels and 1×3×3 convolution kernel feature extraction are sequentially performed; for vertical three-dimensional convolution, sequentially performing 1×3×3 convolution kernels and 3×1×3 convolution kernel feature extraction; while increasing the dimension C of the input in Output sparse convolution feature 2×c in Tensors of H, W, L; each point in each point cloud grid element is traversed.
And adopting an asymmetric residual block to extract the characteristics of the points in the cylinder grid. The expressive power of points in the grid is enhanced, so that the robustness of the network is enhanced, and meanwhile, compared with the traditional 3D convolution, the calculation cost is reduced by extracting the characteristics through the asymmetric residual errors.
Step 132, multi-modal focus convolution downsampling:
performing sub-manifold hole convolution on each cylindrical grid, namely when the center point position P of each convolution kernel is matched with the input point position P of the non-empty cylindrical grid in When overlapping, carrying out convolution operation on the center point of each convolution kernel; wherein, each cylinder grid is subjected to sub-manifold cavity convolution Y p The expression is:
Figure BDA0004038423420000062
wherein x is p For the input sparse convolution feature at the convolution kernel center point position p,
Figure BDA0004038423420000063
w k the weight of the position k in the convolution kernel, d is the spatial dimension, preferably 3; k (K) d Convolving the kernel weight for the sub-manifold holes, preferably k=3, K d =3 3
Convolving the sub-manifold holes for each cylindrical grid location, the output shape of each cylindrical grid location
Figure BDA0004038423420000071
The method comprises the following steps:
Figure BDA0004038423420000072
Figure BDA0004038423420000073
wherein P (P, K) d )=(p+k|k∈K d }。
Selecting importance of adjacent points:
converting the picture into a coordinate system of point cloud data through a conversion matrix, projecting RGB pixel point information of the picture onto the point cloud data in each point cloud grid unit, and assigning RGB pixel point information to each point in each point cloud grid unit to obtain an RGB point cloud grid unit; thereby spatially synchronizing the image data and the point cloud data;
importance map I p : sparse convolution feature extraction is carried out on the points of the RGB point cloud grid units, importance probability calculation is carried out through a sigmoid function, and a tertiary importance graph at the position k in the convolution kernel is obtained
Figure BDA0004038423420000074
And center of the tertiary importance map
Figure BDA0004038423420000075
Wherein, the sparse convolution kernel weight and the sub-manifold cavity convolution kernel weight K d The same applies.
Importance map I p Sparse convolution feature 2 x C involving input at convolution kernel center point position p in Tensors of H, W, L, RGB image information, thereby obtaining importance of surrounding candidate grid output features, and thus, hole convolution of regular sparse conv and submanifold sparse conv can be balanced, and connectivity information is ensured not to be lost while reducing the calculation amount.
Selecting an important grid:
center of three-time importance graph output by RGB point cloud grid unit
Figure BDA0004038423420000076
When the importance threshold tau is larger than or equal to the importance threshold tau, the RGB point cloud grid unit is an important grid, and the position of a convolution kernel of the important grid is obtained:
Figure BDA0004038423420000077
wherein P is im The position of the convolution kernel, which is the important grid;
Figure BDA0004038423420000078
the output position of the convolution kernel, which is an important grid;
Figure BDA0004038423420000079
convolution kernel weights are generated for the importance grid.
Grid importance includes important, relatively important, and unimportant grids; the unimportant grids are point cloud data in the point cloud grid units, the relatively important grids are point cloud data in the point cloud grid units, and the RGB features of the images are extracted to be background point clouds; the important grid is a point cloud in a point cloud grid unit, and the RGB features are extracted as target point clouds.
Wherein the selected important grid generates a convolution kernel dynamic output shape
Figure BDA00040384234200000710
From the three-time importance diagram->
Figure BDA00040384234200000711
The decision, the expression is:
Figure BDA0004038423420000081
finally, a convolution kernel dynamic convolution output position p is generated based on the output shape of each cylindrical grid position and the position of the convolution kernel of the important grid out
Figure BDA0004038423420000082
Dynamically convolving the convolution kernel by an output position p out Convolving with step length of 2 to obtain p out Output characteristics 2 x C in Tensors of H/2, W/2, L/2; performing four downsampling operations to sequentially obtain first downsampled output characteristics 2×C in Tensor of H/2, W/2, L/2, second downsampled output feature 4*C in Tensor of H/4, W/4, L/4, third downsampled output feature 8C in H/8,W/8, L/8 tensor, fourth downsampled output feature 16×C in Tensors of H/16, W/16, L/16.
Step 14, up-sampling the points of the point cloud grid unit;
step 141, multi-modal focus convolution upsampling:
input fourth downsampled output feature 16×c in Tensors of H/16, W/16 and L/16, performing multi-mode focus convolution up-sampling, and performing sub-manifold hole convolution at first according to the specific process 132, and obtaining a final deconvolution shape P 'based on a three-time importance map obtained by the obtained RGB point cloud grid unit' out The method comprises the steps of carrying out a first treatment on the surface of the Based on the deconvolution shape P' out Deconvolution with step length of 2 is carried out, and the deconvoluted characteristic 8*C is output in ' H/8,W/8, tensor of L/8;
step 142, third one of the downsampled output features 8*C in H/8,W/8, tensor of L/8 and deconvoluted feature 8*C in ' H/8,W/8, tensors of L/8 are spliced to obtain splice characteristics 16X C in Tensors of H/8,W/8, L/8; splice feature 16 x c in The tensor of H/8,W/8 and L/8 is input into an asymmetric residual block; horizontal three-dimensional convolution and vertical three-dimensional convolution stitching feature 16×c in Tensors of H/8,W/8, L/8; for horizontal three-dimensional convolution, 3×1×3 convolution kernels and 1×3×3 convolution kernel feature extraction are sequentially performed; for vertical three-dimensional convolution, sequentially performing 1×3×3 convolution kernels and 3×1×3 convolution kernel feature extraction; while reducing the input dimension, output upsampling feature 8*C in Tensors of H/8,W/8 and L/8 to obtain up-sampling output grid characteristics;
the up-sampling output grid characteristics obtained above are up-sampled for three times to obtain down-sampling and up-sampling superposition characteristics C of each point in Tensors of H, W, L.
By superposing the lower sampling block and the upper sampling block, an asymmetric three-dimensional multi-modal focus convolution network is established.
The invention adopts asymmetric residual blocks to strengthen the horizontal and vertical kernels, thereby enhancing the robustness of the cylindrical grid.
Step 15, refining the downsampled and upsampled superposition characteristics of each point by using the MLP point characteristics of each point obtained in the step 12; acquiring the category probability of each refined point by using a loss function, and taking the highest category probability of each point as the point label of the point; the Point-wise refinement module is used for reclassifying the Point labels, so that the problem of information loss caused by label division errors when the labels are divided into cylindrical grids is solved.
Specifically, the spot labels include obstacle labels, ground labels, and retaining wall labels.
Step 2, calibrating a GNSS/IMU integrated navigation positioning system and a laser radar of the unmanned vehicle;
acquiring an initial rotation translation matrix of a GNSS/IMU combined navigation positioning system and a laser radar of the unmanned vehicle; the rotation translation matrix is the initial deviation and the initial attitude angle of the laser radar coordinate system and the GPS vehicle body coordinate system in the XYZ direction, wherein the attitude angle comprises a yaw angle, a roll angle and a pitch angle;
the laser radar is arranged at the front part and/or the tail part of the automatic driving vehicle, the GNSS/IMU integrated navigation positioning system is arranged on the vehicle body, and the measuring deviation of the laser radar position and the XYZ direction of the position of the center of the vehicle body is measured; placing a checkerboard calibration plate vertically with a central line 10 meters in front of the central line in the advancing direction of a vehicle body, extracting point characteristics of corner points of the checkerboard calibration plate, fitting a plane where the checkerboard is positioned according to the extracted point characteristics of the corner points, acquiring measurement attitude angles of a GPS vehicle body coordinate system and a laser radar coordinate system, and determining a rotation translation conversion relation of the GPS vehicle body coordinate system and the laser radar coordinate system of the unmanned vehicle according to measurement deviation of the position where the GPS vehicle body coordinate system and the laser radar coordinate system are positioned in the XYZ direction and the measurement attitude angle;
and unifying a coordinate system of the GNSS/IMU integrated navigation positioning system of the unmanned vehicle and a laser radar based on the rotation translation conversion relation obtained through calibration.
Step 3, constructing a real-time semantic map
The calibrated GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle are used for respectively acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle;
extracting edge characteristics and plane characteristics of input real-time Lei Dadian cloud data of each frame according to the roughness; and saving the edge characteristics and the plane characteristics of each frame of real-time Lei Dadian cloud data.
Optionally, feature extraction is performed on each frame of real-time Lei Dadian cloud data using a radar odometer.
Using the semantic segmentation model constructed in the step 1 to form a key frame semantic point cloud for the label of each point in the obtained real-time point cloud data of each frame; deleting other data among the key frame semantic point clouds, and estimating by using the key frame semantic point clouds;
acquiring real-time position and pose information of the current unmanned vehicle through real-time GPS-IMU data, and converting semantic point clouds of all key frames into a world coordinate system by using an initial rotation translation matrix; and judging whether to optimize the initial rotation flat matrix parameters by using the NDT feature matching effect, and if the NDT matching score between two frames of semantic point clouds is smaller than a threshold value, matching a plurality of key frame semantic point clouds by using the edge features and the plane features, so as to optimize the initial rotation flat matrix parameters.
And superposing all the key frame semantic point clouds to obtain a semantic map, and downsampling the semantic map to generate an optimized single-frame semantic RGB map.
Step 4, filtering dynamic barriers:
and step 41, judging the label of each point in the optimized single-frame semantic RGB map, and eliminating non-ground and dynamic obstacle point clouds in the optimized single-frame semantic RGB map.
Step 42, multi-frame semantic map weight superposition:
storing the GPS points of the vehicle of the current single-frame semantic RGB map, and the point cloud data of the current single-frame semantic RGB map and the four single-frame semantic RGB maps;
taking a vehicle GPS point of a current single-frame semantic RGB map as an origin, converting the point cloud data of the five frames into a world coordinate system for NDT matching and then overlapping; when the point cloud data are superimposed, different weights are respectively given to the point cloud data of the first frame to the point cloud data of the fifth frame, and the five weight values are sequentially and linearly increased from the point cloud data of the first frame to the point cloud data of the fifth frame;
and reserving superimposed point clouds within a threshold range of the superimposed five-frame point cloud data in the Y-axis direction, for example: if the radar is arranged on the top of the vehicle, only partial noise points can be acquired by the point clouds around the vehicle body, the effective detection distance of the single-frame point clouds is 50m, the key frame distance is 2m, and the point cloud data in the Y-axis direction (namely, the noise points) and outside the Y-axis direction (namely, the sparse point clouds) are deleted, so that the influence caused by abnormal semantic segmentation is reduced;
voting KNN (nearest neighbor node algorithm) is carried out on the ith point in the overlapped point cloud, N points of preset distance corresponding to the ith point are searched through Kdtree, and the number Num of labels of the N points around the ith point is overlapped n,label Weight w n If the sum is K i Exceeding the threshold τ, the current ith point l Tag k of (2) i,labe Modified to coefficient maxk n
k n =w n* Num label,n
K i =w 1* Num label,1 +w 2* Num label,2 +…+w n* Num label,n >τ;
k i,label =max{w 1* Num label,1 ,w 2* Num label,2 ,…,w n* Num label,n };
And traversing each point in the superimposed point cloud, and voting KNN (nearest neighbor node algorithm) by each point to obtain the superimposed semantic RGB local map.
And 43, traversing the overlapped semantic RGB local map, and finally obtaining a corrected local map point cloud to form a corrected overlapped semantic map.
Step 5, extracting the boundary of the driving area:
based on the semantic segmentation model in the step 1, a grid map is constructed when a corrected superposition semantic map is constructed, the inner boundary of the grid is extracted, a control point is adopted to fit a smooth boundary, and the boundary extraction speed is improved;
1) Based on the corrected and superimposed semantic map, acquiring vehicle GPS points of the current corrected single-frame semantic map and the first two corrected single-frame semantic maps, and converting the vehicle GPS points into a world map through a rotation translation matrix; judging road boundaries on opposite sides of a driving area through slopes of three frames of vehicle GPS points; the slope of the GPS point of the three frames of vehicles is the slope of the tangent line of the GPS point of the vehicle in the three frames of corrected single frame semantic map.
2) Judging the point cloud grid positions of road labels and retaining wall labels in the three-frame correction single-frame semantic map, searching opposite side boundary points of a running area farthest from a GPS point of a current frame vehicle in a plane grid in the three-frame correction single-frame semantic map, and respectively storing the opposite side boundary points into a left point container and a right point container;
3) Outlier elimination is carried out on road boundary points in a plurality of correction single-frame semantic maps in the left container and the right container to obtain left and right road boundary points;
the specific steps of eliminating discrete points are as follows:
a: setting an initial distance threshold D, and initializing a left boundary point container and a right boundary point container;
b: with the GPS Point of the vehicle in the first frame as the origin, searching the left nearest Point nearest to the GPS Point of the vehicle left Point closest to the right right Storing the two initial points into left and right point containers;
c: with the nearest Point on the left left Searching a left nearest point which does not exceed an initial distance threshold D for an origin point, and storing the left nearest point into a left boundary point container; point by right closest Point right Searching the right closest point which does not exceed the initial distance threshold D for an origin point, and storing the right closest point into a right boundary point container; if the nearest point on the left side or the nearest point on the right side is the traversed point or the point exceeding the threshold value, the nearest point is a discrete point, and the discrete point is removed;
d: and after traversing all the road boundary points, finishing searching to obtain all the left and right road boundary points.
4) In a single-frame RGB map with discrete points removed under a world coordinate system, selecting a boundary fitting control point every 1m to perform least square curve fitting to obtain a final boundary, converting final boundary information back to a WGS84 coordinate system and uploading the final boundary information to a cloud database to store the boundary information, wherein the WGS84 coordinate system is an original GPS coordinate system or an earth coordinate system or a world coordinate system.
It can be understood that if 5 points are selected as control points from the Shan Zhen point cloud boundary, the least squares fit boundary is performed on the five points:
world coordinate system coordinates of five points: (x) 1 ,y 1 ),(x 2 ,y 2 ),(x 3 ,y 3 ),(x 4 ,y 4 ),(x 5 ,y 5 );
Polynomial coefficient theta is obtained by polynomial fitting i Thereby determining the road boundary h θ (x):
h θ (x)=θ 01 x 12 x 23 x 34 x 45 x 5
Figure BDA0004038423420000111
Wherein, the liquid crystal display device comprises a liquid crystal display device, xi and y i The coordinates of the control points in the world coordinate system coordinates, respectively.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims (7)

1. A semantic map construction and boundary real-time extraction method for open-air mining areas is characterized by comprising the following specific steps:
step 1, constructing a semantic segmentation model:
step 11, obtaining pictures and point cloud data at the same moment;
step 12, rasterizing the point cloud data by a cylinder to obtain point cloud grid units and cylinder characteristics of points, and obtaining tensors of the characteristics of each point cloud grid unit based on the cylinder characteristics of the points and the MLP (multi-layer neural network);
step 13, downsampling the points of the point cloud grid unit:
step 131, inputting tensors of the characteristics of each point cloud grid unit into an asymmetric residual block; outputting tensors of the sparse convolution characteristics;
step 132, projecting RGB pixel point information of the picture onto point cloud data in each point cloud grid unit to obtain RGB point cloud grid units; performing multi-mode focus convolution downsampling based on the RGB point cloud grid unit:
step 14, up-sampling the points of the point cloud grid unit:
step 141, performing multi-mode focus convolution up-sampling based on the RGB point cloud grid unit:
step 142, up-sampling the obtained up-sampling output grid characteristics to obtain tensors of down-sampling and up-sampling superposition characteristics of each point;
step 15, refining tensors of down-sampling and up-sampling point superposition characteristics of each point by using the MLP point characteristics of each point obtained in the step 12; acquiring the category probability of each refined point by using a loss function, and taking the highest category probability of each point as the point label of the point;
step 2, constructing a real-time semantic map:
acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle;
using the semantic segmentation model constructed in the step 1 to form a key frame semantic point cloud for the label of each point in the obtained real-time point cloud data of each frame;
superposing all the key frame semantic point clouds to obtain a semantic map, and generating an optimized single-frame semantic RGB map;
step 3, extracting boundaries in real time:
based on the semantic segmentation model in the step 1, a grid map is constructed when an optimized single-frame semantic RGB map is generated, the boundary of the grid map is extracted, and a control point is adopted to fit a smooth boundary to obtain a final boundary.
2. The method for constructing a semantic map and extracting boundaries in real time according to claim 1, further comprising the step of filtering dynamic obstacles before extracting boundaries in real time, wherein the method comprises the following specific steps:
judging the label of each point in the optimized single-frame semantic RGB map, and eliminating non-ground and dynamic obstacle point clouds in the optimized single-frame semantic RGB map;
the weights of the multi-frame semantic map after superposition optimization are obtained to obtain a semantic RGB local map after superposition:
traversing the superimposed semantic RGB local map, and finally obtaining a corrected local map point cloud to form a corrected superimposed semantic map;
and when the boundary is extracted in real time, carrying out grid map construction on the corrected superposition semantic map, extracting the boundary of the grid map, and adopting control points to fit the smooth boundary to obtain a final boundary.
3. The method for constructing and extracting boundaries in real time according to claim 2, wherein the specific steps of obtaining the post-stack semantic RGB partial map by stacking the weights of the optimized multi-frame semantic map are as follows: storing the vehicle GPS points of the current single-frame semantic RGB map, the current single-frame semantic RGB map and the point cloud data of the four subsequent single-frame semantic RGB maps;
taking a vehicle GPS point of a current single-frame semantic RGB map as an origin, converting point cloud data of the five-frame single-frame semantic RGB map into a world coordinate system for NDT matching and then overlapping; when the semantic RGB map point cloud data are overlapped, different weights are respectively given to the semantic RGB map point cloud data of the first frame to the semantic RGB map point cloud data of the fifth frame;
reserving overlapping point clouds of the point cloud data of the overlapping five-frame single-frame semantic RGB map within a threshold range in the Y-axis direction;
KNN voting is carried out on the ith point in the superimposed point cloud, N points of a preset distance corresponding to the ith point are searched through Kdtree, and the number Num of labels of the N points around the ith point is superimposed n,label Weight w n If the sum is K i Exceeding the threshold τ, the current ith point l Tag k of (2) i,labe Modified asCoefficient maxk n
k n =w n* Num label,n
K i =w 1* Num label,1 +w 2* Num label,2 +…+w n* Num label,n >τ;
k i,label =max{w 1* Num label,1 ,w 2* Num label,2 ,…,w n* Num label,n };
And traversing each point in the superimposed point cloud, and carrying out KNN voting on each point to obtain a semantic RGB local map after superposition.
4. The method for constructing a semantic map and extracting boundaries in real time according to claim 3, wherein when the point cloud data of the five-frame single-frame semantic RGB map are converted into the world coordinate system for NDT matching and then superimposed, the weight values of the first frame point cloud data to the fifth frame point cloud data are sequentially linearly increased from the first frame point cloud data to the fifth frame point cloud data.
5. The method for constructing a semantic map and extracting boundaries in real time according to any one of claims 2 to 4, wherein the specific steps of extracting boundaries in real time are as follows:
based on the corrected and superimposed semantic map, acquiring vehicle GPS points of the current corrected single-frame semantic map and the first two corrected single-frame semantic maps, and converting the vehicle GPS points into a world map through a rotation translation matrix; judging road boundaries on opposite sides of a driving area through slopes of vehicle GPS points of the three-frame correction document semantic map;
judging the point cloud grid positions of road labels and retaining wall labels in the three-frame correction single-frame semantic map, searching opposite side boundary points of a running area farthest from a GPS point of a current frame vehicle in a plane grid in the three-frame correction single-frame semantic map, and respectively storing the opposite side boundary points into a left boundary point memory and a right boundary point memory;
removing outliers from road boundary points in a plurality of correction single-frame semantic maps in a left boundary point memory and a right boundary point memory;
converting the point cloud in the current correction single-frame semantic map with the left and right road boundary information after discrete points are removed into a world coordinate system through a rotation translation matrix; and selecting boundary fitting control points at intervals of preset distances in a single-frame RGB map with discrete points removed under a world coordinate system to perform least square curve fitting, so as to obtain a final boundary.
6. The method for constructing a semantic map and extracting boundaries in real time according to claim 5, wherein the specific steps of eliminating discrete points are as follows:
setting an initial distance threshold D, and initializing a left boundary point memory and a right boundary point memory;
the vehicle GPS Point of the first frame correction single frame semantic map is taken as an origin, and the left nearest Point which is nearest to the vehicle GPS Point is searched left Point closest to the right right The method comprises the steps of carrying out a first treatment on the surface of the With the nearest Point on the left left Point closest to the right right Respectively storing the two initial points into a left boundary point memory and a right boundary point memory;
with the nearest Point on the left left For the origin, search for the nearest Point on the left that does not exceed the initial distance threshold D left Storing the left boundary point in a left boundary point memory; point by right closest Point right Searching a right closest point which does not exceed an initial distance threshold D for an origin, and storing the right closest point into a right boundary point memory; if the left nearest Point left Or right closest Point right If the point is traversed or exceeds the threshold value, the nearest point is a discrete point, and the discrete point is removed;
after traversing all road boundary points, finishing searching to obtain all left and right road boundary points;
and obtaining a final boundary based on obtaining all the left and right road boundary points.
7. The semantic map construction and boundary real-time extraction method according to claim 1, wherein the real-time GPS-IMU data and the real-time Lei Dadian cloud data of the unmanned vehicle are acquired by using a calibrated GNSS/IMU combined navigation positioning system and a laser radar of the unmanned vehicle when constructing the real-time semantic map.
CN202310013284.7A 2023-01-05 2023-01-05 Semantic map construction and boundary real-time extraction method for open-air mining area Pending CN116258826A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310013284.7A CN116258826A (en) 2023-01-05 2023-01-05 Semantic map construction and boundary real-time extraction method for open-air mining area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310013284.7A CN116258826A (en) 2023-01-05 2023-01-05 Semantic map construction and boundary real-time extraction method for open-air mining area

Publications (1)

Publication Number Publication Date
CN116258826A true CN116258826A (en) 2023-06-13

Family

ID=86678597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310013284.7A Pending CN116258826A (en) 2023-01-05 2023-01-05 Semantic map construction and boundary real-time extraction method for open-air mining area

Country Status (1)

Country Link
CN (1) CN116258826A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117073664A (en) * 2023-10-18 2023-11-17 中国科学院自动化研究所 Surface mine road topography mapping method
CN117437654A (en) * 2023-12-19 2024-01-23 中国地质大学(武汉) Semantic recognition-based grid map analysis method, device and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117073664A (en) * 2023-10-18 2023-11-17 中国科学院自动化研究所 Surface mine road topography mapping method
CN117073664B (en) * 2023-10-18 2024-02-20 中国科学院自动化研究所 Surface mine road topography mapping method
CN117437654A (en) * 2023-12-19 2024-01-23 中国地质大学(武汉) Semantic recognition-based grid map analysis method, device and medium
CN117437654B (en) * 2023-12-19 2024-03-08 中国地质大学(武汉) Semantic recognition-based grid map analysis method, device and medium

Similar Documents

Publication Publication Date Title
Caltagirone et al. Fast LIDAR-based road detection using fully convolutional neural networks
Ma et al. A review of 3D reconstruction techniques in civil engineering and their applications
CN110222626B (en) Unmanned scene point cloud target labeling method based on deep learning algorithm
WO2019153245A1 (en) Systems and methods for deep localization and segmentation with 3d semantic map
CN116258826A (en) Semantic map construction and boundary real-time extraction method for open-air mining area
CN112258618A (en) Semantic mapping and positioning method based on fusion of prior laser point cloud and depth map
CN112613378B (en) 3D target detection method, system, medium and terminal
CN111815776A (en) Three-dimensional building fine geometric reconstruction method integrating airborne and vehicle-mounted three-dimensional laser point clouds and streetscape images
CN111985376A (en) Remote sensing image ship contour extraction method based on deep learning
CN112633277A (en) Channel ship board detection, positioning and identification method based on deep learning
WO2021114776A1 (en) Object detection method, object detection device, terminal device, and medium
US20230005278A1 (en) Lane extraction method using projection transformation of three-dimensional point cloud map
CN112991534B (en) Indoor semantic map construction method and system based on multi-granularity object model
US20220044072A1 (en) Systems and methods for aligning vectors to an image
Han et al. Urban scene LOD vectorized modeling from photogrammetry meshes
Waqas et al. Deep learning-based obstacle-avoiding autonomous UAVs with fiducial marker-based localization for structural health monitoring
CN115690138A (en) Road boundary extraction and vectorization method fusing vehicle-mounted image and point cloud
CN115147798A (en) Method, model and device for predicting travelable area and vehicle
CN116246119A (en) 3D target detection method, electronic device and storage medium
CN116246033A (en) Rapid semantic map construction method for unstructured road
Berrio et al. Fusing lidar and semantic image information in octree maps
CN117173399A (en) Traffic target detection method and system of cross-modal cross-attention mechanism
Huang et al. A coarse-to-fine LiDar-based SLAM with dynamic object removal in dense urban areas
US20220301176A1 (en) Object detection method, object detection device, terminal device, and medium
CN116309817A (en) Tray detection and positioning method based on RGB-D camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: Room 303, Zone D, Main Building of Beihang Hefei Science City Innovation Research Institute, No. 999 Weiwu Road, Xinzhan District, Hefei City, Anhui Province, 230012

Applicant after: Tage Zhixing Technology Co.,Ltd.

Address before: 100176 901, 9th floor, building 2, yard 10, KEGU 1st Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Applicant before: BEIJING TAGE IDRIVER TECHNOLOGY CO.,LTD.

Country or region before: China