CN113011430B - Large-scale point cloud semantic segmentation method and system - Google Patents

Large-scale point cloud semantic segmentation method and system Download PDF

Info

Publication number
CN113011430B
CN113011430B CN202110309423.1A CN202110309423A CN113011430B CN 113011430 B CN113011430 B CN 113011430B CN 202110309423 A CN202110309423 A CN 202110309423A CN 113011430 B CN113011430 B CN 113011430B
Authority
CN
China
Prior art keywords
point
point cloud
spatial
sampling points
learned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110309423.1A
Other languages
Chinese (zh)
Other versions
CN113011430A (en
Inventor
朱凤华
董秋雷
范嗣祺
叶佩军
吕宜生
田滨
王飞跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202110309423.1A priority Critical patent/CN113011430B/en
Publication of CN113011430A publication Critical patent/CN113011430A/en
Application granted granted Critical
Publication of CN113011430B publication Critical patent/CN113011430B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a large-scale point cloud semantic segmentation method and a system, wherein the semantic segmentation method comprises the following steps: extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified; gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature; gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics; and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic. The method extracts point-by-point characteristics of the point cloud to be identified, extracts more effective spatial characteristics from large-scale point cloud information, gradually encodes the point-by-point characteristics based on the point cloud spatial information of each point to be identified to obtain point cloud characteristics, further decodes the point cloud characteristics to obtain decoding characteristics, and determines a semantic segmentation prediction result of the 3D point cloud to be identified according to the decoding characteristics to obtain semantic information of the surrounding spatial environment, so that the semantic segmentation precision is improved.

Description

Large-scale point cloud semantic segmentation method and system
Technical Field
The invention relates to the technical field of computer vision, in particular to a large-scale point cloud semantic segmentation method and system based on spatial context feature learning.
Background
In a mobile robot surrounding environment perception system, semantic segmentation of surrounding environment is an important component, and semantic understanding information of the environment where the mobile robot is located is provided for a decision control system of the mobile robot. Compared with a 2D image sensor, a 3D sensor (such as a laser radar) can provide richer space geometry information, and is more helpful for a mobile robot to understand the three-dimensional space in which the mobile robot is located. Therefore, with the rapid development of 3D sensors, the semantic segmentation of 3D point clouds has recently attracted attention from academic and industrial circles, and the semantic segmentation of large-scale point clouds, which have a large amount of information but can describe the spatial environment in detail, is a computer vision problem that is attracted attention from researchers.
Because of the unstructured and disorderly 3D point cloud information, semantic segmentation of point clouds is a challenging task, especially for large-scale point clouds. In recent years, a number of Deep Neural Network (DNN) based methods have been used for semantic segmentation of point clouds. The existing point cloud semantic segmentation method can be mainly divided into three categories: methods based on spatial projection, methods based on spatial discretization and methods based on point processing. The method based on space projection firstly projects the 3D point cloud to a 2D plane, then utilizes a 2D semantic segmentation method to realize segmentation, and finally back-projects the 2D segmentation result to a 3D space. The method inevitably has information loss in the projection process, and the loss of critical detail information is not beneficial to the accurate understanding of the perception system to the environment. The method based on the space discretization firstly discretizes the 3D point cloud into a voxel form, and then carries out subsequent semantic segmentation based on the voxel. The method has discretization error, and the final semantic segmentation precision and the environment understanding accuracy are influenced by the discretization degree. Meanwhile, the above two methods both require additional complex point cloud space processing steps, such as projection and discretization, and the high computational complexity thereof makes it impossible to process large-scale point clouds. Therefore, how to extract more effective features from the large-scale point cloud is a key problem that the segmentation precision is prevented from being improved on the premise of ensuring the efficiency.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to improve the semantic segmentation precision, the present invention aims to provide a large-scale point cloud semantic segmentation method and system.
In order to solve the technical problem, the invention provides the following scheme:
a large-scale point cloud semantic segmentation method, which comprises the following steps:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Optionally, based on the point cloud spatial information of each point to be identified, the point-by-point features are encoded step by step to obtain corresponding point cloud features, which specifically includes:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial features according to point cloud spatial information of the down-sampling point and corresponding learned features;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
and determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics.
Optionally, the determining, according to the point cloud spatial information of the down-sampling point and the corresponding learned feature, the corresponding local spatial feature specifically includes:
according to the point cloud space information of the down sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on the shared parameter multi-layer perceptron MLP according to the learned features;
and obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map.
Optionally, the learning of the local spatial context feature is performed according to the point cloud spatial information of the down-sampling point and the corresponding learned feature, so as to obtain the local spatial context feature, and the learning of the local spatial context feature specifically includes:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud spatial information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local spatial context information of the down-sampling points;
and obtaining local spatial context characteristics by a double-distance-based neighborhood point characteristic self-adaptive fusion method according to the polar coordinates, the geometric distance and the learned characteristics.
Optionally, the determining the polar coordinates of the down-sampling points according to the point cloud space information of the down-sampling points specifically includes:
obtaining the initial polar coordinates of the down-sampling point according to the following formula
Figure BDA0002988939620000031
Figure BDA0002988939620000032
Figure BDA0002988939620000033
Figure BDA0002988939620000041
Wherein, a K nearest neighbor KNN method based on Euclidean distance is used for obtaining a down-sampling point p i The K neighbor comprises K neighbor points
Figure BDA0002988939620000042
Figure BDA0002988939620000043
To lower the sampling point p i K-th neighbor point of (1)
Figure BDA0002988939620000044
Relative position coordinates in a rectangular spatial coordinate system, i denotes the down-sampled point p i K denotes a neighbor point
Figure BDA0002988939620000045
K =1,2, \8230, K represents the number of neighbors;
determining the polar coordinate angle alpha of the local space direction according to the local space direction i And beta i (ii) a The local spatial direction is defined by a down-sampling point p i Neighborhood centroid pointing to K neighbor
Figure BDA0002988939620000046
Updating the polar coordinates of the down-sampled points according to the following formula
Figure BDA0002988939620000047
The updated polar representation has local rotational invariance:
Figure BDA0002988939620000048
Figure BDA0002988939620000049
optionally, the obtaining of the local spatial context feature according to the polar coordinate, the geometric distance, and the learned feature by the double-distance-based neighborhood point feature adaptive fusion method specifically includes:
determining a characteristic distance and a geometric characteristic according to the polar coordinates and the learned characteristics of the down-sampling points;
determining weighted fusion parameters according to the following formula
Figure BDA00029889396200000410
Figure BDA00029889396200000411
Figure BDA00029889396200000412
Figure BDA00029889396200000413
Figure BDA00029889396200000414
Wherein softmax () represents a normalized exponential function, MLP () represents a multi-layer perceptron function,
Figure BDA00029889396200000415
the characteristics of the connection are shown as such,
Figure BDA00029889396200000416
the dual-range feature is represented by,
Figure BDA00029889396200000417
representing a join operator;
Figure BDA00029889396200000418
is a near neighbor point
Figure BDA00029889396200000419
The geometric distance of (a);
Figure BDA00029889396200000420
is a neighboring point
Figure BDA00029889396200000421
Is calculated from the learned feature g i And learned characteristics g k Determining; λ is the weight of the adjusted feature distance term, mean () represents the averaging function;
Figure BDA0002988939620000051
representing neighbor points
Figure BDA0002988939620000052
Features determined from the geometric features and learned features;
fusing each neighborhood point feature with the weighting parameter
Figure BDA0002988939620000053
Fusing to obtain local spatial context characteristics f iL
Figure BDA0002988939620000054
Where, represents the dot product operator, i represents the down-sampling point p i K denotes a neighbor point
Figure BDA0002988939620000055
K =1,2, \ 8230, K, K indicates the number of neighboring points.
Alternatively, the down-sampling point p is determined according to the following formula i Global spatial context feature f iG
Figure BDA0002988939620000056
Figure BDA0002988939620000057
Wherein the content of the first and second substances,
Figure BDA0002988939620000058
to connect operators, (x) i ,y i ,z i ) To lower the sampling point p i Spatial coordinates in a spatial rectangular coordinate system, r i Is a volume ratio, v i For reducing the sampling point p i Volume of the neighborhood minimum circumscribed sphere, v g And the minimum circumscribed sphere volume of the point cloud to be identified.
In order to solve the technical problems, the invention also provides the following scheme:
a large-scale point cloud semantic segmentation system, the semantic segmentation system comprising:
the device comprises an extraction unit, a recognition unit and a processing unit, wherein the extraction unit is used for extracting point-by-point characteristics of a point cloud to be recognized, and the point cloud to be recognized is composed of a plurality of points to be recognized;
the encoding unit is used for gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
the decoding unit is used for gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and the prediction unit is used for determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
In order to solve the technical problem, the invention also provides the following scheme:
a large-scale point cloud semantic segmentation system comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
In order to solve the technical problems, the invention also provides the following scheme:
a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
According to the embodiment of the invention, the invention discloses the following technical effects:
the method extracts point-by-point characteristics of the point cloud to be identified, extracts more effective spatial characteristics from large-scale point cloud information, gradually encodes the point-by-point characteristics based on the point cloud spatial information of each point to be identified to obtain point cloud characteristics, further decodes the point cloud characteristics to obtain decoding characteristics, and determines the semantic segmentation prediction result of the 3D point cloud to be identified according to the decoding characteristics to obtain the semantic information of the surrounding spatial environment, so that the semantic segmentation precision is improved.
Drawings
FIG. 1 is a flow chart of a large-scale point cloud semantic segmentation method of the present invention;
FIG. 2 is a flow chart of polar coordinate determination of a local spatial context representation in the present invention;
FIG. 3 is a diagram of the process of angle update in polar coordinates of a local spatial context representation in the present invention;
FIG. 4 is a flow chart of a method for adaptive fusion of neighborhood point features based on dual distance in the present invention;
FIG. 5 is a detailed flowchart of a neighborhood point feature adaptive fusion method based on dual distance according to the present invention;
FIG. 6 is a point cloud distribution plot;
FIG. 7 is a flow chart of spatial context feature determination in the present invention;
FIG. 8 is a schematic diagram of a modular structure of the large-scale point cloud semantic segmentation system according to the present invention.
Description of the symbols:
an extraction unit-1, an encoding unit-2, a decoding unit-3, and a prediction unit-4.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
The invention aims to provide a large-scale point cloud semantic segmentation method, which comprises the steps of extracting point-by-point characteristics of a point cloud to be identified, extracting more effective spatial characteristics from large-scale point cloud information, gradually coding the point-by-point characteristics based on the point cloud spatial information of each point to be identified to obtain point cloud characteristics, further decoding the point cloud characteristics to obtain decoding characteristics, and determining a semantic segmentation prediction result of the 3D point cloud to be identified according to the decoding characteristics to obtain semantic information of a surrounding spatial environment so as to improve semantic segmentation precision.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in FIG. 1, the large-scale point cloud semantic segmentation method of the invention comprises the following steps:
step 100: and extracting point-by-point characteristics of the point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified.
And extracting point-by-point characteristics of the points to be identified through a full connection layer according to the point cloud data.
The point cloud data is point cloud information of Nxd, wherein N is the number of points included in the point cloud, and d is the dimension of the input point cloud information. In some preferred embodiments, the point cloud information dimension d =6, including three dimensions of position information and three dimensions of color information.
Step 200: and gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain the corresponding point cloud feature.
Step 300: and gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics.
Step 400: and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
In step 200, the four encoding feature layers are used for encoding step by step to obtain corresponding point cloud features.
Specifically, the step of gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature includes:
step 210: and carrying out point cloud downsampling processing on each point to be identified to obtain a plurality of downsampling points.
Preferably, a point cloud random down-sampling algorithm is adopted for down-sampling processing.
Step 220: and screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics.
Step 230: and aiming at each down-sampling point, determining corresponding local spatial characteristics according to the point cloud spatial information of the down-sampling point and the corresponding learned characteristics.
Step 240: and determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points.
Step 250: and determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics.
As shown in fig. 7, in step 230, determining a corresponding local spatial feature according to the point cloud spatial information of the down-sampling point and the corresponding learned feature specifically includes:
step 231: and according to the point cloud space information of the down-sampling points and the corresponding learned characteristics, learning the local space context characteristics to obtain the local space context characteristics.
Step 232: and obtaining a feature map based on the shared parameter multi-layer perceptron MLP according to the learned features.
Step 233: and obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map.
In step 231, the local spatial context feature learning is performed according to the point cloud spatial information of the down-sampling point and the corresponding learned features, so as to obtain the local spatial context feature, which specifically includes:
step 2311: and determining the Polar coordinate Representation (LPR) and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the Polar coordinate Representation of the down-sampling points is used for representing the Local space context information of the down-sampling points.
Determining the polar coordinates of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the determining specifically comprises (as shown in fig. 2 and 3):
step A1: obtaining the initial polar coordinates of the down-sampling point according to the following formula
Figure BDA0002988939620000091
Figure BDA0002988939620000101
Figure BDA0002988939620000102
Figure BDA0002988939620000103
Wherein, the sampling point p is obtained by using a K Nearest Neighbors (KNN) method based on Euclidean Distance (Euclidean Distance) i The K neighbor comprises K neighbor points
Figure BDA0002988939620000104
Figure BDA0002988939620000105
For reducing the sampling point p i K-th neighbor point of (1)
Figure BDA0002988939620000106
Relative position coordinates in a rectangular spatial coordinate system, i denotes the down-sampled point p i K denotes a neighbor point
Figure BDA0002988939620000107
K =1,2, \ 8230, K, K indicates the number of neighboring points. In this embodiment, K is 16.
Step A2: determining the polar coordinate angle alpha of the local space direction according to the local space direction i And beta i (ii) a The local spatial direction is defined by the down-sampling point p i Neighborhood centroid pointing to K nearest neighbor
Figure BDA0002988939620000108
By dropping the sampling point p i Neighborhood centroid pointing to K nearest neighbor
Figure BDA0002988939620000109
Determining the local spatial direction has two main advantages: (1) the centroid can effectively reflect the general view of the local domain; (2) The averaging operation in the centroid calculation process can effectively weaken the random factors introduced by random downsampling.
Step A3: updating the polar coordinates of the down-sampled points according to the following formula
Figure BDA00029889396200001010
The updated polar representation has local rotational invariance (as shown in fig. 3 (a) - (c)):
Figure BDA00029889396200001011
Figure BDA00029889396200001012
the polar coordinates are used to characterize a spatial context information with local rotational invariance. In most practical scenes, objects belonging to the same semantic category usually have different posture orientations, such as seats facing different directions in a conference room, so that the features obtained directly based on point learning are orientation-sensitive, and the orientation-sensitive situation can further influence the effect of point cloud semantic segmentation in certain situations. The present invention selects local spatial context information representing points in a spatial polar coordinate system. Compared to a spatial rectangular coordinate system, only the angle is orientation sensitive in a polar coordinate system. In the present invention, updated
Figure BDA0002988939620000111
And
Figure BDA0002988939620000112
is an angle relative to the local spatial direction that allows its value to remain unchanged as the pose orientation changes, and thus the local spatial context representation obtained so far is locally rotation invariant.
Step 2312: and obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics.
The Dual-Distance adaptive fusion (DDAP) method is used for adaptively learning local spatial context features by using neighborhood point features. The distance is an important index for measuring the correlation between the point and the point, and the correlation between the point and the point is improved along with the reduction of the distance. Wherein the dual distance is a geometric distance of the physical space and a characteristic distance of the characteristic space.
As shown in fig. 4 and 5, step 2312 specifically includes:
step B1: and determining the characteristic distance and the geometric characteristic according to the polar coordinates and the learned characteristics of the down-sampling points.
The geometric features are obtained by connecting polar coordinates and absolute coordinates of the down-sampling points and then processing by using a Multi-Layer Perceptron (MLP) which shares parameters.
And step B2: determining weighted fusion parameters according to the following formula
Figure BDA0002988939620000113
(the invention uses MLP and Softmax of shared parameters to adaptively learn weighted fusion parameters of neighborhood point features
Figure BDA0002988939620000114
):
Figure BDA0002988939620000115
Figure BDA0002988939620000116
Figure BDA0002988939620000117
Figure BDA0002988939620000118
Wherein softmax () represents a normalized exponential function, MLP () represents a multi-layer perceptron function,
Figure BDA0002988939620000119
the characteristics of the connection are shown as such,
Figure BDA00029889396200001110
the dual-range feature is represented by,
Figure BDA00029889396200001111
representing a join operator;
Figure BDA0002988939620000121
is a neighboring point
Figure BDA0002988939620000122
The geometric distance of (a);
Figure BDA0002988939620000123
is a near neighbor point
Figure BDA0002988939620000124
Is calculated from the learned feature g i And g k Determining; λ is the weight of the adjusted feature distance term, mean () represents the averaging function;
Figure BDA0002988939620000125
representing neighbor points
Figure BDA0002988939620000126
And the characteristics are determined according to the geometric characteristics and the learned characteristics. In this embodiment, λ is 0.1.
And step B3: fusing each neighborhood point feature with the weighting parameter
Figure BDA0002988939620000127
Fusing to obtain local spatial context characteristics f iL
Figure BDA0002988939620000128
Where, denotes the dot product operator, i denotes the down-sampled point p i K denotes a neighbor point
Figure BDA0002988939620000129
K =1,2, \ 8230, K, K indicates the number of neighboring points.
As shown in fig. 6, in step 240, a down-sampled point p is determined according to the following equation i Global spatial context feature f iG
Figure BDA00029889396200001210
Figure BDA00029889396200001211
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00029889396200001212
to join the operators, (x) i ,y i ,z i ) To lower the sampling point p i Spatial coordinates in a spatial rectangular coordinate system, r i Is a volume ratio, v i To lower the sampling point p i Volume of the neighborhood minimum circumscribed sphere, v g The minimum circumscribed sphere volume of the point cloud to be identified.
The local spatial context Feature can effectively describe context information between points in a neighborhood, and in order to obtain a more discriminative spatial context Feature, the Global spatial context Feature learning (GCF) is performed in the invention for learning the Global spatial context Feature of the point.
Further, in the spatial context feature learning, local spatial context feature learning is performed twice in succession to expand a local spatial context receptive field, the obtained local spatial context feature is added to a feature map learned directly by using MLP based on point features to obtain a local feature, and the local feature is connected to a global spatial context feature to obtain a final spatial context feature (as shown in fig. 7).
Further, in step 300, the four decoding feature layers are respectively decoded step by step to obtain corresponding decoding features.
Specifically, step 300 includes:
step 310: and performing up-sampling on the down-sampling points corresponding to the cloud features of each point to obtain a plurality of up-sampling points and corresponding point cloud features.
In the present embodiment, the up-sampling process is performed using a nearest neighbor interpolation algorithm.
Step 320: and determining corresponding decoding characteristics according to each pair of the up-sampling points and the point cloud characteristics based on the MLP of the shared parameters.
In step 400, a semantic segmentation prediction result of the 3D point cloud to be identified is determined according to the decoding characteristics through the three full-connected layers and the semantic segmentation network model.
The large-scale point cloud semantic segmentation system firstly trains a semantic segmentation network, the training adopts cross entropy loss function training, and learning network parameters are iteratively optimized by using an Adam optimizer. In the present embodiment, the initial learning rate is set to 10 -2 And the learning rate after each iteration is reduced to 95% of the original rate. And then, performing semantic segmentation on the large-scale point cloud by using the trained model.
The feature dimension is changed from d to 8 through a full connection layer for extracting point-by-point features. Through four characteristic coding layers, the information scale participating in operation is changed from initial Nxd
Figure BDA0002988939620000131
Figure BDA0002988939620000132
Through four feature decoding layers and three full connection layers, the information scale is changed into Nxc, wherein c is the category number of semantic segmentation, and the Nxc semantic segmentation prediction information is obtained.
In addition, the invention also provides a large-scale point cloud semantic segmentation system which can improve the semantic segmentation precision.
As shown in fig. 8, the apparatus for improving semantic segmentation accuracy according to the present invention includes an extraction unit 1, an encoding unit 2, a decoding unit 3, and a prediction unit 4.
Specifically, the method comprises the following steps: the extraction unit 1 is used for extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
the encoding unit 2 is used for gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
the decoding unit 3 is used for gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
the prediction unit 4 is configured to determine a semantic segmentation prediction result of the to-be-identified 3D point cloud based on a semantic segmentation network model according to each decoding feature.
In addition, the invention also provides the following scheme:
a large-scale point cloud semantic segmentation system, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Further, the invention also provides the following scheme:
a computer readable storage medium storing one or more programs that, when executed by an electronic device that includes a plurality of application programs, cause the electronic device to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Compared with the prior art, the large-scale point cloud semantic segmentation system and the computer-readable storage medium have the same beneficial effects as the large-scale point cloud semantic segmentation method, and are not repeated herein.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (7)

1. A large-scale point cloud semantic segmentation method is characterized by comprising the following steps:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified consists of a plurality of points to be identified;
based on the point cloud space information of each point to be identified, gradually encoding each point-by-point feature to obtain a corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the characteristics corresponding to the down-sampling points from the point-by-point characteristics, wherein the screened characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial features according to point cloud spatial information of the down-sampling point and corresponding learned features;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
determining corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features, specifically comprising:
according to the point cloud space information of the down-sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on a shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
according to the point cloud spatial information of the down-sampling points and the corresponding learned characteristics, local spatial context characteristics are learned to obtain local spatial context characteristics, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local space context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
2. The large-scale point cloud semantic segmentation method according to claim 1, wherein the determining the polar coordinates of the downsampling points according to the point cloud space information of the downsampling points specifically comprises:
obtaining the initial polar coordinates of the down-sampling point according to the following formula
Figure FDA0003878050820000021
Figure FDA0003878050820000022
Figure FDA0003878050820000023
Figure FDA0003878050820000024
Wherein, a K nearest neighbor KNN method based on Euclidean distance is used for obtaining a down-sampling point p i The K neighbor comprises K neighbor points
Figure FDA0003878050820000025
Figure FDA0003878050820000026
To lower the sampling point p i K-th neighbor point of (1)
Figure FDA0003878050820000027
Relative position coordinates in a rectangular spatial coordinate system, i denotes a down-sampled point p i K denotes a neighbor point
Figure FDA0003878050820000028
K, K represents the number of neighboring points;
determining the polar coordinate angle alpha of the local space direction according to the local space direction i And beta i (ii) a The local spatial direction is defined by the down-sampling point p i Neighborhood centroid pointing to K nearest neighbor
Figure FDA0003878050820000029
Updating the polar seat of the down-sampling point according to the following formulaSign
Figure FDA00038780508200000210
The updated polar representation has local rotational invariance:
Figure FDA00038780508200000211
Figure FDA00038780508200000212
3. the large-scale point cloud semantic segmentation method according to claim 1, wherein the local spatial context features are obtained by a double-distance-based neighborhood point feature adaptive fusion method according to polar coordinates, geometric distances and learned features, and specifically comprises:
determining a characteristic distance and a geometric characteristic according to the polar coordinates and the learned characteristics of the down-sampling points;
determining weighted fusion parameters according to the following formula
Figure FDA0003878050820000031
Figure FDA0003878050820000032
Figure FDA0003878050820000033
Figure FDA0003878050820000034
Figure FDA0003878050820000035
Wherein softmax () represents a normalized exponential function, MLP () represents a multi-layer perceptron function,
Figure FDA0003878050820000036
the characteristics of the connection are shown as such,
Figure FDA0003878050820000037
the dual-range feature is represented by,
Figure FDA0003878050820000038
represents the join operator;
Figure FDA0003878050820000039
is a near neighbor point
Figure FDA00038780508200000310
The geometric distance of (a);
Figure FDA00038780508200000311
is a neighboring point
Figure FDA00038780508200000312
Average L1 feature distance of (a), based on the learned feature g i And g k Determining; λ is the weight of the adjusted feature distance term, mean () represents the averaging function;
Figure FDA00038780508200000313
representing neighbor points
Figure FDA00038780508200000314
Features determined from the geometric features and learned features;
fusing each neighborhood point feature with the weighting parameter
Figure FDA00038780508200000315
Fusing to obtain local spatial context characteristics f iL
Figure FDA00038780508200000316
Where, represents the dot product operator, i represents the down-sampling point p i K denotes a neighbor point
Figure FDA00038780508200000317
K, K represents the number of neighboring points.
4. The large-scale point cloud semantic segmentation method according to claim 1, characterized in that the down-sampled point p is determined according to the following formula i Global spatial context feature f iG
Figure FDA00038780508200000318
Figure FDA00038780508200000319
Wherein the content of the first and second substances,
Figure FDA00038780508200000320
to connect operators, (x) i ,y i ,z i ) To lower the sampling point p i Spatial coordinates in a spatial rectangular coordinate system, r i Is a volume ratio, v i To lower the sampling point p i Volume of the neighborhood minimum circumscribed sphere, v g And the minimum circumscribed sphere volume of the point cloud to be identified.
5. A large-scale point cloud semantic segmentation system, comprising:
the device comprises an extraction unit, a recognition unit and a processing unit, wherein the extraction unit is used for extracting point-by-point characteristics of a point cloud to be recognized, and the point cloud to be recognized is composed of a plurality of points to be recognized;
the coding unit is used for gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain the corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial features according to point cloud spatial information of the down-sampling point and corresponding learned features;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
the method for determining the corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features specifically comprises the following steps:
according to the point cloud space information of the down sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on a shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
according to the point cloud spatial information of the down-sampling points and the corresponding learned characteristics, local spatial context characteristics are learned to obtain local spatial context characteristics, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local space context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
the decoding unit is used for gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and the prediction unit is used for determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
6. A large-scale point cloud semantic segmentation system comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
based on the point cloud space information of each point to be identified, gradually encoding each point-by-point feature to obtain a corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial characteristics according to the point cloud spatial information of the down-sampling point and the corresponding learned characteristics;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
determining corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features, specifically comprising:
according to the point cloud space information of the down sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on a shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
the local spatial context feature learning is carried out according to the point cloud spatial information of the down-sampling points and the corresponding learned features to obtain the local spatial context features, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local space context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
7. A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified consists of a plurality of points to be identified;
based on the point cloud space information of each point to be identified, gradually encoding each point-by-point feature to obtain a corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the characteristics corresponding to the down-sampling points from the point-by-point characteristics, wherein the screened characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial characteristics according to the point cloud spatial information of the down-sampling point and the corresponding learned characteristics;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
determining corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features, specifically comprising:
according to the point cloud space information of the down-sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on the shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
the local spatial context feature learning is carried out according to the point cloud spatial information of the down-sampling points and the corresponding learned features to obtain the local spatial context features, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud spatial information sum of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local spatial context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
CN202110309423.1A 2021-03-23 2021-03-23 Large-scale point cloud semantic segmentation method and system Active CN113011430B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110309423.1A CN113011430B (en) 2021-03-23 2021-03-23 Large-scale point cloud semantic segmentation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110309423.1A CN113011430B (en) 2021-03-23 2021-03-23 Large-scale point cloud semantic segmentation method and system

Publications (2)

Publication Number Publication Date
CN113011430A CN113011430A (en) 2021-06-22
CN113011430B true CN113011430B (en) 2023-01-20

Family

ID=76405543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110309423.1A Active CN113011430B (en) 2021-03-23 2021-03-23 Large-scale point cloud semantic segmentation method and system

Country Status (1)

Country Link
CN (1) CN113011430B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114782684B (en) * 2022-03-08 2023-04-07 中国科学院半导体研究所 Point cloud semantic segmentation method and device, electronic equipment and storage medium
CN115169556B (en) * 2022-07-25 2023-08-04 美的集团(上海)有限公司 Model pruning method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111489358A (en) * 2020-03-18 2020-08-04 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN111860138A (en) * 2020-06-09 2020-10-30 中南民族大学 Three-dimensional point cloud semantic segmentation method and system based on full-fusion network
CN112396137A (en) * 2020-12-14 2021-02-23 南京信息工程大学 Point cloud semantic segmentation method fusing context semantics

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229479B (en) * 2017-08-01 2019-12-31 北京市商汤科技开发有限公司 Training method and device of semantic segmentation model, electronic equipment and storage medium
US11004202B2 (en) * 2017-10-09 2021-05-11 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for semantic segmentation of 3D point clouds
CN112149677A (en) * 2020-09-14 2020-12-29 上海眼控科技股份有限公司 Point cloud semantic segmentation method, device and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111489358A (en) * 2020-03-18 2020-08-04 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN111860138A (en) * 2020-06-09 2020-10-30 中南民族大学 Three-dimensional point cloud semantic segmentation method and system based on full-fusion network
CN112396137A (en) * 2020-12-14 2021-02-23 南京信息工程大学 Point cloud semantic segmentation method fusing context semantics

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation;Siqi Fan 等;《2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》;20211102;全文 *

Also Published As

Publication number Publication date
CN113011430A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN111798475B (en) Indoor environment 3D semantic map construction method based on point cloud deep learning
CN109685152B (en) Image target detection method based on DC-SPP-YOLO
CN111191566B (en) Optical remote sensing image multi-target detection method based on pixel classification
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
CN110738697A (en) Monocular depth estimation method based on deep learning
CN113011430B (en) Large-scale point cloud semantic segmentation method and system
CN111862289B (en) Point cloud up-sampling method based on GAN network
CN113469094A (en) Multi-mode remote sensing data depth fusion-based earth surface coverage classification method
CN111368769B (en) Ship multi-target detection method based on improved anchor point frame generation model
CN108765475B (en) Building three-dimensional point cloud registration method based on deep learning
CN113139470B (en) Glass identification method based on Transformer
CN111667535B (en) Six-degree-of-freedom pose estimation method for occlusion scene
CN113191387A (en) Cultural relic fragment point cloud classification method combining unsupervised learning and data self-enhancement
CN111291622A (en) Method and device for detecting building change in remote sensing image
CN112819080B (en) High-precision universal three-dimensional point cloud identification method
CN109242019A (en) A kind of water surface optics Small object quickly detects and tracking
CN114723764A (en) Parameterized edge curve extraction method for point cloud object
CN116563682A (en) Attention scheme and strip convolution semantic line detection method based on depth Hough network
CN113420590A (en) Robot positioning method, device, equipment and medium in weak texture environment
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium
CN116721206A (en) Real-time indoor scene vision synchronous positioning and mapping method
CN117078956A (en) Point cloud classification segmentation network based on point cloud multi-scale parallel feature extraction and attention mechanism
CN117036425A (en) Point cloud hierarchical decision registration method, system, equipment and medium
CN115944868A (en) Control method for ship-borne fire water monitor
CN114323055B (en) Robot weak rejection area path planning method based on improved genetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant