CN115965936A

CN115965936A - Edge position marking method and equipment

Info

Publication number: CN115965936A
Application number: CN202211677152.6A
Authority: CN
Inventors: 袁艺天; 张新雨; 高立帅; 张勃; 揭泽群; 马林; 初祥祥
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2022-12-26
Filing date: 2022-12-26
Publication date: 2023-04-14

Abstract

The application discloses an edge position labeling method and device, and belongs to the technical field of computers. The method comprises the following steps: displaying three-dimensional point cloud data of an obstacle detected by a target object; responding to the marking operation of the three-dimensional point cloud data, and acquiring a rough marking frame marked on the barrier, wherein the edge of the barrier is positioned in the coverage range of the rough marking frame; the obstacle detection method and device based on the three-dimensional point cloud data and the rough marking frame detect the obstacle to obtain the edge position of the obstacle.

Description

Edge position marking method and equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for edge position labeling.

Background

With the continuous development of the automatic driving technology, the terminal can detect the collected three-dimensional point cloud data of the surrounding environment so as to detect the obstacles of the surrounding environment.

Specifically, the terminal calls an obstacle detection model to detect an obstacle included in the three-dimensional point cloud data so as to obtain an obstacle in the surrounding environment. Before the obstacle detection model is used, the obstacle detection model needs to be trained, and sample data used for training needs to be manually marked by a developer on the edge position of the obstacle in the three-dimensional point cloud data, so that the obstacle detection model can be trained on the basis of the three-dimensional point cloud data with the marked edge position.

However, when the developer marks the edge position of the obstacle, the developer needs to ensure the edge of the obstacle to be attached, and thus the operation efficiency is low.

Disclosure of Invention

The embodiment of the application provides an edge position marking method and equipment, which break through the limitation that a user needs to manually mark the edge position of an obstacle, and improve the operation efficiency. The technical scheme is as follows:

in one aspect, an edge position labeling method is provided, and the method includes:

displaying three-dimensional point cloud data of the obstacle detected by the target object;

responding to the marking operation of the three-dimensional point cloud data, and acquiring a rough marking frame marked on the obstacle, wherein the edge of the obstacle is positioned in the coverage range of the rough marking frame;

and detecting the obstacle based on the three-dimensional point cloud data and the rough marking frame to obtain the edge position of the obstacle.

In one aspect, an edge position labeling apparatus is provided, the apparatus including:

the display module is used for displaying the three-dimensional point cloud data of the obstacle detected by the target object;

the acquisition module is used for responding to the marking operation of the three-dimensional point cloud data and acquiring a rough marking frame marked on the obstacle, wherein the edge of the obstacle is positioned in the coverage range of the rough marking frame;

and the detection module is used for detecting the obstacle based on the three-dimensional point cloud data and the rough marking frame to obtain the edge position of the obstacle.

In one possible implementation, the apparatus further includes:

the generating module is used for generating training data based on the edge position of the obstacle and the three-dimensional point cloud data, the training data is used for training an obstacle detection model, and the obstacle detection model is used for obstacle detection.

In one possible implementation manner, the detection module includes:

the determining unit is used for determining a plurality of key points corresponding to the rough mark frame based on the position of each point in the three-dimensional point cloud data and the rough mark frame;

the fusion unit is used for fusing the key point features of the key points corresponding to the rough marking frame based on the obtained key point features of each key point and the position of the rough marking frame to obtain the marking frame features of the rough marking frame;

and the detection unit is used for detecting the characteristics of the mark frame to obtain the edge position of the obstacle corresponding to the rough mark frame.

In a possible implementation manner, the determining unit is configured to:

determining a gravity center corresponding to the rough marking frame based on the position of each point in the three-dimensional point cloud data, wherein the gravity center is any point corresponding to the rough marking frame;

determining a point with the largest distance from the gravity center among a plurality of points corresponding to the rough mark frame as the key point;

for each point in the rest points, respectively acquiring the distance between the point and the gravity center and the distance between the point and each determined key point, and determining the minimum distance in the distance between the point and the gravity center and the distance between the point and each determined key point as the distance corresponding to the point;

and determining the point with the maximum corresponding distance in the rest points as the key point until a first preset number of key points are obtained.

In a possible implementation manner, the determining unit is configured to:

acquiring an average value of a plurality of points corresponding to the rough marking frame, and determining a point corresponding to the average value as the gravity center;

or,

and determining any point of a plurality of points corresponding to the rough mark frame as the gravity center.

In one possible implementation, the apparatus further includes:

the dividing module is used for dividing the three-dimensional point cloud data into a second preset number of two-dimensional areas according to two-dimensional angles;

the feature extraction module is used for extracting features of each two-dimensional region in the second preset number of two-dimensional regions to obtain a first feature matrix, and each element in the first feature matrix indicates a first feature of the corresponding two-dimensional region;

the feature extraction module is further configured to perform feature extraction on the first feature matrix to obtain a second feature matrix, where each element in the second feature matrix indicates a second feature of the corresponding two-dimensional region;

and the cascading module is used for cascading a first feature corresponding to the two-dimensional region to which the key point belongs in the first feature matrix, a second feature corresponding to the two-dimensional region to which the key point belongs in the second feature matrix and the original feature of the key point to obtain the key point feature of the key point.

In a possible implementation manner, the detection module is further configured to detect the obstacle based on the three-dimensional point cloud data and the rough mark frame, so as to obtain a category of the obstacle.

In a possible implementation manner, the detection module is further configured to detect the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain a movement direction angle of the obstacle, where the movement direction angle is an included angle between a movement direction of the obstacle and a preset direction.

In a possible implementation manner, the step of detecting the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain the edge position of the obstacle is implemented based on an edge detection model, and the obtaining module is further configured to obtain a sample edge position of the obstacle corresponding to the rough mark frame in the sample three-dimensional point cloud data;

the detection module is further used for detecting an obstacle based on the edge detection model, the sample three-dimensional point cloud data and the rough mark frame to obtain a predicted edge position of the obstacle corresponding to the rough mark frame;

the device further comprises: a training module for training the edge detection model based on the predicted edge position and the sample edge position.

In one aspect, a terminal is provided and includes one or more processors and one or more memories, where at least one program code is stored in the one or more memories and loaded by the one or more processors and executed to implement the operations performed by the edge position labeling method according to any of the above possible implementations.

In one aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the at least one program code is loaded by a processor and executed to implement the operations performed by the edge position labeling method according to any one of the above possible implementation manners.

In one aspect, there is provided a computer program or computer program product comprising: computer program code which, when executed by a terminal, causes the terminal to perform the operations performed by the edge location labeling method according to any one of the possible implementations described above.

The application provides an edge position marking method, a user can specify the position of an obstacle through marking a rough marking frame of the obstacle in three-dimensional point cloud data, and then the edge position of the obstacle is obtained through detection based on the rough marking frame.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application;

fig. 2 is a flowchart of an edge position labeling method according to an embodiment of the present application;

FIG. 3 is a flowchart of an edge position labeling method according to an embodiment of the present disclosure;

FIG. 4 is a flowchart of another edge position labeling method provided in the embodiments of the present application;

FIG. 5 is a schematic diagram illustrating edge positions and categories of detected obstacles according to an embodiment of the present disclosure;

FIG. 6 is a schematic structural diagram of an edge position labeling apparatus according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of another edge position marking apparatus provided in the embodiments of the present application;

fig. 8 is a schematic structural diagram of a terminal provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It will be understood that the terms "first," "second," and the like as used herein may be used herein to describe various concepts, which are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, the first feature matrix may be referred to as a second feature matrix, and the second feature matrix may be referred to as the first feature matrix, without departing from the scope of the present application.

As used herein, the term "at least one", "a plurality", "each", "any", at least one includes one, two or more, a plurality includes two or more, and each refers to each of the corresponding plurality, and any refers to any one of the plurality, for example, the plurality of keypoints includes 3 keypoints, and each refers to each of the 3 keypoints, and any refers to any one of the 3 keypoints, which may be the first, the second, or the third.

It should be noted that information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals referred to in this application are authorized by the user or sufficiently authorized by various parties, and the collection, use, and processing of the relevant data is required to comply with relevant laws and regulations and standards in relevant countries and regions. For example, the positioning information and the like referred to in the present application are acquired under sufficient authorization. And the information and the data are processed and then used in a big data application scene, and can not be recognized to any natural person or generate specific association with the natural person.

In some embodiments, the edge position labeling method provided in the embodiments of the present application is executed by a terminal.

In other embodiments, the edge position labeling method provided by the embodiments of the present application is executed by a terminal and a server. The server can be a server, or a server cluster composed of a plurality of servers, or a cloud computing service center.

It should be noted that, in the embodiment of the present application, an execution subject of the edge position labeling method is not limited.

Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application, and as shown in fig. 1, the implementation environment includes a terminal 101 and a server 102, where the terminal 101 and the server 102 are connected through a wireless or wired network.

The server 102 is a server that provides the terminal 101 with services related to automatic driving. In some embodiments, the server 102 is used to provide services to the terminal 101.

In some embodiments, the terminal 101 displays three-dimensional point cloud data of an obstacle detected by a target object, and if a marking operation on the displayed three-dimensional point cloud data is detected, the terminal responds to the marking operation to obtain a rough marking frame marked by the obstacle, and then sends the three-dimensional point cloud data and the rough marking frame to a server, and the server detects the obstacle to obtain an edge position of the obstacle, so as to train an obstacle detection model based on the edge position of the obstacle and the three-dimensional point cloud data.

Fig. 2 is a flowchart of an edge position labeling method according to an embodiment of the present application. The embodiment of the present application takes an execution subject as an example for exemplary explanation, and the embodiment includes:

201. and the terminal displays the three-dimensional point cloud data of the obstacle detected by the target object.

The moving object is any object having a moving function, for example, the moving object may be a bicycle, an automobile, an unmanned aerial vehicle, or another object, and the embodiment of the present application is not limited. An obstacle is an object that blocks the movement of a moving object. The obstacle may be an obstacle in a stationary state, an obstacle in a moving state, or another type of obstacle, and the embodiments of the present application are not limited. For example, the obstacle may be a stationary bicycle, a moving bicycle, a stationary automobile, a moving automobile, or other objects, and the embodiments of the present application are not limited thereto.

The target object can shoot the obstacle in the moving process to obtain three-dimensional point cloud data of the obstacle, and the three-dimensional point cloud data is uploaded to the terminal, so that the three-dimensional point cloud data of the obstacle detected by the target object can be displayed by the terminal.

202. And the terminal responds to the marking operation of the three-dimensional point cloud data to obtain a rough marking frame marked on the obstacle, and the edge of the obstacle is positioned in the coverage range of the rough marking frame.

In this embodiment of the application, if the three-dimensional point cloud data includes an obstacle, the user marks the obstacle by looking up the three-dimensional point cloud data, so that the marked rough mark frame includes the obstacle. The coverage area of the rough mark frame is larger than the obstacle, so the rough mark frame can cover the edge of the obstacle.

203. And the terminal detects the obstacle based on the three-dimensional point cloud data and the rough marking frame to obtain the edge position of the obstacle.

In the embodiment of the application, the terminal acquires the three-dimensional point cloud data and the rough marking frame, namely the terminal determines the rough position of the obstacle, and then the obstacle is detected based on the three-dimensional point cloud data and the rough marking frame of the obstacle, so that the edge position of the obstacle can be obtained.

Fig. 3 is a flowchart of an edge position labeling method according to an embodiment of the present application. The embodiment of the present application takes an execution subject as an example for exemplary explanation, and the embodiment includes:

301. and the terminal displays the three-dimensional point cloud data of the obstacle detected by the target object.

In some embodiments, the target object is equipped with a radar by which the surrounding environment is detected in order to obtain three-dimensional point cloud data of obstacles in the surrounding environment. Optionally, the terminal is installed with a model training application, and three-dimensional point cloud data of the obstacle detected by the target object is displayed through a labeling interface in the model training application.

302. And the terminal responds to the marking operation of the three-dimensional point cloud data and acquires a rough marking frame marked on the barrier, wherein the edge of the barrier is positioned in the coverage range of the rough marking frame.

In the embodiment of the application, the terminal displays the three-dimensional point cloud data, and the three-dimensional point cloud data comprises the shape of the obstacle, so that a user can perform framing operation on the three-dimensional point cloud data of the obstacle according to the shape of the obstacle included in the three-dimensional point cloud data, and can also understand the framing operation on the three-dimensional point cloud data, so that the framed rough mark frame comprises the edge of the obstacle, and the terminal can acquire the rough mark frame marked by the user.

In some embodiments, when displaying the three-dimensional point cloud data, the terminal may display a top view of the three-dimensional point cloud data, and may also be understood as displaying two-dimensional point cloud data of the three-dimensional point cloud data, so that when performing a labeling operation on the three-dimensional point cloud data, a user may pay attention to the two-dimensional point cloud data, and exclude a third-dimensional parameter of the three-dimensional point cloud data.

For example, the three-dimensional point cloud data includes X, Y, Z three-dimensional parameters, and when the terminal displays a top view of the three-dimensional point cloud data, X, Y is displayed, and the Z dimension is ignored, so that a user only needs to consider X, Y two dimensions when labeling the three-dimensional point cloud data.

Optionally, when the terminal displays the top view of the three-dimensional point cloud data, the obstacle is also displayed in the top view mode, so that the user can label the obstacle according to the displayed top view, so that the marked rough mark frame includes the edge of the obstacle.

303. And the terminal determines a plurality of key points corresponding to the rough marking frame based on the position of each point in the three-dimensional point cloud data and the rough marking frame.

In the embodiment of the application, the three-dimensional point cloud data includes a plurality of points, each point corresponds to a respective position, and some points are located in the rough mark frame and near the rough mark frame, so that a plurality of key points corresponding to the rough mark frame can be determined based on the positions of the points in the three-dimensional point cloud data and the rough mark frame.

In some embodiments, the method for determining a plurality of key points corresponding to the rough mark box comprises: determining the gravity center corresponding to the rough marking frame based on the position of each point in the three-dimensional point cloud data, wherein the gravity center is any point corresponding to the rough marking frame, determining the point with the largest distance from the gravity center in a plurality of points corresponding to the rough marking frame as a key point, respectively obtaining the distance from the point to the gravity center and the distance from each determined key point for each point in the rest points, determining the smallest distance from the point to the gravity center and the distance from each determined key point as the distance corresponding to the point, and determining the point with the largest distance from the rest points as the key point until a first preset number of key points are obtained.

In the embodiment of the application, the three-dimensional point cloud data includes a plurality of points, and the rough mark frame is located in the three-dimensional point cloud data, so that the rough mark frame also corresponds to the plurality of points, the center of gravity of the rough mark frame is determined according to the position of the point corresponding to the rough mark frame, and then the key point of the rough mark frame is determined according to the center of gravity. Specifically, for each of a plurality of points corresponding to the rough mark frame, a distance between each point and the center of gravity is obtained, and a point having the largest distance from the center of gravity is determined as a key point, at this time, the determined key point is excluded from the rough mark frame, and for each of the remaining points, a point having the largest distance from the center of gravity and the determined key point respectively needs to be determined as a key point.

For example, the multiple points corresponding to the rough mark frame are A, B, C, D, respectively, if the preset number to be obtained is 2, for example, the distance between the point a and the center of gravity is 1, the distance between the point B and the center of gravity is 2, the distance between the point C and the center of gravity is 3, and the distance between the point D and the center of gravity is 4, the point D is determined as a key point, at this time, the next key point is determined, the distance between the point a and the center of gravity is 1, and the distance between the point D and the point a is 3, and the distance corresponding to the point a is determined as 1; if the distance between the point B and the gravity center is 2 and the distance between the point B and the point D is 2, determining that the distance corresponding to the point B is 2; if the distance between the point C and the gravity center is 3 and the distance between the point C and the gravity center is 1, determining that the distance corresponding to the point C is 1; point D is thus determined as the keypoint.

It should be noted that the barycenter determined in the embodiment of the present application may also be regarded as a key point corresponding to the rough mark frame, that is, the barycenter determined by the terminal is a first key point, and then a second key point and a third key point are sequentially determined until a first preset number of key points are determined.

In some embodiments, the method for determining the keypoints in the embodiment of the present application may be referred to as a farthest sampling algorithm, that is, the present application may sample a first preset number of keypoints by using the farthest sampling algorithm. Optionally, the application adopts a partitioned farthest sampling algorithm to sample a first preset number of key points.

Next, how to determine the gravity center of the rough mark box is explained:

in some embodiments, the rough mark frame corresponds to a plurality of points, and an average value of the plurality of points corresponding to the rough mark frame may be obtained, and a point corresponding to the average value may be determined as the center of gravity. In this embodiment, each of the plurality of points corresponding to the rough mark frame corresponds to a position, because an average value of the positions of the plurality of points corresponding to the rough mark frame can be obtained, a point corresponding to the average value is determined, and the point is determined as the center of gravity corresponding to the rough mark frame.

In some embodiments, the terminal determines any one of a plurality of points corresponding to the coarse mark box as the center of gravity. For example, if the plurality of points corresponding to the coarse marker box include A, B, C, D, any point in A, B, C, D can be determined as the center of gravity. For example, a is determined as the center of gravity, or B is determined as the center of gravity, or C is determined as the center of gravity, or D is determined as the center of gravity.

In the scheme provided by the embodiment of the application, the gravity center is determined by determining the average value of the plurality of points corresponding to the rough marking frame, or any one of the plurality of points corresponding to the rough marking frame is selected as the gravity center, so that the mode of determining the gravity center is expanded, and the diversity is improved.

In some embodiments, the terminal expands the coarse mark frame by a preset multiple, and determines a point included in the expanded coarse mark frame as a point corresponding to the coarse mark frame. The preset multiple is 1.5 times, 1.7 times or other values, and the embodiments of the present application are not limited. For example, the terminal expands the rough mark frame by 1.5 times, so that the expanded rough mark frame may include more points, the points included in the expanded rough mark frame are determined as the points corresponding to the rough mark frame, and further, the characteristics of the rough mark frame may be determined with reference to more points, thereby improving the accuracy of subsequent detection of the obstacle.

In some embodiments, the terminal determines the point included in the rough mark box as the point corresponding to the rough mark box. For example, if the coarse mark frame includes A, B, C three points, A, B, C three points are determined as the points corresponding to the coarse mark frame.

It should be noted that, in the embodiment of the present application, step 303 is performed by a keypoint sampling module, that is, the keypoint sampling module may determine a plurality of keypoints corresponding to the rough mark box.

304. And the terminal fuses the key point features of the key points corresponding to the rough mark frame based on the obtained key point features of each key point and the position of the rough mark frame to obtain the mark frame features of the rough mark frame.

For each of the plurality of key points corresponding to the rough mark frame, each key point has a key point feature, and therefore, after the key point features of the plurality of key points corresponding to the rough mark frame are fused, the obtained features can represent the mark frame features of the rough mark frame.

In some embodiments, the terminal adopts a region pooling algorithm to fuse a plurality of key point features corresponding to the rough mark frame, so as to obtain the mark frame feature of the rough mark frame. Optionally, the region pooling algorithm is a RoI pooling algorithm, or other pooling algorithms, and the embodiments of the present application are not limited thereto.

It should be noted that step 304 in this embodiment of the present application is performed by a coarse frame feature extraction module, that is, the coarse frame feature extraction module may determine a mark frame feature of the coarse mark frame.

305. And the terminal detects the characteristics of the marking frame to obtain the edge position of the obstacle corresponding to the rough marking frame.

In the embodiment of the present application, the mark frame feature indicates a feature of a corresponding rough mark frame, and since the rough mark frame includes an obstacle, an edge position of the obstacle corresponding to the rough mark frame can be obtained by detecting the mark frame feature.

In the embodiments of the present application, the edge position of an obstacle is detected and explained as an example. In another embodiment, the size of the obstacle can also be detected, that is, the terminal detects the feature of the mark frame to obtain the edge position of the obstacle and the size of the obstacle.

Alternatively, the size of the obstacle may be determined based on the edge position of the obstacle. For example, if the edge position of the obstacle is rectangular, the length and width of the obstacle may be determined from the edge position of the obstacle, and the size of the obstacle may be determined by multiplying the length and width. For example, if the edge position of the obstacle is a circle, the diameter of the obstacle may be determined from the edge position of the obstacle, and the size of the obstacle may be determined based on the determined diameter. Alternatively, the obstacle may also have other shapes, and the embodiments of the present application are not described in detail.

In some embodiments, the terminal detects the obstacle based on the three-dimensional point cloud data and the rough marking frame, and not only can obtain the edge position of the obstacle, but also can obtain the type of the obstacle. The type of the obstacle is an automobile, a bicycle, a tree, or other types, and the embodiments of the present application are not limited.

In some embodiments, the terminal detects the obstacle based on the three-dimensional point cloud data and the rough mark frame, and not only can obtain the edge position of the obstacle, but also can obtain the moving direction angle of the obstacle. The terminal detects the characteristics of the mark frame to obtain a movement direction angle of the obstacle corresponding to the rough mark frame, wherein the movement direction angle refers to an included angle between the movement direction of the obstacle and a preset direction.

It should be noted that steps 303 to 305 in the embodiment of the present application are one possible implementation manner of step 203 described above. In some embodiments, the step 203 may also be implemented based on a model, that is, the step of detecting the obstacle by the terminal based on the three-dimensional point cloud data and the rough mark frame to obtain the edge position of the obstacle is implemented based on an edge detection model, and the following step of training the edge detection model is described:

the method comprises the steps of obtaining sample edge positions of obstacles corresponding to rough mark frames in sample three-dimensional point cloud data, detecting the obstacles based on an edge detection model, the sample three-dimensional point cloud data and the rough mark frames to obtain predicted edge positions of the obstacles corresponding to the rough mark frames, and training an edge detection model based on the predicted edge positions and the sample edge positions.

In the embodiment of the application, after the sample three-dimensional point cloud data is obtained, the sample three-dimensional point cloud data can be marked to obtain a rough mark frame and a sample edge position in the sample three-dimensional point cloud data, an edge detection model is called to detect the sample three-dimensional point cloud data and the rough mark frame to obtain a predicted edge position of an obstacle corresponding to the rough mark frame, and the edge detection model is trained based on the difference between the sample edge position and the predicted edge position to enable the edge detection model to have the capability of detecting the edge position of the obstacle based on the three-dimensional point cloud data and the rough mark frame.

In some embodiments, the edge detection model also has the ability to detect the class of obstacles. The method comprises the steps of obtaining sample edge positions of obstacles and sample types of the obstacles corresponding to rough marking frames in sample three-dimensional point cloud data, detecting the obstacles based on an edge detection model, the sample three-dimensional point cloud data and the rough marking frames to obtain predicted edge positions of the obstacles and predicted types of the obstacles corresponding to the rough marking frames, and training an edge detection model based on the predicted edge positions, the predicted types, the sample edge positions and the predicted types.

In the embodiment of the application, after the sample three-dimensional point cloud data is obtained, the sample three-dimensional point cloud data can be labeled to obtain a rough mark frame, a sample edge position and a sample type of the obstacle in the sample three-dimensional point cloud data, an edge detection model is called to detect the sample three-dimensional power data and the rough mark frame to obtain a predicted edge position of the obstacle and a predicted type of the obstacle corresponding to the rough mark frame, and the edge detection model is trained based on the difference between the sample edge position and the predicted edge position and the difference between the sample type and the predicted type of the obstacle, so that the edge detection model has the capability of detecting the edge position of the obstacle and the type of the obstacle based on the three-dimensional point cloud data and the rough mark frame.

In some embodiments, the edge detection model further has a capability of detecting a moving direction of an obstacle, and in the process of training the edge detection model, a sample moving direction of the obstacle in the sample three-dimensional point cloud data needs to be acquired, the obstacle is detected based on the edge detection model, the sample three-dimensional point cloud data and the rough mark frame, a predicted moving direction of the obstacle corresponding to the rough mark frame is obtained, and the edge detection model is trained based on the sample moving direction and the predicted moving direction.

In the above embodiment, the processing of the feature of the key point based on the key point is directly described as an example. In yet other embodiments, the key point feature obtaining process includes: dividing the three-dimensional point cloud data into a second preset number of two-dimensional areas according to two-dimensional angles, performing feature extraction on each two-dimensional area in the second preset number of two-dimensional areas to obtain a first feature matrix, wherein each element in the first feature matrix indicates a first feature of the corresponding two-dimensional area, performing feature extraction on the first feature matrix to obtain a second feature matrix, wherein each element in the second feature matrix indicates a second feature of the corresponding two-dimensional area, and for each key point in a plurality of key points, cascading the first feature of the two-dimensional area to which the key point belongs in the first feature matrix, the second feature of the two-dimensional area to which the key point belongs in the second feature matrix and the original features of the key point to obtain the key point feature of the key point.

In some embodiments, the raw features of the keypoint include features in four dimensions, X-axis, Y-axis, Z-axis, and intensity. The intensity refers to a feature of one dimension included in the three-dimensional point cloud data.

In some embodiments, for each two-dimensional region in the second preset number of two-dimensional regions, a PointNet algorithm is used for feature extraction, so as to obtain first features corresponding to the second preset number of two-dimensional regions.

In some embodiments, a 2D full convolution backbone network is used to perform feature extraction on the first feature matrix to obtain a second feature matrix, and since each element in the first feature matrix corresponds to a first feature of the two-dimensional region, after the feature extraction is performed on the first feature matrix, each element in the obtained second feature matrix corresponds to a second feature of the two-dimensional region.

For example, the three-dimensional point cloud data is divided into M × N two-dimensional regions on an XY plane, each two-dimensional region is not divided on a Z axis, that is, each two-dimensional region includes all points on a corresponding Z axis, a PointNet algorithm (one algorithm) is used to extract two-dimensional region features of D1 dimension for each two-dimensional region, a first feature matrix of M × N D1 dimension is configured for M × N two-dimensional regions in the three-dimensional point cloud data, the first feature matrix of M × N D1 dimension is input to a 2D full convolution trunk network to obtain a second feature matrix of M × N D1 dimension, and then for each keypoint, a first feature corresponding to the two-dimensional region to which the keypoint belongs in the first feature matrix, a second feature corresponding to the second feature matrix, and original features of the keypoint are cascaded to obtain keypoint features of D1+ D2+4 dimensions.

It should be noted that, in the embodiment of the present application, a two-dimensional region is taken as an example for description, and in another embodiment, a preset number of voxels are obtained after the three-dimensional point cloud data is divided, and then feature extraction is performed on each voxel, so as to determine the key point features of the key points based on the extracted features.

Optionally, the step of extracting the features of a preset number of voxels is performed by a voxel feature extraction module, that is, the voxel feature extraction module may extract the features of the voxels. Optionally, the step of extracting the feature of the key point is performed by a key point feature extraction module, that is, the key point feature extraction module may extract the feature of the key point. Alternatively, the step 305 is performed by a detection head module, that is, the detection head module may detect the edge position of the obstacle corresponding to the rough mark frame.

306. The terminal generates training data based on the edge position of the obstacle and the three-dimensional point cloud data, the training data is used for training an obstacle detection model, and the obstacle detection model is used for detecting the obstacle.

In the embodiment of the application, the obstacle detection model needs to be trained so that the obstacle detection model can have the capability of detecting the obstacle by using the three-dimensional point cloud data, therefore, training data is generated based on the edge position of the obstacle and the three-dimensional point cloud data, and the obstacle detection model is trained by using the training data.

In some embodiments, the training data is sample data, that is, the edge position of the obstacle included in the training data is sample data, in the process of training the obstacle detection model, the obstacle detection model is called to detect the three-dimensional point cloud data in the training data to obtain a predicted edge position of the obstacle, and then the obstacle detection model is trained based on the edge position of the obstacle included in the training data and the predicted edge position of the obstacle to obtain an obstacle detection model with obstacle detection capability.

In some embodiments, the terminal may further detect the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain a type of the obstacle, so that the terminal generates training data based on the edge position of the obstacle, the type of the obstacle, and the three-dimensional point cloud data, and the training data is used for training the obstacle detection model.

In some embodiments, the training data is sample data, that is, the edge position of the obstacle and the category of the obstacle included in the training data are both sample data, in the process of training the obstacle detection model, the obstacle detection model is called to detect the three-dimensional point cloud data in the training data to obtain the predicted edge position of the obstacle and the predicted category of the obstacle, and then the obstacle detection model is trained based on the edge position of the obstacle and the predicted category of the obstacle included in the training data and the predicted edge position of the obstacle and the predicted category of the obstacle to obtain the obstacle detection model with the obstacle detection capability.

The scheme of the present application is illustrated by way of example with reference to figure 4: the method comprises the steps that a terminal obtains three-dimensional point cloud data and a rough marking frame, based on the three-dimensional point cloud data and the rough marking frame, key point sampling can be conducted to obtain a first preset number of key points, the three-dimensional point cloud data can be further divided to obtain a plurality of voxels, voxel feature extraction is conducted on the obtained voxels to obtain a first feature matrix, then trunk network feature extraction is conducted on the first feature matrix, key point feature extraction is conducted based on the obtained key points, the first feature matrix and the second feature matrix to obtain key point features, then marking frame features of the rough marking frame are extracted based on the key point features, and then the marking frame features are detected to obtain the edge position of an obstacle and the category of the obstacle. Referring to fig. 5, a dashed line box in fig. 5 is a rough mark box of the obstacle marked by the user, and a solid line box is an edge position of the obstacle detected based on the method provided by the present application, and also displays a category of the obstacle.

According to the edge position marking method provided by the embodiment of the application, a user can specify the position of the obstacle by marking the rough marking frame of the obstacle in the three-dimensional point cloud data, and then the marking frame feature of the rough marking frame is determined based on the key point feature of the key point in the rough marking frame, and can represent the feature of the obstacle included by the corresponding rough marking frame, so that the marking frame feature is detected to obtain the edge position of the obstacle, and the subsequent training of the obstacle detection model by adopting the obtained edge position of the obstacle is ensured.

In addition, according to the embodiment of the application, the two-dimensional area is divided for the three-dimensional point cloud data, and the key point features of the key points are extracted based on the two-dimensional area, so that the matching between the obtained key point features and the features of the obstacles included in the three-dimensional point cloud data is ensured, the accuracy of obtaining the key point features is improved, and the accuracy of detecting the obstacles is further improved.

In addition, in the scheme provided by the embodiment of the application, the center of gravity is determined by determining the average value of the plurality of points corresponding to the rough mark frame, or any one of the plurality of points corresponding to the rough mark frame is selected as the center of gravity, so that the mode of determining the center of gravity is expanded, and the diversity is improved.

Fig. 6 is a schematic structural diagram of an edge position labeling device provided in an embodiment of the present application, and referring to fig. 6, the edge position labeling device includes:

a display module 601, configured to display three-dimensional point cloud data of an obstacle detected by a target object;

an obtaining module 602, configured to obtain, in response to a labeling operation on the three-dimensional point cloud data, a rough label frame labeled on the obstacle, where an edge of the obstacle is located in a coverage area of the rough label frame;

a detecting module 603, configured to detect the obstacle based on the three-dimensional point cloud data and the rough mark frame, to obtain an edge position of the obstacle.

In one possible implementation, referring to fig. 7, the apparatus further includes:

a generating module 604, configured to generate training data based on the edge position of the obstacle and the three-dimensional point cloud data, where the training data is used to train an obstacle detection model, and the obstacle detection model is used to perform obstacle detection.

In a possible implementation manner, the detecting module 603 includes:

a determining unit 6031, configured to determine, based on the position of each point in the three-dimensional point cloud data and the rough label frame, a plurality of key points corresponding to the rough label frame;

a fusion unit 6032, configured to fuse, based on the obtained key point feature of each key point and the position of the rough tag frame, the key point features of the key points corresponding to the rough tag frame to obtain a tag frame feature of the rough tag frame;

a detecting unit 6033, configured to detect the feature of the mark frame, and obtain an edge position of the obstacle corresponding to the rough mark frame.

In one possible implementation, the determining unit 6031 is configured to:

determining a point with the largest distance from the gravity center among a plurality of points corresponding to the rough marking frame as the key point;

In one possible implementation, the determining unit 6031 is configured to:

or,

a dividing module 605, configured to divide the three-dimensional point cloud data into a second preset number of two-dimensional regions according to two-dimensional angles;

a feature extraction module 606, configured to perform feature extraction on each two-dimensional region in the second preset number of two-dimensional regions to obtain a first feature matrix, where each element in the first feature matrix indicates a first feature of a corresponding two-dimensional region;

the feature extraction module 606 is further configured to perform feature extraction on the first feature matrix to obtain a second feature matrix, where each element in the second feature matrix indicates a second feature of the corresponding two-dimensional region;

a cascading module 607, configured to cascade, for each keypoint of the multiple keypoints, a first feature corresponding to the two-dimensional region to which the keypoint belongs in the first feature matrix, a second feature corresponding to the two-dimensional region in the second feature matrix, and the original feature of the keypoint, so as to obtain the keypoint feature of the keypoint.

In a possible implementation manner, the detecting module 603 is further configured to detect the obstacle based on the three-dimensional point cloud data and the rough mark frame, so as to obtain a category of the obstacle.

In a possible implementation manner, the detecting module 603 is further configured to detect the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain a moving direction angle of the obstacle, where the moving direction angle is an included angle between a moving direction of the obstacle and a preset direction.

In a possible implementation manner, the step of detecting the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain the edge position of the obstacle is implemented based on an edge detection model, and the obtaining module 602 is further configured to obtain a sample edge position of the obstacle corresponding to the rough mark frame in the sample three-dimensional point cloud data;

the detection module 603 is further configured to detect an obstacle based on the edge detection model, the sample three-dimensional point cloud data, and the rough marker frame, so as to obtain a predicted edge position of the obstacle corresponding to the rough marker frame;

the device further comprises: a training module 608, configured to train the edge detection model based on the predicted edge position and the sample edge position.

It should be noted that: the edge position labeling apparatus provided in the foregoing embodiment is only illustrated by dividing the functional modules when generating the training data, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the terminal is divided into different functional modules to complete all or part of the functions described above. In addition, the edge position labeling device provided by the above embodiment and the edge position labeling method embodiment belong to the same concept, and specific implementation processes thereof are described in the method embodiment and are not described herein again.

Fig. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal 800 includes: a processor 801 and a memory 802.

Processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so forth. The processor 801 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 801 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 801 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 801 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 802 may include one or more computer-readable storage media, which may be non-transitory. Memory 802 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 802 is used for storing at least one program code for execution by the processor 801 to implement the edge location labeling method provided by the method embodiments in the present application.

In some embodiments, the terminal 800 may further include: a peripheral interface 803 and at least one peripheral. The processor 801, memory 802 and peripheral interface 803 may be connected by bus or signal lines. Various peripheral devices may be connected to the peripheral interface 803 by a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 804, a display 805, a camera 806, an audio circuit 807, a positioning component 808, and a power supply 809.

The peripheral interface 803 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 801 and the memory 802. In some embodiments, the processor 801, memory 802, and peripheral interface 803 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 801, the memory 802, and the peripheral interface 803 may be implemented on separate chips or circuit boards, which are not limited by this embodiment.

The Radio Frequency circuit 804 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 804 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 804 converts an electrical signal into an electromagnetic signal to be transmitted, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 804 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 804 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 804 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 805 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 805 is a touch display, the display 805 also has the ability to capture touch signals on or above the surface of the display 805. The touch signal may be input to the processor 801 as a control signal for processing. At this point, the display 805 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 805 may be one, providing a front panel of the terminal 800; in other embodiments, the display 805 may be at least two, respectively disposed on different surfaces of the terminal 800 or in a foldable design; in still other embodiments, the display 805 may be a flexible display disposed on a curved surface or a folded surface of the terminal 800. Even further, the display 805 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display 805 can be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and other materials.

The camera assembly 806 is used to capture images or video. Optionally, camera assembly 806 includes a front camera and a rear camera. The front camera is arranged on the front panel of the terminal, and the rear camera is arranged on the back of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 806 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 807 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 801 for processing or inputting the electric signals to the radio frequency circuit 804 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 800. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 807 may also include a headphone jack.

The positioning component 808 is used to locate the current geographic Location of the terminal 800 for navigation or LBS (Location Based Service). The Positioning component 808 may be a Positioning component based on the GPS (Global Positioning System) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

Power supply 809 is used to provide power to various components in terminal 800. The power source 809 may be ac, dc, disposable or rechargeable. When the power source 809 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 800 also includes one or more sensors 810. The one or more sensors 810 include, but are not limited to: acceleration sensor 811, gyro sensor 812, pressure sensor 813, fingerprint sensor 814, optical sensor 815 and proximity sensor 816.

The acceleration sensor 811 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 800. For example, the acceleration sensor 811 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 801 may control the display 805 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 811. The acceleration sensor 811 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 812 may detect a body direction and a rotation angle of the terminal 800, and the gyro sensor 812 may cooperate with the acceleration sensor 811 to acquire a 3D motion of the user with respect to the terminal 800. The processor 801 may implement the following functions according to the data collected by the gyro sensor 812: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 813 may be disposed on the side frames of terminal 800 and/or underneath display 805. When the pressure sensor 813 is disposed on the side frame of the terminal 800, the holding signal of the user to the terminal 800 can be detected, and the processor 801 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 813. When the pressure sensor 813 is disposed at a lower layer of the display screen 805, the processor 801 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 805. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 814 is used for collecting a fingerprint of the user, and the processor 801 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 801 authorizes the user to perform relevant sensitive operations including unlocking a screen, viewing encrypted information, downloading software, paying for and changing settings, etc. Fingerprint sensor 814 may be disposed on the front, back, or side of terminal 800. When a physical button or a vendor Logo is provided on the terminal 800, the fingerprint sensor 814 may be integrated with the physical button or the vendor Logo.

The optical sensor 815 is used to collect the ambient light intensity. In one embodiment, the processor 801 may control the display brightness of the display 805 based on the ambient light intensity collected by the optical sensor 815. Specifically, when the ambient light intensity is high, the display brightness of the display screen 805 is increased; when the ambient light intensity is low, the display brightness of the display 805 is adjusted down. In another embodiment, the processor 801 may also dynamically adjust the shooting parameters of the camera assembly 806 based on the ambient light intensity collected by the optical sensor 815.

A proximity sensor 816, also called a distance sensor, is provided on the front panel of the terminal 800. The proximity sensor 816 is used to collect the distance between the user and the front surface of the terminal 800. In one embodiment, when the proximity sensor 816 detects that the distance between the user and the front surface of the terminal 800 gradually decreases, the processor 801 controls the display 805 to switch from the bright screen state to the dark screen state; when the proximity sensor 816 detects that the distance between the user and the front surface of the terminal 800 becomes gradually larger, the display 805 is controlled by the processor 801 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 8 is not intended to be limiting of terminal 800, and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components may be used.

Fig. 9 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 900 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 901 and one or more memories 902, where the memory 902 stores at least one program code, and the at least one program code is loaded and executed by the processors 901 to implement the methods provided by the above method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The server 900 is configured to perform the steps performed by the server in the above method embodiments.

In an exemplary embodiment, a computer-readable storage medium, such as a memory including program code, which is executable by a processor in a computer device to perform the edge location labeling method in the above embodiments, is also provided. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program or a computer program product is also provided, which includes computer program code, which when executed by a computer, causes the computer to implement the edge location labeling method in the above-mentioned embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An edge position labeling method, characterized by comprising:

2. The method of claim 1, further comprising:

and generating training data based on the edge position of the obstacle and the three-dimensional point cloud data, wherein the training data is used for training an obstacle detection model, and the obstacle detection model is used for obstacle detection.

3. The method of claim 1, wherein the detecting the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain an edge position of the obstacle comprises:

determining a plurality of key points corresponding to the rough mark frame based on the position of each point in the three-dimensional point cloud data and the rough mark frame;

fusing the key point features of the key points corresponding to the rough marking frame based on the obtained key point features of each key point and the position of the rough marking frame to obtain marking frame features of the rough marking frame;

and detecting the characteristics of the mark frame to obtain the edge position of the obstacle corresponding to the rough mark frame.

4. The method of claim 3, wherein determining a plurality of keypoints corresponding to the coarse marker box based on the position of each point in the three-dimensional point cloud data and the coarse marker box comprises:

for each of the remaining points, respectively obtaining a distance between the point and the barycenter and a distance between the point and each determined key point, and determining a minimum distance in the distances between the point and the barycenter and each determined key point as a distance corresponding to the point;

5. The method of claim 4, wherein determining the corresponding center of gravity of the coarse marker box based on the location of each point in the three-dimensional point cloud data comprises:

acquiring an average value of a plurality of points corresponding to the rough marking frame, and determining a point corresponding to the average value as the center of gravity;

or,

6. The method of claim 3, further comprising:

dividing the three-dimensional point cloud data into a second preset number of two-dimensional areas according to two-dimensional angles;

performing feature extraction on each two-dimensional area in the second preset number of two-dimensional areas to obtain a first feature matrix, wherein each element in the first feature matrix indicates a first feature of the corresponding two-dimensional area;

extracting features of the first feature matrix to obtain a second feature matrix, wherein each element in the second feature matrix indicates a second feature of the corresponding two-dimensional area;

and for each key point in the plurality of key points, cascading a first feature corresponding to the two-dimensional region to which the key point belongs in the first feature matrix, a second feature corresponding to the two-dimensional region in the second feature matrix and the original features of the key point to obtain the key point features of the key point.

7. The method of claim 1, further comprising:

and detecting the obstacle based on the three-dimensional point cloud data and the rough marking frame to obtain the category of the obstacle.

8. The method of claim 1, further comprising:

and detecting the obstacle based on the three-dimensional point cloud data and the rough marking frame to obtain a moving direction angle of the obstacle, wherein the moving direction angle is an included angle between the moving direction of the obstacle and a preset direction.

9. The method according to any one of claims 1 to 8, wherein the step of detecting the obstacle based on the three-dimensional point cloud data and the rough mark frame to obtain the edge position of the obstacle is implemented based on an edge detection model, and the method further comprises:

acquiring a sample edge position of an obstacle corresponding to a rough mark frame in sample three-dimensional point cloud data;

detecting an obstacle based on the edge detection model, the sample three-dimensional point cloud data and the rough marking frame to obtain a predicted edge position of the obstacle corresponding to the rough marking frame;

training the edge detection model based on the predicted edge positions and the sample edge positions.

10. A terminal, characterized in that the terminal comprises one or more processors and one or more memories, in which at least one program code is stored, which is loaded and executed by the one or more processors to implement the operations performed by the edge position labeling method according to any one of claims 1 to 9.