CN117197419A - Lei Dadian cloud labeling method and device, electronic equipment and storage medium - Google Patents

Lei Dadian cloud labeling method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117197419A
CN117197419A CN202311163344.XA CN202311163344A CN117197419A CN 117197419 A CN117197419 A CN 117197419A CN 202311163344 A CN202311163344 A CN 202311163344A CN 117197419 A CN117197419 A CN 117197419A
Authority
CN
China
Prior art keywords
numerical value
dimensional
radar
frame
coordinates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311163344.XA
Other languages
Chinese (zh)
Inventor
马骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311163344.XA priority Critical patent/CN117197419A/en
Publication of CN117197419A publication Critical patent/CN117197419A/en
Pending legal-status Critical Current

Links

Landscapes

  • Radar Systems Or Details Thereof (AREA)

Abstract

The embodiment of the application provides a labeling method and device of radar point cloud, electronic equipment and a computer readable storage medium, and relates to the field of computer vision. According to the method, a reference point set is determined from a radar point cloud according to the Lei Dadian cloud and the coordinates of a two-dimensional annotation frame in a world coordinate system; determining a numerical value interval in which components of all radar points in the reference point set in the coordinate axis are located, and counting the number of radar points in each numerical value interval; and sequencing the corresponding numerical value intervals according to the numerical value for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals of which the number of the radar points is greater than a preset threshold value. The embodiment of the application reduces a great amount of manual labeling cost, and improves the labeling efficiency of the labeling scene of the automatic driving scene by 5-10 times.

Description

Lei Dadian cloud labeling method and device, electronic equipment and storage medium
Technical Field
The application relates to the field of computer vision, in particular to a labeling method and device of radar point cloud, electronic equipment, a computer readable storage medium and a computer program product.
Background
In the automatic driving field, in order to train an automatic driving algorithm model, each large company needs to use a vehicle-mounted radar to sample the real world in running, and the acquired laser radar data is called as 'point cloud data' or 'Lei Dadian cloud', and each radar point contains information such as coordinates, reflection intensity and the like.
Before the Lei Dadian cloud feeds the algorithm model, data cleaning and marking are needed, objects are accurately marked on the point cloud image, the formed structured data can be fed into the algorithm, and a good model training effect is achieved.
At present, the main labeling method for the point cloud data is still full-manual labeling, namely 8 vertex coordinates of the three-dimensional bounding box are manually adjusted, so that the edge of the three-dimensional bounding box is attached to the point cloud to be labeled. This method has extremely low marking efficiency, and the marking speed of a single frame is about 20 seconds, and a skilled worker is required to take about 10 seconds.
Disclosure of Invention
The embodiment of the application provides a labeling method, a labeling device, electronic equipment, a computer readable storage medium and a computer program product of radar point cloud, which can solve the problems in the prior art. The technical scheme is as follows:
according to an aspect of the embodiment of the application, there is provided a method for labeling a radar point cloud, the method comprising:
Acquiring coordinates of a radar point cloud and a two-dimensional labeling frame under a world coordinate system, wherein the two-dimensional labeling frame is obtained by labeling a target object in the radar point cloud in a top view of the Lei Dadian cloud;
determining a reference point set from the radar point cloud according to the coordinates of the radar point cloud and the two-dimensional annotation frame in the world coordinate system, wherein radar points in the reference point set are projected and positioned in the two-dimensional annotation frame in the vertical direction;
for each coordinate axis of the world coordinate system, determining a numerical interval in which components of each radar point in the reference point set are located, and counting the number of radar points in each numerical interval;
and sequencing the corresponding numerical value intervals according to the numerical value size for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals of which the number of the radar points is greater than a preset threshold value.
According to another aspect of the embodiment of the present application, there is provided a labeling device for a radar point cloud, the device including:
The labeling frame determining module is used for obtaining coordinates of a radar point cloud and a two-dimensional labeling frame under a world coordinate system, wherein the two-dimensional labeling frame is obtained by labeling a target object in the radar point cloud in a top view of the Lei Dadian cloud;
the reference click determining module is used for determining a reference point set from the radar point cloud according to the coordinates of the radar point cloud and the two-dimensional annotation frame in the world coordinate system, and radar points in the reference point set are projected in the vertical direction and are positioned in the two-dimensional annotation frame;
the statistics module is used for determining a numerical value interval in which components of all radar points in the reference point set are located in the coordinate axes for each coordinate axis of the world coordinate system, and counting the number of the radar points in each numerical value interval;
and the traversing module is used for sequencing the corresponding numerical value intervals according to the numerical value for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals with the number of the radar points being greater than the preset threshold value.
As an optional implementation manner, the statistics module determines a numerical interval in which each radar point in the reference point set is located in a component of the coordinate axis, and counts the number of radar points in each numerical interval, including:
for each radar point in the reference point set, determining a numerical value interval in which a component of the radar point in the coordinate axis is located;
if the key is determined to be a key value pair of the numerical value interval, adding a preset numerical value to the value of the key value pair;
if it is determined that no key value pair with the key being the numerical value interval exists, a new key value pair with the key being the numerical value interval is created, and the value of the new key value pair is set as the preset numerical value.
As an optional implementation manner, the traversing module sorts the corresponding numerical intervals according to the numerical sizes, and traverses each numerical interval from two ends to the middle of the sequence, including:
sequencing each numerical value interval according to the size to obtain an ordered array;
traversing from two ends of the ordered array to the middle respectively, wherein for a currently traversed numerical value interval, a query key is a reference key value pair of the currently traversed numerical value interval;
if the value of the reference key value pair is larger than the preset threshold value, determining the currently traversed numerical value interval as an extreme value of the edge of the corresponding coordinate axis, and stopping traversing from the currently traversed numerical value interval to the middle;
And if the value of the reference key value pair is not greater than the preset threshold value, continuing traversing from the currently traversed numerical value interval to the middle.
As an optional implementation manner, the statistical module determines a numerical value interval where each radar point in the reference point set is located in a component of the coordinate axis, and further includes:
obtaining a three-dimensional labeling frame corresponding to the two-dimensional labeling frame, wherein the coordinates of the two-dimensional labeling frame in the world coordinate system are at the boundary of the three-dimensional labeling frame in the horizontal direction, and the boundary of the three-dimensional labeling frame in the vertical direction is a preset coordinate;
the traversing module determines a numerical value interval of each radar point in the reference point set in the component of the coordinate axis, and the numerical value interval comprises the following steps:
determining a subset of reference points located outside the boundary of the three-dimensional annotation frame from the set of reference points;
and for the coordinate axis in the vertical direction, determining a numerical value interval of each radar point in the reference point subset in the component of the coordinate axis.
As an optional implementation manner, the labeling frame determining module obtains coordinates of the two-dimensional labeling frame in a world coordinate system, including:
displaying a top view of the radar point cloud;
generating a two-dimensional annotation frame in response to the annotation operation of the target object;
Determining the coordinates of the two-dimensional annotation frame in a screen coordinate system;
and converting the coordinates of the two-dimensional annotation frame in the screen coordinate system into the coordinates of the world coordinate system.
As an optional implementation manner, the label frame determining module converts the coordinates of the two-dimensional label frame in the screen coordinate system into the coordinates of the world coordinate system, including:
normalizing the coordinates of the two-dimensional annotation frame in a screen coordinate system to obtain normalized screen coordinates;
normalizing the distance vector to obtain a normalized distance vector, wherein the distance vector is used for representing the distance between the normalized screen coordinates and the preset camera position;
determining the ratio of the camera position to the component of the distance vector in the vertical direction, and obtaining the distance between the two-dimensional annotation frame and the camera position according to the normalized distance vector and the ratio;
and obtaining the coordinates of the two-dimensional annotation frame in a world coordinate system according to the camera position and the distance between the two-dimensional annotation frame and the camera position.
As an alternative embodiment, the magnitude of the preset threshold is proportional to the number of radar points in the reference point set.
According to another aspect of an embodiment of the present application, there is provided an electronic device including a memory, a processor, and a computer program stored on the memory, the processor executing the computer program to implement the steps of the labeling method of Lei Dadian cloud described above.
According to still another aspect of the embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the labeling method of Lei Dadian cloud described above.
According to an aspect of an embodiment of the present application, there is provided a computer program product, including a computer program, which when executed by a processor implements the steps of the labeling method of Lei Dadian cloud described above.
The technical scheme provided by the embodiment of the application has the beneficial effects that:
the method comprises the steps of obtaining a two-dimensional annotation frame by annotating target objects in a top view of a radar point cloud, wherein the situation that the objects overlap in the vertical direction does not exist in the Lei Dadian cloud in practical application, so that the two-dimensional annotation frame can be annotated in the top view to ensure that the two-dimensional annotation frame also only comprises one object, namely the target object, in a world coordinate system, compared with the prior art, the three-dimensional enclosure box can be finally obtained by annotating 3 coordinate axes respectively, the annotation efficiency can be greatly improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.
Fig. 1a is a schematic diagram of a system architecture for implementing labeling of a radar point cloud according to an embodiment of the present application;
fig. 1b is a schematic view of a scene of a labeling method of a radar point cloud according to an embodiment of the present application;
fig. 2a is a schematic flow chart of a labeling method of a radar point cloud according to an embodiment of the present application;
FIG. 2b is a schematic diagram of a two-dimensional labeling frame labeled in a top view of a radar point cloud according to an embodiment of the present application;
FIG. 2c is a schematic flow chart of obtaining a two-dimensional annotation frame according to an embodiment of the present application;
fig. 2d is a schematic diagram of a positional relationship between a radar point and a two-dimensional labeling frame according to an embodiment of the present application;
FIG. 2e is a schematic diagram of traversing a sequence according to an embodiment of the present application;
FIG. 2f is a schematic diagram of different view angles of a three-dimensional bounding box according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of determining edges of a three-dimensional bounding box of a target object on corresponding coordinate axes according to an embodiment of the present application;
FIG. 4a is a schematic diagram of a three-dimensional label frame corresponding to a two-dimensional label frame according to an embodiment of the present application;
FIG. 4b is a schematic diagram of a relative position of a three-dimensional labeling frame in a spatial coordinate system according to an embodiment of the present application;
fig. 5a is a schematic diagram of an operation interface for labeling a radar point cloud according to an embodiment of the present application;
fig. 5b is a schematic diagram of an operation interface for displaying a point cloud of a target object from a top view according to an embodiment of the present application;
FIG. 5c is a schematic diagram of an operation interface for displaying a two-dimensional annotation frame according to an embodiment of the present application;
fig. 5d is a schematic diagram of an operation interface of a three-dimensional bounding box for displaying a target object according to an embodiment of the present application;
fig. 6 is a schematic flow chart of a labeling method of a radar point cloud according to an embodiment of the present application;
fig. 7 is a schematic view of a scenario for automatic driving algorithm training according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a labeling device for a radar point cloud according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below with reference to the drawings in the present application. It should be understood that the embodiments described below with reference to the drawings are exemplary descriptions for explaining the technical solutions of the embodiments of the present application, and the technical solutions of the embodiments of the present application are not limited.
As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and "comprising," when used in this specification, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, all of which may be included in the present specification. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates that at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B".
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
First, several terms related to the present application are described and explained:
data tagging refers to tagging or annotating specific information or features in the original data so that the computer program can understand and process the data. Data annotation is commonly used in the fields of machine learning and artificial intelligence to help algorithms better understand and learn data.
A point cloud is a three-dimensional data representation consisting of a large number of points, each having its coordinates in three-dimensional space and possibly other attributes, such as color, normal vector, intensity, etc. The point cloud is usually collected by a laser radar, a camera or other sensors, and can be used for building a three-dimensional model, terrain analysis, object identification, robot navigation and other applications. The point cloud data may be stored as a text file, binary file, or record in a database, which may be processed, visualized, and analyzed using various software tools. The point cloud technology is widely applied in the fields of computer vision, machine learning, artificial intelligence and the like, and becomes an indispensable part in the digital age.
Three-dimensional bounding boxes, also known as 3D boxes, are a form of representation of three-dimensional objects commonly used for object detection and recognition tasks in computer vision and machine learning. It is a cuboid consisting of six planes, each plane consisting of four vertices, which can be used to represent the position, size and direction of objects in three-dimensional space. In an object detection task, a three-dimensional bounding box may be used to label objects in an image or point cloud so that an algorithm can identify and locate the objects. In machine learning, the three-dimensional bounding box can be used as a label of training data to help an algorithm learn the shape and position information of an object.
In the automatic driving field, in order to train an automatic driving algorithm model, each large company needs to use a vehicle-mounted radar to sample the real world in running, and the acquired laser radar data is called as 'point cloud data' or 'Lei Dadian cloud', and each radar point contains information such as coordinates, reflection intensity and the like.
Before the Lei Dadian cloud feeds the algorithm model, data cleaning and marking are needed, objects are accurately marked on the point cloud image, the formed structured data can be fed into the algorithm, and a good model training effect is achieved.
At present, the main labeling method for the point cloud data is still full-manual labeling, namely 8 vertex coordinates of the three-dimensional bounding box are manually adjusted, so that the edge of the three-dimensional bounding box is attached to the point cloud to be labeled. This method has extremely low marking efficiency, and the marking speed of a single frame is about 20 seconds, and a skilled worker is required to take about 10 seconds.
The application provides a radar point cloud method, a radar point cloud device, an electronic device, a computer readable storage medium and a computer program product, and aims to solve the technical problems in the prior art.
The application relates in particular to Computer Vision (CV) technology in artificial intelligence technology. The computer vision is a science for researching how to make a machine "look at", and more specifically, a camera and a computer are used to replace human eyes to identify, follow and measure targets, and further perform graphic processing, so that the computer is processed into images more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
The present application mainly relates to identifying the position of a target object in three-dimensional point cloud data through a computer vision technology, and particularly, the description of the following embodiments can be referred to.
The technical solutions of the embodiments of the present application and technical effects produced by the technical solutions of the present application are described below by describing several exemplary embodiments. It should be noted that the following embodiments may be referred to, or combined with each other, and the description will not be repeated for the same terms, similar features, similar implementation steps, and the like in different embodiments.
Fig. 1a is a schematic diagram of a system architecture for implementing labeling of a radar point cloud according to an embodiment of the present application, where the system architecture may include a server 200 and a lidar cluster, and the lidar cluster may include one or more lidars, and the number of lidars is not limited herein. As shown in fig. 1a, the plurality of lidars may specifically include lidar 100a, lidar 101a, lidars 102a, …, lidar 103a; as shown in fig. 1a, each of lidar 100a, lidar 101a, lidars 102a, …, and lidar 103a may be networked with server 200, so that each lidar may interact with server 200 via a network connection.
The server 200 shown in fig. 1a may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms, and the like. The terminal device may be: intelligent terminals such as smart phones, tablet computers, notebook computers, desktop computers, intelligent televisions and the like. The following describes embodiments of the present application in detail by taking the communication between the lidar 100a and the server 200 as an example.
Referring to fig. 1b, fig. 1b is a schematic view of a labeling method of a radar point cloud according to an embodiment of the present application. The laser radar 100a may be an in-vehicle device mounted on a traveling vehicle, and the laser radar 100a may perform laser scanning on a scene on a traveling road to obtain a point cloud data frame (i.e., a radar point cloud) during traveling of the vehicle. The point cloud data frames are three-dimensional images, and have continuity.
Accordingly, after a plurality of consecutive point cloud data frames are acquired, the lidar 100a may transmit the plurality of consecutive point cloud data frames to the server 200. After the server 200 obtains the plurality of continuous point cloud data frames, the plurality of continuous point cloud data frames may be labeled with the target object.
The point cloud data frame comprises a target object, and the target object points to an object to be detected in the cloud data frame. Therefore, the labeling of the target object, that is, labeling the 3D stereoscopic circumscribed box (three-dimensional bounding box) where the target object is located in the point cloud data frame.
The embodiment of the application provides a method for labeling radar point clouds, which comprises the following steps of:
and S2101, acquiring coordinates of each of the radar point cloud and the two-dimensional annotation frame under a world coordinate system.
Specifically, the execution body in the embodiment of the present application may be any one computer device or a computer cluster formed by a plurality of computer devices. The computer device may be a terminal device or a server, and may specifically be determined according to an actual application scenario. Here, the execution body in the embodiment of the present application will be described by taking a server as an example.
According to the embodiment of the application, the radar point cloud is displayed at the terminal, then a two-dimensional labeling frame is labeled by a labeling person in the top view of the radar point cloud, and a target object of the radar point cloud is enclosed in the two-dimensional labeling frame.
Referring to fig. 2b, a schematic diagram of a two-dimensional labeling frame labeled in a top view of a radar point cloud is shown in an exemplary embodiment of the present application, where, as shown in the drawing, the radar point cloud 2201 is a radar point cloud of an object, in the top view of the radar point cloud 2201, a worker may manually label the two-dimensional labeling frame 2202, and since the two-dimensional labeling frame does not require an edge radar point attached to the radar point cloud, it is only necessary to ensure that the point cloud of the object is enclosed in the two-dimensional labeling frame, and therefore, the labeling speed of the worker is extremely fast, and only 1-2 seconds is generally required to label one two-dimensional labeling frame.
In some embodiments, the laser radar in fig. 1b is further configured with an image acquisition device, and when the laser radar performs laser scanning, a plurality of continuous image frames are obtained through shooting by the image acquisition device at the same time, that is, each point cloud data frame has a corresponding image frame, and the image frames are planar images (two-dimensional images).
Referring to fig. 2c, a schematic flow chart of obtaining a two-dimensional labeling frame is shown in an exemplary embodiment of the present application, as shown in the fig. 2c, the object in the image frame may be first classified and identified by a pre-trained image identification model, the target object may be determined, the first coordinate of the target object in the image physical coordinate system may be determined, the point cloud data frame corresponding to the image frame may be projected to the image physical coordinate system, the second coordinate of the point cloud data frame may be obtained, the second coordinate of the point cloud data frame may be equal to the point cloud subset of the first coordinate from the point cloud data frame, the object in the two-dimensional image and the object in the three-dimensional point cloud data may be initially clustered, the distance between all radar points in the point cloud data frame and the X-Y plane may be determined, so as to form a clustering result.
In the embodiment of the application, the two-dimensional annotation frame is obtained from the top view of the radar point cloud, because in the horizontal direction, other objects are usually nearby one object, if the two-dimensional annotation frame is obtained from the side view, the edge detection in the horizontal direction cannot be completed based on the single two-dimensional annotation frame. In addition, the radar point in the vertical direction can also be used for ground detection, so that the three-dimensional bounding box can be attached to the ground instead of floating in the air.
It should be noted that the coordinate system where the radar point cloud is located is a world coordinate system, the world coordinate system is an absolute coordinate system of the system, and the world coordinate system is defined as that the center of a circle with a small circle is the origin ow, the xw axis is horizontal to the right, the yw axis is downward, and zw is determined by a right-hand rule. The coordinate system where the two-dimensional label frame is located is a screen coordinate system (screen coordinate), the screen coordinate is defined by pixels, the lower left corner of the screen is the origin (0, 0), the upper right corner is (screen. Width, screen. Height), the width of the screen, the height is the height of the screen, the Z value is the world coordinate of the camera, and is measured in world units of the camera. The two-dimensional labeling frame is labeled on the screen of the terminal, so that the initial coordinates of the two-dimensional labeling frame are the coordinates under the screen coordinate system. The screen coordinate system is a two-dimensional coordinate system taking the upper left of the display interface as an origin, and forms screen coordinates by the number of rows and columns of pixel points.
Since the radar point cloud and the two-dimensional labeling frame are not in the same coordinate system at the beginning, step S2101 further includes unifying the radar point cloud and the two-dimensional labeling frame to the same coordinate system, that is, the world coordinate system, so as to obtain the coordinates of each of the radar point cloud and the two-dimensional labeling frame in the world coordinate system.
According to the method, only one two-dimensional marking frame is marked on the top view manually, and 3 two-dimensional marking frames are not required to be marked in 3 planes respectively as in the related art, so that the number of manual marking is greatly reduced, meanwhile, the operation time of view conversion is omitted, and the marking efficiency is greatly improved.
S2102, determining a reference point set from the radar point cloud according to the coordinates of the radar point cloud and a two-dimensional labeling frame in the world coordinate system, wherein radar points in the reference point set are projected in the vertical direction and are positioned in the two-dimensional labeling frame;
in order to obtain the three-dimensional bounding box of the target object, the embodiment of the application needs to find the radar point projected in the vertical direction and located in the two-dimensional labeling frame, so long as the projection in the vertical direction is located in the two-dimensional labeling frame, the radar point of other objects cannot be the radar point of the other objects, because the labeling person can control the range of the frame body when labeling the two-dimensional labeling frame in the top view, and a plurality of objects cannot be contained in one two-dimensional labeling frame. The radar points at the edges can be screened out of the reference point set later in determining the three-dimensional bounding box of the target object.
Referring to fig. 2d, a schematic diagram of a position relationship between a radar point and a two-dimensional labeling frame provided by the embodiment of the application is shown in an exemplary manner, and it can be seen from the figure that a projection of a radar point E in a vertical direction is located in the two-dimensional labeling frame, and a projection position of a radar point F in the vertical direction is located outside the two-dimensional labeling frame. Since whether a point is located in a rectangle is determined, whether the point is located between the upper side, the lower side, the left side and the right side of the rectangle can be determined, and whether a point is located between two line segments can be determined according to whether a point is located on one side of a line segment, that is, whether the included angle exceeds 180 ° can be determined by utilizing the directionality of cross multiplication.
Taking the radar point E in fig. 2d as an example, the radar point E necessarily satisfies: (AB×AE) × (CD×CE) > 0, and (DA×DE) × (BC×BE) > 0.
In the embodiment of the application, the method for determining the reference point set comprises the following steps:
for any one radar point, determining coordinates of a projection point of the radar point in the vertical direction, namely, coordinate components of the radar point in an X axis and a Y axis (the X axis and the Y axis are coordinate axes perpendicular to each other in the horizontal direction), and determining coordinate components of four vertexes of the two-dimensional labeling frame in the X axis and the Y axis, wherein the four vertexes are respectively defined as a first vertex, a second vertex, a third vertex and a fourth vertex along the clockwise direction (or along the anticlockwise direction).
Determining a first cross multiplication result of a first vector and a second vector, wherein the first vector is a vector from a first vertex to a second vertex, and the second vector is a vector from the first vertex to a projection point;
determining a second binary multiplication result of a third vector and a fourth vector, wherein the third vector is a vector from the third vertex to the fourth vertex, and the fourth vector is a vector from the third vertex to the projection point;
determining third-fork multiplication results of a fifth vector and a sixth vector, wherein the fifth vector is a vector from a fourth vertex to the first vertex, and the sixth vector is a vector from the fourth vertex to a projection point;
determining third-corner multiplication results of a seventh vector and an eighth vector, wherein the seventh vector is a vector from the second vertex to the third vertex, and the eighth vector is a vector from the second vertex to the projection point;
and if the product of the first cross multiplication result and the second cross multiplication result is larger than 0 and the product of the third cross multiplication result and the fourth cross multiplication result is larger than 0, determining that the projection of the radar point in the vertical direction is in the two-dimensional labeling frame.
S2103, for each coordinate axis of the world coordinate system, determining a numerical interval in which components of each radar point in the reference point set are located, and counting the number of radar points in each numerical interval.
According to the embodiment of the application, for the coordinate axis on each coordinate axis, a plurality of intervals are divided, and in practical application, the coordinate value of the point cloud is usually accurate to more than seven bits after the decimal point, and the original coordinate value can be obtained with two-bit precision, namely the interval of the interval obtained by taking the two-bit value after the decimal point. For example, when the sections are divided in 0.01 unit, if the coordinates of 4 radar points are-3.92323924, -3.93316546, -3.91954651, -3.90123234, it is known that the 4 radar points belong to-3.92, -3.93, -3.92 and-3.90 sections, respectively, and it is further determined that there are two radar points in the-3.92 section, -3.93 has one radar point, and that there are 1 radar point in the-3.90 section.
The number of the radar points in one interval can effectively reflect whether the radar points are target objects, so that in the embodiment of the application, the basis is laid for judging whether one interval is a reliable edge by counting the radar points in each interval of each coordinate axis.
S2104, sequencing the corresponding numerical value intervals according to the numerical value for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals of which the number of radar points is greater than a preset threshold value.
Because the edge on each coordinate axis includes two points, namely a maximum value and a minimum value, according to each coordinate axis, the embodiments of the present application sequence based on the magnitude of the corresponding value of the value interval, and traverse each value interval from two ends to the middle of the sequence, please refer to fig. 2e, which exemplarily shows a schematic diagram of traversing the sequence, the sequence gradually increases from left to right value interval, and traverses from two ends to the middle of the sequence, taking the left end (lowAxis) as an example, firstly, whether the number of radar points in the-4.06 interval is greater than a preset threshold value is determined, if not, whether the number of radar points in the-4.05 interval is greater than the preset threshold value is continuously determined, and if the number of radar points in the-4.05 interval is determined to be greater than the preset threshold value, determining that the-4.05 is an extremum of the three-dimensional bounding box of the target object in the corresponding coordinate axis is determined. According to the embodiment of the application, for each end, when the number of radar points is larger than the numerical value interval of the preset threshold value, one point corresponding to the edge is found, and when both ends are traversed to the target numerical value interval, the connecting line between the coordinates corresponding to the two target numerical value intervals can form the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis. And obtaining the three-dimensional bounding box of the target object by obtaining the edges of the 3 coordinate axes.
Referring to fig. 2f, a schematic diagram of different views of a three-dimensional bounding box provided by an embodiment of the present application is shown schematically, where, as shown in the figure, radar points inside a boundary of the three-dimensional bounding box are represented by hollow points, radar points outside the boundary of the three-dimensional bounding box are represented by solid points, and it can be more easily understood through a side view and a top view, and the top view shows edges of the three-dimensional bounding box in a vertical direction, including an upper edge and a lower edge, and it can be seen that the upper edge and the lower edge have a certain distance compared with an upper extreme point and a lower extreme point of a radar point cloud. The top view shows the 4 edges of the three-dimensional bounding box in the X-axis and Y-axis.
According to the embodiment of the application, the target object is marked in the top view of the radar point cloud to obtain the two-dimensional marking frame, as the condition that the object is overlapped in the vertical direction does not exist in the Lei Dadian cloud in actual application, the two-dimensional marking frame can be marked in the top view to ensure that the two-dimensional marking frame also only comprises one object, namely the target object, compared with the prior art, the three-dimensional surrounding box can be finally obtained by marking the three-dimensional surrounding box respectively from 3 coordinate axes, the marking efficiency can be greatly improved.
On the basis of the foregoing embodiments, as an optional embodiment, determining a numerical interval in which each radar point in the reference point set is located in a component of the coordinate axis, and counting the number of radar points in each numerical interval includes:
for each radar point in the reference point set, determining a numerical value interval in which a component of the radar point in the coordinate axis is located;
if the key is determined to be a key value pair of the numerical value interval, adding a preset numerical value to the value of the key value pair;
if it is determined that no key value pair exists, wherein the key value pair is the numerical value section, a new key value pair, the key of which is the numerical value section, is created, and the value of the new key value pair is set to be the preset numerical value.
The components of each coordinate axis of the three-dimensional coordinate system of the radar point are determined in the numerical intervals of the corresponding coordinate axes, that is, the components of three coordinate axes of one radar point are generally different in the numerical intervals of the corresponding coordinate axes. For example, if the coordinates of one radar point are (1.23123,4.79809,2.46971) and the accuracy of the numerical range is two decimal places, the three-axis components of the radar point are located in the numerical ranges 1.23, 4.80, and 2.47, respectively.
After the numerical value interval of the radar point in the component of the coordinate axis is determined, whether a key is a key value pair of the numerical value interval is further judged, if yes, a preset numerical value is added to the key value pair, the preset numerical value can be 1, if not, a new key value pair is created, and the value of the new key value pair is set to be the preset numerical value.
The embodiment of the application can traverse the components of all radar points in the reference point set in the coordinate axis in order from large to small or from small to large for each coordinate axis so as to create all key value pairs corresponding to the coordinate axis.
In some embodiments, embodiments of the present application may utilize HashMap to store key-value pairs. HashMap is a key value pair storage structure in Java collection framework for realizing Map interface. The method uses the hash table to store data and decides the storage position according to the hash value of the key, thereby realizing rapid insertion, deletion and searching operations.
The keys and values in HashMap may be any type of object, but it is required that the keys be unique and the values be repeatable. HashMap allows use of null as a key and value, and allows storage of the value corresponding to the null key.
The internal implementation of HashMap is based on a combined structure of arrays and linked lists (or red-black trees), each array element being called a bucket, and each bucket storing a linked list of several key-value pairs (or red-black trees). When storing and retrieving data, the positions in the array are calculated by hash functions based on the hash values of the keys, and then operate in the corresponding linked list (or red-black tree). HashMap provides efficient insert, delete and find operations and has a faster access speed.
On the basis of the foregoing embodiments, as an alternative embodiment, sorting the corresponding numerical intervals according to the numerical sizes, traversing each numerical interval from two ends to the middle of the sequence includes:
sequencing each numerical value interval according to the size to obtain an ordered array;
traversing from two ends of the ordered array to the middle respectively, wherein for a currently traversed numerical value interval, a query key is a reference key value pair of the currently traversed numerical value interval;
if the value of the reference key value pair is larger than the preset threshold value, determining the currently traversed numerical value interval as an extreme value of the edge of the corresponding coordinate axis, and stopping traversing from the currently traversed numerical value interval to the middle;
And if the value of the reference key value pair is not greater than the preset threshold value, continuing traversing from the currently traversed numerical value interval to the middle.
According to the embodiment of the application, for each obtained numerical value interval corresponding to each coordinate axis, the numerical value intervals are ordered according to the sequence from small to large or from large to small, so as to obtain an ordered array. Each number in the ordered array is a number interval, and traverses from two ends of the ordered array to the middle, so as to obtain the maximum value and the minimum value of the edge respectively. Two threads may be provided for each ordered array, traversing from one end of the ordered array to the middle, respectively. For the array interval of the current variable of each thread, the thread inquiry key is a reference key value pair of the currently traversed numerical value interval, if the value of the reference key value pair is larger than the preset threshold value, the currently traversed numerical value interval is determined to be an extreme value of the edge of the corresponding coordinate axis, the traversal from the currently traversed numerical value interval to the middle is stopped, and the process is ended; and if the value of the reference key value pair is not greater than the preset threshold value, continuing traversing from the currently traversed numerical value interval to the middle.
Referring to fig. 3, a schematic flow chart of determining an edge of a three-dimensional bounding box of a target object at a corresponding coordinate axis according to an embodiment of the present application is shown, where the flow may include two parts, a first part is a number of radar points in a statistics value interval, and a second part is a traversal sequence:
For the first part, the preset bit precision of the component of each radar point in the coordinate axis is reserved firstly to serve as a numerical interval where the component of the corresponding radar point in the coordinate axis is located, for example, the component of one radar point in the Z axis is 1.231234, and if the 2-bit precision after the decimal point is reserved, the numerical interval where the component of the radar point in the Z axis is located is determined to be 1.23. Further judging whether a key exists in the HashMap as a key value pair of a numerical value interval, if the key exists as the key value pair of the numerical value interval, adding a preset numerical value to the value of the key value pair; if it is determined that no key value pair with the key being the numerical value interval exists, a new key value pair with the key being the numerical value interval is created, and the value of the new key value pair is set as the preset numerical value.
For the second stage, the obtained numerical intervals are firstly ordered according to the size to form an ordered array, and then the ordered array is traversed through double Pointers (Two Pointers), wherein the double Pointers are accessed by using Two Pointers instead of a single pointer in the process of traversing elements, so that the corresponding purpose is achieved. If the two pointers are opposite in direction, they are referred to as "clash pointers". If the two pointers are in the same direction, they are called "fast and slow pointers". In the embodiment of the application, the collision pointers are adopted, wherein the collision pointers refer to two pointers left and right pointing to the first element and the last element of the sequence respectively, then the left pointer is continuously increased, and the right is continuously decreased until the values of the two pointers collide (i.e. left= right), or other requirements are met.
The present application does not require the values of the two pointers to collide, but rather requires certain conditions to be met: the value of the reference key value pair is greater than the preset threshold, namely if the value of the traversed reference key value pair is greater than 3, the pointer is not changed any more, the value is determined to be the extreme value of the edge, and if the value of the reference key value pair is not greater than the preset threshold, the traversing from the currently traversed numerical value interval to the middle is continued.
On the basis of the above embodiments, as an optional embodiment, determining a numerical interval in which each radar point in the reference point set is located in a component of the coordinate axis further includes:
and obtaining a three-dimensional labeling frame corresponding to the two-dimensional labeling frame, wherein the boundary of the three-dimensional labeling frame in the horizontal direction is the coordinate of the two-dimensional labeling frame in the world coordinate system, and the boundary of the three-dimensional labeling frame in the vertical direction is a preset coordinate.
In the embodiment of the present application, according to the position of the two-dimensional labeling frame in the world coordinate system, a three-dimensional labeling frame corresponding to the two-dimensional labeling frame is further created, please refer to fig. 4a, which schematically illustrates a schematic diagram of a three-dimensional labeling frame corresponding to the two-dimensional labeling frame provided in the embodiment of the present application, as shown in the figure, the two-dimensional labeling frame 4101 is labeled by a label person in a top view of a radar point cloud, and in general, projections of all radar points of a target object in a vertical direction are located in the two-dimensional labeling diagram.
It should be noted that in the point cloud labeling scene of automatic driving, the point clouds of all vehicles can be on the same horizontal plane on the task, that is, on the one hand, the situation that two vehicles overlap is not existed, and on the other hand, the heights of a large number of radar points are relatively concentrated, so that the embodiment of the application can correspondingly set the length of the three-dimensional labeling frame in the vertical direction in the world coordinate system by counting the conventional heights of objects and setting the height of the three-dimensional labeling frame, for example, the height of a general automobile is 1.5 meters.
Determining a numerical value interval of each radar point in the reference point set in the component of the coordinate axis comprises the following steps:
determining a subset of reference points located outside the boundary of the three-dimensional annotation frame from the set of reference points;
and for the coordinate axis in the vertical direction, determining a numerical value interval of each radar point in the reference point subset in the component of the coordinate axis.
Referring to fig. 4b, which is an exemplary schematic diagram showing the relative position of a three-dimensional labeling frame in a space coordinate system, it can be clearly seen from the diagram that the projection of all the coordinates in the space 4201 in the vertical direction is located in the two-dimensional labeling frame 4202, the three-dimensional labeling frame 4203 is located in the middle of the space 4201, the upper edge in the vertical direction of the three-dimensional bounding box of the target object exists above the three-dimensional labeling frame 4203, the lower edge in the vertical direction of the three-dimensional bounding box of the target object exists below the three-dimensional labeling frame 4203, when calculating the Z-axis boundary, only a few radar points need to be traversed by excluding radar points in the three-dimensional labeling frame, and since the three-dimensional labeling frame is located in the middle of the space, the confirmation of the edges in the vertical direction is not affected.
On the basis of the above embodiments, as an alternative embodiment, the magnitude of the preset threshold value of the embodiment of the present application is proportional to the number of radar points in the reference point set. That is, the preset threshold may be set smaller when the number of radar points in the reference point set is small, and larger if the number of radar points in the reference point set is large. By adjusting the value of the preset threshold under different scenes, the edge detection can be more accurate.
On the basis of the above embodiments, as an alternative embodiment, obtaining coordinates of the two-dimensional labeling frame in the world coordinate system includes:
displaying a top view of the radar point cloud;
generating a two-dimensional annotation frame in response to the annotation operation of the target object;
determining the coordinates of the two-dimensional annotation frame in a screen coordinate system;
and converting the coordinates of the two-dimensional annotation frame in the screen coordinate system into the coordinates of the world coordinate system.
In the embodiment of the application, a annotator starts an annotating program and imports a radar point cloud to be annotated to the annotating program, an operation interface schematic diagram of the annotating Lei Dadian cloud shown in fig. 5a is displayed, the operation interface comprises a radar point cloud 5101, a first control 5102 for adjusting a visual angle and a second control 5103 for adjusting a scale of the radar point cloud, the annotator can provide the first control 5102 to adjust the scale to a visual angle view of the radar point cloud observed from a corresponding coordinate axis, the annotator can adjust the scale to be larger through the second control 5103, the radar point cloud is displayed to be larger and more detailed, the scale is reduced, and the annotator can rapidly determine a target object to be annotated through the first control and the second control. When the annotator adjusts the radar point cloud to a top view through the first control and adjusts the scale of the radar point cloud to a proper distance through the second control, an operation interface schematic diagram of the point cloud of the target object shown in fig. 5b from the top view is displayed, the operation interface further comprises a third control 5104 for annotating, the operation interface displays a two-dimensional annotation frame 5105 with adjustable size in response to clicking of the third control 5104 by the annotator, the annotator surrounds the target object in the two-dimensional annotation frame 5105 by adjusting the size of the two-dimensional annotation frame 5105, the operation interface schematic diagram of the two-dimensional annotation frame shown in fig. 5c is displayed, it is understood that the size of the two-dimensional annotation frame 5105 is not strictly limited, and only the full number of radar points of the target object are contained under the condition that the target object is visible to naked eyes, the annotator further clicks the operation interface schematic diagram of the three-dimensional enclosure box 5106 for displaying the target object shown in fig. 5d to indicate the completion after clicking of the fourth control 5105, the annotator can be seen from the figure, the three-dimensional enclosure box can be automatically generated according to the manual enclosure model, and the three-dimensional enclosure model can be automatically observed under different scale inspection modes by the control, and the three-dimensional enclosure model can be continuously observed under different operation models.
When the embodiment of the application generates the two-dimensional annotation frame, the coordinates of the two-dimensional annotation frame in the screen coordinate system are firstly determined, and the final three-dimensional bounding box is the position in the world coordinate system, so the embodiment of the application needs to convert the coordinates of the two-dimensional annotation frame in the screen coordinate system into the coordinates of the world coordinate system, and specifically, the conversion process can be divided into the following steps:
first, the coordinates (referred to as screen coordinates) of each vertex of the two-dimensional label frame in the screen coordinate system, which is one two-dimensional coordinate (x, y), are acquired.
The screen coordinates are then converted to normalized screen coordinates, specifically by dividing the screen coordinates by the width and height of the container, then multiplying by 2 and subtracting 1. This process can be expressed by the following formula:
vec.x=2(clientPos.x/container.width)-1
vec.y=-2(clientPos.y/container.height)-1
Vec.z=0.5;
wherein vec.x, vec.y and vec.z represent components of the normalized screen coordinates in x, y and z axis directions, respectively, it should be noted that the z axis is perpendicular to the direction of the screen, the clientpos.x and clientpos.y represent components of the screen coordinates in X, Y axis directions, respectively, and the container, i.e. the display area of the display screen, is the width and height of the container, respectively.
It should be noted that the calculation of the clientpos.x/container.width and the clientpos.y/container.height is to normalize the screen coordinates so that the range is within the range of 0,1, for example, if one vertex is located at the middle of the screen, the values of the clientpos.x/container.width and the clientpos.y/container.height are 0.5.
The calculation (clientpos.x/container.width) ×2 is to map the screen coordinates to the range of [0,2], if the result obtained in the previous step is 0.5, the result obtained in the step of band is 1.
2 (clientpos.x/container.width) -1 and-2 (clientpos.y/container.height) -1 are calculated to map the screen coordinates into the range of [ -1,1], that is, normalize the screen coordinates, and if the result obtained in the previous step is 1, the result obtained in this step is 0.
Through the normalization processing, the screen coordinates can be converted into a coordinate system used by the rendering functional limits in a summarizing mode, and calculation and operation can be performed under the coordinate system later.
The normalized device coordinates are then converted to world coordinates.
This is achieved by using the unproject method, which converts a range of normalized screen coordinates (-1, 1) into camera space. In computer graphics, the world coordinate system and the camera space (or view space) are two different coordinate systems that are used to describe the position and orientation of an object in three-dimensional space. Camera space, which is a coordinate system relative to the observer (or camera). In camera space, the camera is located at the origin, oriented in the-z-axis direction, with the y-axis generally defined as the "up" direction. All objects are positioned and oriented relative to the camera. In other words, the position and orientation of the object are determined with respect to the position and orientation of the camera, which makes it easier to handle visual effects such as perspective and field clipping.
The embodiment of the application regards a camera for collecting radar point clouds as a perspective camera.
Firstly, the difference value (also called as a distance vector) between the normalized screen coordinates and the camera position is normalized, then the distance between the camera position and the plane is calculated, and finally the distance is multiplied by the normalized vector and the camera position is added to obtain the world coordinates. This process can be expressed by the following formula:
Nvec=normalize(vec-camera.position)
distance=-camera.position.z/Nvec.z
worldPos=camera.position+Nvec×distance
in the above formula, -camera. Position refers to the position of the camera in camera space, and the distance vector Nvec is the vector in camera space, and by adding nvec×distance (scalar) to the camera position, we can get a new position pos, which is also in the world coordinate system, where the z-axis component of camera. Position needs to be divided by the z-axis component of Nvec, because the camera is perpendicular to the X-Y plane in top view.
Referring to fig. 6, a flow chart of a labeling method of a radar point cloud according to an embodiment of the present application is shown, and as shown in the drawing, the method includes:
s601, displaying a top view of the radar point cloud, and generating a two-dimensional annotation frame in response to the annotation operation of the target object;
S602, determining the coordinates of the two-dimensional annotation frame in a screen coordinate system, and converting the coordinates of the two-dimensional annotation frame in the screen coordinate system into coordinates of a world coordinate system;
s603, determining a reference point set from the radar point cloud according to the coordinates of the radar point cloud and the two-dimensional annotation frame in the world coordinate system;
s604, for each radar point in the reference point set, determining a numerical value interval in which a component of the radar point in the coordinate axis is located;
s605, if the existence of the key is determined to be the key value pair of the numerical value interval, adding a preset numerical value to the value of the key value pair, otherwise, creating a new key value pair of which the key is the numerical value interval, and setting the value of the new key value pair as the preset numerical value;
s606, sequencing each numerical value interval according to the size to obtain an ordered array;
s607, traversing from two ends of the ordered array to the middle respectively, wherein for a currently traversed numerical value interval, a query key is a reference key value pair of the currently traversed numerical value interval;
s608, if the value of the reference key value pair is larger than the preset threshold value, determining the currently traversed numerical value interval as an extreme value of the edge of the corresponding coordinate axis, and stopping traversing from the currently traversed numerical value interval to the middle; and otherwise, continuing to traverse from the currently traversed numerical value interval to the middle.
Referring to fig. 7, a schematic view of a scenario applied to automatic driving algorithm training in an embodiment of the present application is shown, where the scenario includes a vehicle 702 deployed with a laser radar 701, a terminal 703, a labeling server 704, and a model training server 705, specifically, in a road running process, the vehicle 702 scans a surrounding environment through the laser radar to obtain a radar point cloud and sends the radar point cloud to the terminal 703, the terminal 703 displays a top view of the radar point cloud, a two-dimensional labeling frame of at least one target object is labeled in the top view by a label person, the terminal 703 sends coordinates of the two-dimensional labeling frame in a screen coordinate system and coordinates of the radar point cloud in a world coordinate system to the labeling server 704, the labeling server 704 converts the coordinates of the two-dimensional labeling frame in the screen coordinate system into coordinates of the world coordinate system, and determines a reference point set from the radar point cloud according to the coordinates of the radar point cloud and the two-dimensional labeling frame in the world coordinate system, and the radar point in the reference point set is projected in a vertical direction in the two-dimensional labeling frame; for each coordinate axis of the world coordinate system, determining a numerical interval in which components of each radar point in the reference point set are located, and counting the number of radar points in each numerical interval; and sequencing the corresponding numerical value intervals according to the numerical value size for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals of which the number of the radar points is greater than a preset threshold value. The labeling server 704 sends the labeled radar point cloud to the model training server 705, and the model training server 705 understands the state of the vehicle 702 according to the labeled radar point cloud, which may specifically include: the condition of the vehicle is judged according to information of a target object, such as lane lines, obstacles, traffic lights and the like, the influence of external environments such as priorities of the target object on the vehicle is perceived, the track or intention of the obstacle can be judged according to the obstacle information, and the track of the obstacle is judged relatively long-term according to the short-term track and intention.
The embodiment of the application provides a labeling device for a radar point cloud, as shown in fig. 8, the labeling device for the radar point cloud can comprise: a callout box determination module 801, a reference click determination module 802, a statistics module 803, and a traversal module 804, wherein,
the labeling frame determining module 801 is configured to obtain coordinates of a radar point cloud and a two-dimensional labeling frame under a world coordinate system, where the two-dimensional labeling frame is obtained by labeling a target object in the radar point cloud in a top view of the Lei Dadian cloud;
a reference click determining module 802, configured to determine a reference point set from the radar point cloud according to coordinates of the radar point cloud and a two-dimensional labeling frame in the world coordinate system, where radar points in the reference point set are projected in a vertical direction and are located in the two-dimensional labeling frame;
a statistics module 803, configured to determine, for each coordinate axis of the world coordinate system, a numerical interval in which a component of each radar point in the reference point set is located in the coordinate axis, and count the number of radar points in each numerical interval;
the traversing module 804 is configured to sort, for each coordinate axis, the corresponding numerical intervals according to the numerical values, traverse each numerical interval from two ends of the sequence to the middle, and determine, when the two ends respectively traverse the target numerical intervals with the number of radar points greater than the preset threshold, an edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to coordinate values corresponding to the two target numerical intervals.
As an optional implementation manner, the statistics module determines a numerical interval in which each radar point in the reference point set is located in a component of the coordinate axis, and counts the number of radar points in each numerical interval, including:
for each radar point in the reference point set, determining a numerical value interval in which a component of the radar point in the coordinate axis is located;
if the key is determined to be a key value pair of the numerical value interval, adding a preset numerical value to the value of the key value pair;
if it is determined that no key value pair with the key being the numerical value interval exists, a new key value pair with the key being the numerical value interval is created, and the value of the new key value pair is set as the preset numerical value.
As an optional implementation manner, the traversing module sorts the corresponding numerical intervals according to the numerical sizes, and traverses each numerical interval from two ends to the middle of the sequence, including:
sequencing each numerical value interval according to the size to obtain an ordered array;
traversing from two ends of the ordered array to the middle respectively, wherein for a currently traversed numerical value interval, a query key is a reference key value pair of the currently traversed numerical value interval;
if the value of the reference key value pair is larger than the preset threshold value, determining the currently traversed numerical value interval as an extreme value of the edge of the corresponding coordinate axis, and stopping traversing from the currently traversed numerical value interval to the middle;
And if the value of the reference key value pair is not greater than the preset threshold value, continuing traversing from the currently traversed numerical value interval to the middle.
As an optional implementation manner, the statistical module determines a numerical value interval where each radar point in the reference point set is located in a component of the coordinate axis, and further includes:
obtaining a three-dimensional labeling frame corresponding to the two-dimensional labeling frame, wherein the coordinates of the two-dimensional labeling frame in the world coordinate system are at the boundary of the three-dimensional labeling frame in the horizontal direction, and the boundary of the three-dimensional labeling frame in the vertical direction is a preset coordinate;
the traversing module determines a numerical value interval of each radar point in the reference point set in the component of the coordinate axis, and the numerical value interval comprises the following steps:
determining a subset of reference points located outside the boundary of the three-dimensional annotation frame from the set of reference points;
and for the coordinate axis in the vertical direction, determining a numerical value interval of each radar point in the reference point subset in the component of the coordinate axis.
As an optional implementation manner, the labeling frame determining module obtains coordinates of the two-dimensional labeling frame in a world coordinate system, including:
displaying a top view of the radar point cloud;
generating a two-dimensional annotation frame in response to the annotation operation of the target object;
Determining the coordinates of the two-dimensional annotation frame in a screen coordinate system;
and converting the coordinates of the two-dimensional annotation frame in the screen coordinate system into the coordinates of the world coordinate system.
As an optional implementation manner, the label frame determining module converts the coordinates of the two-dimensional label frame in the screen coordinate system into the coordinates of the world coordinate system, including:
normalizing the coordinates of the two-dimensional annotation frame in a screen coordinate system to obtain normalized screen coordinates;
normalizing the distance vector to obtain a normalized distance vector, wherein the distance vector is used for representing the distance between the normalized screen coordinates and the preset camera position;
determining the ratio of the camera position to the component of the distance vector in the vertical direction, and obtaining the distance between the two-dimensional annotation frame and the camera position according to the normalized distance vector and the ratio;
and obtaining the coordinates of the two-dimensional annotation frame in a world coordinate system according to the camera position and the distance between the two-dimensional annotation frame and the camera position.
As an alternative embodiment, the magnitude of the preset threshold is proportional to the number of radar points in the reference point set.
The device of the embodiment of the present application may execute the labeling method of the radar point cloud provided by the embodiment of the present application, and its implementation principle is similar, and actions executed by each module in the device of each embodiment of the present application correspond to steps in the method of each embodiment of the present application, and detailed functional descriptions of each module of the device may be referred to in the corresponding method shown in the foregoing, which is not repeated herein.
The embodiment of the application provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of a labeling method of radar point clouds, and compared with the related technology, the method can realize the following steps: according to the embodiment of the application, the target object is marked in the top view of the radar point cloud to obtain the two-dimensional marking frame, as the condition that the object is overlapped in the vertical direction does not exist in the Lei Dadian cloud in actual application, the two-dimensional marking frame can be marked in the top view to ensure that the two-dimensional marking frame also only comprises one object, namely the target object, compared with the prior art, the three-dimensional surrounding box can be finally obtained by marking the three-dimensional surrounding box respectively from 3 coordinate axes, the marking efficiency can be greatly improved.
In an alternative embodiment, there is provided an electronic device, as shown in fig. 9, the electronic device 4000 shown in fig. 9 includes: a processor 4001 and a memory 4003. Wherein the processor 4001 is coupled to the memory 4003, such as via a bus 4002. Optionally, the electronic device 4000 may further comprise a transceiver 4004, the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data, etc. It should be noted that, in practical applications, the transceiver 4004 is not limited to one, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.
The processor 4001 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules and circuits described in connection with this disclosure. The processor 4001 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.
Bus 4002 may include a path to transfer information between the aforementioned components. Bus 4002 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The bus 4002 can be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 9, but not only one bus or one type of bus.
Memory 4003 may be, but is not limited to, ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, EEPROM (Electrically Erasable Programmable Read Only Memory ), CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer.
The memory 4003 is used for storing a computer program for executing an embodiment of the present application, and is controlled to be executed by the processor 4001. The processor 4001 is configured to execute a computer program stored in the memory 4003 to realize the steps shown in the foregoing method embodiment.
Embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the foregoing method embodiments and corresponding content.
The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program can realize the steps and corresponding contents of the embodiment of the method when being executed by a processor.
The terms "first," "second," "third," "fourth," "1," "2," and the like in the description and in the claims and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such that the embodiments of the application described herein may be implemented in other sequences than those illustrated or otherwise described.
It should be understood that, although various operation steps are indicated by arrows in the flowcharts of the embodiments of the present application, the order in which these steps are implemented is not limited to the order indicated by the arrows. In some implementations of embodiments of the application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages can be flexibly configured according to the requirement, which is not limited by the embodiment of the present application.
The foregoing is merely an optional implementation manner of some of the implementation scenarios of the present application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the present application are adopted without departing from the technical ideas of the scheme of the present application, and the implementation manner is also within the protection scope of the embodiments of the present application.

Claims (10)

1. The labeling method of the radar point cloud is characterized by comprising the following steps of:
acquiring coordinates of a radar point cloud and a two-dimensional labeling frame under a world coordinate system, wherein the two-dimensional labeling frame is obtained by labeling a target object in the radar point cloud in a top view of the Lei Dadian cloud;
determining a reference point set from the radar point cloud according to the coordinates of the radar point cloud and the two-dimensional annotation frame in the world coordinate system, wherein radar points in the reference point set are projected and positioned in the two-dimensional annotation frame in the vertical direction;
for each coordinate axis of the world coordinate system, determining a numerical interval in which components of each radar point in the reference point set are located, and counting the number of radar points in each numerical interval;
and sequencing the corresponding numerical value intervals according to the numerical value size for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals of which the number of the radar points is greater than a preset threshold value.
2. The method of claim 1, wherein determining the value intervals in which the components of the coordinate axes of the reference point set of the radar points lie, and counting the number of radar points in each value interval, comprises:
For each radar point in the reference point set, determining a numerical value interval in which a component of the radar point in the coordinate axis is located;
if the key is determined to be a key value pair of the numerical value interval, adding a preset numerical value to the value of the key value pair;
if it is determined that no key value pair with the key being the numerical value interval exists, a new key value pair with the key being the numerical value interval is created, and the value of the new key value pair is set as the preset numerical value.
3. The method of claim 2, wherein the sorting the corresponding value intervals by value size, traversing each value interval from both ends of the sequence to the middle, comprises:
sequencing each numerical value interval according to the size to obtain an ordered array;
traversing from two ends of the ordered array to the middle respectively, wherein for a currently traversed numerical value interval, a query key is a reference key value pair of the currently traversed numerical value interval;
if the value of the reference key value pair is larger than the preset threshold value, determining the currently traversed numerical value interval as an extreme value of the edge of the corresponding coordinate axis, and stopping traversing from the currently traversed numerical value interval to the middle;
and if the value of the reference key value pair is not greater than the preset threshold value, continuing traversing from the currently traversed numerical value interval to the middle.
4. A method according to any one of claims 1-3, wherein said determining a value interval in which each radar point in said reference point set lies in a component of said coordinate axis further comprises, before:
obtaining a three-dimensional labeling frame corresponding to the two-dimensional labeling frame, wherein the coordinates of the two-dimensional labeling frame in the world coordinate system are at the boundary of the three-dimensional labeling frame in the horizontal direction, and the boundary of the three-dimensional labeling frame in the vertical direction is a preset coordinate;
determining a numerical value interval of each radar point in the reference point set in the component of the coordinate axis comprises the following steps:
determining a subset of reference points located outside the boundary of the three-dimensional annotation frame from the set of reference points;
and for the coordinate axis in the vertical direction, determining a numerical value interval of each radar point in the reference point subset in the component of the coordinate axis.
5. The method of claim 1, wherein obtaining coordinates of the two-dimensional annotation frame in a world coordinate system comprises:
displaying a top view of the radar point cloud;
generating a two-dimensional annotation frame in response to the annotation operation of the target object;
determining the coordinates of the two-dimensional annotation frame in a screen coordinate system;
And converting the coordinates of the two-dimensional annotation frame in the screen coordinate system into the coordinates of the world coordinate system.
6. The method of claim 5, wherein converting the coordinates of the two-dimensional annotation frame in the screen coordinate system to coordinates in the world coordinate system comprises:
normalizing the coordinates of the two-dimensional annotation frame in a screen coordinate system to obtain normalized screen coordinates;
normalizing the distance vector to obtain a normalized distance vector, wherein the distance vector is used for representing the distance between the normalized screen coordinates and the preset camera position;
determining the ratio of the camera position to the component of the distance vector in the vertical direction, and obtaining the distance between the two-dimensional annotation frame and the camera position according to the normalized distance vector and the ratio;
and obtaining the coordinates of the two-dimensional annotation frame in a world coordinate system according to the camera position and the distance between the two-dimensional annotation frame and the camera position.
7. The method of claim 1, wherein the magnitude of the preset threshold is proportional to the number of radar points in the set of reference points.
8. A radar point cloud labeling device, comprising:
the labeling frame determining module is used for obtaining coordinates of a radar point cloud and a two-dimensional labeling frame under a world coordinate system, wherein the two-dimensional labeling frame is obtained by labeling a target object in the radar point cloud in a top view of the Lei Dadian cloud;
The reference click determining module is used for determining a reference point set from the radar point cloud according to the coordinates of the radar point cloud and the two-dimensional annotation frame in the world coordinate system, and radar points in the reference point set are projected in the vertical direction and are positioned in the two-dimensional annotation frame;
the statistics module is used for determining a numerical value interval in which components of all radar points in the reference point set are located in the coordinate axes for each coordinate axis of the world coordinate system, and counting the number of the radar points in each numerical value interval;
and the traversing module is used for sequencing the corresponding numerical value intervals according to the numerical value for each coordinate axis, traversing each numerical value interval from two ends of the sequence to the middle, and determining the edge of the three-dimensional bounding box of the target object on the corresponding coordinate axis according to the coordinate values corresponding to the two target numerical value intervals when the two ends respectively traverse the target numerical value intervals with the number of the radar points being greater than the preset threshold value.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to carry out the steps of the method of labeling a radar point cloud according to any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of labeling a radar point cloud according to any of claims 1-7.
CN202311163344.XA 2023-09-08 2023-09-08 Lei Dadian cloud labeling method and device, electronic equipment and storage medium Pending CN117197419A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311163344.XA CN117197419A (en) 2023-09-08 2023-09-08 Lei Dadian cloud labeling method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311163344.XA CN117197419A (en) 2023-09-08 2023-09-08 Lei Dadian cloud labeling method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117197419A true CN117197419A (en) 2023-12-08

Family

ID=88983003

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311163344.XA Pending CN117197419A (en) 2023-09-08 2023-09-08 Lei Dadian cloud labeling method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117197419A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118041918A (en) * 2024-04-12 2024-05-14 天城智创(天津)科技有限公司 Distributed informationized data transmission method for digital visual platform

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118041918A (en) * 2024-04-12 2024-05-14 天城智创(天津)科技有限公司 Distributed informationized data transmission method for digital visual platform

Similar Documents

Publication Publication Date Title
CN109461211B (en) Semantic vector map construction method and device based on visual point cloud and electronic equipment
CN111563442B (en) Slam method and system for fusing point cloud and camera image data based on laser radar
KR102125959B1 (en) Method and apparatus for determining a matching relationship between point cloud data
Schauer et al. The peopleremover—removing dynamic objects from 3-d point cloud data by traversing a voxel occupancy grid
CN108898676B (en) Method and system for detecting collision and shielding between virtual and real objects
CN109034077B (en) Three-dimensional point cloud marking method and device based on multi-scale feature learning
CN111665842B (en) Indoor SLAM mapping method and system based on semantic information fusion
CN110176078B (en) Method and device for labeling training set data
Hoppe et al. Incremental Surface Extraction from Sparse Structure-from-Motion Point Clouds.
KR102195164B1 (en) System and method for multiple object detection using multi-LiDAR
CN112097732A (en) Binocular camera-based three-dimensional distance measurement method, system, equipment and readable storage medium
CN111402414A (en) Point cloud map construction method, device, equipment and storage medium
CN117197419A (en) Lei Dadian cloud labeling method and device, electronic equipment and storage medium
CN112257668A (en) Main and auxiliary road judging method and device, electronic equipment and storage medium
Singer et al. Dales objects: A large scale benchmark dataset for instance segmentation in aerial lidar
Carrillo et al. Urbannet: Leveraging urban maps for long range 3d object detection
CN112381873B (en) Data labeling method and device
US20230169680A1 (en) Beijing *** netcom science technology co., ltd.
CN115683109B (en) Visual dynamic obstacle detection method based on CUDA and three-dimensional grid map
CN116642490A (en) Visual positioning navigation method based on hybrid map, robot and storage medium
CN114627365B (en) Scene re-recognition method and device, electronic equipment and storage medium
WO2023283929A1 (en) Method and apparatus for calibrating external parameters of binocular camera
CN114565906A (en) Obstacle detection method, obstacle detection device, electronic device, and storage medium
Ruf et al. Towards real-time change detection in videos based on existing 3D models
CN113901903A (en) Road identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication