CN113808077A - Target detection method, device, equipment and storage medium - Google Patents

Target detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN113808077A
CN113808077A CN202110896460.7A CN202110896460A CN113808077A CN 113808077 A CN113808077 A CN 113808077A CN 202110896460 A CN202110896460 A CN 202110896460A CN 113808077 A CN113808077 A CN 113808077A
Authority
CN
China
Prior art keywords
attention mechanism
characteristic diagram
channel
target
weighting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110896460.7A
Other languages
Chinese (zh)
Inventor
聂泳忠
杨素伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiren Ma Diyan Beijing Technology Co ltd
Original Assignee
Xiren Ma Diyan Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiren Ma Diyan Beijing Technology Co ltd filed Critical Xiren Ma Diyan Beijing Technology Co ltd
Priority to CN202110896460.7A priority Critical patent/CN113808077A/en
Publication of CN113808077A publication Critical patent/CN113808077A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Traffic Control Systems (AREA)

Abstract

The embodiment of the application discloses a target detection method, a device, equipment and a storage medium, wherein after a target graph is obtained based on point cloud data, the target graph is convolved, the diversity of the characteristics of the point cloud data is reserved to the maximum extent, on the basis, the characteristic graph obtained by graph convolution is weighted by using an attention mechanism, and the proportion of the area where a target is located is enhanced.

Description

Target detection method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of target detection, in particular to a target detection method, a target detection device, target detection equipment and a storage medium.
Background
With the development of artificial intelligence technology, automatic detection of environmental targets becomes a common means in the environmental sensing process.
When detecting an environmental target, a common method is to collect point cloud data of the surrounding environment by using a laser radar, and perform target detection according to the point cloud data.
When the traditional method is used for detecting the target based on point cloud data, the accuracy is poor.
Content of application
The embodiment of the application provides a target detection method, a target detection device, a target detection equipment and a storage medium, and can improve the accuracy of a detection result.
In a first aspect, an embodiment of the present application provides a target detection method, including:
generating a target graph according to the projection points of the point cloud data in the preset direction, wherein the target graph is a graph formed by the projection points and connecting lines between the projection points;
performing convolution operation on the nodes and the edges of the target graph, and extracting the node characteristics of the nodes and the edge characteristics of the edges to obtain a first characteristic graph;
weighting the first characteristic diagram according to an attention mechanism to obtain a second characteristic diagram corresponding to the first characteristic diagram;
and detecting the target according to the second feature map.
In a second aspect, an embodiment of the present application provides an object detection apparatus, including:
the target image generation module is used for generating a target image according to the projection points of the point cloud data in the preset direction, and the target image is a image formed by connecting lines between the projection points and the projection points;
the convolution module is used for performing convolution operation on the nodes and the edges of the target graph, and extracting the node characteristics of the nodes and the edge characteristics of the edges to obtain a first characteristic graph;
the weighting module is used for weighting the first characteristic diagram according to the attention mechanism to obtain a second characteristic diagram corresponding to the first characteristic diagram;
and the detection module is used for carrying out target detection according to the second characteristic diagram.
In a third aspect, an embodiment of the present application provides an electronic device, including:
the laser radar is used for acquiring point cloud data;
a processor;
a memory for storing computer program instructions;
the computer program instructions, when executed by a processor, implement the method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method according to the first aspect.
According to the target detection method, the device, the equipment and the storage medium, after the target graph is obtained based on the point cloud data, the target graph is convolved, the diversity of the characteristics of the point cloud data is reserved to the maximum extent, on the basis, the characteristic graph obtained by graph convolution is weighted by using an attention mechanism, the occupation ratio of the area where the target is located is enhanced, namely when the target detection is carried out, the diversity of the characteristics of the point cloud data is increased, the occupation ratio of the area where the target is located is enhanced, and therefore the accuracy of the target detection result is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a target detection method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a proxel according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a target graph provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a candidate graph according to an embodiment of the present application;
FIG. 5 is a flow chart of another method for detecting an object according to an embodiment of the present disclosure;
FIG. 6 is a flow chart of another method for detecting an object according to an embodiment of the present disclosure;
FIG. 7 is a schematic view of a cartridge according to an embodiment of the present disclosure;
fig. 8 is a structural diagram of an object detection apparatus according to an embodiment of the present application;
fig. 9 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Laser radar is an important environmental perception sensor, and is widely applied to relevant fields such as automatic driving, intelligent robots, man-machine interaction, behavior recognition and the like at present.
For example, the automatic driving vehicle can sense the surrounding environment according to the point cloud data of the surrounding environment acquired by the laser radar in the process of traveling, for example, surrounding obstacles can be sensed, so that the obstacles are avoided in time, and the driving safety is ensured.
When detecting targets such as obstacles, a common method is to input point cloud data into a deep convolutional neural network, and output a detection result by the deep convolutional neural network, where the deep convolutional neural network includes a pooling layer, and the pooling type of the pooling layer is maximum pooling, that is, a feature point with a maximum median in a local area is selected, so that excessive loss of features is caused, and accuracy of the detection result is reduced.
Therefore, the embodiment of the application provides a target detection method, which can improve the accuracy of a detection result. The method can be applied to the scene of sensing the surrounding environment by utilizing the laser radar. In addition, in the target detection method provided in the embodiment of the present application, the execution subject may be a target detection apparatus, or a processing module in the target detection apparatus for executing the target detection method. The target detection device can be integrated in intelligent equipment such as an automatic driving vehicle, a mobile robot and the like.
The embodiment of the present application takes an example in which a target detection device integrated in an autonomous vehicle executes a target detection method, and describes a target detection method provided in the embodiment of the present application.
Fig. 1 is a flowchart of a target detection method according to an embodiment of the present application.
As shown in fig. 1, the target detection method may include S110 to S140, which are specifically as follows:
and S110, generating a target map according to the projection points of the point cloud data in the preset direction.
The target map is a map formed by projected points and connecting lines between the projected points. The point cloud data can be collected by a laser radar, and the structure of the laser radar is not particularly limited in the embodiment of the application. In the running process of the vehicle, the laser radar acquires the point cloud data of the surrounding environment in real time, and a basis is provided for the vehicle to sense the surrounding environment.
The original point cloud data collected by the laser radar generally comprises (x, y, z, r) four dimensions, wherein x, y and z respectively represent an x axis, a y axis and a z axis of the point cloud data in a space coordinate system, and r is the reflectivity of a laser beam emitted by the laser radar to an object.
To increase the diversity of point cloud data features, in one embodiment, the dimensions of the original point cloud data may be expanded, for example, to (x, y, z, r, x, y, z, y, r, x, y, r, yv,yv,zv) Seven dimensions with a v subscript being the offset of the point cloud data relative to the centroid of all the point cloud data, e.g., xvIs the offset, y, of the abscissa of the point cloud data relative to the abscissa of the centroidvIs the offset of the ordinate of the point cloud data relative to the ordinate of the center of mass, zvIs the offset of the z coordinate of the point cloud data relative to the z coordinate of the centroid.
Considering that the point cloud data has a disorder characteristic, in order to improve the accuracy of the target detection result, in an embodiment, the disordered point cloud data may be converted into an ordered two-dimensional matrix, for example, the point cloud data may be projected in a preset direction to obtain a projection point. The two-dimensional matrix may also be referred to herein as a bird's eye view.
The preset direction is a preset projection direction of the point cloud data, and can be selected according to needs in practical application, for example, the direction in which the z axis is located can be used as the preset direction, that is, the point cloud data is projected to the z axis.
The target map is formed based on connecting lines between projection points and the projection points, the projection points can be projection points corresponding to all point cloud data or projection points corresponding to part of the point cloud data, and calculation cost can be saved by selecting part of the projection points.
In one embodiment, a relatively dense set of proxels may be selected and the portion of the graph generated by the connecting lines between proxels and proxels may be designated as the target graph.
In an embodiment, a plurality of receptor field point clusters can also be obtained by a way of random path walking, then expected values and variances of differences of distances of projection points in each receptor field point cluster in a vector space and a three-dimensional space are compared, and the receptor field point cluster with the minimum expected value and variance is recorded as a target graph.
The sensing field point cluster is a point cluster formed by projection points obtained according to a path random walk method, the path random walk method is a method for walking among the projection points according to a preset random walk length, and the walk path is a path formed by connecting the projection points according to a walk sequence. The random walk length, i.e. the number of projection points involved in the random walk, for example, the random walk length is 4, which means that the random walk involves 4 projection points, i.e. the random walk between 4 projection points.
Illustratively, walk between proxels A, B, C and D, in the order of A, C, D and B, and then the walk path is A-C-D-B.
The expected value is used for representing the similarity degree of the projection points in the receptive field point cluster, and the smaller the expected value is, the more similar the projection points in the receptive field point cluster are; the variance is used for representing the stability of the projection point in the receptive outlier cluster, and the smaller the variance is, the more stable the projection point in the receptive outlier cluster is.
For example, referring to fig. 2, fig. 2 is a projection point obtained by projecting the point cloud data to a predetermined direction, and for viewing convenience, fig. 2 represents the projection point as a circle, and in practical application, the number of the projection point is larger. Taking the way of random walk through the path as an example, the obtained receptor site cluster with the minimum expected value and variance is also the target map, as shown in fig. 3, at this time, the projection points included in the target map are partial projection points.
And S120, performing convolution operation on the nodes and the edges of the target graph, and extracting the node characteristics of the nodes and the edge characteristics of the edges to obtain a first characteristic graph.
The nodes of the target graph are the projection points, and each projection point in fig. 3 may be referred to as a node, for example. The edge of the target graph is a connecting line between two nodes, for example, each connecting line in fig. 3 may be referred to as an edge, and the connecting line between the projection points may be determined according to the wandering path.
The convolution can be used for extracting features, for example, the nodes of the target graph are convoluted, the node features can be extracted, the edges of the target graph are convoluted, and the edge features can be extracted.
The node characteristics may reflect characteristics of individual nodes and may include, for example, the number, height, reflectivity, etc. of the nodes. The edge feature may reflect a characteristic between two nodes, and may include, for example, a distance between two nodes.
In one embodiment, a convolutional neural network may be used to perform a convolution operation on nodes and edges of the target graph, and extract node features and edge features to obtain a first feature graph.
Compared with the traditional scheme that the point cloud data is subjected to maximum pooling operation, the point with the maximum value is selected from the local acceptance domain, and the target detection is performed according to the selected point with the maximum value, the nodes and the edges of the target graph are respectively convolved, so that the diversity of the point cloud data characteristics can be kept to the maximum, and the accuracy of the detection result can be improved during subsequent target detection.
And S130, weighting the first characteristic diagram according to the attention mechanism to obtain a second characteristic diagram corresponding to the first characteristic diagram.
The attention mechanism is similar to the human selective visual attention mechanism in nature, and the core idea is to select information which is key to the current task target from a plurality of information. According to the embodiment of the application, the attention mechanism is utilized, the occupation ratio of the foreground object area can be improved, the occupation ratio of the background area is weakened, and therefore the accuracy of the detection result can be improved. The foreground object region is also the region where the target to be detected is located.
Specifically, the attention mechanism may be utilized to determine the weight of the foreground object, i.e., the target, and the weight of the background, then the foreground object and the background are weighted respectively, and the weighted sum is used as the second feature map, where the weight of the foreground object is greater than the weight of the background.
And S140, detecting the target according to the second feature map.
In one embodiment, a two-dimensional image corresponding to the point cloud data may be generated based on the second feature map, and image detection may be performed from the two-dimensional image. Based on this, S140 may include the steps of:
generating a two-dimensional image corresponding to the point cloud data according to the second feature map;
performing multi-convolution kernel feature extraction operation on the two-dimensional image to obtain image features of the two-dimensional image under each convolution kernel;
superposing the image characteristics corresponding to each convolution kernel to obtain superposed characteristics;
and detecting the target according to the superposition characteristics.
The multiple convolution kernels comprise multiple convolution kernels with different sizes, and when the convolution kernels are used for extracting the image features of the two-dimensional image, the image features corresponding to the different convolution kernels are different.
The specific size of the convolution kernel is not limited in the embodiment of the present application, for example, an initial convolution kernel may be selected in advance, and on the basis, offset is performed to obtain a plurality of different convolution kernels.
Illustratively, the initial convolution kernel is 3 × 3, the offsets are (-1, -1), (-1, 0), (1, 1), respectively, and the corresponding convolution kernels are 2 × 2, 2 × 3, 4 × 4, respectively.
In consideration of the fact that the sizes of the images output by different convolution kernels are different, in order to effectively overlap the image features obtained by the convolution kernels, the sizes of the images obtained by the convolution kernels can be unified, and under the condition that the sizes are unified, the image features are overlapped, so that the overlapping effectiveness is guaranteed, the diversity and the quality of the image features are improved, and the accuracy of detection results is further improved.
Of course, other methods may be used to perform target detection, and the embodiment of the present application is not particularly limited.
Therefore, after the target graph is obtained based on the point cloud data, the target graph is convolved, the diversity of the point cloud data features is reserved to the maximum extent, on the basis, the feature graph obtained by graph convolution is weighted by an attention mechanism, the proportion of the area where the target is located is enhanced, namely when the target detection is carried out, the diversity of the point cloud data features is increased, meanwhile, the proportion of the area where the target is located is enhanced, and therefore the accuracy of the target detection result is increased.
Taking the example of determining the target graph in a way of random walk of the path, in one embodiment, S110 may include the following steps:
determining a central point and a nearest neighbor point of the central point from the projection points;
determining a candidate nearest neighbor point corresponding to the central point according to a path random walk method;
generating a candidate graph according to the central point, the candidate nearest neighbor point and a connecting line between the central point and the candidate nearest neighbor point;
and determining a target map from the candidate maps.
The central point and the nearest neighbor are both points in the projection points. In one embodiment, the projected points may be clustered by a clustering algorithm, and the center point is determined according to the clustering result, wherein the clustering algorithm may be selected as needed, for example, Kmeans + +.
The Nearest Neighbor points are the points Nearest to the center point, and in one embodiment, the K Nearest points to the center point may be determined by a K-Nearest Neighbor (KNN) algorithm.
The candidate nearest neighbor is the nearest neighbor corresponding to the path random walk, and in one embodiment, the random walk length, that is, the number of candidate nearest neighbors, may be preset. It should be noted that the path random walk is started from the center point.
And connecting the central point and the candidate nearest neighbor point according to the wandering path to obtain a connecting line of the central point and the candidate nearest neighbor point, and further obtain a candidate graph.
Illustratively, referring to fig. 4, the circle numbered 1 represents the center point, the remaining circles are candidate nearest neighbors involved in the random walk of the path, the walk path is 1-2-3-1-4, and the candidate graph shown in fig. 4 can be obtained by connecting the projection points according to the walk path.
In one embodiment, expected values and variances of distance differences of the candidate graphs in vector space and three-dimensional space can be determined, and the candidate graph with the minimum expected value and variance is determined as a target graph, so that the selected projection points can be ensured to be similar and stable.
In performing target detection, taking the determination of the type and location of the target as an example, in one embodiment, the attention mechanism may include a channel attention mechanism and a spatial attention mechanism.
The channel attention mechanism is used for paying attention to the channel information of the first feature map, namely paying attention to which features are meaningful, and the number of channels of the first feature map can be set according to actual needs.
The spatial attention mechanism is used to focus on the spatial information of the first feature map, i.e. where the features are of interest.
The channel attention mechanism and the spatial attention mechanism may be performed in parallel or in series.
In an embodiment, as shown in fig. 5, the target detection method provided by the embodiment of the present application may include steps S510-S560 shown below, in which the channel attention mechanism and the spatial attention mechanism are executed in parallel:
and S510, generating a target map according to the projection points of the point cloud data in the preset direction.
S520, performing convolution operation on the nodes and the edges of the target graph, and extracting node features of the nodes and edge features of the edges to obtain a first feature graph.
S530, weighting each channel of the first characteristic diagram according to a channel attention mechanism to obtain a channel characteristic diagram.
Specifically, the weights of the first feature map corresponding to different channels may be determined according to a channel attention mechanism, the corresponding channels are weighted according to the weights, and the weighting results of the channels are superimposed to obtain the channel feature map. The sum of the weights of the channels is 1.
The weight of each channel can be obtained through encoder training, and the embodiment of the application does not limit the specific training process.
And S540, weighting each pixel point in the first characteristic diagram according to a space attention mechanism to obtain a space characteristic diagram.
Specifically, the weights of the pixels in the first feature map may be determined according to a spatial attention mechanism, the pixels are weighted according to the weights, and the weighting results of the pixels are superimposed to obtain the spatial feature map. The sum of the weights of the pixel points is 1.
And S550, overlapping the channel characteristic diagram and the spatial characteristic diagram to obtain a second characteristic diagram.
It should be noted that, if the sizes of the channel feature map and the spatial feature map are different, the sizes of the two feature maps need to be unified before the channel feature map and the spatial feature map are superimposed, and the channel feature map and the spatial feature map are superimposed under the condition that the sizes of the channel feature map and the spatial feature map are the same, so that the validity of the superimposition is ensured.
It should be further noted that the execution order of S530 and S540 is not limited in the embodiment of the present application, that is, S530 may be executed first, and then S540 may be executed; or may execute S540 first and then execute S530; s530 and S540 may also be performed simultaneously.
And S560, carrying out target detection according to the second characteristic diagram.
Therefore, the embodiment of the application combines the graph convolution and the attention mechanism, not only maximally retains the diversity of the point cloud data characteristics by utilizing the graph convolution, but also enhances the ratio of the foreground object by utilizing the channel attention mechanism and the space attention mechanism, thereby improving the accuracy of target detection.
For example, when the channel attention mechanism and the spatial attention mechanism are implemented in series, in one embodiment, as shown in fig. 6, the target detection method provided by the embodiment of the present application may include steps S610-S650 as follows:
s610, generating a target graph according to the projection points of the point cloud data in the preset direction.
S620, performing convolution operation on the nodes and the edges of the target graph, and extracting node features of the nodes and edge features of the edges to obtain a first feature graph.
S630, weighting each channel of the first characteristic diagram according to the channel attention mechanism to obtain a channel characteristic diagram.
And S640, weighting each pixel point in the channel characteristic diagram according to a space attention mechanism to obtain a second characteristic diagram.
In this embodiment, on the basis of obtaining the channel feature map based on the channel attention mechanism, each pixel point in the channel feature map is weighted by using the spatial attention mechanism, and the process is similar to the weighting of the first feature map by using the spatial attention mechanism, and is not described herein again for brevity.
Compared with a parallel mode, the serial mode can simplify operation and improve efficiency.
And S650, carrying out target detection according to the second characteristic diagram.
Therefore, the embodiment of the application combines the graph convolution and the attention mechanism, not only maximally retains the diversity of the point cloud data characteristics by utilizing the graph convolution, but also enhances the ratio of the foreground object by utilizing the channel attention mechanism and the space attention mechanism, and improves the accuracy of target detection.
In view of the large data volume of the point cloud data, in order to improve the detection efficiency, in one embodiment, the point cloud data acquired by the laser radar may be subjected to cylinder division, and the target detection method is performed for each cylinder.
For example, the x-y plane may be divided into a plurality of grids, the size of each grid may be the same or different, and the size of each grid may be set according to actual needs, for example, may be set to H × W.
For each mesh, according to the position coordinates of the point cloud data, the height of each mesh in the space can be determined, and a cylinder is formed based on the mesh and the height, so that the point cloud data can be divided into cylinders as shown in fig. 7, each cylinder contains a plurality of point cloud data, and fig. 7 exemplarily shows a schematic diagram of one cylinder.
It should be understood that each column may contain different point cloud data, for example, some columns may contain more than N point cloud data, some columns may contain less than N point cloud data, and the size of N may be set according to actual needs.
In one embodiment, for a cylinder with point cloud data exceeding N, N point cloud data can be randomly selected from the cylinder for subsequent target detection; for the cylinder with the point cloud data less than N, the cylinder can be filled with 0, so that P non-empty cylinders can be obtained, and each cylinder contains N point cloud data.
Taking the example that the two-dimensional image corresponding to the point cloud data includes D channels, the point cloud data can be expressed by tensor (D, P, N).
For each cylinder, in one embodiment, the point cloud data within the cylinder may be projected in the D-P direction (Z axis) to obtain a projection point.
For the process of generating the target map based on the projection points, obtaining the first feature map according to the target map, and obtaining the second feature map based on the first feature map, reference may be made to the above embodiments, and details are not repeated here for brevity.
In one embodiment, the obtained second feature map may be scattered back to the original strut position, so that a two-dimensional image corresponding to the whole point cloud data may be obtained, which may also be referred to as a pseudo-picture.
For a process of performing target detection based on the two-dimensional image, reference may be made to the above-mentioned embodiment, and details are not repeated here for brevity.
Therefore, the point cloud data is subjected to cylinder division, and similar operations are executed in parallel for each cylinder, so that the calculation amount can be reduced, and the detection time can be saved.
Based on the same inventive concept, the embodiment of the present application further provides an object detection apparatus, which may be integrated in an intelligent device capable of sensing the surrounding environment, for example, an autonomous vehicle or a mobile robot. The object detection device provided in the embodiment of the present application is described in detail below with reference to fig. 8.
Fig. 8 is a structural diagram of an object detection apparatus according to an embodiment of the present application.
As shown in fig. 8, the object detecting device may include:
the target map generation module 81 is configured to generate a target map according to the projection points of the point cloud data in the preset direction, where the target map is a map formed by connecting the projection points and the projection points;
the convolution module 82 is used for performing convolution operation on the nodes and the edges of the target graph, and extracting the node characteristics of the nodes and the edge characteristics of the edges to obtain a first characteristic graph;
the weighting module 83 is configured to weight the first feature map according to the attention mechanism to obtain a second feature map corresponding to the first feature map;
and a detection module 84, configured to perform target detection according to the second feature map.
The above-described object detection device is described in detail below, specifically as follows:
in one embodiment, the attention mechanism includes a channel attention mechanism and a spatial attention mechanism, the first profile includes a plurality of channels;
the weighting module 83 is specifically configured to:
weighting each channel of the first characteristic diagram according to a channel attention mechanism to obtain a channel characteristic diagram;
weighting each pixel point in the first characteristic diagram according to a space attention mechanism to obtain a space characteristic diagram;
and fusing the channel feature map and the spatial feature map to obtain a second feature map.
In one embodiment, the attention mechanism includes a channel attention mechanism and a spatial attention mechanism, the first profile includes a plurality of channels;
the weighting module 83 is specifically configured to:
weighting each channel of the first characteristic diagram according to a channel attention mechanism to obtain a channel characteristic diagram;
and weighting each pixel point in the channel characteristic diagram according to a space attention mechanism to obtain a second characteristic diagram.
In one embodiment, the detection module 84 is specifically configured to:
generating a two-dimensional image corresponding to the point cloud data according to the second feature map;
performing multi-convolution kernel feature extraction operation on the two-dimensional image to obtain image features of the two-dimensional image under each convolution kernel;
superposing the image characteristics corresponding to each convolution kernel to obtain superposed characteristics;
and detecting the target according to the superposition characteristics.
In an embodiment, the target map generating module 81 is specifically configured to:
determining a central point and a nearest neighbor point of the central point from the projection points;
determining a candidate nearest neighbor point corresponding to the central point according to a path random walk method;
generating a candidate graph according to the central point, the candidate nearest neighbor point and a connecting line between the central point and the candidate nearest neighbor point;
and determining a target map from the candidate maps.
Therefore, after the target graph is obtained based on the point cloud data, the target graph is convolved, the diversity of the point cloud data features is reserved to the maximum extent, on the basis, the feature graph obtained by graph convolution is weighted by an attention mechanism, the proportion of the area where the target is located is enhanced, namely when the target detection is carried out, the diversity of the point cloud data features is increased, meanwhile, the proportion of the area where the target is located is enhanced, and therefore the accuracy of the target detection result is increased.
Each module in the apparatus shown in fig. 8 has a function of implementing each step in fig. 1, fig. 5, and fig. 6 and can achieve a corresponding technical effect, and for brevity, is not described again here.
Based on the same inventive concept, the embodiment of the application also provides an electronic device, which can be an intelligent device capable of sensing the surrounding environment. The electronic device provided by the embodiment of the present application is described in detail below with reference to fig. 9.
As shown in fig. 9, the electronic device may include a lidar 91, a processor 92, and a memory 93 for storing computer program instructions.
The laser radar 91 is used to collect point cloud data of the surrounding environment.
The processor 92 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more Integrated circuits implementing embodiments of the present Application.
Memory 93 may include mass storage for data or instructions. By way of example, and not limitation, memory 93 may include a Hard Disk Drive (HDD), a floppy Disk Drive, flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. In one example, memory 121 may include removable or non-removable (or fixed) media, or memory 93 is non-volatile solid-state memory. In one example, the Memory 93 may be a Read Only Memory (ROM). In one example, the ROM may be mask programmed ROM, programmable ROM (prom), erasable prom (eprom), electrically erasable prom (eeprom), electrically rewritable ROM (earom), or flash memory, or a combination of two or more of these.
The processor 92 reads and executes the computer program instructions stored in the memory 93 to implement the method in the embodiment shown in fig. 1, 5, and 6, and achieve the corresponding technical effect achieved by the embodiment shown in fig. 1, 5, and 6 executing the method, which is not described herein again for brevity.
In one example, the electronic device may also include a communication interface 94 and a bus 95. As shown in fig. 9, the laser radar 91, the processor 92, the memory 93, and the communication interface 94 are connected by a bus 95 to complete mutual communication.
The communication interface 94 is mainly used for implementing communication between modules, apparatuses and/or devices in the embodiments of the present invention.
The bus 95 includes hardware, software, or both to couple the various components of the electronic device to one another. By way of example, and not limitation, Bus 95 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an InfiniBand interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a video electronics standards Association local (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 95 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The electronic device may execute the target detection method in the embodiment of the present application based on the target map generated from the projection points of the point cloud data in the preset direction, thereby implementing the target detection method described in conjunction with fig. 1, 5, and 6 and the target detection apparatus described in fig. 8.
In addition, in combination with the data transmission method in the foregoing embodiments, the embodiments of the present application may provide a computer storage medium to implement. The computer storage medium having computer program instructions stored thereon; the computer program instructions, when executed by a processor, implement any of the data transmission methods in the above embodiments.
It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
Aspects of embodiments of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims (10)

1. A method of object detection, comprising:
generating a target graph according to the projection points of the point cloud data in the preset direction, wherein the target graph is a graph formed by connecting lines between the projection points and the projection points;
performing convolution operation on the nodes and the edges of the target graph, and extracting the node characteristics of the nodes and the edge characteristics of the edges to obtain a first characteristic graph;
weighting the first feature map according to an attention mechanism to obtain a second feature map corresponding to the first feature map;
and detecting the target according to the second characteristic diagram.
2. The method of claim 1, wherein the attention mechanism comprises a channel attention mechanism and a spatial attention mechanism, the first profile comprising a plurality of channels;
the weighting the first feature map according to the attention mechanism to obtain a second feature map corresponding to the first feature map includes:
weighting each channel of the first characteristic diagram according to the channel attention mechanism to obtain a channel characteristic diagram;
weighting each pixel point in the first characteristic diagram according to the spatial attention mechanism to obtain a spatial characteristic diagram;
and overlapping the channel characteristic diagram and the spatial characteristic diagram to obtain the second characteristic diagram.
3. The method of claim 1, wherein the attention mechanism comprises a channel attention mechanism and a spatial attention mechanism, the first profile comprising a plurality of channels;
the weighting the first feature map according to the attention mechanism to obtain a second feature map corresponding to the first feature map includes:
weighting each channel of the first characteristic diagram according to the channel attention mechanism to obtain a channel characteristic diagram;
and weighting each pixel point in the channel characteristic diagram according to the space attention mechanism to obtain the second characteristic diagram.
4. The method of claim 1, wherein the performing target detection according to the second feature map comprises:
generating a two-dimensional image corresponding to the point cloud data according to the second feature map;
performing feature extraction operation of multiple convolution kernels on the two-dimensional image to obtain image features of the two-dimensional image under each convolution kernel;
superposing the image characteristics corresponding to the convolution kernels to obtain superposed characteristics;
and detecting the target according to the superposition characteristics.
5. The method of claim 1, wherein generating the target map according to the projection points of the point cloud data in the preset direction comprises:
determining a central point and nearest neighbors of the central point from the projection points;
determining candidate nearest neighbor points corresponding to the central point according to a path random walk method;
generating a candidate graph according to the central point and the candidate nearest neighbor point and a connecting line between the central point and the candidate nearest neighbor point;
determining a target map from the candidate maps.
6. An object detection device, comprising:
the target image generation module is used for generating a target image according to the projection points of the point cloud data in the preset direction, and the target image is an image formed by connecting lines between the projection points and the projection points;
the convolution module is used for performing convolution operation on the nodes and the edges of the target graph, and extracting the node characteristics of the nodes and the edge characteristics of the edges to obtain a first characteristic graph;
the weighting module is used for weighting the first feature map according to an attention mechanism to obtain a second feature map corresponding to the first feature map;
and the detection module is used for carrying out target detection according to the second characteristic diagram.
7. The apparatus of claim 6, wherein the attention mechanism comprises a channel attention mechanism and a spatial attention mechanism, the first profile comprising a plurality of channels;
the weighting module is specifically configured to:
weighting each channel of the first characteristic diagram according to the channel attention mechanism to obtain a channel characteristic diagram;
weighting each pixel point in the first characteristic diagram according to the spatial attention mechanism to obtain a spatial characteristic diagram;
and fusing the channel characteristic diagram and the spatial characteristic diagram to obtain the second characteristic diagram.
8. The apparatus of claim 6, wherein the attention mechanism comprises a channel attention mechanism and a spatial attention mechanism, the first profile comprising a plurality of channels;
the weighting module is specifically configured to:
weighting each channel of the first characteristic diagram according to the channel attention mechanism to obtain a channel characteristic diagram;
and weighting each pixel point in the channel characteristic diagram according to the space attention mechanism to obtain the second characteristic diagram.
9. An electronic device, comprising:
the laser radar is used for acquiring point cloud data;
a processor;
a memory for storing computer program instructions;
the computer program instructions, when executed by the processor, implement the method of any of claims 1-5.
10. A computer-readable storage medium having computer program instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1-5.
CN202110896460.7A 2021-08-05 2021-08-05 Target detection method, device, equipment and storage medium Pending CN113808077A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110896460.7A CN113808077A (en) 2021-08-05 2021-08-05 Target detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110896460.7A CN113808077A (en) 2021-08-05 2021-08-05 Target detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113808077A true CN113808077A (en) 2021-12-17

Family

ID=78893419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110896460.7A Pending CN113808077A (en) 2021-08-05 2021-08-05 Target detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113808077A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205717A (en) * 2022-09-14 2022-10-18 广东汇天航空航天科技有限公司 Obstacle point cloud data processing method and flight equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205717A (en) * 2022-09-14 2022-10-18 广东汇天航空航天科技有限公司 Obstacle point cloud data processing method and flight equipment

Similar Documents

Publication Publication Date Title
CN112417967B (en) Obstacle detection method, obstacle detection device, computer device, and storage medium
CN111079619B (en) Method and apparatus for detecting target object in image
CN109521757B (en) Static obstacle identification method and device
Alhussan et al. Pothole and plain road classification using adaptive mutation dipper throated optimization and transfer learning for self driving cars
JP6328327B2 (en) Image processing apparatus and image processing method
CN111582054B (en) Point cloud data processing method and device and obstacle detection method and device
EP3690744B1 (en) Method for integrating driving images acquired from vehicles performing cooperative driving and driving image integrating device using same
JP6856854B2 (en) A method and device that integrates object detection information detected by each object detector corresponding to each of the cameras located in the surrounding area for sensor fusion by a large number of vehicles and collaborative driving using V2X communicable applications.
CN111308500B (en) Obstacle sensing method and device based on single-line laser radar and computer terminal
CN112166458B (en) Target detection and tracking method, system, equipment and storage medium
CN111144304A (en) Vehicle target detection model generation method, vehicle target detection method and device
CN111354022B (en) Target Tracking Method and System Based on Kernel Correlation Filtering
CN112926461B (en) Neural network training and driving control method and device
WO2024087962A1 (en) Truck bed orientation recognition system and method, and electronic device and storage medium
CN112257668A (en) Main and auxiliary road judging method and device, electronic equipment and storage medium
CN112166446A (en) Method, system, device and computer readable storage medium for identifying trafficability
CN113723216A (en) Lane line detection method and device, vehicle and storage medium
CN113808077A (en) Target detection method, device, equipment and storage medium
CN109635641A (en) Determination method, apparatus, equipment and the storage medium of lane boundary line
CN114830185A (en) Position determination by means of a neural network
CN113591543B (en) Traffic sign recognition method, device, electronic equipment and computer storage medium
EP4050510A1 (en) Object information calculation method and system
CN113793295A (en) Data processing method, device and equipment and readable storage medium
CN111507154B (en) Method and apparatus for detecting lane line elements using transversal filter mask
CN117037120B (en) Target perception method and device based on time sequence selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination