WO2022206517A1 - Target detection method and apparatus - Google Patents

Target detection method and apparatus Download PDF

Info

Publication number
WO2022206517A1
WO2022206517A1 PCT/CN2022/082553 CN2022082553W WO2022206517A1 WO 2022206517 A1 WO2022206517 A1 WO 2022206517A1 CN 2022082553 W CN2022082553 W CN 2022082553W WO 2022206517 A1 WO2022206517 A1 WO 2022206517A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
target
category
size
image
Prior art date
Application number
PCT/CN2022/082553
Other languages
French (fr)
Chinese (zh)
Inventor
周伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022206517A1 publication Critical patent/WO2022206517A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image

Definitions

  • the present application relates to the technical field of sensor processing, and in particular, to a target detection method and device.
  • Object detection is an indispensable key technology in autonomous driving.
  • target detection processing for targets such as vehicles and pedestrians can provide reference information for path planning, lane selection, human-vehicle tracking, and behavior prediction, which is of great significance in autonomous driving.
  • target detection can be performed by a combination of radar detection and image detection.
  • the radar points detected by the radar can be mapped to the image coordinate system, and a predefined anchor frame is generated for each mapped radar point, and then combined with the generated anchor frame, the deep learning-based convolutional neural network is used for target detection.
  • the deep learning-based convolutional neural network is used for target detection.
  • it may reflect back to multiple radar points. If an anchor frame is generated in the image for each radar point, it is too redundant, and the amount of data processing is large, which requires a lot of processing time. , resulting in low efficiency of target detection.
  • the present application provides a target detection method, which is used to realize target detection by combining point cloud data and images, and improve the efficiency of target detection.
  • the present application provides a target detection method, the method includes: acquiring first point cloud data collected by a radar sensor and a first image collected by a corresponding camera sensor, wherein the first point cloud data includes a plurality of point cloud; map the first point cloud data to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds;
  • the cloud data is divided into grids; according to the feature data of the point clouds in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data is determined, wherein any target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds is generated; according to each generated target point At least one anchor box corresponding to the cloud performs target detection, and determines the position of at least one target object to be detected.
  • the grid-divided target point cloud is determined according to the feature data of the point cloud, and then the anchor frame is generated in the image according to the target point cloud. It greatly reduces the number of point clouds that need to generate anchor boxes, thereby reducing the number of anchor boxes generated. At the same time, it can also integrate the characteristics of each point cloud to ensure the accuracy of the generated anchor boxes. Therefore, this method solves the problem based on acquisition Each point cloud obtained generates an anchor frame, which causes network redundancy and time-consuming problems, which speeds up the detection speed and efficiency of target detection.
  • this method determines the target point cloud by synthesizing the features of the point cloud in the grid, so even if the point cloud data collected by the radar sensor has radar noise, it can reduce the impact of local noise density and slow down Problems such as slow processing caused by noise.
  • the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud.
  • the target point cloud corresponding to the point cloud data can be determined according to the characteristic data characterizing the radar echo intensity of the point cloud data, so that the characteristic information of the point cloud in the point cloud data can be fully utilized, so as to improve the accuracy of the point cloud data.
  • the accuracy of the data for object detection is the accuracy of the data for object detection.
  • any anchor box corresponding to any target point cloud contains the target point cloud.
  • the target point cloud is mapped to the image, it is mostly located in the image area where the target object to be detected is located. Therefore, by generating an anchor frame containing the target point cloud for the target point cloud, the generated anchors can be accurately The frame is used to determine the area where the target object corresponding to the target point cloud is located in the image, so as to achieve target detection.
  • dividing the second point cloud data into grids includes: dividing the second point cloud data into multiple grids according to a set grid size; The distance parameter of the midpoint cloud divides the point cloud contained in each grid into multiple point cloud sets, wherein the distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
  • the point cloud is further divided with reference to the distance from the point cloud to the radar sensor, so that the coordinate position information of each dimension of the point cloud can be fully utilized in the detection process, thereby improving the detection accuracy.
  • determining a plurality of target point clouds after grid division of the second point cloud data including: in each grid , determine the target point cloud of each point cloud set according to the feature data of the point cloud in each point cloud set; take the target point cloud of multiple point cloud sets contained in each grid as each grid The target point cloud is obtained; the target point cloud of multiple grids contained in the second point cloud data is used as the target point cloud after grid division of the second point cloud data, and the multiple target point clouds are obtained.
  • the target point cloud of each grid is determined in combination with the feature data of the point cloud, which can fully consider the distribution characteristics of the point cloud in the small area space (grid) to simplify the point cloud , while reducing the number of anchor boxes generated in the image and improving the detection efficiency, it can also ensure the accuracy of generating anchor boxes and improve the detection accuracy.
  • the feature data includes the radar cross-sectional area of the point cloud; in each grid, the target point of each point cloud set is determined according to the feature data of the point cloud in each point cloud set, respectively Cloud, including: taking the centroid point of at least one point cloud in each point cloud set as the target point cloud of each point cloud set, wherein, in the at least one point cloud, the radar intercepts of any two point clouds are The difference between the areas is less than the set threshold.
  • the centroid point of the point cloud whose radar cross-sectional area difference is less than the set threshold in the point cloud set is determined and used as the target point cloud of the point cloud set. While reducing the number of point clouds, it can pass the target The feature information of the point cloud in the point cloud comprehensive point cloud set.
  • generating at least one anchor frame corresponding to each target point cloud in the multiple target point clouds includes: acquiring at least one category identifier, wherein different categories The identifiers are respectively used to represent categories of different objects; according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud is determined.
  • the actual sizes of different objects are different. Therefore, by generating anchor boxes corresponding to the category of certain objects for the target point cloud, the diversity of the generated anchor boxes is improved, so that it can be used in a variety of different A relatively suitable anchor box is selected in the anchor box of the category to further improve the accuracy of target detection.
  • the at least one category identification includes a set category identification and/or a category identification determined after performing target detection on a reference image, wherein the reference image is before the first image A frame of image for object detection.
  • an anchor frame can be generated for the target point cloud in the current frame image with reference to the category of the object detected in the previous frame image. Similarity, improve the accuracy of the anchor frame generated in the current frame image, and achieve the effect of improving the detection accuracy.
  • the confidence level of the category identification determined after the target detection is performed on the reference image is greater than a set threshold.
  • the higher the confidence of the determined category identification, the higher the accuracy of the determined category identification, and the category identification with the determined confidence greater than a certain threshold in the previous frame image is used as a priori information for reference.
  • the accuracy of the anchor box generated in the image is further improved, and the detection accuracy is further improved.
  • determining at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification includes: determining at least one anchor box size, wherein the The at least one anchor box size includes at least one anchor box size corresponding to each category identification in the at least one category identification; at least one anchor box of each target point cloud that conforms to the at least one anchor box size is determined.
  • the size of the anchor frame generated for the target point cloud in the image can be easily and quickly determined, thereby improving the detection speed.
  • determining at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification includes: determining at least one object size, wherein the At least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to represent the size of the object to which the category identification belongs; according to the at least one Object size, determine at least one mapping size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the mapping of the object to which the target category identifier belongs to the first image
  • the target category identifier is the category identifier corresponding to the object size; at least one anchor box of each target point cloud that conforms to the at least one mapping size is determined.
  • the size of the anchor box corresponding to the category identification is determined, which can improve the size of the generated anchor box. accuracy, thereby improving the accuracy of target detection.
  • the target point cloud in any anchor frame of any target point cloud, is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at any position in the anchor frame. Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
  • the target point cloud may be located at any position in the area where the target object is located in the image. Therefore, in the case of the same anchor frame size, generating anchor boxes in different directions of the target point cloud can provide more choices and An accurate anchor box ensures that when the target point cloud is mapped to different positions in the area where the target object is located, an anchor box that surrounds the target object as much as possible can be selected.
  • performing target detection according to at least one anchor frame corresponding to each generated target point cloud, and determining the position of at least one target object to be detected includes: identifying each target object included in the first image. The target category of the target object, and the confidence level of the target category is determined; in at least one anchor frame of each target point cloud, the target anchor frame where each target object is located is determined; the detection result is output, and the detection result includes: The target category of each target object, the category identification of the target category of each target object, the confidence level of the target category of each target object, and the target anchor box where each target object is located.
  • the target point cloud is a point cloud that has been simplified from the original point cloud data. Therefore, when the target detection is performed based on a small number of anchor boxes generated from the target point cloud, the target point cloud can be selected from a small number of anchor boxes. For the anchor box that marks the position of the target object, the detection speed is fast. At the same time, the target point cloud combines the spatial distribution characteristics of the point cloud in the original point cloud data, which can ensure the accuracy of detection.
  • the present application provides a target detection device, which includes a data acquisition unit and a processing unit; the data acquisition unit is used to acquire first point cloud data collected by a radar sensor and first point cloud data collected by a corresponding camera sensor. image, wherein the first point cloud data includes multiple point clouds; the processing unit is configured to map the first point cloud data to the image plane of the first image to obtain second point cloud data, The second point cloud data includes multiple point clouds; the second point cloud data is divided into grids; the second point cloud is determined according to the feature data of the point cloud in the second point cloud data A plurality of target point clouds after grid division of data, wherein any target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, the plurality of target points are generated At least one anchor frame corresponding to each target point cloud in the cloud; target detection is performed according to the generated at least one anchor frame corresponding to each target point cloud, and the position of the at least one target object to be detected is determined.
  • the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud.
  • any anchor box corresponding to any target point cloud contains the target point cloud.
  • the processing unit when the processing unit divides the second point cloud data into grids, it is specifically configured to: divide the second point cloud data into multiple grids according to a set grid size According to the distance parameter of the point cloud in each grid, the point cloud contained in each grid is divided into a plurality of point cloud sets, wherein the distance parameter is used to indicate the level of the point cloud to the radar sensor distance.
  • the processing unit determines, according to the feature data of the point cloud in the second point cloud data, the plurality of target point clouds after the grid division of the second point cloud data
  • the specific use of In In each grid, the target point cloud of each point cloud set is determined according to the feature data of the point cloud in each point cloud set; the target points of multiple point cloud sets contained in each grid are determined.
  • the cloud is used as the target point cloud of each grid; the target point cloud of multiple grids included in the second point cloud data is used as the target point cloud after grid division of the second point cloud data, and the result is obtained the plurality of target point clouds.
  • the processing unit when generating, in the first image, at least one anchor frame corresponding to each target point cloud of the multiple target point clouds, is specifically configured to: obtain at least one anchor frame Category identifiers, wherein different category identifiers are respectively used to represent categories of different objects; at least one anchor box corresponding to the at least one category identifier of each target point cloud is determined according to the at least one category identifier.
  • the at least one category identification includes a set category identification and/or a category identification determined after performing target detection on a reference image, wherein the reference image is before the first image A frame of image for object detection.
  • the confidence level of the category identification determined after the target detection is performed on the reference image is greater than a set threshold.
  • the processing unit determines, according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud
  • the processing unit is specifically configured to: determine at least one anchor box Anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier; determine the size of each target point cloud that conforms to the at least one anchor frame size At least one anchor box.
  • the processing unit determines at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification, and is specifically used for: determining at least one object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to indicate the size of the object to which the category identification belongs ; Determine at least one mapping size according to the at least one object size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any one object size is the object mapping to which the target category identifier belongs After reaching the size of the first image, the target category identifier is the category identifier corresponding to the object size; at least one anchor frame of each target point cloud that conforms to the at least one mapping size is determined.
  • the target point cloud in any anchor frame of any target point cloud, is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at any position in the anchor frame. Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
  • the processing unit performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the position of at least one target object to be detected, which is specifically used for: identifying the first The target category of each target object contained in the image, and determine the confidence level of the target category; in at least one anchor frame of each target point cloud, determine the target anchor frame where each target object is located; output the detection result,
  • the detection result includes: the target category of each target object, the category identifier of the target category of each target object, the confidence level of the target category of each target object, and the target anchor frame where each target object is located.
  • the present application provides a target detection device, which includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the calculation program stored in the memory, so as to realize the above-mentioned first aspect or the method described in any possible design of the first aspect.
  • the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed on a target detection device, the target detection device is made to execute the above-mentioned first The method described in the aspect or any possible design of the first aspect.
  • the present application provides a computer program product, the computer program product includes a computer program or an instruction, when the computer program or instruction is run on a target detection device, the target detection device is made to perform the above-mentioned first aspect. or the method described in any possible design of the first aspect.
  • the present application provides a sensor or fusion device, where the sensor can be a detection sensor such as a radar sensor or a camera sensor.
  • the sensor or fusion device includes the target detection device described in the second aspect or the third aspect.
  • the present application provides a terminal, where the terminal includes the target detection device described in the second aspect or the third aspect, or includes the sensor or fusion device described in the sixth aspect.
  • the terminal is any one of the following: intelligent transportation equipment, smart home equipment, intelligent manufacturing equipment, and robots.
  • the intelligent transportation device is any one of the following: a vehicle, an unmanned aerial vehicle, an automatic guided transportation vehicle, and an unmanned transportation vehicle.
  • the present application provides a system, which includes a radar sensor, a corresponding camera sensor, and the target detection device described in the second or third aspect above.
  • FIG. 1 is a schematic diagram of the architecture of a possible application system of the target detection method provided by the embodiment of the present application;
  • FIG. 2 is a schematic diagram of a target detection method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a target detection apparatus provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a target detection apparatus provided by an embodiment of the present application.
  • Target detection refers to locating multiple target objects (or target objects) according to the collected image, including determining the category and position of the target object, which is generally marked with an anchor box (bounding box) in the image.
  • Object classification refers to judging the category of the target object determined in the image.
  • Point cloud data The set of point data on the surface of an object scanned by a 3D scanning device can be called point cloud data.
  • Point cloud data is a collection of vectors in a three-dimensional coordinate system. These vectors are usually expressed in three-dimensional coordinates, and are generally used primarily to represent the shape of an object's outer surface.
  • the point cloud can also represent the RGB (red, green, blue, red, green and blue) color, gray value, depth, and intensity of the object's reflective surface of a point.
  • the point cloud coordinate system involved in the embodiments of the present application is the three-dimensional coordinate system where the point cloud in the point cloud data is located.
  • Radar cross-section refers to the reflection cross-sectional area of the radar.
  • the principle of radar detection is to transmit electromagnetic waves to the surface of the object, and then receive the electromagnetic wave signal reflected back by the object, according to the received electromagnetic wave. The signal detects objects. Among them, after the electromagnetic wave emitted by the radar is irradiated on the surface of the object, the less electromagnetic waves that are reflected back are received, the smaller the cross-sectional area of the radar, the smaller the recognition degree of the feature of the object by the radar, and the shorter the detection distance of the radar.
  • Image (plane) coordinate system The image coordinate system can also be called the pixel coordinate system, which is usually a two-dimensional coordinate system established with the feature point in the upper left corner of the image as the origin, and the unit is pixel.
  • At least one refers to one or more, and "a plurality” refers to two or more.
  • And/or which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one (item) of the following or its similar expression refers to any combination of these items, including any combination of single item (item) or plural item (item).
  • At least one (a) of a, b or c may represent: a, b, c, a and b, a and c, b and c, or a, b and c, where a, b, c Can be single or multiple.
  • the target detection algorithm based on convolutional neural network has been more used because the detection results are more accurate and efficient. Therefore, the current mainstream (two-dimensional) target detection algorithms are mostly methods based on convolutional neural networks. These methods can be generalized into two categories, namely one-stage algorithms and two-stage algorithms. Among them, the one-stage algorithm treats object detection as a regression problem, which can directly learn the classification probability and anchor box of the target object from the input image.
  • the two-stage algorithm uses two stages for target detection. First, the region proposal network (RPN) is used to generate regions of interest in the first stage, and then these RPNs are used in the second stage to classify and classify target objects. Anchor box regression.
  • the two methods have their own advantages in comparison.
  • the one-stage algorithm is faster than the two-stage algorithm, but the accuracy of the two-stage algorithm is better.
  • the radar detection points (point cloud) detected by the radar can be mapped to the image coordinate system of the image captured by the camera, and a predefined radar detection point can be generated for each mapped radar detection point.
  • Anchor boxes are used as object proposals, and then the location of objects in the image is detected based on these anchor boxes.
  • this method has a large amount of data processing and a large amount of processing time, so the efficiency of this method for target detection is very low.
  • the embodiments of the present application provide a target detection method for performing fast and accurate target detection, thereby improving the efficiency of target detection.
  • the target detection method provided by the embodiments of the present application may combine point cloud data collected by a radar sensor and images collected by a corresponding camera sensor to perform target detection.
  • the method can be applied to a target detection device with data processing capability.
  • the target detection device may be a vehicle with a data processing function, or an in-vehicle device with a data processing function in the vehicle, or set in a sensor that collects and processes point cloud data and image data.
  • the in-vehicle equipment may include, but is not limited to, in-vehicle terminals, in-vehicle controllers, in-vehicle modules, in-vehicle modules, in-vehicle components, in-vehicle chips, electronic control units (ECUs), domain controllers (DCs) and other devices.
  • the target detection device may also be other electronic devices with data processing functions, including but not limited to smart home devices (such as TVs, etc.), smart robots, mobile terminals (such as mobile phones, tablet computers, etc.), wearable devices (such as smart Watches, etc.) and other smart devices.
  • the target detection device may also be a controller, a chip, or other devices in the smart device.
  • FIG. 1 is a schematic structural diagram of a possible application system of the target detection method provided by the embodiment of the present application.
  • the system at least includes a point cloud acquisition module 101 , a point cloud data processing module 102 , an image acquisition module 103 and a target detection module 104 .
  • the point cloud acquisition module 101 and the image acquisition module 103 are respectively used to acquire point cloud data and corresponding images.
  • the point cloud acquisition module 101 and the image acquisition module 103 are respectively configured to acquire the first point cloud data and the first image corresponding to the same scene at the same moment.
  • the point cloud acquisition module 101 and the image acquisition module 103 may be set at the same position in the autonomous driving vehicle, and the point cloud acquisition module 101 may acquire point clouds of the surrounding environment of the autonomous driving vehicle
  • the data is sent to the point cloud data processing module 102 , and at the same time, the image acquisition module 103 collects the image of the surrounding environment of the autonomous driving vehicle and sends it to the target detection module 104 .
  • the point cloud collection module 101 may be a radar sensor, such as a millimeter wave radar, or any other device capable of collecting point cloud data, which is not specifically limited in this embodiment of the present application.
  • the image acquisition module 103 may be a camera sensor, such as a camera, a video camera, a monitor, etc., or any other device capable of capturing images, which is not specifically limited in this embodiment of the present application.
  • the point cloud data processing module 102 is configured to process the first point cloud data from the point cloud acquisition module 101 and the first image from the image acquisition module 103, including: mapping the first point cloud data Go to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds; perform grid division on the second point cloud data; according to the second point cloud data The feature data of the point cloud in the point cloud data, to determine a plurality of target point clouds after grid division of the second point cloud data, wherein any target point cloud corresponds to at least one point in the second point cloud data cloud.
  • the point cloud data processing module 102 can map the point cloud data collected by the radar to the point cloud data collected by the camera according to the camera calibration parameters.
  • the image plane is divided into grids, and the point cloud in the same grid is divided according to the distance from the point cloud after the mapping to the radar, so as to realize the image plane where the mapped point cloud is located and the top view of the image plane (bird). Eye view, BEV) grid of the three-dimensional space composed of planes.
  • point cloud data processing module 102 divides the mapped point cloud into a grid, and combines the RCS features of the point cloud, calculates the statistical information of point clouds with similar feature information, and further generates targets with different point cloud feature information.
  • Point cloud as points of interest (POI) for generating anchor boxes.
  • the point cloud data processing module 102 may acquire the first image from the image acquisition module 103 or the target detection module 104 . After determining multiple target point clouds, the point cloud data processing module 102 notifies the target detection module 104 of the multiple target point clouds.
  • the point cloud data processing module 102 may include a mapping module 105 and a grid division module 106, wherein the mapping module 105 is configured to map the first point cloud data to the second An image plane of an image to obtain a second point cloud data.
  • the grid division module 106 is configured to perform grid division on the second point cloud data, and determine the second point cloud data for grid division according to the feature data of the point cloud in the second point cloud data After the multiple target point cloud.
  • the mapping module 105 may first perform point cloud mapping, and the grid dividing module 106 may then perform grid division.
  • Grid division that is, the mapping module 105 first maps the first point cloud data to the image plane of the first image to obtain the second point cloud data, and then the grid division module 106 maps the second point cloud data to the second point cloud data.
  • the grid division module 106 may first perform grid division, and the mapping module 105 may perform point cloud mapping, that is, the The grid division module 106 first performs grid division processing on the acquired first point cloud data, and determines a plurality of target point clouds after grid division, and then the mapping module 105 determines the grid division module 106
  • the multiple target point clouds of , respectively, are mapped to the image as interest points for generating anchor boxes.
  • the target detection module 104 is configured to, in the first image, generate at least one anchor frame corresponding to each target point cloud in the plurality of target point clouds;
  • An anchor box is used for object detection to determine the position of at least one target object to be detected.
  • the target detection module 104 may include a feature extraction module 107 and a detection module 108 .
  • the feature extraction module 107 is configured to extract image features from the first image, so that the detection module 108 can perform target detection according to the extracted image features and the generated anchor frame.
  • the feature extraction module 107 may use a visual geometry group (VGG) network model or other deep learning network models to extract image features, for example, a VGG16 model may be used.
  • VGG visual geometry group
  • the above-mentioned visual geometric group network model is only a specific example of the network model that can be used by the feature extraction module 107, and the network model that can be used by the feature extraction module 107 in the embodiment of the present application is not limited to the above.
  • a residual neural network model residual neural network, ResNet
  • ResNet residual neural network
  • the detection module 108 is configured to generate at least one anchor frame corresponding to each target point cloud in the plurality of target point clouds, and perform operations on the image features extracted by the feature extraction module 107 according to the area marked by each anchor frame. , complete the classification of the target in the image and the regression analysis of the corresponding anchor box, and finally realize the target detection function.
  • a deep learning network model can be used to complete the classification of the target in the image and the regression analysis of the corresponding anchor frame.
  • the output target detection result includes a category identifier of at least one target object detected in the first image, a confidence level corresponding to each category identifier, and an anchor frame corresponding to each target object.
  • a category identifier of at least one target object detected in the first image
  • a confidence level corresponding to each category identifier
  • an anchor frame corresponding to each target object.
  • the function realized by the target detection module 104 can be realized by one deep learning network model, or can be realized by the cooperation of multiple deep learning network models.
  • the deep learning network models of the target detection modules respectively implement different functions in the target detection module 104.
  • the functions implemented by the feature extraction module 107 and the detection module 108 can be implemented by different deep learning network models.
  • the system may further include a category identification management module 109, and the category identification management module 109 is configured to provide at least one category identification for the target detection module 104, so that all The target detection module 104 generates an anchor frame according to the target point cloud identified by the at least one category, and then performs target detection.
  • a category identification management module 109 is configured to provide at least one category identification for the target detection module 104, so that all The target detection module 104 generates an anchor frame according to the target point cloud identified by the at least one category, and then performs target detection.
  • the at least one category identification provided by the category identification management module 109 to the target detection module 104 includes a set category identification and/or a category identification determined after target detection is performed on a reference image, wherein the reference image is the A frame of image for object detection before the image.
  • the category identification management module 109 stores at least one set category identification, and can acquire a target detection result of a frame of images that is subjected to target detection before the first image, and the target detection result is included at least in the The category of the target object detected in the previous frame of image and its confidence. For each category identifier detected in the previous frame of image, if the category identifier management module 109 determines that the confidence level corresponding to the category identifier is greater than the set threshold, it will be used as a priori reference information, and the The category identifiers are input to the detection module 108 together with the set category identifiers, so that the detection module 108 generates anchor boxes of specific sizes corresponding to these category identifiers.
  • the category identifier management module 109 uses the category identifiers with the confidence level detected in the previous frame image greater than the set threshold and the category identifiers of cars and people as the category identifiers.
  • the category identification reference information is input to the target detection module 104 .
  • the functions of the point cloud data processing module 102 and the target detection module 104 can be realized by a network model
  • the input of the network model can be the first point cloud data and the corresponding first image
  • the output is the first point cloud data and the corresponding first image.
  • An anchor box corresponding to a target object detected in an image, a category identifier of the target object, and a confidence level corresponding to each category identifier; or, the input of the network model may be the first point cloud data and the corresponding first image
  • the target detection result of a frame of image before the first image is output as the anchor frame corresponding to the target object detected in the first image, the category identifier of the target object, and the confidence level corresponding to each category identifier.
  • the structure of the system shown in FIG. 1 does not constitute a specific limitation on the system to which the target detection method provided by the embodiment of the present application is applied.
  • the system to which the target detection method may be applied may include more or less modules than those shown in FIG. 1 , or combine some modules, or split some modules, or arrange different modules .
  • the devices, modules, functions, etc. included in the system architecture shown in FIG. 1 may all be integrated into one device for implementation, or may be distributed in different devices for implementation.
  • the system shown in Figure 1 can all be included in an autonomous vehicle.
  • the point cloud acquisition module 101 and the image acquisition module 103 shown in FIG. 1 may be independent devices, respectively, and the functions of the modules other than the point cloud acquisition module 101 and the image acquisition module 103 may be integrated into one Implemented in a processing device or server or cloud.
  • FIG. 1 is only an example, and the system applied by the embodiments of the present application is not limited thereto.
  • FIG. 2 is a schematic diagram of a target detection method provided by an embodiment of the present application.
  • the target detection device may be, but is not limited to, the device with data processing capability provided in the embodiment of the present application, for example, may be the above-mentioned vehicle or vehicle-mounted device, or a server, a cloud server, or the like.
  • the target detection method provided by this application includes:
  • the target detection apparatus acquires first point cloud data collected by a radar sensor and a first image collected by a corresponding camera sensor, where the first point cloud data includes multiple point clouds.
  • the target detection apparatus may respectively receive the first point cloud data sent by the radar sensor and the first image sent by the camera sensor, and perform target detection based on the first point cloud data and the first image.
  • the method for the target detection apparatus to acquire the first point cloud data and the first image may also be to receive the first point cloud data and the first image input by the user, or the target detection apparatus may directly collect the first point Cloud data and first image.
  • the first point cloud data and the first image are data corresponding to the same scene acquired by the radar sensor and the camera sensor at the same time.
  • the radar sensor described in step S201 may be used as the point cloud collection module 101 shown in FIG. 1
  • the camera sensor may be used as the point cloud collection module 101 shown in FIG. 1 .
  • the target detection apparatus maps the first point cloud data to the image plane of the first image to obtain second point cloud data, where the second point cloud data includes multiple point clouds.
  • the target detection device After acquiring the first point cloud data and the corresponding first image, the target detection device projects the first point cloud data to the first image to obtain second point cloud data.
  • the target detection device may, according to the conversion relationship between the coordinates in the coordinate system where the point cloud is located in the first point cloud data, and the coordinates in the image plane coordinate system of the first image, detect the first The coordinates of the point cloud in the point cloud data are converted to obtain the position coordinates in the image plane after the point cloud in the first point cloud data is mapped to the image plane of the first image.
  • the multiple point clouds (which may also be referred to as projection points) included in the second point cloud data correspond to the multiple point clouds included in the first point cloud data one-to-one.
  • Any point cloud in the second point cloud data is a feature point after the point cloud in the corresponding first point cloud data is mapped to the image plane of the first image.
  • the target detection device can map the first point cloud data to the image plane of the first image according to the calibration parameters of the millimeter-wave radar and the camera to obtain the second point cloud data .
  • mapping relationship between the point cloud in the first point cloud data and the corresponding point cloud in the second point cloud data conforms to the following formula:
  • P [X, Y, Z, 1], representing the coordinates of the point cloud in the first point cloud data in the point cloud coordinate system
  • p [x, y, 1], representing the second point cloud data the coordinates of the point cloud in the image plane coordinate system,
  • the parameters of the point cloud in the second point cloud data include at least the coordinates of the point cloud in the image plane of the first image, the distance parameter of the point cloud, and the RCS of the point cloud, wherein, The distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
  • the position information of the point cloud can be expressed as (x 1 , y 1 , d 1 . ), where x 1 and y 1 are the coordinates of the point cloud A in the image plane coordinate system of the first image respectively, and d 1 is the distance parameter of the point cloud A, indicating that under the BEV perspective, the point cloud The distance from A to the vertical plane where the camera is located.
  • the target detection apparatus may divide the second point cloud data into grids according to the position information of each point cloud in the second point cloud data.
  • step S202 when the method is applied to the system shown in FIG. 1 , the method described in step S202 may be executed by the mapping module 105 shown in FIG. 1 .
  • S203 The target detection apparatus performs grid division on the second point cloud data.
  • the second point cloud data is divided into a plurality of grids, and then according to the distance parameter of the point cloud in each grid, each grid is divided into several grids.
  • the point cloud included in the grid is divided into a plurality of point cloud sets, and the grid division of the second point cloud data is completed, wherein the distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
  • the target detection apparatus when the target detection apparatus performs grid division on the second point cloud data, first, according to the set grid size, the image plane is divided into multiple grids, wherein the image size is M ⁇ N. pixel, the size of each grid obtained after division is m ⁇ n pixels, and M, N, m, and n are all positive numbers. Coordinates in the image plane coordinate system that determine the point cloud contained in each raster.
  • the target detection device is in the BEV viewing angle plane of the vertical plane where the camera is located, according to the distance parameter of the point cloud, from the point cloud closest to the camera to the point cloud farthest from the camera, according to the set distance size (such as unit distance) for vertical segmentation, so that the point clouds within each set distance corresponding to each grid form a point cloud set.
  • the set distance size such as unit distance
  • Final implementation Divide the point cloud in the second point cloud data into multiple grids of size m ⁇ n ⁇ L, where m ⁇ n is the set grid size, and L is the set distance size.
  • step S203 when the method is applied to the system shown in FIG. 1 , the method described in step S203 may be executed by the grid division module 106 shown in FIG. 1 .
  • the target detection device determines, according to the feature data of the point cloud in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data, wherein any target point cloud corresponds to the at least one point cloud in the second point cloud data.
  • the target detection device divides the second point cloud data into grids, for each grid, the target point cloud of each point cloud set is determined according to the feature data of the point cloud in the point cloud set contained in the grid, and the target point cloud is determined.
  • the target point cloud of multiple point cloud sets contained in the grid is used as the target point cloud of the grid.
  • the feature data of the point cloud is used to represent the radar echo intensity corresponding to the point cloud, and the feature data can be RCS specifically;
  • the target point cloud of each point cloud set is used to represent the spatial distribution characteristics of the point cloud in the point cloud set ; each grid corresponds to at least one target point cloud.
  • the target detection device may calculate the centroid point of the point cloud included in each point cloud set, and use the calculated centroid point as the target point cloud of the point cloud set.
  • the difference between the RCSs of any two point clouds in each point cloud set is less than a set threshold, or, in the point clouds included in each point cloud set, only any two point clouds that satisfy any two point clouds are reserved.
  • the difference between the RCS of the part of the point cloud is less than the set threshold.
  • the target detection device can select at least one point cloud whose RCS difference is less than a set threshold from the point clouds contained in the grid according to the similarity of the RCSs of the point clouds in the same grid, and calculate its RCS in the grid space.
  • the centroid point, the calculated centroid point is used as the target point cloud of the grid for the subsequent generation of the anchor frame.
  • the target detection device After determining the target point cloud of each grid in the second point cloud data, uses the target point clouds of multiple grids in the second point cloud data as the second point cloud data for grid division After the target point cloud is obtained, the multiple target point clouds are obtained.
  • step S204 when the method is applied to the system shown in FIG. 1 , the method described in step S204 may be executed by the grid division module 106 shown in FIG. 1 .
  • the target detection apparatus generates, in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds.
  • any anchor box of any target point cloud includes the target point cloud.
  • the target detection device uses the plurality of target point clouds as interest points for subsequent generation of anchor frames, and generates a target point cloud containing the target point cloud for each target point cloud of the plurality of target point clouds. At least one anchor frame of the target point cloud is obtained to obtain multiple anchor frames, so as to determine the position of the target object to be detected according to the multiple anchor frames.
  • the target detection device may perform the following two steps respectively:
  • Step 1 The target detection apparatus acquires at least one category identifier, wherein different category identifiers are respectively used to identify categories of different objects.
  • the category identifier may specifically be the name, code, etc. of the category to which the object belongs, or any other information used to represent the category to which the object belongs.
  • the at least one category identifier may include a category identifier set as described below and/or a category identifier determined after performing target detection on a reference image.
  • the set category identifier may be set as an identifier of a category of an object that appears frequently in a target detection scene.
  • the set category identifier may be input by the user.
  • the set category identifiers may be set to the category identifiers of objects that often appear in the vehicle driving scene, such as the category identifiers of vehicles and pedestrians.
  • Labels are used as basic category labels.
  • anchor boxes corresponding to these category labels are generated, which can improve the simplicity and accuracy of target detection.
  • the reference image is a frame of images for which target detection is performed before the first image.
  • the content displayed in consecutive multi-frame images (such as consecutive multi-frame images in a video stream) is generally close, and the categories of objects contained in adjacent images may be relatively close. Therefore, when performing target detection on consecutive multi-frame images, you can refer to the current frame.
  • the category of the object contained in the previous frame of the image is used to determine which category identifiers corresponding to the anchor frame need to be generated in the current frame image, so that the category identifier determined for the current image is closer to the object actually contained in the current image. Category identification to further improve the accuracy of target detection.
  • the confidence level of the category identifier determined after the target detection is performed on the reference image is greater than a set threshold.
  • the target detection device after the target detection device performs target detection on the reference image by using the network model, the detection result output by the network model can be obtained, and the detection result includes the category identifiers corresponding to the categories of different objects detected in the reference image. , and the confidence level corresponding to each category identifier, the target detection device selects the category identifier whose corresponding confidence degree is greater than the set threshold from each category identifier, as the category identifier determined after the target detection is performed on the reference image, which can be guaranteed to be determined according to the reference image. the reliability and accuracy of the category identification.
  • Step 2 The target detection device generates at least one anchor frame including the target point cloud for each target point cloud in the plurality of target point clouds according to the at least one category identifier.
  • the target detection device may generate at least one anchor frame of the target point cloud in any of the following ways:
  • At least one anchor frame size wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier; and then determine that each target point cloud conforms to the at least one anchor frame size At least one anchor box of the anchor box size.
  • the target detection apparatus may store the correspondence between different category identifiers and anchor frame sizes.
  • one category identifier may correspond to multiple different anchor frame sizes
  • one anchor frame size may also correspond to Multiple different category identifiers.
  • the correspondence between the different category identifiers and the anchor frame size may be input by the user.
  • the size of different anchor boxes can be set by the user, or can be obtained by classifying or machine learning the true value of the dataset containing the actual size of different objects, or determined according to the common size and height-width ratio of different object instances in practice .
  • the size of the anchor frame includes the area parameter and the aspect ratio parameter of the anchor frame
  • the area parameter is the area of the anchor frame, which is used to represent the size of the anchor frame
  • the aspect ratio parameter is the two parameters of the anchor frame. The ratio between the lengths of adjacent sides. Different anchor box sizes correspond to different area parameters and/or scale parameters.
  • each category identifier may correspond to multiple different area parameters, and each area parameter may correspond to multiple different aspect ratio parameters ; or each category identifier can correspond to multiple different aspect ratio parameters, and each aspect ratio parameter can correspond to multiple different area parameters.
  • each category identifier can correspond to 4 area parameters (128, 256, 512, 1024 pixels).
  • the target object is facing the camera and the side facing the camera, the size of the image is different, so you can set the aspect ratio parameters of two angles for the state of the target object facing the camera and side facing the camera, that is, each category identifier corresponds to The aspect ratio parameters of two different angles are shown in Table 1 below:
  • each category identifier corresponds to 2 aspect ratio parameters, and after combining with the above 4 area parameters, 8 anchor frame sizes of different sizes can be obtained. Therefore, based on the above area parameters and aspect ratio parameters , when the target detection device generates anchor boxes, it can generate 8 anchor boxes of different sizes for each target point cloud.
  • At least one object size wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to represent the category identification
  • the size of the object to which it belongs then, according to the at least one object size, at least one mapping size is determined, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the target
  • the object to which the category identifier belongs is mapped to the size of the first image, and the target category identifier is the category identifier corresponding to the object size; then determine at least one anchor of each target point cloud that conforms to the at least one mapping size frame.
  • mapping size the relationship between the object size and the corresponding mapping size conforms to the following formula:
  • s represents the scaling factor
  • u represents the mapping size
  • dx represent the size (length and height) of a pixel
  • v 0 represents the translation of the origin
  • f is the focal length of the camera
  • R is the rotation matrix
  • t is the translation vector
  • X w , Y w , and Z w represent the object size.
  • the object detection apparatus may store object sizes corresponding to different category identifiers.
  • the correspondence between the different category identifiers and the object size may be input by the user.
  • the different object sizes may be set by the user, or obtained by classifying or machine learning the true values of the datasets containing the actual sizes of the different objects, or determined according to the common sizes of different object instances in practice.
  • the anchor frame size includes the side lengths of two adjacent sides of the anchor frame. This parameter can be determined by category feedback information and calibration parameters.
  • the distance parameter of the target point cloud can be based on the distance from the target point cloud to the vertical plane where the camera is located.
  • the size of the common category of object instances in the world coordinate system is mapped to the image plane of the first image, and the size of the anchor box corresponding to the category identifier is obtained, and then the anchor box of a specific category and a specific size can be obtained.
  • the anchor frame is generated to mark the target object in the image. Therefore, the size of the anchor frame needs to correspond to the actual size of the target object. And the distance of the target object from the image acquisition device is different, and the size displayed in the image is also different. Therefore, in the above method, the size of the anchor frame corresponding to the category identification is determined according to the actual distance of the target object corresponding to the category identification and the distance parameter of the point cloud detected by the radar, combined with the small hole imaging principle of the camera, which can improve the generated anchor frame. The size of the accuracy, thereby improving the efficiency of target detection.
  • the target point cloud in any anchor frame of any target point cloud, is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at the Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
  • some anchor frames corresponding to the target point cloud may be generated by taking the target point cloud as the center point.
  • the target detection device can translate the position of the anchor frame of the generated target point cloud, so that the target point cloud becomes a point (such as a midpoint) on any side of the anchor frame, so as to obtain relatively more anchor box.
  • the information such as the size, position and quantity of the anchor frame can be flexibly set and adjusted in combination with the actual application scenario.
  • the method described in step S205 may be executed by the detection module 108 and the category identification management module 109 shown in FIG. 1 , for example,
  • the above-mentioned method for determining at least one category identifier referred to when generating the anchor frame corresponding to the target point cloud can be performed by the category identifier management module 109, and the above-mentioned method for generating the target corresponding to the target point cloud according to the at least one category identifier.
  • the method of anchoring can be performed by the detection module 108 .
  • the target detection apparatus performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the position of at least one target object to be detected.
  • the target detection device After the target detection device generates at least one corresponding anchor frame for target detection for the target point cloud, it can compensate the size of the generated anchor frame according to the distance parameter of the target point cloud according to the following formula:
  • s is the scaling factor corresponding to the target point cloud
  • ⁇ and ⁇ are the scaling factors for adjusting the size of the anchor frame, which can be obtained through model training
  • d is the distance parameter of the target point cloud.
  • the target detection device can identify the first image including the generated anchor frames based on the convolutional neural network model, wherein the target object detection can be carried out in combination with the image features obtained by the image feature extraction and classification, identify the target category of each target object contained in the first image, and determine the confidence of the target category; and in at least one anchor frame of each target point cloud, determine where each target object is located target anchor box.
  • the target detection device performs target detection by combining point cloud data and images
  • the target category of each target object contained in the first image, the category identifier of the target category of each target object, and the target category of each target object can be obtained.
  • the method described in step S206 may be executed by the feature extraction module 107 and the detection module 108 shown in FIG. 1 .
  • step numbers in the various embodiments described in the embodiments of the present application are only an example of the execution flow, and do not constitute a limitation on the sequence of execution of the steps. There is no strict order of execution between the steps of timing dependencies.
  • the second point cloud data is obtained, and the fusion of the point cloud data and the image is realized.
  • the data is divided into grids, and the point cloud voxel features of similar RCS information in the grid are combined to determine the target point cloud corresponding to the grid, which greatly reduces the number of required point clouds, and correspondingly reduces the number of anchor frames generated.
  • the characteristics of each point cloud are also synthesized, which solves the problem of network redundancy caused by the generation of anchor boxes based on each point cloud, and speeds up the detection speed of target detection.
  • the above scheme simplifies the number of point clouds by synthesizing point cloud features in a small space (raster). Even in the presence of radar noise, it can reduce the impact of local noise density and slow down the processing caused by noise. sluggishness etc.
  • the feedback of the detection result of the previous frame image can be combined, and the category identifier detected in the previous frame image can be used as prior information, and the anchor frame generated in the current frame image can be adjusted, which can further improve the To achieve the effect of using prior information to improve the detection accuracy.
  • the target detection apparatus 300 may include: a data acquisition unit 301 and a processing unit 302 .
  • the data acquisition unit 301 is configured to acquire the first point cloud data collected by the radar sensor and the first image collected by the corresponding camera sensor, wherein the first point cloud data includes multiple point clouds; the processing unit 302 , used to map the first point cloud data to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds;
  • the cloud data is divided into grids; according to the feature data of the point clouds in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data is determined, wherein any target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds is generated; according to each generated target point At least one anchor box corresponding to the cloud performs target detection, and determines the position of at least one target object to be detected.
  • the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud.
  • any anchor box corresponding to any target point cloud contains the target point cloud.
  • the processing unit 302 when the processing unit 302 divides the second point cloud data into a grid, it is specifically configured to: divide the second point cloud data into multiple pieces according to a set grid size grid; according to the distance parameter of the point cloud in each grid, the point cloud contained in each grid is divided into a plurality of point cloud sets, wherein the distance parameter is used to indicate the distance between the point cloud and the radar sensor. Horizontal distance.
  • the processing unit 302 determines, according to the feature data of the point cloud in the second point cloud data, multiple target point clouds after grid division of the second point cloud data, the specific Used to: in each grid, determine the target point cloud of each point cloud set according to the feature data of the point cloud in each point cloud set; The point cloud is used as the target point cloud of each grid; the target point cloud of multiple grids contained in the second point cloud data is used as the target point cloud after grid division of the second point cloud data, Obtain the multiple target point clouds.
  • the processing unit 302 when generating at least one anchor frame corresponding to each target point cloud in the multiple target point clouds, the processing unit 302 is specifically configured to: obtain at least one anchor frame There are various category identifiers, wherein the different category identifiers are respectively used to represent the categories of different objects; according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud is determined.
  • the at least one category identification includes a set category identification and/or a category identification determined after performing target detection on a reference image, wherein the reference image is before the first image A frame of image for object detection.
  • the confidence level of the category identification determined after the target detection is performed on the reference image is greater than a set threshold.
  • the processing unit 302 determines at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification
  • the processing unit 302 is specifically configured to: determine at least one anchor box An anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier; it is determined that each target point cloud conforms to the at least one anchor frame size of at least one anchor box.
  • the processing unit 302 determines, according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud, and is specifically configured to: determine at least one Object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to indicate the size of the object to which the category identification belongs.
  • At least one mapping size is determined according to the at least one object size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the object to which the target category identifier belongs
  • the size after mapping to the first image, the target category identifier is the category identifier corresponding to the object size; at least one anchor box of each target point cloud that conforms to the at least one mapping size is determined.
  • the target point cloud in any anchor frame of any target point cloud, is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at any position in the anchor frame. Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
  • the processing unit 302 performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the position of at least one target object to be detected, which is specifically used for: identifying the first The target category of each target object contained in an image, and determine the confidence level of the target category; in at least one anchor frame of each target point cloud, determine the target anchor frame where each target object is located; output the detection result , the detection result includes: the target category of each target object, the category identifier of the target category of each target object, the confidence level of the target category of each target object, and the target anchor frame where each target object is located.
  • the target detection apparatus 300 may further include a storage unit 303 for storing program codes and data of the target detection apparatus 300 .
  • the processing unit 302 may be a processor or a controller, such as a general-purpose central processing unit (CPU), a general-purpose processor, a digital signal processing (DSP), an application-specific integrated circuit (application). specific integrated circuits, ASIC), field programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute the various exemplary logical blocks, modules, etc. described in connection with this disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, and the like.
  • the storage unit 303 may be a memory.
  • the data acquisition unit 301 may be an interface circuit of the target detection device, used for receiving data from other devices, for example, receiving the first point cloud data sent by a radar sensor. When the target detection device is implemented in the form of a chip, the data acquisition unit 301 may be an interface circuit used by the chip to receive data from or send data to other chips or devices.
  • the division of units in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be other division methods.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit. In the device, it can also exist physically alone, or two or more units can be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the software or firmware includes, but is not limited to, computer program instructions or code, and can be executed by a hardware processor.
  • the hardware includes, but is not limited to, various types of integrated circuits, such as a central processing unit (CPU), a digital signal processor (DSP), a field programmable gate array (FPGA), or an application specific integrated circuit (ASIC).
  • CPU central processing unit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • the embodiments of the present application further provide a target detection apparatus for implementing the target detection method provided by the embodiments of the present application.
  • the target detection apparatus 400 may include: one or more processors 401, a memory 402, and one or more computer programs (not shown in the figure).
  • the above devices may be coupled through one or more communication lines 403 .
  • one or more computer programs are stored in the memory 402, and the one or more computer programs include instructions; the processor 401 invokes the instructions stored in the memory 402, so that the target detection apparatus 400 executes the instructions provided by the embodiments of the present application. object detection method.
  • the processor may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, which can implement or
  • a general purpose processor may be a microprocessor or any conventional processor or the like.
  • the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the memory may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memory.
  • the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • the target detection apparatus 400 may further include a communication interface 404 for communicating with other apparatuses through a transmission medium.
  • the target detection device 400 may communicate with a device that collects first point cloud data, such as a radar sensor, through the communication interface 404, so as to receive the first point cloud data collected by the device.
  • the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces.
  • the transceiver when the communication interface is a transceiver, the transceiver may include an independent receiver and an independent transmitter; it may also be a transceiver integrating a transceiver function, or an interface circuit.
  • the processor 401, the memory 402, and the communication interface 404 may be connected to each other through a communication line 403; the communication line 403 may be a Peripheral Component Interconnect (PCI for short) bus or an extension Industry standard structure (Extended Industry Standard Architecture, referred to as EISA) bus and so on.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication line 403 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 4, but it does not mean that there is only one bus or one type of bus.
  • the methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software When implemented in software, it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, network equipment, user equipment, or other programmable apparatus.
  • the computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server or data center by means of wired (such as coaxial cable, optical fiber, digital subscriber line, DSL for short) or wireless (such as infrared, wireless, microwave, etc.)
  • a computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media.
  • the available media can be magnetic media (eg, floppy disks, hard disks, magnetic tape), optical media (eg, digital video disc (DVD) for short), or semiconductor media (eg, SSD), and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The present application provides a target detection method and apparatus, which belong to the technical field of sensors, and can be applied to aided driving and autonomous driving. The method comprises: acquiring first point cloud data collected by a radar sensor and a first image collected by a corresponding camera sensor; mapping the first point cloud data to an image plane of the first image, so as to obtain second point cloud data; performing grid division on the second point cloud data; determining a plurality of target point clouds of the second point cloud data after being subjected to the grid division, wherein any of the target point clouds corresponds to at least one point cloud in the second point cloud data; generating at least one anchor box corresponding to each of the plurality of target point clouds; and performing target detection according to the at least one generated anchor box corresponding to each target point cloud, so as to determine the position of at least one target object to be detected. By means of the method, the efficiency of target detection can be improved. Furthermore, the method can be applied to the Internet of Vehicles, such as vehicle to everything (V2X), and vehicle to vehicle (V2V).

Description

一种目标检测方法及装置A target detection method and device
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求在2021年03月31日提交中国专利局、申请号为202110345758.9、申请名称为“一种目标检测方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number of 202110345758.9 and the application title of "a method and device for target detection" filed with the Chinese Patent Office on March 31, 2021, the entire contents of which are incorporated into this application by reference .
技术领域technical field
本申请涉及传感器处理技术领域,尤其涉及一种目标检测方法及装置。The present application relates to the technical field of sensor processing, and in particular, to a target detection method and device.
背景技术Background technique
目标检测是自动驾驶中不可或缺的关键技术。例如针对车辆、行人等目标进行目标检测处理,可以为路径规划、车道选择、人车跟踪、行为预测等提供参考信息,在自动驾驶中具有重要意义。Object detection is an indispensable key technology in autonomous driving. For example, target detection processing for targets such as vehicles and pedestrians can provide reference information for path planning, lane selection, human-vehicle tracking, and behavior prediction, which is of great significance in autonomous driving.
目前,可以采用雷达检测和图像检测结合的方式进行目标检测。具体可以将雷达检测到的雷达点映射到图像坐标系,并为每个映射后的雷达点生成预定义的锚框,然后结合生成的锚框,利用基于深度学习的卷积神经网络进行目标检测。但是,对于一个目标,雷达检测后可能会反射回多个雷达点,若针对每个雷达点都在图像中生成锚框,则过于冗余,且数据处理量很大,需要耗费大量的处理时间,导致目标检测的效率很低。At present, target detection can be performed by a combination of radar detection and image detection. Specifically, the radar points detected by the radar can be mapped to the image coordinate system, and a predefined anchor frame is generated for each mapped radar point, and then combined with the generated anchor frame, the deep learning-based convolutional neural network is used for target detection. . However, for a target, after radar detection, it may reflect back to multiple radar points. If an anchor frame is generated in the image for each radar point, it is too redundant, and the amount of data processing is large, which requires a lot of processing time. , resulting in low efficiency of target detection.
发明内容SUMMARY OF THE INVENTION
本申请提供一种目标检测方法,用以结合点云数据和图像实现目标检测,提高目标检测的效率。The present application provides a target detection method, which is used to realize target detection by combining point cloud data and images, and improve the efficiency of target detection.
第一方面,本申请提供一种目标检测方法,该方法包括:获取雷达传感器采集的第一点云数据和对应的摄像头传感器采集的第一图像,其中,所述第一点云数据包含多个点云;将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云;对所述第二点云数据进行栅格划分;根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云;在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框;根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。In a first aspect, the present application provides a target detection method, the method includes: acquiring first point cloud data collected by a radar sensor and a first image collected by a corresponding camera sensor, wherein the first point cloud data includes a plurality of point cloud; map the first point cloud data to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds; The cloud data is divided into grids; according to the feature data of the point clouds in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data is determined, wherein any target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds is generated; according to each generated target point At least one anchor box corresponding to the cloud performs target detection, and determines the position of at least one target object to be detected.
在该方法中,通过对映射到图像后的点云数据进行栅格划分后,根据点云的特征数据确定栅格划分后的目标点云,再根据目标点云在图像中生成锚框,能够极大减少需生成锚框的点云的数量,进而降低生成的锚框个数,同时也能综合每个点云的特征,保证生成的锚框的准确度,因此,该方法解决了基于采集到的每个点云都生成锚框造成的网络冗余、耗费时间较多等问题,加快了进行目标检测的检测速度和效率。此外,该方法通过对栅格中的点云进行特征综合确定目标点云,因此即便在雷达传感器采集的点云数据存在雷达噪点的情况下,也能够减小局部噪点密集带来的影响,减缓噪点带来的处理迟钝等问题。In this method, after the point cloud data mapped to the image is divided into grids, the grid-divided target point cloud is determined according to the feature data of the point cloud, and then the anchor frame is generated in the image according to the target point cloud. It greatly reduces the number of point clouds that need to generate anchor boxes, thereby reducing the number of anchor boxes generated. At the same time, it can also integrate the characteristics of each point cloud to ensure the accuracy of the generated anchor boxes. Therefore, this method solves the problem based on acquisition Each point cloud obtained generates an anchor frame, which causes network redundancy and time-consuming problems, which speeds up the detection speed and efficiency of target detection. In addition, this method determines the target point cloud by synthesizing the features of the point cloud in the grid, so even if the point cloud data collected by the radar sensor has radar noise, it can reduce the impact of local noise density and slow down Problems such as slow processing caused by noise.
在一种可能的设计中,所述特征数据用于表示点云的雷达回波强度或点云的雷达回波强度分布特征或点云的极化特征。In a possible design, the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud.
在该方法中,可以根据表征点云雷达回波强度的特征数据,确定点云数据对应的目标点云,这样能够对点云数据中点云的特征信息进行充分利用,以此提高根据点云数据进行目标检测的准确性。In this method, the target point cloud corresponding to the point cloud data can be determined according to the characteristic data characterizing the radar echo intensity of the point cloud data, so that the characteristic information of the point cloud in the point cloud data can be fully utilized, so as to improve the accuracy of the point cloud data. The accuracy of the data for object detection.
在一种可能的设计中,在每个目标点云对应的至少一个锚框中,任一个目标点云对应的任一个锚框包含所述目标点云。In a possible design, in at least one anchor box corresponding to each target point cloud, any anchor box corresponding to any target point cloud contains the target point cloud.
在该方法中,目标点云映射到图像后大多位于待检测的目标对象所在的图像区域,因此,通过为目标点云生成包含该目标点云的锚框,能够尽可能准确的根据生成的锚框来确定目标点云对应的目标对象在图像中所在的区域,进而实现目标检测。In this method, after the target point cloud is mapped to the image, it is mostly located in the image area where the target object to be detected is located. Therefore, by generating an anchor frame containing the target point cloud for the target point cloud, the generated anchors can be accurately The frame is used to determine the area where the target object corresponding to the target point cloud is located in the image, so as to achieve target detection.
在一种可能的设计中,对所述第二点云数据进行栅格划分,包括:按照设定栅格大小,将所述第二点云数据划分为多个栅格;根据每个栅格中点云的距离参数,将每个栅格中包含的点云划分为多个点云集合,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。In a possible design, dividing the second point cloud data into grids includes: dividing the second point cloud data into multiple grids according to a set grid size; The distance parameter of the midpoint cloud divides the point cloud contained in each grid into multiple point cloud sets, wherein the distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
在该方法中,在划分栅格后,参考点云到雷达传感器的距离,对点云进行进一步划分,能够在检测过程中充分利用点云的各维度坐标位置信息,进而提高检测的准确性。In this method, after dividing the grid, the point cloud is further divided with reference to the distance from the point cloud to the radar sensor, so that the coordinate position information of each dimension of the point cloud can be fully utilized in the detection process, thereby improving the detection accuracy.
在一种可能的设计中,根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,包括:在每个栅格中,分别根据每个点云集合中点云的特征数据,确定每个点云集合的目标点云;将每个栅格中包含的多个点云集合的目标点云,作为每个栅格的目标点云;将所述第二点云数据包含的多个栅格的目标点云,作为所述第二点云数据进行栅格划分后的目标点云,得到所述多个目标点云。In a possible design, according to the feature data of the point cloud in the second point cloud data, determining a plurality of target point clouds after grid division of the second point cloud data, including: in each grid , determine the target point cloud of each point cloud set according to the feature data of the point cloud in each point cloud set; take the target point cloud of multiple point cloud sets contained in each grid as each grid The target point cloud is obtained; the target point cloud of multiple grids contained in the second point cloud data is used as the target point cloud after grid division of the second point cloud data, and the multiple target point clouds are obtained. .
在该方法中,进行点云栅格划分后,结合点云的特征数据确定每个栅格的目标点云,能够充分考虑小区域空间(栅格)中点云的分布特征进行点云精简化,在减少图像中生成的锚框数量、提高检测效率的同时,也能保证生成锚框的准确度,提高检测精度。In this method, after the point cloud grid is divided, the target point cloud of each grid is determined in combination with the feature data of the point cloud, which can fully consider the distribution characteristics of the point cloud in the small area space (grid) to simplify the point cloud , while reducing the number of anchor boxes generated in the image and improving the detection efficiency, it can also ensure the accuracy of generating anchor boxes and improve the detection accuracy.
在一种可能的设计中,所述特征数据包括点云的雷达截面积;在每个栅格中,分别根据每个点云集合中点云的特征数据,确定每个点云集合的目标点云,包括:将每个点云集合中至少一个点云的形心点,作为每个点云集合的目标点云,其中,在所述至少一个点云中,任意两个点云的雷达截面积之间的差值小于设定阈值。In a possible design, the feature data includes the radar cross-sectional area of the point cloud; in each grid, the target point of each point cloud set is determined according to the feature data of the point cloud in each point cloud set, respectively Cloud, including: taking the centroid point of at least one point cloud in each point cloud set as the target point cloud of each point cloud set, wherein, in the at least one point cloud, the radar intercepts of any two point clouds are The difference between the areas is less than the set threshold.
在该方法中,针对点云集合中雷达截面积之差小于设定阈值的点云,确定其形心点,并作为点云集合的目标点云,在精简点云数量的同时,能够通过目标点云综合点云集合中点云的特征信息。In this method, the centroid point of the point cloud whose radar cross-sectional area difference is less than the set threshold in the point cloud set is determined and used as the target point cloud of the point cloud set. While reducing the number of point clouds, it can pass the target The feature information of the point cloud in the point cloud comprehensive point cloud set.
在一种可能的设计中,在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框,包括:获取至少一种类别标识,其中,不同类别标识分别用于表示不同对象的类别;根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框。In a possible design, in the first image, generating at least one anchor frame corresponding to each target point cloud in the multiple target point clouds includes: acquiring at least one category identifier, wherein different categories The identifiers are respectively used to represent categories of different objects; according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud is determined.
在该方法中,不同对象的实际尺寸是不同的,因此,通过为目标点云分别生成某些对象的类别标识对应的锚框,提高了生成的锚框的多样性,从而可以在多种不同类别的锚框中选择相对合适的锚框,进一步提高目标检测的准确性。In this method, the actual sizes of different objects are different. Therefore, by generating anchor boxes corresponding to the category of certain objects for the target point cloud, the diversity of the generated anchor boxes is improved, so that it can be used in a variety of different A relatively suitable anchor box is selected in the anchor box of the category to further improve the accuracy of target detection.
在一种可能的设计中,所述至少一种类别标识包括设定的类别标识和/或对参考图像进行目标检测后确定的类别标识,其中,所述参考图像为在所述第一图像之前进行目标检测 的一帧图像。In a possible design, the at least one category identification includes a set category identification and/or a category identification determined after performing target detection on a reference image, wherein the reference image is before the first image A frame of image for object detection.
在该方法中,可以参考前一帧图像中检测到的对象的类别,为当前帧图像中的目标点云生成锚框,因此可以根据相邻帧图像中包含的对象的相似性、目标类别的相似性,提高当前帧图像中生成的锚框的准确度,实现提高检测精度的效果。In this method, an anchor frame can be generated for the target point cloud in the current frame image with reference to the category of the object detected in the previous frame image. Similarity, improve the accuracy of the anchor frame generated in the current frame image, and achieve the effect of improving the detection accuracy.
在一种可能的设计中,所述对参考图像进行目标检测后确定的类别标识的置信度大于设定阈值。In a possible design, the confidence level of the category identification determined after the target detection is performed on the reference image is greater than a set threshold.
在该方法中,确定类别标识的置信度越高,说明确定的类别标识的准确度越高,则将前一帧图像中确定的置信度大于一定阈值的类别标识作为先验信息进行参考,可以进一步提高图像中生成的锚框的准确度,进而进一步提高检测精度。In this method, the higher the confidence of the determined category identification, the higher the accuracy of the determined category identification, and the category identification with the determined confidence greater than a certain threshold in the previous frame image is used as a priori information for reference. The accuracy of the anchor box generated in the image is further improved, and the detection accuracy is further improved.
在一种可能的设计中,根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框,包括:确定至少一个锚框尺寸,其中,所述至少一个锚框尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个锚框尺寸;确定每个目标点云的符合所述至少一个锚框尺寸的至少一个锚框。In a possible design, determining at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification includes: determining at least one anchor box size, wherein the The at least one anchor box size includes at least one anchor box size corresponding to each category identification in the at least one category identification; at least one anchor box of each target point cloud that conforms to the at least one anchor box size is determined.
在该方法中,根据设定的类别标识对应的锚框尺寸,可以简便快速的确定在图像中为目标点云生成的锚框的尺寸,提高检测速度。In this method, according to the anchor frame size corresponding to the set category identifier, the size of the anchor frame generated for the target point cloud in the image can be easily and quickly determined, thereby improving the detection speed.
在一种可能的设计中,根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框,包括:确定至少一个对象尺寸,其中,所述至少一个对象尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个对象尺寸,任一个类别标识对应的对象尺寸用于表示所述类别标识所属的对象的尺寸;根据所述至少一个对象尺寸,确定至少一个映射尺寸,其中,所述至少一个对象尺寸与所述至少一个映射尺寸一一对应,任一个对象尺寸对应的映射尺寸为目标类别标识所属的对象映射到所述第一图像后的尺寸,所述目标类别标识为所述对象尺寸对应的类别标识;确定每个目标点云的符合所述至少一个映射尺寸的至少一个锚框。In a possible design, determining at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification includes: determining at least one object size, wherein the At least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to represent the size of the object to which the category identification belongs; according to the at least one Object size, determine at least one mapping size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the mapping of the object to which the target category identifier belongs to the first image The target category identifier is the category identifier corresponding to the object size; at least one anchor box of each target point cloud that conforms to the at least one mapping size is determined.
在该方法中,根据类别标识对应的目标对象的实际距离和雷达探测的点云的距离参数,结合视觉探测器的成像原理,来确定类别标识对应的锚框尺寸,能够提高生成的锚框大小的准确度,进而提高目标检测的准确度。In this method, according to the actual distance of the target object corresponding to the category identification and the distance parameter of the point cloud detected by the radar, combined with the imaging principle of the visual detector, the size of the anchor box corresponding to the category identification is determined, which can improve the size of the generated anchor box. accuracy, thereby improving the accuracy of target detection.
在一种可能的设计中,在任一个目标点云的任一个锚框中,所述目标点云位于所述锚框包围的区域中的任意位置,例如位于所述锚框的中心位置,或者位于所述锚框的任一边长中的任意位置,例如位于所述锚框的任一边长的中点位置。In a possible design, in any anchor frame of any target point cloud, the target point cloud is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at any position in the anchor frame. Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
在该方法中,目标点云可能位于图像中目标对象所在区域的任一位置,因此,在锚框尺寸相同的情况下,在目标点云的不同方向生成锚框,能够提供更多选择和更准确的锚框,在目标点云被映射到目标对象所在区域的不同位置时,保证能够选择到尽可能包围目标对象的锚框。In this method, the target point cloud may be located at any position in the area where the target object is located in the image. Therefore, in the case of the same anchor frame size, generating anchor boxes in different directions of the target point cloud can provide more choices and An accurate anchor box ensures that when the target point cloud is mapped to different positions in the area where the target object is located, an anchor box that surrounds the target object as much as possible can be selected.
在一种可能的设计中,根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置,包括:识别所述第一图像中包含的每个目标对象的目标类别,并确定所述目标类别的置信度;在每个目标点云的至少一个锚框中,确定每个目标对象所在的目标锚框;输出检测结果,所述检测结果包含:每个目标对象的目标类别,每个目标对象的目标类别的类别标识,每个目标对象的目标类别的置信度,以及每个目标对象所在的目标锚框。In a possible design, performing target detection according to at least one anchor frame corresponding to each generated target point cloud, and determining the position of at least one target object to be detected includes: identifying each target object included in the first image. The target category of the target object, and the confidence level of the target category is determined; in at least one anchor frame of each target point cloud, the target anchor frame where each target object is located is determined; the detection result is output, and the detection result includes: The target category of each target object, the category identification of the target category of each target object, the confidence level of the target category of each target object, and the target anchor box where each target object is located.
在该方法中,目标点云是对原始点云数据进行精简后的点云,因此,基于对目标点云 生成的较少数量的锚框进行目标检测时,可以在少量的锚框中选择用于标记目标对象位置的锚框,检测速度较快,同时,目标点云综合了原始点云数据中点云的空间分布特征,可以保证检测的准确度。In this method, the target point cloud is a point cloud that has been simplified from the original point cloud data. Therefore, when the target detection is performed based on a small number of anchor boxes generated from the target point cloud, the target point cloud can be selected from a small number of anchor boxes. For the anchor box that marks the position of the target object, the detection speed is fast. At the same time, the target point cloud combines the spatial distribution characteristics of the point cloud in the original point cloud data, which can ensure the accuracy of detection.
第二方面,本申请提供一种目标检测装置,该装置包括数据获取单元和处理单元;所述数据获取单元,用于获取雷达传感器采集的第一点云数据和对应的摄像头传感器采集的第一图像,其中,所述第一点云数据包含多个点云;所述处理单元,用于将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云;对所述第二点云数据进行栅格划分;根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云;在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框;根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。In a second aspect, the present application provides a target detection device, which includes a data acquisition unit and a processing unit; the data acquisition unit is used to acquire first point cloud data collected by a radar sensor and first point cloud data collected by a corresponding camera sensor. image, wherein the first point cloud data includes multiple point clouds; the processing unit is configured to map the first point cloud data to the image plane of the first image to obtain second point cloud data, The second point cloud data includes multiple point clouds; the second point cloud data is divided into grids; the second point cloud is determined according to the feature data of the point cloud in the second point cloud data A plurality of target point clouds after grid division of data, wherein any target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, the plurality of target points are generated At least one anchor frame corresponding to each target point cloud in the cloud; target detection is performed according to the generated at least one anchor frame corresponding to each target point cloud, and the position of the at least one target object to be detected is determined.
在一种可能的设计中,所述特征数据用于表示点云的雷达回波强度或点云的雷达回波强度分布特征或点云的极化特征。In a possible design, the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud.
在一种可能的设计中,在每个目标点云对应的至少一个锚框中,任一个目标点云对应的任一个锚框包含所述目标点云。In a possible design, in at least one anchor box corresponding to each target point cloud, any anchor box corresponding to any target point cloud contains the target point cloud.
在一种可能的设计中,所述处理单元对所述第二点云数据进行栅格划分时,具体用于:按照设定栅格大小,将所述第二点云数据划分为多个栅格;根据每个栅格中点云的距离参数,将每个栅格中包含的点云划分为多个点云集合,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。In a possible design, when the processing unit divides the second point cloud data into grids, it is specifically configured to: divide the second point cloud data into multiple grids according to a set grid size According to the distance parameter of the point cloud in each grid, the point cloud contained in each grid is divided into a plurality of point cloud sets, wherein the distance parameter is used to indicate the level of the point cloud to the radar sensor distance.
在一种可能的设计中,所述处理单元根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云时,具体用于:在每个栅格中,分别根据每个点云集合中点云的特征数据,确定每个点云集合的目标点云;将每个栅格中包含的多个点云集合的目标点云,作为每个栅格的目标点云;将所述第二点云数据包含的多个栅格的目标点云,作为所述第二点云数据进行栅格划分后的目标点云,得到所述多个目标点云。In a possible design, when the processing unit determines, according to the feature data of the point cloud in the second point cloud data, the plurality of target point clouds after the grid division of the second point cloud data, the specific use of In: In each grid, the target point cloud of each point cloud set is determined according to the feature data of the point cloud in each point cloud set; the target points of multiple point cloud sets contained in each grid are determined. The cloud is used as the target point cloud of each grid; the target point cloud of multiple grids included in the second point cloud data is used as the target point cloud after grid division of the second point cloud data, and the result is obtained the plurality of target point clouds.
在一种可能的设计中,所述处理单元在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框时,具体用于:获取至少一种类别标识,其中,不同类别标识分别用于表示不同对象的类别;根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框。In a possible design, when generating, in the first image, at least one anchor frame corresponding to each target point cloud of the multiple target point clouds, the processing unit is specifically configured to: obtain at least one anchor frame Category identifiers, wherein different category identifiers are respectively used to represent categories of different objects; at least one anchor box corresponding to the at least one category identifier of each target point cloud is determined according to the at least one category identifier.
在一种可能的设计中,所述至少一种类别标识包括设定的类别标识和/或对参考图像进行目标检测后确定的类别标识,其中,所述参考图像为在所述第一图像之前进行目标检测的一帧图像。In a possible design, the at least one category identification includes a set category identification and/or a category identification determined after performing target detection on a reference image, wherein the reference image is before the first image A frame of image for object detection.
在一种可能的设计中,所述对参考图像进行目标检测后确定的类别标识的置信度大于设定阈值。In a possible design, the confidence level of the category identification determined after the target detection is performed on the reference image is greater than a set threshold.
在一种可能的设计中,所述处理单元根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框时,具体用于:确定至少一个锚框尺寸,其中,所述至少一个锚框尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个锚框尺寸;确定每个目标点云的符合所述至少一个锚框尺寸的至少一个锚框。In a possible design, when the processing unit determines, according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud, the processing unit is specifically configured to: determine at least one anchor box Anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier; determine the size of each target point cloud that conforms to the at least one anchor frame size At least one anchor box.
在一种可能的设计中,所述处理单元根据所述至少一种类别标识,确定每个目标点云 的所述至少一种类别标识对应的至少一个锚框,具体用于:确定至少一个对象尺寸,其中,所述至少一个对象尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个对象尺寸,任一个类别标识对应的对象尺寸用于表示所述类别标识所属的对象的尺寸;根据所述至少一个对象尺寸,确定至少一个映射尺寸,其中,所述至少一个对象尺寸与所述至少一个映射尺寸一一对应,任一个对象尺寸对应的映射尺寸为目标类别标识所属的对象映射到所述第一图像后的尺寸,所述目标类别标识为所述对象尺寸对应的类别标识;确定每个目标点云的符合所述至少一个映射尺寸的至少一个锚框。In a possible design, the processing unit determines at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification, and is specifically used for: determining at least one object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to indicate the size of the object to which the category identification belongs ; Determine at least one mapping size according to the at least one object size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any one object size is the object mapping to which the target category identifier belongs After reaching the size of the first image, the target category identifier is the category identifier corresponding to the object size; at least one anchor frame of each target point cloud that conforms to the at least one mapping size is determined.
在一种可能的设计中,在任一个目标点云的任一个锚框中,所述目标点云位于所述锚框包围的区域中的任意位置,例如位于所述锚框的中心位置,或者位于所述锚框的任一边长中的任意位置,例如位于所述锚框的任一边长的中点位置。In a possible design, in any anchor frame of any target point cloud, the target point cloud is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at any position in the anchor frame. Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
在一种可能的设计中,所述处理单元根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置,具体用于:识别所述第一图像中包含的每个目标对象的目标类别,并确定所述目标类别的置信度;在每个目标点云的至少一个锚框中,确定每个目标对象所在的目标锚框;输出检测结果,所述检测结果包含:每个目标对象的目标类别,每个目标对象的目标类别的类别标识,每个目标对象的目标类别的置信度,以及每个目标对象所在的目标锚框。In a possible design, the processing unit performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the position of at least one target object to be detected, which is specifically used for: identifying the first The target category of each target object contained in the image, and determine the confidence level of the target category; in at least one anchor frame of each target point cloud, determine the target anchor frame where each target object is located; output the detection result, The detection result includes: the target category of each target object, the category identifier of the target category of each target object, the confidence level of the target category of each target object, and the target anchor frame where each target object is located.
第三方面,本申请提供一种目标检测装置,该装置包括存储器和处理器;所述存储器用于存储计算机程序;所述处理器用于执行所述存储器中存储的计算程序,实现上述第一方面或第一方面的任一可能的设计所描述的方法。In a third aspect, the present application provides a target detection device, which includes a memory and a processor; the memory is used to store a computer program; the processor is used to execute the calculation program stored in the memory, so as to realize the above-mentioned first aspect or the method described in any possible design of the first aspect.
第四方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,当所述计算机程序在目标检测装置上运行时,使得所述目标检测装置执行上述第一方面或第一方面的任一可能的设计所描述的方法。In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed on a target detection device, the target detection device is made to execute the above-mentioned first The method described in the aspect or any possible design of the first aspect.
第五方面,本申请提供一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,当所述计算机程序或指令在目标检测装置上运行时,使得所述目标检测装置执行上述第一方面或第一方面的任一可能的设计所描述的方法。In a fifth aspect, the present application provides a computer program product, the computer program product includes a computer program or an instruction, when the computer program or instruction is run on a target detection device, the target detection device is made to perform the above-mentioned first aspect. or the method described in any possible design of the first aspect.
第六方面,本申请提供一种传感器或融合装置,该传感器可以为雷达传感器、摄像头传感器等探测传感器。该传感器或融合装置包括上述第二方面或第三方面所述的目标检测装置。In a sixth aspect, the present application provides a sensor or fusion device, where the sensor can be a detection sensor such as a radar sensor or a camera sensor. The sensor or fusion device includes the target detection device described in the second aspect or the third aspect.
第七方面,本申请提供一种终端,该终端包括上述第二方面或第三方面所述的目标检测装置,或者包括第六方面所述的传感器或融合装置。In a seventh aspect, the present application provides a terminal, where the terminal includes the target detection device described in the second aspect or the third aspect, or includes the sensor or fusion device described in the sixth aspect.
在一种可能的设计中,所述终端为如下任一种:智能运输设备,智能家居设备,智能制造设备,机器人。In a possible design, the terminal is any one of the following: intelligent transportation equipment, smart home equipment, intelligent manufacturing equipment, and robots.
在一种可能的设计中,所述智能运输设备为如下任一种:车辆,无人机,自动导引运输车,无人运输车。In a possible design, the intelligent transportation device is any one of the following: a vehicle, an unmanned aerial vehicle, an automatic guided transportation vehicle, and an unmanned transportation vehicle.
第八方面,本申请提供一种***,该***包括雷达传感器、对应的摄像头传感器,以及上述第二方面或第三方面所述的目标检测装置。In an eighth aspect, the present application provides a system, which includes a radar sensor, a corresponding camera sensor, and the target detection device described in the second or third aspect above.
上述第二方面到第八方面的有益效果,请参见上述第一方面的有益效果的描述,这里不再重复赘述。For the beneficial effects of the second aspect to the eighth aspect, please refer to the description of the beneficial effects of the first aspect, which will not be repeated here.
附图说明Description of drawings
图1为本申请实施例提供的目标检测方法的一种可能的应用***的架构示意图;FIG. 1 is a schematic diagram of the architecture of a possible application system of the target detection method provided by the embodiment of the present application;
图2为本申请实施例提供的一种目标检测方法的示意图;2 is a schematic diagram of a target detection method provided by an embodiment of the present application;
图3为本申请实施例提供的一种目标检测装置的示意图;3 is a schematic diagram of a target detection apparatus provided by an embodiment of the present application;
图4为本申请实施例提供的一种目标检测装置的结构示意图。FIG. 4 is a schematic structural diagram of a target detection apparatus provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。其中,在本申请实施例的描述中,以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。In order to make the objectives, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings. Among them, in the description of the embodiments of the present application, the terms "first" and "second" are only used for description purposes, and should not be understood as indicating or implying relative importance or indicating the number of technical features indicated. . Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature.
为了便于理解,示例性的给出了与本申请相关概念的说明以供参考。For ease of understanding, descriptions of concepts related to the present application are exemplarily given for reference.
1)、目标检测:目标检测是指根据采集的图像定位多个目标对象(或目标物体),包括确定目标对象的类别和位置,该位置一般在图像中用锚框(bounding box)标记。目标分类是指判断图像中确定的目标对象的类别。1) Target detection: Target detection refers to locating multiple target objects (or target objects) according to the collected image, including determining the category and position of the target object, which is generally marked with an anchor box (bounding box) in the image. Object classification refers to judging the category of the target object determined in the image.
2)、点云数据:通过三维扫描设备扫描得到的物体表面上的点数据集合可称之为点云(point cloud)数据。点云数据是在一个三维坐标***中的一组向量的集合。这些向量通常以三维坐标的形式表示,而且一般主要用来代表一个物体的外表面形状。除三维坐标代表的几何位置信息之外,点云还可以表示一个点的RGB(red、green、blue,红绿蓝)颜色、灰度值、深度、物体反射面强度等。在本申请实施例中涉及的点云坐标系即点云数据中点云所在的三维坐标系。2) Point cloud data: The set of point data on the surface of an object scanned by a 3D scanning device can be called point cloud data. Point cloud data is a collection of vectors in a three-dimensional coordinate system. These vectors are usually expressed in three-dimensional coordinates, and are generally used primarily to represent the shape of an object's outer surface. In addition to the geometric position information represented by the three-dimensional coordinates, the point cloud can also represent the RGB (red, green, blue, red, green and blue) color, gray value, depth, and intensity of the object's reflective surface of a point. The point cloud coordinate system involved in the embodiments of the present application is the three-dimensional coordinate system where the point cloud in the point cloud data is located.
3)、雷达截面积(Radar cross-section,RCS):RCS是指雷达的反射截面积,雷达探测的原理是发射电磁波照射到物体表面,再接收物体反射回的电磁波信号,根据接收到的电磁波信号对物体进行探测。其中,雷达发射的电磁波照射到物体表面后,接收到的反射回的电磁波越少,则雷达截面积越小,雷达对物体的特征的识别度就越小,雷达的探测距离也越短。3), Radar cross-section (Radar cross-section, RCS): RCS refers to the reflection cross-sectional area of the radar. The principle of radar detection is to transmit electromagnetic waves to the surface of the object, and then receive the electromagnetic wave signal reflected back by the object, according to the received electromagnetic wave. The signal detects objects. Among them, after the electromagnetic wave emitted by the radar is irradiated on the surface of the object, the less electromagnetic waves that are reflected back are received, the smaller the cross-sectional area of the radar, the smaller the recognition degree of the feature of the object by the radar, and the shorter the detection distance of the radar.
4)、图像(平面)坐标系:图像坐标系也可以称为像素坐标系,通常是以图像左上角特征点为原点建立的二维坐标系,单位为像素(pixel)。4) Image (plane) coordinate system: The image coordinate system can also be called the pixel coordinate system, which is usually a two-dimensional coordinate system established with the feature point in the upper left corner of the image as the origin, and the unit is pixel.
应理解,本申请实施例中“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一(项)个”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b或c中的至少一项(个),可以表示:a,b,c,a和b,a和c,b和c,或a、b和c,其中a、b、c可以是单个,也可以是多个。It should be understood that, in the embodiments of the present application, "at least one" refers to one or more, and "a plurality" refers to two or more. "And/or", which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship. "At least one (item) of the following" or its similar expression refers to any combination of these items, including any combination of single item (item) or plural item (item). For example, at least one (a) of a, b or c may represent: a, b, c, a and b, a and c, b and c, or a, b and c, where a, b, c Can be single or multiple.
方法实施例中的具体操作方法也可以应用于装置实施例或***实施例中。The specific operation methods in the method embodiments may also be applied to the apparatus embodiments or the system embodiments.
随着人工智能、视觉技术等科技的不断发展,自动驾驶逐渐成为智能汽车的新趋势。 而在自动驾驶过程中,为了保证驾驶安全,需要实时对行人、车辆、道路标志等目标对象进行检测,获取这些目标对象的类别、位置信息等,从而有效进行车辆控制,保证安全驾驶。因此,目标检测是自动驾驶***中的重要任务之一。With the continuous development of artificial intelligence, vision technology and other technologies, autonomous driving has gradually become a new trend of smart cars. In the process of automatic driving, in order to ensure driving safety, it is necessary to detect target objects such as pedestrians, vehicles, and road signs in real time, and obtain the category and location information of these target objects, so as to effectively control the vehicle and ensure safe driving. Therefore, object detection is one of the important tasks in autonomous driving systems.
与传统的视觉检测算法相比,基于卷积神经网络的目标检测算法由于检测结果更准确高效而得到更多应用。因此目前主流的(二维)目标检测算法多是基于卷积神经网络的方法。这些方法可概括为两类,分别为一阶段(one-stage)算法和二阶段(two-stage)算法。其中,一阶段算法将目标检测视为一个回归问题,可以直接从输入的图像学习目标对象的分类概率和锚框。二阶段算法则是采用了两个阶段来进行目标检测,先在第一阶段使用区域提议网络(region proposal network,RPN)生成感兴趣区域,然后在第二阶段使用这些RPN进行目标对象的分类和锚框回归。两种方法相比较而言各有优势,一阶段算法比二阶段算法的速度快,但是二阶段算法的准确性更好。Compared with the traditional visual detection algorithm, the target detection algorithm based on convolutional neural network has been more used because the detection results are more accurate and efficient. Therefore, the current mainstream (two-dimensional) target detection algorithms are mostly methods based on convolutional neural networks. These methods can be generalized into two categories, namely one-stage algorithms and two-stage algorithms. Among them, the one-stage algorithm treats object detection as a regression problem, which can directly learn the classification probability and anchor box of the target object from the input image. The two-stage algorithm uses two stages for target detection. First, the region proposal network (RPN) is used to generate regions of interest in the first stage, and then these RPNs are used in the second stage to classify and classify target objects. Anchor box regression. The two methods have their own advantages in comparison. The one-stage algorithm is faster than the two-stage algorithm, but the accuracy of the two-stage algorithm is better.
上述基于卷积神经网络的目标检测算法多是基于摄像头来实现的,即通过对摄像头拍摄的图像进行图像检测的方式实现,而基于摄像头和雷达融合的检测算法却相对缺乏。但是,在自动驾驶领域,大多数自动驾驶车辆都配备了摄像头、毫米波雷达、激光雷达等各种传感器,通过多个同构传感器的冗余性和异构传感器的多样性,可以使得传感器性能优劣互补,提供更准确、更可靠的检测效果。因此,如何结合利用雷达和摄像头来实现更快更准确的目标检测效果,是一个具有挑战性的问题。Most of the above target detection algorithms based on convolutional neural networks are implemented based on cameras, that is, by performing image detection on images captured by cameras, while detection algorithms based on camera and radar fusion are relatively lacking. However, in the field of autonomous driving, most autonomous vehicles are equipped with various sensors such as cameras, millimeter-wave radars, and lidars. Through the redundancy of multiple homogeneous sensors and the diversity of heterogeneous sensors, sensor performance can be improved. The advantages and disadvantages are complementary to provide more accurate and reliable detection results. Therefore, how to combine radar and camera to achieve faster and more accurate target detection is a challenging problem.
目前,结合摄像头和雷达进行目标检测时,可以将雷达检测到的雷达检测点(点云),映射到摄像头拍摄的图像的图像坐标系中,并为每个映射的雷达检测点生成预定义的锚框作为目标提议,然后根据这些锚框对图像中对象的位置进行检测。但是,该方法的数据处理量很大,且需要耗费大量的处理时间,因此,该方法进行目标检测的效率很低。At present, when combining the camera and radar for target detection, the radar detection points (point cloud) detected by the radar can be mapped to the image coordinate system of the image captured by the camera, and a predefined radar detection point can be generated for each mapped radar detection point. Anchor boxes are used as object proposals, and then the location of objects in the image is detected based on these anchor boxes. However, this method has a large amount of data processing and a large amount of processing time, so the efficiency of this method for target detection is very low.
鉴于此,本申请实施例提供一种目标检测方法,用以进行快速、准确的目标检测,进而提高目标检测的效率。In view of this, the embodiments of the present application provide a target detection method for performing fast and accurate target detection, thereby improving the efficiency of target detection.
本申请实施例提供的目标检测方法可以结合雷达传感器采集的点云数据,以及对应的摄像头传感器采集的图像,进行目标检测。该方法可以应用于具有数据处理能力的目标检测装置中。作为示例而非限定,所述目标检测装置可以为具有数据处理功能的车辆,或者车辆中具有数据处理功能的车载设备,或者设置在具有采集及处理点云数据和图像数据的传感器中。车载设备可以包括但不限于车载终端、车载控制器、车载模块、车载模组、车载部件、车载芯片、电子控制单元(electronic control unit,ECU)、域控制器(domain controller,DC)等装置。目标检测装置还可以是其它具有数据处理功能的电子设备,电子设备包括但不限于智能家居设备(例如电视等)、智能机器人、移动终端(例如手机、平板电脑等)、可穿戴设备(例如智能手表等)等智能设备。目标检测装置也可以是智能设备内的控制器、芯片等其它器件。The target detection method provided by the embodiments of the present application may combine point cloud data collected by a radar sensor and images collected by a corresponding camera sensor to perform target detection. The method can be applied to a target detection device with data processing capability. By way of example and not limitation, the target detection device may be a vehicle with a data processing function, or an in-vehicle device with a data processing function in the vehicle, or set in a sensor that collects and processes point cloud data and image data. The in-vehicle equipment may include, but is not limited to, in-vehicle terminals, in-vehicle controllers, in-vehicle modules, in-vehicle modules, in-vehicle components, in-vehicle chips, electronic control units (ECUs), domain controllers (DCs) and other devices. The target detection device may also be other electronic devices with data processing functions, including but not limited to smart home devices (such as TVs, etc.), smart robots, mobile terminals (such as mobile phones, tablet computers, etc.), wearable devices (such as smart Watches, etc.) and other smart devices. The target detection device may also be a controller, a chip, or other devices in the smart device.
下面结合附图,对本申请实施例提供的目标检测方法进行详细说明,可以理解的是,以下所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。The target detection method provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings. It can be understood that the embodiments described below are only a part of the embodiments of the present application, but not all of the embodiments.
图1为本申请实施例提供的目标检测方法的一种可能的应用***的架构示意图。如图1所示,所述***至少包括点云采集模块101、点云数据处理模块102、图像采集模块103及目标检测模块104。FIG. 1 is a schematic structural diagram of a possible application system of the target detection method provided by the embodiment of the present application. As shown in FIG. 1 , the system at least includes a point cloud acquisition module 101 , a point cloud data processing module 102 , an image acquisition module 103 and a target detection module 104 .
所述点云采集模块101和所述图像采集模块103分别用于获取点云数据和对应的图像。 在本申请一些实施例中,所述点云采集模块101和所述图像采集模块103分别用于获取在同一时刻对应同一场景的第一点云数据和第一图像。The point cloud acquisition module 101 and the image acquisition module 103 are respectively used to acquire point cloud data and corresponding images. In some embodiments of the present application, the point cloud acquisition module 101 and the image acquisition module 103 are respectively configured to acquire the first point cloud data and the first image corresponding to the same scene at the same moment.
例如,在自动驾驶场景中,所述点云采集模块101和所述图像采集模块103可以设置在自动驾驶车辆中的同一位置,所述点云采集模块101可以采集自动驾驶车辆周围环境的点云数据,并发送到点云数据处理模块102,同时,所述图像采集模块103采集所述自动驾驶车辆周围环境的图像,并发送到所述目标检测模块104。For example, in an autonomous driving scenario, the point cloud acquisition module 101 and the image acquisition module 103 may be set at the same position in the autonomous driving vehicle, and the point cloud acquisition module 101 may acquire point clouds of the surrounding environment of the autonomous driving vehicle The data is sent to the point cloud data processing module 102 , and at the same time, the image acquisition module 103 collects the image of the surrounding environment of the autonomous driving vehicle and sends it to the target detection module 104 .
本申请实施例中,所述点云采集模块101可以为雷达传感器,例如毫米波雷达等,还可以是其它任何能够采集点云数据的装置,本申请实施例对此不做具体限定。所述图像采集模块103可以为摄像头传感器,例如摄像头、摄影机、监控器等,还可以是其它任何能够拍摄图像的装置,本申请实施例对此不做具体限定。In this embodiment of the present application, the point cloud collection module 101 may be a radar sensor, such as a millimeter wave radar, or any other device capable of collecting point cloud data, which is not specifically limited in this embodiment of the present application. The image acquisition module 103 may be a camera sensor, such as a camera, a video camera, a monitor, etc., or any other device capable of capturing images, which is not specifically limited in this embodiment of the present application.
所述点云数据处理模块102,用于对来自所述点云采集模块101的第一点云数据和来自图像采集模块103的第一图像进行处理,包括:将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云;对所述第二点云数据进行栅格划分;根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云。The point cloud data processing module 102 is configured to process the first point cloud data from the point cloud acquisition module 101 and the first image from the image acquisition module 103, including: mapping the first point cloud data Go to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds; perform grid division on the second point cloud data; according to the second point cloud data The feature data of the point cloud in the point cloud data, to determine a plurality of target point clouds after grid division of the second point cloud data, wherein any target point cloud corresponds to at least one point in the second point cloud data cloud.
示例性的,所述点云采集模块101为雷达,所述图像采集模块103为摄像头时,所述点云数据处理模块102可以根据摄像头标定参数,将雷达采集的点云数据映射到摄像头采集的图像中,通过对图像平面进行栅格划分,再结合映射后点云到雷达的距离对同一栅格中点云进行划分,实现对映射后点云所在的图像平面及该图像平面的俯视图(bird eye view,BEV)平面组成的三维空间的网格划分。所述点云数据处理模块102对映射后的点云进行栅格划分后,结合点云的RCS特征,计算具有相似特征信息的点云的统计信息,再进一步生成具有不同点云特征信息的目标点云,作为生成锚框的兴趣点(points of interest,POI)。Exemplarily, when the point cloud collection module 101 is a radar, and the image collection module 103 is a camera, the point cloud data processing module 102 can map the point cloud data collected by the radar to the point cloud data collected by the camera according to the camera calibration parameters. In the image, the image plane is divided into grids, and the point cloud in the same grid is divided according to the distance from the point cloud after the mapping to the radar, so as to realize the image plane where the mapped point cloud is located and the top view of the image plane (bird). Eye view, BEV) grid of the three-dimensional space composed of planes. After the point cloud data processing module 102 divides the mapped point cloud into a grid, and combines the RCS features of the point cloud, calculates the statistical information of point clouds with similar feature information, and further generates targets with different point cloud feature information. Point cloud, as points of interest (POI) for generating anchor boxes.
其中,所述点云数据处理模块102可以从所述图像采集模块103或目标检测模块104获取所述第一图像。所述点云数据处理模块102在确定多个目标点云后,将所述多个目标点云通知到所述目标检测模块104。The point cloud data processing module 102 may acquire the first image from the image acquisition module 103 or the target detection module 104 . After determining multiple target point clouds, the point cloud data processing module 102 notifies the target detection module 104 of the multiple target point clouds.
在本申请一些实施例中,所述点云数据处理模块102可以包括映射模块105和栅格划分模块106,其中,所述映射模块105用于将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据。所述栅格划分模块106用于对所述第二点云数据进行栅格划分,并根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云。In some embodiments of the present application, the point cloud data processing module 102 may include a mapping module 105 and a grid division module 106, wherein the mapping module 105 is configured to map the first point cloud data to the second An image plane of an image to obtain a second point cloud data. The grid division module 106 is configured to perform grid division on the second point cloud data, and determine the second point cloud data for grid division according to the feature data of the point cloud in the second point cloud data After the multiple target point cloud.
作为一种可选的实施方式,如图1中所示,在所述点云数据处理模块102中,可以是所述映射模块105先进行点云映射,所述栅格划分模块106再进行栅格划分,即所述映射模块105先将第一点云数据映射到所述第一图像的图像平面得到第二点云数据,然后所述栅格划分模块106再对所述第二点云数据进行栅格划分处理,并确定进行栅格划分后的多个目标点云,从而将所述多个目标点云作为生成锚框的兴趣点。As an optional implementation manner, as shown in FIG. 1 , in the point cloud data processing module 102, the mapping module 105 may first perform point cloud mapping, and the grid dividing module 106 may then perform grid division. Grid division, that is, the mapping module 105 first maps the first point cloud data to the image plane of the first image to obtain the second point cloud data, and then the grid division module 106 maps the second point cloud data to the second point cloud data. Perform grid division processing, and determine multiple target point clouds after grid division, so as to use the multiple target point clouds as interest points for generating anchor frames.
作为另一种可选的实施方式,在所述点云数据处理模块102中,可以是所述栅格划分模块106先进行栅格划分,所述映射模块105再进行点云映射,即所述栅格划分模块106先对获取的第一点云数据进行栅格划分处理,并确定进行栅格划分后的多个目标点云,然后所述映射模块105再将所述栅格划分模块106确定的多个目标点云分别映射到所述图像 后作为生成锚框的兴趣点。As another optional implementation manner, in the point cloud data processing module 102, the grid division module 106 may first perform grid division, and the mapping module 105 may perform point cloud mapping, that is, the The grid division module 106 first performs grid division processing on the acquired first point cloud data, and determines a plurality of target point clouds after grid division, and then the mapping module 105 determines the grid division module 106 The multiple target point clouds of , respectively, are mapped to the image as interest points for generating anchor boxes.
所述目标检测模块104,用于在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框;并根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。The target detection module 104 is configured to, in the first image, generate at least one anchor frame corresponding to each target point cloud in the plurality of target point clouds; An anchor box is used for object detection to determine the position of at least one target object to be detected.
在本申请一些实施例中,如图1中所示,所述目标检测模块104可以包括特征提取模块107和检测模块108。其中,所述特征提取模块107用于从所述第一图像中提取图像特征,以便于所述检测模块108根据提取的图像特征以及生成的锚框进行目标检测。所述特征提取模块107可以采用视觉几何组(visual geometry group,VGG)网络模型或其它深度学习网络模型提取图像特征,例如可以采用VGG16模型。In some embodiments of the present application, as shown in FIG. 1 , the target detection module 104 may include a feature extraction module 107 and a detection module 108 . The feature extraction module 107 is configured to extract image features from the first image, so that the detection module 108 can perform target detection according to the extracted image features and the generated anchor frame. The feature extraction module 107 may use a visual geometry group (VGG) network model or other deep learning network models to extract image features, for example, a VGG16 model may be used.
可以理解的是,上述视觉几何组网络模型仅作为所述特征提取模块107可以采用的网络模型的一个具体示例,本申请实施例中所述特征提取模块107可以采用的网络模型不局限于所述视觉几何组网络模型,例如还可以采用残差神经网络模型(residual neural network,ResNet)等能够实现其功能的网络模型。It can be understood that the above-mentioned visual geometric group network model is only a specific example of the network model that can be used by the feature extraction module 107, and the network model that can be used by the feature extraction module 107 in the embodiment of the present application is not limited to the above. For the visual geometry group network model, for example, a residual neural network model (residual neural network, ResNet) and other network models that can realize its functions can also be used.
所述检测模块108用于生成所述多个目标点云中每个目标点云对应的至少一个锚框,并根据各锚框标定的区域,对所述特征提取模块107提取的图像特征进行运算,完成图像中目标的分类和对应锚框的回归分析,最终实现目标检测功能。所述检测模块108生成每个目标点云对应的至少一个锚框后,可以采用深度学习网络模型完成图像中目标的分类和对应锚框的回归分析。The detection module 108 is configured to generate at least one anchor frame corresponding to each target point cloud in the plurality of target point clouds, and perform operations on the image features extracted by the feature extraction module 107 according to the area marked by each anchor frame. , complete the classification of the target in the image and the regression analysis of the corresponding anchor box, and finally realize the target detection function. After the detection module 108 generates at least one anchor frame corresponding to each target point cloud, a deep learning network model can be used to complete the classification of the target in the image and the regression analysis of the corresponding anchor frame.
所述目标检测模块104进行目标检测后,输出的目标检测结果包括在所述第一图像中检测到的至少一个目标对象的类别标识、各类别标识对应的置信度以及各目标对象对应的锚框(可以理解为目标对象对应的边界框)的位置和大小等,其中,类别标识用于表示目标对象的分类类别,锚框用于在所述第一图像中标记目标对象的位置。After the target detection module 104 performs target detection, the output target detection result includes a category identifier of at least one target object detected in the first image, a confidence level corresponding to each category identifier, and an anchor frame corresponding to each target object. (It can be understood as the position and size of the bounding box corresponding to the target object), etc., wherein the category identifier is used to represent the classification category of the target object, and the anchor box is used to mark the position of the target object in the first image.
可以理解的是,所述目标检测模块104所实现的功能,可以通过一个深度学习网络模型实现,也可以通过多个深度学习网络模型配合实现,在通过多个深度学习网络模型配合实现时,不同的深度学习网络模型分别实现所述目标检测模块104中不同的功能,例如,可以通过不同的深度学习网络模型分别实现上述特征提取模块107和检测模块108所实现的功能。It can be understood that, the function realized by the target detection module 104 can be realized by one deep learning network model, or can be realized by the cooperation of multiple deep learning network models. The deep learning network models of the target detection modules respectively implement different functions in the target detection module 104. For example, the functions implemented by the feature extraction module 107 and the detection module 108 can be implemented by different deep learning network models.
在本申请一些实施例中,如图1中所示,该***还可以包括类别标识管理模块109,所述类别标识管理模块109用于为目标检测模块104提供至少一种类别标识,以使所述目标检测模块104根据所述至少一种类别标识为确定的目标点云生成锚框,进而进行目标检测。In some embodiments of the present application, as shown in FIG. 1 , the system may further include a category identification management module 109, and the category identification management module 109 is configured to provide at least one category identification for the target detection module 104, so that all The target detection module 104 generates an anchor frame according to the target point cloud identified by the at least one category, and then performs target detection.
其中,类别标识管理模块109提供给目标检测模块104的至少一种类别标识包括设定的类别标识和/或对参考图像进行目标检测后确定的类别标识,其中,所述参考图像为在所述图像之前进行目标检测的一帧图像。Wherein, the at least one category identification provided by the category identification management module 109 to the target detection module 104 includes a set category identification and/or a category identification determined after target detection is performed on a reference image, wherein the reference image is the A frame of image for object detection before the image.
具体的,所述类别标识管理模块109存储有至少一种设定的类别标识,并可以获取在第一图像之前进行目标检测的一帧图像的目标检测结果,该目标检测结果至少包括在所述前一帧图像中检测到的目标对象的类别及其置信度。对于在所述前一帧图像中检测到的每一个类别标识,若所述类别标识管理模块109确定该类别标识对应的置信度大于设定阈值,则将其作为一个先验参考信息,并将该类别标识与设定的类别标识一并输入到检测模块108,以使所述检测模块108生成这些类别标识对应的特定尺寸的锚框。Specifically, the category identification management module 109 stores at least one set category identification, and can acquire a target detection result of a frame of images that is subjected to target detection before the first image, and the target detection result is included at least in the The category of the target object detected in the previous frame of image and its confidence. For each category identifier detected in the previous frame of image, if the category identifier management module 109 determines that the confidence level corresponding to the category identifier is greater than the set threshold, it will be used as a priori reference information, and the The category identifiers are input to the detection module 108 together with the set category identifiers, so that the detection module 108 generates anchor boxes of specific sizes corresponding to these category identifiers.
例如,若设定的类别标识包含汽车和人的类别标识,则类别标识管理模块109将前一帧图像中检测到的置信度大于设定阈值的类别标识以及所述汽车和人的类别标识作为类别标识参考信息,输入到目标检测模块104。For example, if the set category identifiers include the category identifiers of cars and people, the category identifier management module 109 uses the category identifiers with the confidence level detected in the previous frame image greater than the set threshold and the category identifiers of cars and people as the category identifiers. The category identification reference information is input to the target detection module 104 .
在图1所示***中,点云数据处理模块102和目标检测模块104的功能可以通过一个网络模型实现,该网络模型的输入可以为第一点云数据和对应的第一图像,输出为第一图像中检测到的目标对象对应的锚框、目标对象的类别标识,以及各类别标识对应的置信度;或者,该网络模型的输入可以为第一点云数据和对应的第一图像,以及在该第一图像之前进行目标检测的一帧图像的目标检测结果,输出为该第一图像中检测到的目标对象对应的锚框、目标对象的类别标识,以及各类别标识对应的置信度。In the system shown in FIG. 1, the functions of the point cloud data processing module 102 and the target detection module 104 can be realized by a network model, the input of the network model can be the first point cloud data and the corresponding first image, and the output is the first point cloud data and the corresponding first image. An anchor box corresponding to a target object detected in an image, a category identifier of the target object, and a confidence level corresponding to each category identifier; or, the input of the network model may be the first point cloud data and the corresponding first image, and The target detection result of a frame of image before the first image is output as the anchor frame corresponding to the target object detected in the first image, the category identifier of the target object, and the confidence level corresponding to each category identifier.
可以理解的是,图1中所示的***的结构并不构成对本申请实施例提供的目标检测方法应用的***的具体限定。在本申请另一些实施例中,目标检测方法可能应用的***可以包括比图1所示的更多或更少的模块,或者组合某些模块,或者拆分某些模块,或者不同的模块布置。It can be understood that the structure of the system shown in FIG. 1 does not constitute a specific limitation on the system to which the target detection method provided by the embodiment of the present application is applied. In other embodiments of the present application, the system to which the target detection method may be applied may include more or less modules than those shown in FIG. 1 , or combine some modules, or split some modules, or arrange different modules .
需要说明的是,图1中所示的***架构中包括的装置、模块、功能等,可以全部集成到一个装置中实现,也可以分布在不同的装置中实现。例如,在自动驾驶场景中,图1所示的***可以全部包含在自动驾驶车辆中。又例如,图1中所示的点云采集模块101、图像采集模块103可以分别为独立的装置,除所述点云采集模块101、图像采集模块103之外的各模块的功能可以集成到一个处理装置或服务器或云端中实现。It should be noted that, the devices, modules, functions, etc. included in the system architecture shown in FIG. 1 may all be integrated into one device for implementation, or may be distributed in different devices for implementation. For example, in an autonomous driving scenario, the system shown in Figure 1 can all be included in an autonomous vehicle. For another example, the point cloud acquisition module 101 and the image acquisition module 103 shown in FIG. 1 may be independent devices, respectively, and the functions of the modules other than the point cloud acquisition module 101 and the image acquisition module 103 may be integrated into one Implemented in a processing device or server or cloud.
当然,图1也只是一种示例,本申请实施例应用的***不限于此。Of course, FIG. 1 is only an example, and the system applied by the embodiments of the present application is not limited thereto.
下面结合具体实施例,对本申请提供的目标检测方法进行详细介绍。The target detection method provided by the present application will be described in detail below with reference to specific embodiments.
图2为本申请实施例提供的一种目标检测方法的示意图。FIG. 2 is a schematic diagram of a target detection method provided by an embodiment of the present application.
为了便于介绍,在下文中,以本申请提供的目标检测方法由目标检测装置执行为例进行说明。目标检测装置可以但不限于为本申请实施例提供的具有数据处理能力的装置,例如可以为上述的车辆或车载设备,或者服务器、云端服务器等。For ease of introduction, the following description takes the target detection method provided by the present application being executed by a target detection device as an example for description. The target detection device may be, but is not limited to, the device with data processing capability provided in the embodiment of the present application, for example, may be the above-mentioned vehicle or vehicle-mounted device, or a server, a cloud server, or the like.
如图2所示,本申请提供的目标检测方法包括:As shown in Figure 2, the target detection method provided by this application includes:
S201:目标检测装置获取雷达传感器采集的第一点云数据和对应的摄像头传感器采集的第一图像,其中,所述第一点云数据包含多个点云。S201: The target detection apparatus acquires first point cloud data collected by a radar sensor and a first image collected by a corresponding camera sensor, where the first point cloud data includes multiple point clouds.
本申请实施例中,目标检测装置可以分别接收雷达传感器发送的第一点云数据,以及摄像头传感器发送的第一图像,基于所述第一点云数据和所述第一图像进行目标检测。In this embodiment of the present application, the target detection apparatus may respectively receive the first point cloud data sent by the radar sensor and the first image sent by the camera sensor, and perform target detection based on the first point cloud data and the first image.
在本申请一些实施例中,目标检测装置获取第一点云数据和第一图像的方式还可以是接收用户输入的第一点云数据和第一图像,或者目标检测装置可以直接采集第一点云数据和第一图像。In some embodiments of the present application, the method for the target detection apparatus to acquire the first point cloud data and the first image may also be to receive the first point cloud data and the first image input by the user, or the target detection apparatus may directly collect the first point Cloud data and first image.
在本申请一些实施例中,所述第一点云数据和第一图像为所述雷达传感器和摄像头传感器在同一时刻获取的、对应同一场景的数据。In some embodiments of the present application, the first point cloud data and the first image are data corresponding to the same scene acquired by the radar sensor and the camera sensor at the same time.
在本申请一些实施例中,所述方法应用于如图1所示的***中时,步骤S201所述的雷达传感器可以作为图1中所示的点云采集模块101,摄像头传感器可以作为图1中所示的图像采集模块103。In some embodiments of the present application, when the method is applied to the system shown in FIG. 1 , the radar sensor described in step S201 may be used as the point cloud collection module 101 shown in FIG. 1 , and the camera sensor may be used as the point cloud collection module 101 shown in FIG. 1 . The image acquisition module 103 shown in .
S202:目标检测装置将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云。S202: The target detection apparatus maps the first point cloud data to the image plane of the first image to obtain second point cloud data, where the second point cloud data includes multiple point clouds.
上述目标检测装置获取第一点云数据和对应的第一图像后,将所述第一点云数据投影到所述第一图像,得到第二点云数据。具体的,目标检测装置可以根据所述第一点云数据中点云所在坐标系中的坐标,与所述第一图像的图像平面坐标系中的坐标之间的转换关系,对所述第一点云数据中点云的坐标进行转换,得到所述第一点云数据中点云映射到所述第一图像的图像平面后,在所述图像平面中的位置坐标。After acquiring the first point cloud data and the corresponding first image, the target detection device projects the first point cloud data to the first image to obtain second point cloud data. Specifically, the target detection device may, according to the conversion relationship between the coordinates in the coordinate system where the point cloud is located in the first point cloud data, and the coordinates in the image plane coordinate system of the first image, detect the first The coordinates of the point cloud in the point cloud data are converted to obtain the position coordinates in the image plane after the point cloud in the first point cloud data is mapped to the image plane of the first image.
其中,所述第二点云数据中包含的多个点云(也可以称为投影点)与所述第一点云数据中包含的多个点云一一对应。所述第二点云数据中的任一个点云为对应的第一点云数据中的点云映射到所述第一图像的图像平面后的特征点。Wherein, the multiple point clouds (which may also be referred to as projection points) included in the second point cloud data correspond to the multiple point clouds included in the first point cloud data one-to-one. Any point cloud in the second point cloud data is a feature point after the point cloud in the corresponding first point cloud data is mapped to the image plane of the first image.
在雷达传感器为毫米波雷达,摄像头传感器为摄像头时,目标检测装置可以根据毫米波雷达和摄像头的标定参数,将第一点云数据映射到第一图像的图像平面中,得到第二点云数据。When the radar sensor is a millimeter-wave radar and the camera sensor is a camera, the target detection device can map the first point cloud data to the image plane of the first image according to the calibration parameters of the millimeter-wave radar and the camera to obtain the second point cloud data .
其中,第一点云数据中的点云与其对应的第二点云数据中的点云之间的映射关系符合如下公式:The mapping relationship between the point cloud in the first point cloud data and the corresponding point cloud in the second point cloud data conforms to the following formula:
p=HPp=HP
其中,P=[X,Y,Z,1],表示第一点云数据中的点云在点云坐标系中的坐标,p=[x,y,1],表示第二点云数据中的点云在图像平面坐标系中的坐标,
Figure PCTCN2022082553-appb-000001
表示摄像头的标定参数。
Among them, P=[X, Y, Z, 1], representing the coordinates of the point cloud in the first point cloud data in the point cloud coordinate system, p=[x, y, 1], representing the second point cloud data the coordinates of the point cloud in the image plane coordinate system,
Figure PCTCN2022082553-appb-000001
Indicates the calibration parameters of the camera.
在本申请一些实施例中,所述第二点云数据中点云的参数至少包括点云在所述第一图像的图像平面中的坐标、点云的距离参数以及点云的RCS,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。In some embodiments of the present application, the parameters of the point cloud in the second point cloud data include at least the coordinates of the point cloud in the image plane of the first image, the distance parameter of the point cloud, and the RCS of the point cloud, wherein, The distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
示例性的,上述第一点云数据映射到第一图像后,得到的第二点云数据中,对于任一个点云,该点云的位置信息可以表示为(x 1,y 1,d 1),其中,x 1、y 1分别为该点云A在所述第一图像的图像平面坐标系中的坐标,d 1为该点云A的距离参数,表示在BEV视角下,该点云A到摄像头所在的竖直平面的距离。则目标检测装置可以根据第二点云数据中每个点云的位置信息,对第二点云数据进行栅格划分。 Exemplarily, in the second point cloud data obtained after the first point cloud data is mapped to the first image, for any point cloud, the position information of the point cloud can be expressed as (x 1 , y 1 , d 1 . ), where x 1 and y 1 are the coordinates of the point cloud A in the image plane coordinate system of the first image respectively, and d 1 is the distance parameter of the point cloud A, indicating that under the BEV perspective, the point cloud The distance from A to the vertical plane where the camera is located. Then, the target detection apparatus may divide the second point cloud data into grids according to the position information of each point cloud in the second point cloud data.
在本申请一些实施例中,所述方法应用于如图1所示的***中时,步骤S202所述的方法可以由图1中所示的映射模块105执行。In some embodiments of the present application, when the method is applied to the system shown in FIG. 1 , the method described in step S202 may be executed by the mapping module 105 shown in FIG. 1 .
S203:目标检测装置对所述第二点云数据进行栅格划分。S203: The target detection apparatus performs grid division on the second point cloud data.
上述目标检测装置得到第二点云数据后,按照设定栅格大小,将所述第二点云数据划分为多个栅格,再根据每个栅格中点云的距离参数,将每个栅格中包含的点云划分为多个点云集合,完成对所述第二点云数据的栅格划分,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。After the above-mentioned target detection device obtains the second point cloud data, according to the set grid size, the second point cloud data is divided into a plurality of grids, and then according to the distance parameter of the point cloud in each grid, each grid is divided into several grids. The point cloud included in the grid is divided into a plurality of point cloud sets, and the grid division of the second point cloud data is completed, wherein the distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
示例性的,基于以上实施例,目标检测装置对第二点云数据进行栅格划分时,首先按照设定栅格大小,将图像平面划分为多个栅格,其中,图像大小为M×N像素,划分后得到的每个栅格大小为m×n像素,M、N、m、n均为正数,则目标检测装置可以根据第二点云数据中点云在所述第一图像的图像平面坐标系中的坐标,确定每个栅格中包含的点云。然后,目标检测装置在摄像头所在的竖直平面的BEV视角平面中,根据点云的距离参数,对从距摄像头最近的点云到距摄像头最远的点云,按照设定距离大小(如单位距离)进行纵向分割,使得每个栅格对应的每个设定距离范围内的点云组成一个点云集合。最终实现 将第二点云数据中的点云划分为多个大小为m×n×L的栅格,其中,m×n为设定栅格大小,L为设定距离大小。Exemplarily, based on the above embodiment, when the target detection apparatus performs grid division on the second point cloud data, first, according to the set grid size, the image plane is divided into multiple grids, wherein the image size is M×N. pixel, the size of each grid obtained after division is m×n pixels, and M, N, m, and n are all positive numbers. Coordinates in the image plane coordinate system that determine the point cloud contained in each raster. Then, the target detection device is in the BEV viewing angle plane of the vertical plane where the camera is located, according to the distance parameter of the point cloud, from the point cloud closest to the camera to the point cloud farthest from the camera, according to the set distance size (such as unit distance) for vertical segmentation, so that the point clouds within each set distance corresponding to each grid form a point cloud set. Final implementation Divide the point cloud in the second point cloud data into multiple grids of size m×n×L, where m×n is the set grid size, and L is the set distance size.
在本申请一些实施例中,所述方法应用于如图1所示的***中时,步骤S203所述的方法可以由图1中所示的栅格划分模块106执行。In some embodiments of the present application, when the method is applied to the system shown in FIG. 1 , the method described in step S203 may be executed by the grid division module 106 shown in FIG. 1 .
S204:目标检测装置根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云。S204: The target detection device determines, according to the feature data of the point cloud in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data, wherein any target point cloud corresponds to the at least one point cloud in the second point cloud data.
目标检测装置对第二点云数据进行栅格划分后,对于每个栅格,分别根据该栅格包含的点云集合中点云的特征数据,确定每个点云集合的目标点云,并将该栅格中包含的多个点云集合的目标点云,作为该栅格的目标点云。其中,点云的特征数据用于表示点云对应的雷达回波强度,该特征数据具体可以为RCS;每个点云集合的目标点云用于代表该点云集合中点云的空间分布特征;每个栅格对应至少一个目标点云。After the target detection device divides the second point cloud data into grids, for each grid, the target point cloud of each point cloud set is determined according to the feature data of the point cloud in the point cloud set contained in the grid, and the target point cloud is determined. The target point cloud of multiple point cloud sets contained in the grid is used as the target point cloud of the grid. Among them, the feature data of the point cloud is used to represent the radar echo intensity corresponding to the point cloud, and the feature data can be RCS specifically; the target point cloud of each point cloud set is used to represent the spatial distribution characteristics of the point cloud in the point cloud set ; each grid corresponds to at least one target point cloud.
目标检测装置确定每个点云集合的目标点云时,可以计算每个点云集合中包含的点云的形心点,将计算得到的形心点作为该点云集合的目标点云。When determining the target point cloud of each point cloud set, the target detection device may calculate the centroid point of the point cloud included in each point cloud set, and use the calculated centroid point as the target point cloud of the point cloud set.
在本申请一些实施例中,每个点云集合中任意两个点云的RCS之差小于设定阈值,或者,在每个点云集合包含的点云中,仅保留满足任意两个点云的RCS之差小于设定阈值这一条件的部分点云。In some embodiments of the present application, the difference between the RCSs of any two point clouds in each point cloud set is less than a set threshold, or, in the point clouds included in each point cloud set, only any two point clouds that satisfy any two point clouds are reserved. The difference between the RCS of the part of the point cloud is less than the set threshold.
通过上述方式,目标检测装置可以根据同一栅格中点云的RCS的相似性,从栅格包含的点云中选择RCS相差小于设定阈值的至少一个点云,计算其在栅格空间中的形心点,将计算得到的形心点作为栅格的目标点云,用于后续生成锚框。该方式能够极大减少生成锚框时参考的目标点云的数量,降低生成锚框的数据处理量,进而提高目标检测效率。In the above manner, the target detection device can select at least one point cloud whose RCS difference is less than a set threshold from the point clouds contained in the grid according to the similarity of the RCSs of the point clouds in the same grid, and calculate its RCS in the grid space. The centroid point, the calculated centroid point is used as the target point cloud of the grid for the subsequent generation of the anchor frame. This method can greatly reduce the number of target point clouds that are referenced when generating anchor boxes, reduce the amount of data processing for generating anchor boxes, and improve target detection efficiency.
目标检测装置确定第二点云数据中每个栅格的目标点云后,将所述第二点云数据中多个栅格的目标点云,作为所述第二点云数据进行栅格划分后的目标点云,得到所述多个目标点云。After determining the target point cloud of each grid in the second point cloud data, the target detection device uses the target point clouds of multiple grids in the second point cloud data as the second point cloud data for grid division After the target point cloud is obtained, the multiple target point clouds are obtained.
在本申请一些实施例中,所述方法应用于如图1所示的***中时,所述步骤S204所述的方法可以由图1中所示的栅格划分模块106执行。In some embodiments of the present application, when the method is applied to the system shown in FIG. 1 , the method described in step S204 may be executed by the grid division module 106 shown in FIG. 1 .
S205:目标检测装置在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框。S205: The target detection apparatus generates, in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds.
其中,在所述多个目标点云中,任一个目标点云的任一个锚框包含该目标点云。Among the multiple target point clouds, any anchor box of any target point cloud includes the target point cloud.
目标检测装置在确定所述多个目标点云之后,将所述多个目标点云作为后续生成锚框的兴趣点,通过为所述多个目标点云中的每个目标点云生成包含该目标点云的至少一个锚框,得到多个锚框,以便根据所述多个锚框确定待检测的目标对象的位置。After determining the plurality of target point clouds, the target detection device uses the plurality of target point clouds as interest points for subsequent generation of anchor frames, and generates a target point cloud containing the target point cloud for each target point cloud of the plurality of target point clouds. At least one anchor frame of the target point cloud is obtained to obtain multiple anchor frames, so as to determine the position of the target object to be detected according to the multiple anchor frames.
具体实施时,目标检测装置可以分别执行如下2个步骤:During specific implementation, the target detection device may perform the following two steps respectively:
步骤1、目标检测装置获取至少一种类别标识,其中,不同类别标识分别用于标识不同对象的类别。Step 1. The target detection apparatus acquires at least one category identifier, wherein different category identifiers are respectively used to identify categories of different objects.
本申请实施例中,类别标识具体可以为对象所属类别的名称、代码等,或者其它任何用于代表对象所属类别的信息。In this embodiment of the present application, the category identifier may specifically be the name, code, etc. of the category to which the object belongs, or any other information used to represent the category to which the object belongs.
在本申请一些实施例中,所述至少一种类别标识中可以包括如下所述的设定的类别标识和/或对参考图像进行目标检测后确定的类别标识。In some embodiments of the present application, the at least one category identifier may include a category identifier set as described below and/or a category identifier determined after performing target detection on a reference image.
1)设定的类别标识。1) The set category ID.
在该方式中,所述设定的类别标识可以设置为目标检测场景下出现频率较大的对象的类别的标识。其中,所述设定的类别标识可以为用户输入的。In this manner, the set category identifier may be set as an identifier of a category of an object that appears frequently in a target detection scene. The set category identifier may be input by the user.
例如,目标检测场景为自动驾驶场景时,所述设定的类别标识可以设置为车辆行驶场景中经常出现的对象的类别的标识,如车辆和行人两种对象的类别标识,将这些对象的类别标识作为基本的类别标识,在对该场景下获取的每帧图像进行目标检测时,都生成这些类别标识对应的锚框,可以提高目标检测的简捷性和准确度。For example, when the target detection scene is an automatic driving scene, the set category identifiers may be set to the category identifiers of objects that often appear in the vehicle driving scene, such as the category identifiers of vehicles and pedestrians. Labels are used as basic category labels. When performing target detection on each frame of images obtained in the scene, anchor boxes corresponding to these category labels are generated, which can improve the simplicity and accuracy of target detection.
2)对参考图像进行目标检测后确定的类别标识。2) The category identification determined after the target detection is performed on the reference image.
在该方式中,所述参考图像为在所述第一图像之前进行目标检测的一帧图像。In this manner, the reference image is a frame of images for which target detection is performed before the first image.
连续多帧图像(如视频流中的连续多帧图像)中显示内容一般比较接近,相邻图像包含对象的类别可能比较接近,因此,在对连续多帧图像进行目标检测时,可以参考当前帧图像的前一帧图像中包含的对象的类别,来确定在当前帧图像中需要生成哪些类别标识对应的锚框,从而使针对当前图像确定的类别标识更接近于当前图像中实际包含的对象的类别标识,进一步提高目标检测的准确度。The content displayed in consecutive multi-frame images (such as consecutive multi-frame images in a video stream) is generally close, and the categories of objects contained in adjacent images may be relatively close. Therefore, when performing target detection on consecutive multi-frame images, you can refer to the current frame. The category of the object contained in the previous frame of the image is used to determine which category identifiers corresponding to the anchor frame need to be generated in the current frame image, so that the category identifier determined for the current image is closer to the object actually contained in the current image. Category identification to further improve the accuracy of target detection.
可选的,所述对参考图像进行目标检测后确定的类别标识的置信度大于设定阈值。Optionally, the confidence level of the category identifier determined after the target detection is performed on the reference image is greater than a set threshold.
具体的,目标检测装置在利用网络模型对参考图像进行目标检测后,可以得到所述网络模型输出的检测结果,该检测结果包括在所述参考图像中检测到的不同对象的类别对应的类别标识,以及各类别标识对应的置信度,目标检测装置从各类别标识中选择对应的置信度大于设定阈值的类别标识,作为对参考图像进行目标检测后确定的类别标识,可以保证根据参考图像确定的类别标识的可信度和准确度。Specifically, after the target detection device performs target detection on the reference image by using the network model, the detection result output by the network model can be obtained, and the detection result includes the category identifiers corresponding to the categories of different objects detected in the reference image. , and the confidence level corresponding to each category identifier, the target detection device selects the category identifier whose corresponding confidence degree is greater than the set threshold from each category identifier, as the category identifier determined after the target detection is performed on the reference image, which can be guaranteed to be determined according to the reference image. the reliability and accuracy of the category identification.
步骤2、目标检测装置根据所述至少一种类别标识,分别为所述多个目标点云中的每个目标点云生成包含该目标点云的至少一个锚框。Step 2: The target detection device generates at least one anchor frame including the target point cloud for each target point cloud in the plurality of target point clouds according to the at least one category identifier.
该步骤中,目标检测装置可以采用如下任一方式生成目标点云的至少一个锚框:In this step, the target detection device may generate at least one anchor frame of the target point cloud in any of the following ways:
方式1way 1
先确定至少一个锚框尺寸,其中所述至少一个锚框尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个锚框尺寸;再确定每个目标点云的符合所述至少一个锚框尺寸的至少一个锚框。First determine at least one anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier; and then determine that each target point cloud conforms to the at least one anchor frame size At least one anchor box of the anchor box size.
在本申请一些实施例中,目标检测装置可以存储不同类别标识与锚框尺寸的对应关系,在该对应关系中,一个类别标识可以对应多个不同的锚框尺寸,一个锚框尺寸也可以对应多个不同的类别标识。In some embodiments of the present application, the target detection apparatus may store the correspondence between different category identifiers and anchor frame sizes. In this correspondence, one category identifier may correspond to multiple different anchor frame sizes, and one anchor frame size may also correspond to Multiple different category identifiers.
其中,所述不同类别标识与锚框尺寸的对应关系可以为用户输入的。不同锚框的尺寸可以是用户设定的,也可以通过对包含不同对象实际尺寸的数据集真值进行分类或机器学习得到,或者是根据不同对象实例在实际中的常见大小和高宽比例确定。Wherein, the correspondence between the different category identifiers and the anchor frame size may be input by the user. The size of different anchor boxes can be set by the user, or can be obtained by classifying or machine learning the true value of the dataset containing the actual size of different objects, or determined according to the common size and height-width ratio of different object instances in practice .
在该方式中,锚框尺寸包括锚框的面积参数和高宽比参数,所述面积参数为锚框的面积,用于表示锚框的大小,所述高宽比参数为锚框的两个相邻边长的长度之间的比值。不同的锚框尺寸对应的面积参数和/或比例参数不同。In this method, the size of the anchor frame includes the area parameter and the aspect ratio parameter of the anchor frame, the area parameter is the area of the anchor frame, which is used to represent the size of the anchor frame, and the aspect ratio parameter is the two parameters of the anchor frame. The ratio between the lengths of adjacent sides. Different anchor box sizes correspond to different area parameters and/or scale parameters.
作为一种可选的实施方式,在不同类别标识与不同锚框尺寸的对应关系中,每个类别标识可以对应多个不同的面积参数,每个面积参数可以对应多个不同的高宽比参数;或者每个类别标识可以对应多个不同的高宽比参数,每个高宽比参数可以对应多个不同的面积参数。As an optional implementation manner, in the correspondence between different category identifiers and different anchor frame sizes, each category identifier may correspond to multiple different area parameters, and each area parameter may correspond to multiple different aspect ratio parameters ; or each category identifier can correspond to multiple different aspect ratio parameters, and each aspect ratio parameter can correspond to multiple different area parameters.
示例性的,在目标检测场景为自动驾驶场景时,不同类别标识与不同锚框尺寸的对应 关系中,每种类别标识可以对应4个面积参数(128、256、512、1024像素)。考虑目标对象正对摄像头和侧对摄像头时,在图像呈现的大小不同,因此可以针对目标对象正对摄像头和侧对摄像头的状态分别设置两个角度的高宽比参数,即每种类别标识对应两种不同角度的高宽比参数,如下表1所示:Exemplarily, when the target detection scene is an automatic driving scene, in the correspondence between different category identifiers and different anchor frame sizes, each category identifier can correspond to 4 area parameters (128, 256, 512, 1024 pixels). Considering that when the target object is facing the camera and the side facing the camera, the size of the image is different, so you can set the aspect ratio parameters of two angles for the state of the target object facing the camera and side facing the camera, that is, each category identifier corresponds to The aspect ratio parameters of two different angles are shown in Table 1 below:
表1自动驾驶场景中常见类别对象实例的高宽比Table 1. Aspect ratios of common category object instances in autonomous driving scenarios
Figure PCTCN2022082553-appb-000002
Figure PCTCN2022082553-appb-000002
如上表1中所示,每种类别标识对应2个高宽比参数,则与上述4个面积参数组合后可得到8种不同大小的锚框尺寸,因此,基于上述面积参数和高宽比参数,目标检测装置生成锚框时,可以针对每个目标点云,分别生成8个不同大小的锚框。As shown in Table 1 above, each category identifier corresponds to 2 aspect ratio parameters, and after combining with the above 4 area parameters, 8 anchor frame sizes of different sizes can be obtained. Therefore, based on the above area parameters and aspect ratio parameters , when the target detection device generates anchor boxes, it can generate 8 anchor boxes of different sizes for each target point cloud.
可以理解的是,上述表1所示的不同类别对象实例及其高宽比参数进行本申请实施例中提供的示例性说明,本申请实施例中可以采用类似的方法来生成更多类别、更多尺寸的锚框,从而满足检测精度的要求。It can be understood that the object instances of different categories and their aspect ratio parameters shown in the above Table 1 are exemplary descriptions provided in the embodiments of the present application, and similar methods can be used in the embodiments of the present application to generate more categories, more Multi-size anchor boxes to meet the requirements of detection accuracy.
方式2way 2
先确定至少一个对象尺寸,其中,所述至少一个对象尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个对象尺寸,任一个类别标识对应的对象尺寸用于表示所述类别标识所属的对象的尺寸;然后根据所述至少一个对象尺寸,确定至少一个映射尺寸,其中,所述至少一个对象尺寸与所述至少一个映射尺寸一一对应,任一个对象尺寸对应的映射尺寸为目标类别标识所属的对象映射到所述第一图像后的尺寸,所述目标类别标识为所述对象尺寸对应的类别标识;再确定每个目标点云的符合所述至少一个映射尺寸的至少一个锚框。First determine at least one object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to represent the category identification The size of the object to which it belongs; then, according to the at least one object size, at least one mapping size is determined, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the target The object to which the category identifier belongs is mapped to the size of the first image, and the target category identifier is the category identifier corresponding to the object size; then determine at least one anchor of each target point cloud that conforms to the at least one mapping size frame.
其中,对象尺寸与对应的映射尺寸之间的关系符合如下公式:Among them, the relationship between the object size and the corresponding mapping size conforms to the following formula:
Figure PCTCN2022082553-appb-000003
Figure PCTCN2022082553-appb-000003
其中,s表示缩放系数,u、v表示映射尺寸,dx、dy为一个像素的大小(长和高),v 0表示原点的平移量,
Figure PCTCN2022082553-appb-000004
表示摄像头的内参数,f为摄像头的焦距,R为旋转矩阵,t为平移向量,
Figure PCTCN2022082553-appb-000005
表示摄像头的外参数,X w、Y w、Z w表示对象尺寸。
Among them, s represents the scaling factor, u, v represent the mapping size, dx, dy represent the size (length and height) of a pixel, v 0 represents the translation of the origin,
Figure PCTCN2022082553-appb-000004
Indicates the internal parameters of the camera, f is the focal length of the camera, R is the rotation matrix, t is the translation vector,
Figure PCTCN2022082553-appb-000005
Represents the extrinsic parameters of the camera, and X w , Y w , and Z w represent the object size.
在本申请一些实施例中,目标检测装置可以存储不同类别标识对应的对象尺寸。所述不同类别标识与对象尺寸的对应关系可以为用户输入的。不同对象尺寸可以是用户设定的, 也可以通过对包含不同对象实际尺寸的数据集真值进行分类或机器学习得到,或者是根据不同对象实例在实际中的常见大小确定。In some embodiments of the present application, the object detection apparatus may store object sizes corresponding to different category identifiers. The correspondence between the different category identifiers and the object size may be input by the user. The different object sizes may be set by the user, or obtained by classifying or machine learning the true values of the datasets containing the actual sizes of the different objects, or determined according to the common sizes of different object instances in practice.
在该方式中,锚框尺寸包括锚框的两个相邻边的边长大小。该参数可以通过类别反馈信息和标定参数来确定。In this manner, the anchor frame size includes the side lengths of two adjacent sides of the anchor frame. This parameter can be determined by category feedback information and calibration parameters.
示例性的,通过常见类别对象实例在实际的世界坐标系中的大小的统计,根据摄像头的小孔成像模型,可以根据目标点云到摄像头所在的竖直平面的距离即目标点云的距离参数,将常见类别的对象实例在世界坐标系中的大小映射到第一图像的图像平面,得到类别标识对应的锚框大小,进而可以得到特定类别、特定大小的锚框。Exemplarily, according to the statistics of the size of common category object instances in the actual world coordinate system, according to the pinhole imaging model of the camera, the distance parameter of the target point cloud can be based on the distance from the target point cloud to the vertical plane where the camera is located. , the size of the common category of object instances in the world coordinate system is mapped to the image plane of the first image, and the size of the anchor box corresponding to the category identifier is obtained, and then the anchor box of a specific category and a specific size can be obtained.
本申请实施例中,生成锚框是为了标记图像中的目标对象,因此,锚框的尺寸需要与目标对象的实际尺寸相对应。而目标对象距离图像采集装置的距离不同,其在图像中显示的大小也不同。因此,上述方式中,根据类别标识对应的目标对象的实际距离和雷达探测的点云的距离参数,结合摄像头的小孔成像原理,来确定类别标识对应的锚框尺寸,能够提高生成的锚框大小的准确度,进而提高目标检测的效率。In this embodiment of the present application, the anchor frame is generated to mark the target object in the image. Therefore, the size of the anchor frame needs to correspond to the actual size of the target object. And the distance of the target object from the image acquisition device is different, and the size displayed in the image is also different. Therefore, in the above method, the size of the anchor frame corresponding to the category identification is determined according to the actual distance of the target object corresponding to the category identification and the distance parameter of the point cloud detected by the radar, combined with the small hole imaging principle of the camera, which can improve the generated anchor frame. The size of the accuracy, thereby improving the efficiency of target detection.
在本申请一些实施例中,在任一个目标点云的任一个锚框中,所述目标点云位于所述锚框包围的区域中的任意位置,例如位于所述锚框的中心位置,或者位于所述锚框的任一边长中的任意位置,例如位于所述锚框的任一边长的中点位置。In some embodiments of the present application, in any anchor frame of any target point cloud, the target point cloud is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at the Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
具体的,目标检测装置确定针对每个目标点云生成的至少一个锚框尺寸后,在生成锚框时,可采用以目标点云作为中心点的方式生成目标点云对应的一些锚框。在此基础上,目标检测装置可以对已生成的目标点云的锚框的位置进行平移,使得目标点云变为锚框的任一边长上的一点(例如中点),来得到相对更多的锚框。Specifically, after the target detection device determines the size of at least one anchor frame generated for each target point cloud, when generating the anchor frame, some anchor frames corresponding to the target point cloud may be generated by taking the target point cloud as the center point. On this basis, the target detection device can translate the position of the anchor frame of the generated target point cloud, so that the target point cloud becomes a point (such as a midpoint) on any side of the anchor frame, so as to obtain relatively more anchor box.
需要说明的是,实际生成目标点云对应的锚框时,锚框的大小、位置及数量等信息可结合实际应用场景进行灵活设置及调整。It should be noted that when the anchor frame corresponding to the target point cloud is actually generated, the information such as the size, position and quantity of the anchor frame can be flexibly set and adjusted in combination with the actual application scenario.
在本申请一些实施例中,所述方法应用于如图1所示的***中时,步骤S205所述的方法可以由图1中所示的检测模块108和类别标识管理模块109执行,例如,上述的确定生成目标点云对应的锚框时所参考的至少一种类别标识的方法可以由所述类别标识管理模块109执行,上述根据所述至少一种类别标识来生成目标点云对应的目标锚框的方法可以由所述检测模块108执行。In some embodiments of the present application, when the method is applied to the system shown in FIG. 1 , the method described in step S205 may be executed by the detection module 108 and the category identification management module 109 shown in FIG. 1 , for example, The above-mentioned method for determining at least one category identifier referred to when generating the anchor frame corresponding to the target point cloud can be performed by the category identifier management module 109, and the above-mentioned method for generating the target corresponding to the target point cloud according to the at least one category identifier. The method of anchoring can be performed by the detection module 108 .
S206:目标检测装置根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。S206: The target detection apparatus performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the position of at least one target object to be detected.
目标检测装置为目标点云生成对应的至少一个用于目标检测的锚框后,可以根据目标点云的距离参数,按照如下公式对生成的锚框大小进行补偿:After the target detection device generates at least one corresponding anchor frame for target detection for the target point cloud, it can compensate the size of the generated anchor frame according to the distance parameter of the target point cloud according to the following formula:
Figure PCTCN2022082553-appb-000006
Figure PCTCN2022082553-appb-000006
其中,s为目标点云对应的缩放系数,α、β为调整锚框大小的比例因子,可以通过模型训练获取,d为目标点云的距离参数。Among them, s is the scaling factor corresponding to the target point cloud, α and β are the scaling factors for adjusting the size of the anchor frame, which can be obtained through model training, and d is the distance parameter of the target point cloud.
目标检测装置对生成的锚框大小进行补偿后,可以基于卷积神经网络模型对包含生成的多个锚框的第一图像进行识别,其中可结合通过图像特征提取得到的图像特征进行目标对象检测及分类,识别所述第一图像中包含的每个目标对象的目标类别,并确定所述目标类别的置信度;以及在每个目标点云的至少一个锚框中,确定每个目标对象所在的目标锚框。综上,目标检测装置结合点云数据和图像进行目标检测后,可以得到第一图像中包含 的每个目标对象的目标类别,每个目标对象的目标类别的类别标识,每个目标对象的目标类别的置信度,以及每个目标对象所在的目标锚框,其中,任一个目标锚框用于在所述第一图像中标记对应的目标对象的位置。After compensating the size of the generated anchor frame, the target detection device can identify the first image including the generated anchor frames based on the convolutional neural network model, wherein the target object detection can be carried out in combination with the image features obtained by the image feature extraction and classification, identify the target category of each target object contained in the first image, and determine the confidence of the target category; and in at least one anchor frame of each target point cloud, determine where each target object is located target anchor box. In summary, after the target detection device performs target detection by combining point cloud data and images, the target category of each target object contained in the first image, the category identifier of the target category of each target object, and the target category of each target object can be obtained. The confidence level of the category, and the target anchor frame where each target object is located, wherein any target anchor frame is used to mark the position of the corresponding target object in the first image.
在本申请一些实施例中,所述方法应用于如图1所示的***中时,步骤S206所述的方法可以由图1中所示的特征提取模块107和检测模块108执行。In some embodiments of the present application, when the method is applied to the system shown in FIG. 1 , the method described in step S206 may be executed by the feature extraction module 107 and the detection module 108 shown in FIG. 1 .
需要说明的是,本申请实施例中所描述的各个实施例中的步骤编号仅为执行流程的一种示例,并不构成对步骤执行的先后顺序的限制,本申请实施例中相互之间没有时序依赖关系的步骤之间没有严格的执行顺序。It should be noted that, the step numbers in the various embodiments described in the embodiments of the present application are only an example of the execution flow, and do not constitute a limitation on the sequence of execution of the steps. There is no strict order of execution between the steps of timing dependencies.
上述实施例中,通过将雷达传感器采集的第一点云数据映射到摄像头传感器采集的第一图像,得到第二点云数据,实现点云数据与图像的融合,对融合后的第二点云数据进行栅格划分,并结合栅格中相似RCS信息的点云体素特征,确定栅格对应的目标点云,大大减少了所需点云的数量,对应减小生成的锚框个数的同时,也综合了每个点云的特征,解决了基于每个点云都生成锚框造成的网络冗余等问题,加快了进行目标检测的检测速度。同时,上述方案通过小空间(栅格)的点云特征综合,进行点云数量精简,即便在存在雷达噪点的情况下,也能够减小局部噪点密集带来的影响,减缓噪点带来的处理迟钝等问题。In the above embodiment, by mapping the first point cloud data collected by the radar sensor to the first image collected by the camera sensor, the second point cloud data is obtained, and the fusion of the point cloud data and the image is realized. The data is divided into grids, and the point cloud voxel features of similar RCS information in the grid are combined to determine the target point cloud corresponding to the grid, which greatly reduces the number of required point clouds, and correspondingly reduces the number of anchor frames generated. At the same time, the characteristics of each point cloud are also synthesized, which solves the problem of network redundancy caused by the generation of anchor boxes based on each point cloud, and speeds up the detection speed of target detection. At the same time, the above scheme simplifies the number of point clouds by synthesizing point cloud features in a small space (raster). Even in the presence of radar noise, it can reduce the impact of local noise density and slow down the processing caused by noise. sluggishness etc.
此外,上述实施例中可以结合上一帧图像的检测结果的反馈,将上一帧图像中检测出的类别标识作为先验信息,可以对当前帧图像中生成的锚框进行调整,能够进一步提高精度,实现利用先验信息提升检测准确度的效果。In addition, in the above embodiment, the feedback of the detection result of the previous frame image can be combined, and the category identifier detected in the previous frame image can be used as prior information, and the anchor frame generated in the current frame image can be adjusted, which can further improve the To achieve the effect of using prior information to improve the detection accuracy.
基于以上实施例及相同构思,本申请实施例还提供了一种目标检测装置,如图3所示,所述目标检测装置300可以包括:数据获取单元301和处理单元302。Based on the above embodiments and the same concept, an embodiment of the present application further provides a target detection apparatus. As shown in FIG. 3 , the target detection apparatus 300 may include: a data acquisition unit 301 and a processing unit 302 .
所述数据获取单元301,用于获取雷达传感器采集的第一点云数据和对应的摄像头传感器采集的第一图像,其中,所述第一点云数据包含多个点云;所述处理单元302,用于将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云;对所述第二点云数据进行栅格划分;根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云;在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框;根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。The data acquisition unit 301 is configured to acquire the first point cloud data collected by the radar sensor and the first image collected by the corresponding camera sensor, wherein the first point cloud data includes multiple point clouds; the processing unit 302 , used to map the first point cloud data to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds; The cloud data is divided into grids; according to the feature data of the point clouds in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data is determined, wherein any target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds is generated; according to each generated target point At least one anchor box corresponding to the cloud performs target detection, and determines the position of at least one target object to be detected.
在一种可能的设计中,所述特征数据用于表示点云的雷达回波强度或点云的雷达回波强度分布特征或点云的极化特征。In a possible design, the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud.
在一种可能的设计中,在每个目标点云对应的至少一个锚框中,任一个目标点云对应的任一个锚框包含所述目标点云。In a possible design, in at least one anchor box corresponding to each target point cloud, any anchor box corresponding to any target point cloud contains the target point cloud.
在一种可能的设计中,所述处理单元302对所述第二点云数据进行栅格划分时,具体用于:按照设定栅格大小,将所述第二点云数据划分为多个栅格;根据每个栅格中点云的距离参数,将每个栅格中包含的点云划分为多个点云集合,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。In a possible design, when the processing unit 302 divides the second point cloud data into a grid, it is specifically configured to: divide the second point cloud data into multiple pieces according to a set grid size grid; according to the distance parameter of the point cloud in each grid, the point cloud contained in each grid is divided into a plurality of point cloud sets, wherein the distance parameter is used to indicate the distance between the point cloud and the radar sensor. Horizontal distance.
在一种可能的设计中,所述处理单元302根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云时,具体用于:在每个栅格中,分别根据每个点云集合中点云的特征数据,确定每个点云集合的目标点云;将每个栅格中包 含的多个点云集合的目标点云,作为每个栅格的目标点云;将所述第二点云数据包含的多个栅格的目标点云,作为所述第二点云数据进行栅格划分后的目标点云,得到所述多个目标点云。In a possible design, when the processing unit 302 determines, according to the feature data of the point cloud in the second point cloud data, multiple target point clouds after grid division of the second point cloud data, the specific Used to: in each grid, determine the target point cloud of each point cloud set according to the feature data of the point cloud in each point cloud set; The point cloud is used as the target point cloud of each grid; the target point cloud of multiple grids contained in the second point cloud data is used as the target point cloud after grid division of the second point cloud data, Obtain the multiple target point clouds.
在一种可能的设计中,所述处理单元302在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框时,具体用于:获取至少一种类别标识,其中,不同类别标识分别用于表示不同对象的类别;根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框。In a possible design, in the first image, when generating at least one anchor frame corresponding to each target point cloud in the multiple target point clouds, the processing unit 302 is specifically configured to: obtain at least one anchor frame There are various category identifiers, wherein the different category identifiers are respectively used to represent the categories of different objects; according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud is determined.
在一种可能的设计中,所述至少一种类别标识包括设定的类别标识和/或对参考图像进行目标检测后确定的类别标识,其中,所述参考图像为在所述第一图像之前进行目标检测的一帧图像。In a possible design, the at least one category identification includes a set category identification and/or a category identification determined after performing target detection on a reference image, wherein the reference image is before the first image A frame of image for object detection.
在一种可能的设计中,所述对参考图像进行目标检测后确定的类别标识的置信度大于设定阈值。In a possible design, the confidence level of the category identification determined after the target detection is performed on the reference image is greater than a set threshold.
在一种可能的设计中,所述处理单元302根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框时,具体用于:确定至少一个锚框尺寸,其中,所述至少一个锚框尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个锚框尺寸;确定每个目标点云的符合所述至少一个锚框尺寸的至少一个锚框。In a possible design, when the processing unit 302 determines at least one anchor box corresponding to the at least one category identification of each target point cloud according to the at least one category identification, the processing unit 302 is specifically configured to: determine at least one anchor box An anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier; it is determined that each target point cloud conforms to the at least one anchor frame size of at least one anchor box.
在一种可能的设计中,所述处理单元302根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框,具体用于:确定至少一个对象尺寸,其中,所述至少一个对象尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个对象尺寸,任一个类别标识对应的对象尺寸用于表示所述类别标识所属的对象的尺寸;根据所述至少一个对象尺寸,确定至少一个映射尺寸,其中,所述至少一个对象尺寸与所述至少一个映射尺寸一一对应,任一个对象尺寸对应的映射尺寸为目标类别标识所属的对象映射到所述第一图像后的尺寸,所述目标类别标识为所述对象尺寸对应的类别标识;确定每个目标点云的符合所述至少一个映射尺寸的至少一个锚框。In a possible design, the processing unit 302 determines, according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud, and is specifically configured to: determine at least one Object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to indicate the size of the object to which the category identification belongs. size; at least one mapping size is determined according to the at least one object size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the object to which the target category identifier belongs The size after mapping to the first image, the target category identifier is the category identifier corresponding to the object size; at least one anchor box of each target point cloud that conforms to the at least one mapping size is determined.
在一种可能的设计中,在任一个目标点云的任一个锚框中,所述目标点云位于所述锚框包围的区域中的任意位置,例如位于所述锚框的中心位置,或者位于所述锚框的任一边长中的任意位置,例如位于所述锚框的任一边长的中点位置。In a possible design, in any anchor frame of any target point cloud, the target point cloud is located at any position in the area surrounded by the anchor frame, for example, at the center of the anchor frame, or at any position in the anchor frame. Any position in the length of any side of the anchor frame, for example, at the midpoint position of the length of any side of the anchor frame.
在一种可能的设计中,所述处理单元302根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置,具体用于:识别所述第一图像中包含的每个目标对象的目标类别,并确定所述目标类别的置信度;在每个目标点云的至少一个锚框中,确定每个目标对象所在的目标锚框;输出检测结果,所述检测结果包含:每个目标对象的目标类别,每个目标对象的目标类别的类别标识,每个目标对象的目标类别的置信度,以及每个目标对象所在的目标锚框。In a possible design, the processing unit 302 performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the position of at least one target object to be detected, which is specifically used for: identifying the first The target category of each target object contained in an image, and determine the confidence level of the target category; in at least one anchor frame of each target point cloud, determine the target anchor frame where each target object is located; output the detection result , the detection result includes: the target category of each target object, the category identifier of the target category of each target object, the confidence level of the target category of each target object, and the target anchor frame where each target object is located.
作为一种实现方式,所述目标检测装置300还可以包括存储单元303,用于存储所述目标检测装置300的程序代码和数据。其中,所述处理单元302可以是处理器或控制器,例如可以是通用中央处理器(central processing unit,CPU),通用处理器,数字信号处理(digital signal processing,DSP),专用集成电路(application specific integrated circuits,ASIC),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块等。所述处理器也可以是实现计算功能的组合,例 如包括一个或多个微处理器组合,DSP和微处理器的组合等等。所述存储单元303可以是存储器。所述数据获取单元301可以是一种该目标检测装置的接口电路,用于从其它装置接收数据,例如接收雷达传感器发送的第一点云数据。当该目标检测装置以芯片的方式实现时,数据获取单元301可以是该芯片用于从其它芯片或装置接收数据或者向其它芯片或装置发送数据的接口电路。As an implementation manner, the target detection apparatus 300 may further include a storage unit 303 for storing program codes and data of the target detection apparatus 300 . The processing unit 302 may be a processor or a controller, such as a general-purpose central processing unit (CPU), a general-purpose processor, a digital signal processing (DSP), an application-specific integrated circuit (application). specific integrated circuits, ASIC), field programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It may implement or execute the various exemplary logical blocks, modules, etc. described in connection with this disclosure. The processor may also be a combination that implements computing functions, such as a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, and the like. The storage unit 303 may be a memory. The data acquisition unit 301 may be an interface circuit of the target detection device, used for receiving data from other devices, for example, receiving the first point cloud data sent by a radar sensor. When the target detection device is implemented in the form of a chip, the data acquisition unit 301 may be an interface circuit used by the chip to receive data from or send data to other chips or devices.
本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能单元可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。The division of units in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be other division methods. In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit. In the device, it can also exist physically alone, or two or more units can be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
图3中的各个单元的只一个或多个可以软件、硬件、固件或其结合实现。所述软件或固件包括但不限于计算机程序指令或代码,并可以被硬件处理器所执行。所述硬件包括但不限于各类集成电路,如中央处理单元(CPU)、数字信号处理器(DSP)、现场可编程门阵列(FPGA)或专用集成电路(ASIC)。Only one or more of the various units in FIG. 3 may be implemented in software, hardware, firmware, or a combination thereof. The software or firmware includes, but is not limited to, computer program instructions or code, and can be executed by a hardware processor. The hardware includes, but is not limited to, various types of integrated circuits, such as a central processing unit (CPU), a digital signal processor (DSP), a field programmable gate array (FPGA), or an application specific integrated circuit (ASIC).
基于以上实施例及相同构思,本申请实施例还提供了一种目标检测装置,用于实现本申请实施例提供的目标检测方法。如图4所示,所述目标检测装置400可以包括:一个或多个处理器401,存储器402,以及一个或多个计算机程序(图中未示出)。作为一种实现方式,上述各器件可以通过一个或多个通信线路403耦合。其中,存储器402中存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令;处理器401调用存储器402中存储的所述指令,使得目标检测装置400执行本申请实施例提供的目标检测方法。Based on the above embodiments and the same concept, the embodiments of the present application further provide a target detection apparatus for implementing the target detection method provided by the embodiments of the present application. As shown in FIG. 4, the target detection apparatus 400 may include: one or more processors 401, a memory 402, and one or more computer programs (not shown in the figure). As an implementation manner, the above devices may be coupled through one or more communication lines 403 . Wherein, one or more computer programs are stored in the memory 402, and the one or more computer programs include instructions; the processor 401 invokes the instructions stored in the memory 402, so that the target detection apparatus 400 executes the instructions provided by the embodiments of the present application. object detection method.
在本申请实施例中,处理器可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。In this embodiment of the present application, the processor may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, which can implement or The methods, steps and logic block diagrams disclosed in the embodiments of this application are executed. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
在本申请实施例中,存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的***和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。本申请实施例中的存储器还可以是电路或者其它任意能够实现存储功能的装置。In the embodiments of the present application, the memory may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory may be random access memory (RAM), which acts as an external cache. By way of example and not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (synchlink DRAM, SLDRAM) ) and direct memory bus random access memory (direct rambus RAM, DR RAM). It should be noted that the memory of the systems and methods described herein is intended to include, but not be limited to, these and any other suitable types of memory. The memory in this embodiment of the present application may also be a circuit or any other device capable of implementing a storage function.
作为一种实现方式,所述目标检测装置400还可以包括通信接口404,用于通过传输介质和其它装置进行通信,例如,在采集第一点云数据的装置不是所述目标检测装置400 时,所述目标检测装置400可以通过所述通信接口404,与采集第一点云数据的装置如雷达传感器进行通信,从而接收该装置采集的第一点云数据。在本申请实施例中,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口。在本申请实施例中,通信接口为收发器时,收发器可以包括独立的接收器、独立的发射器;也可以集成收发功能的收发器、或者是接口电路。As an implementation manner, the target detection apparatus 400 may further include a communication interface 404 for communicating with other apparatuses through a transmission medium. For example, when the apparatus for collecting the first point cloud data is not the target detection apparatus 400, The target detection device 400 may communicate with a device that collects first point cloud data, such as a radar sensor, through the communication interface 404, so as to receive the first point cloud data collected by the device. In this embodiment of the present application, the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces. In this embodiment of the present application, when the communication interface is a transceiver, the transceiver may include an independent receiver and an independent transmitter; it may also be a transceiver integrating a transceiver function, or an interface circuit.
在本申请一些实施例中,所述处理器401、存储器402以及通信接口404可以通过通信线路403相互连接;通信线路403可以是外设部件互连标准(Peripheral Component Interconnect,简称PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,简称EISA)总线等。所述通信线路403可以分为地址总线、数据总线、控制总线等。为便于表示,图4中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。In some embodiments of the present application, the processor 401, the memory 402, and the communication interface 404 may be connected to each other through a communication line 403; the communication line 403 may be a Peripheral Component Interconnect (PCI for short) bus or an extension Industry standard structure (Extended Industry Standard Architecture, referred to as EISA) bus and so on. The communication line 403 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 4, but it does not mean that there is only one bus or one type of bus.
本申请实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,简称DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,简称DVD)、或者半导体介质(例如,SSD)等。The methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer may be a general purpose computer, a special purpose computer, a computer network, network equipment, user equipment, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server or data center by means of wired (such as coaxial cable, optical fiber, digital subscriber line, DSL for short) or wireless (such as infrared, wireless, microwave, etc.) A computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The available media can be magnetic media (eg, floppy disks, hard disks, magnetic tape), optical media (eg, digital video disc (DVD) for short), or semiconductor media (eg, SSD), and the like.
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the scope of the present application. Thus, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims (24)

  1. 一种目标检测方法,其特征在于,包括:A target detection method, comprising:
    获取雷达传感器采集的第一点云数据和对应的摄像头传感器采集的第一图像,其中,所述第一点云数据包含多个点云;acquiring the first point cloud data collected by the radar sensor and the first image collected by the corresponding camera sensor, wherein the first point cloud data includes multiple point clouds;
    将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云;mapping the first point cloud data to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds;
    对所述第二点云数据进行栅格划分;performing grid division on the second point cloud data;
    根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云;According to the feature data of the point cloud in the second point cloud data, determine a plurality of target point clouds after grid division of the second point cloud data, wherein any target point cloud corresponds to the second point cloud data at least one point cloud in ;
    在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框;In the first image, generating at least one anchor frame corresponding to each target point cloud in the plurality of target point clouds;
    根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。Perform target detection according to at least one anchor frame corresponding to each generated target point cloud, and determine the position of at least one target object to be detected.
  2. 根据权利要求1所述的方法,其特征在于,所述特征数据用于表示点云的雷达回波强度或点云的雷达回波强度分布特征或点云的极化特征。The method according to claim 1, wherein the feature data is used to represent the radar echo intensity of the point cloud or the distribution characteristic of the radar echo intensity of the point cloud or the polarization characteristic of the point cloud.
  3. 根据权利要求1或2所述的方法,其特征在于,在每个目标点云对应的至少一个锚框中,任一个目标点云对应的任一个锚框包含所述目标点云。The method according to claim 1 or 2, wherein, in at least one anchor frame corresponding to each target point cloud, any anchor frame corresponding to any target point cloud contains the target point cloud.
  4. 根据权利要求1~3任一所述的方法,其特征在于,对所述第二点云数据进行栅格划分,包括:The method according to any one of claims 1 to 3, wherein the grid division of the second point cloud data comprises:
    按照设定栅格大小,将所述第二点云数据划分为多个栅格;Divide the second point cloud data into a plurality of grids according to the set grid size;
    根据每个栅格中点云的距离参数,将每个栅格中包含的点云划分为多个点云集合,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。According to the distance parameter of the point cloud in each grid, the point cloud contained in each grid is divided into a plurality of point cloud sets, wherein the distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
  5. 根据权利要求4所述的方法,其特征在于,根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,包括:The method according to claim 4, wherein, according to the feature data of the point cloud in the second point cloud data, determining a plurality of target point clouds after grid division of the second point cloud data, comprising:
    在每个栅格中,分别根据每个点云集合中点云的特征数据,确定每个点云集合的目标点云;In each grid, the target point cloud of each point cloud set is determined according to the feature data of the point cloud in each point cloud set;
    将每个栅格中包含的多个点云集合的目标点云,作为每个栅格的目标点云;Use the target point cloud of multiple point cloud sets contained in each grid as the target point cloud of each grid;
    将所述第二点云数据包含的多个栅格的目标点云,作为所述第二点云数据进行栅格划分后的目标点云,得到所述多个目标点云。The target point clouds of multiple grids included in the second point cloud data are used as the target point clouds after grid division of the second point cloud data to obtain the multiple target point clouds.
  6. 根据权利要求1~5任一所述的方法,其特征在于,在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框,包括:The method according to any one of claims 1 to 5, wherein, in the first image, generating at least one anchor frame corresponding to each target point cloud in the plurality of target point clouds, comprising:
    获取至少一种类别标识,其中,不同类别标识分别用于表示不同对象的类别;Obtain at least one category identifier, wherein different category identifiers are used to represent categories of different objects;
    根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框。According to the at least one category identification, at least one anchor box corresponding to the at least one category identification of each target point cloud is determined.
  7. 根据权利要求6所述的方法,其特征在于,所述至少一种类别标识包括设定的类别标识和/或对参考图像进行目标检测后确定的类别标识,其中,所述参考图像为在所述第一图像之前进行目标检测的一帧图像。The method according to claim 6, wherein the at least one category identifier includes a set category identifier and/or a category identifier determined after performing target detection on a reference image, wherein the reference image is a A frame of images for target detection before the first image is described.
  8. 根据权利要求7所述的方法,其特征在于,所述对参考图像进行目标检测后确定的类别标识的置信度大于设定阈值。The method according to claim 7, wherein the confidence level of the category identifier determined after the target detection is performed on the reference image is greater than a set threshold.
  9. 根据权利要求6~8任一所述的方法,其特征在于,根据所述至少一种类别标识,确 定每个目标点云的所述至少一种类别标识对应的至少一个锚框,包括:The method according to any one of claims 6 to 8, wherein, according to the at least one category identification, determining at least one anchor box corresponding to the at least one category identification of each target point cloud, comprising:
    确定至少一个锚框尺寸,其中,所述至少一个锚框尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个锚框尺寸;determining at least one anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identifier in the at least one category identifier;
    确定每个目标点云的符合所述至少一个锚框尺寸的至少一个锚框。At least one anchor box of each target point cloud that conforms to the size of the at least one anchor box is determined.
  10. 根据权利要求6~8任一所述的方法,其特征在于,根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框,包括:The method according to any one of claims 6 to 8, wherein determining, according to the at least one category identifier, at least one anchor box corresponding to the at least one category identifier of each target point cloud, comprising:
    确定至少一个对象尺寸,其中,所述至少一个对象尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个对象尺寸,任一个类别标识对应的对象尺寸用于表示所述类别标识所属的对象的尺寸;Determine at least one object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to indicate that the category identification belongs to the size of the object;
    根据所述至少一个对象尺寸,确定至少一个映射尺寸,其中,所述至少一个对象尺寸与所述至少一个映射尺寸一一对应,任一个对象尺寸对应的映射尺寸为目标类别标识所属的对象映射到所述第一图像后的尺寸,所述目标类别标识为所述对象尺寸对应的类别标识;At least one mapping size is determined according to the at least one object size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the mapping of the object to which the target category identifier belongs. The size after the first image, the target category identifier is the category identifier corresponding to the object size;
    确定每个目标点云的符合所述至少一个映射尺寸的至少一个锚框。At least one anchor box of each target point cloud that conforms to the at least one mapping size is determined.
  11. 根据权利要求1~10任一所述的方法,其特征在于,在任一个目标点云的任一个锚框中,所述目标点云位于所述锚框的中心位置,或者位于所述锚框的任一边长的中点位置。The method according to any one of claims 1 to 10, wherein, in any anchor frame of any target point cloud, the target point cloud is located at the center of the anchor frame, or at the center of the anchor frame. The midpoint of either side length.
  12. 根据权利要求1~11任一所述的方法,其特征在于,根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置,包括:The method according to any one of claims 1 to 11, wherein the target detection is performed according to at least one anchor frame corresponding to each generated target point cloud, and the position of the at least one target object to be detected is determined, comprising:
    识别所述第一图像中包含的每个目标对象的目标类别,并确定所述目标类别的置信度;Identifying the target category of each target object contained in the first image, and determining the confidence level of the target category;
    在每个目标点云的至少一个锚框中,确定每个目标对象所在的目标锚框;In at least one anchor frame of each target point cloud, determine the target anchor frame where each target object is located;
    输出检测结果,所述检测结果包含:每个目标对象的目标类别,每个目标对象的目标类别的类别标识,每个目标对象的目标类别的置信度,以及每个目标对象所在的目标锚框。Output detection results, the detection results include: the target category of each target object, the category identifier of the target category of each target object, the confidence level of the target category of each target object, and the target anchor frame where each target object is located .
  13. 一种目标检测装置,其特征在于,包括数据获取单元和处理单元;A target detection device, comprising a data acquisition unit and a processing unit;
    所述数据获取单元,用于获取雷达传感器采集的第一点云数据和对应的摄像头传感器采集的第一图像,其中,所述第一点云数据包含多个点云;the data acquisition unit, configured to acquire the first point cloud data collected by the radar sensor and the first image collected by the corresponding camera sensor, wherein the first point cloud data includes multiple point clouds;
    所述处理单元,用于将所述第一点云数据映射到所述第一图像的图像平面,得到第二点云数据,其中,所述第二点云数据包含多个点云;对所述第二点云数据进行栅格划分;根据所述第二点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云,其中,任一个目标点云对应所述第二点云数据中的至少一个点云;在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框;根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置。The processing unit is configured to map the first point cloud data to the image plane of the first image to obtain second point cloud data, wherein the second point cloud data includes multiple point clouds; The second point cloud data is divided into grids; according to the feature data of the point clouds in the second point cloud data, a plurality of target point clouds after grid division of the second point cloud data is determined, wherein any one of The target point cloud corresponds to at least one point cloud in the second point cloud data; in the first image, at least one anchor frame corresponding to each target point cloud in the multiple target point clouds is generated; according to the generated At least one anchor frame corresponding to each target point cloud performs target detection, and determines the position of at least one target object to be detected.
  14. 根据权利要求13所述的目标检测装置,其特征在于,所述特征数据用于表示点云的雷达回波强度或点云的雷达回波强度分布特征或点云的极化特征;The target detection device according to claim 13, wherein the feature data is used to represent the radar echo intensity of the point cloud or the distribution feature of the radar echo intensity of the point cloud or the polarization feature of the point cloud;
    在每个目标点云对应的至少一个锚框中,任一个目标点云对应的任一个锚框包含所述目标点云。In at least one anchor box corresponding to each target point cloud, any anchor box corresponding to any target point cloud contains the target point cloud.
  15. 根据权利要求13或14所述的目标检测装置,其特征在于,所述处理单元对所述第二点云数据进行栅格划分时,具体用于:The target detection device according to claim 13 or 14, wherein when the processing unit performs grid division on the second point cloud data, it is specifically used for:
    按照设定栅格大小,将所述第二点云数据划分为多个栅格;Divide the second point cloud data into a plurality of grids according to the set grid size;
    根据每个栅格中点云的距离参数,将每个栅格中包含的点云划分为多个点云集合,其中,所述距离参数用于表示点云到所述雷达传感器的水平距离。According to the distance parameter of the point cloud in each grid, the point cloud contained in each grid is divided into a plurality of point cloud sets, wherein the distance parameter is used to represent the horizontal distance from the point cloud to the radar sensor.
  16. 根据权利要求15所述的目标检测装置,其特征在于,所述处理单元根据所述第二 点云数据中点云的特征数据,确定所述第二点云数据进行栅格划分后的多个目标点云时,具体用于:The target detection device according to claim 15, wherein the processing unit determines, according to the feature data of the point cloud in the second point cloud data, a plurality of grid divisions of the second point cloud data When the target point cloud is used, it is specifically used for:
    在每个栅格中,分别根据每个点云集合中点云的特征数据,确定每个点云集合的目标点云;In each grid, the target point cloud of each point cloud set is determined according to the feature data of the point cloud in each point cloud set;
    将每个栅格中包含的多个点云集合的目标点云,作为每个栅格的目标点云;Use the target point cloud of multiple point cloud sets contained in each grid as the target point cloud of each grid;
    将所述第二点云数据包含的多个栅格的目标点云,作为所述第二点云数据进行栅格划分后的目标点云,得到所述多个目标点云。Taking the target point clouds of multiple grids included in the second point cloud data as the target point clouds after grid division of the second point cloud data, the multiple target point clouds are obtained.
  17. 根据权利要求13~16任一所述的目标检测装置,其特征在于,所述处理单元在所述第一图像中,生成所述多个目标点云中每个目标点云对应的至少一个锚框时,具体用于:The target detection device according to any one of claims 13 to 16, wherein the processing unit generates, in the first image, at least one anchor corresponding to each target point cloud in the plurality of target point clouds box, specifically for:
    获取至少一种类别标识,其中,不同类别标识分别用于表示不同对象的类别;根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框;或者Obtain at least one category identifier, wherein different category identifiers are respectively used to represent categories of different objects; according to the at least one category identifier, determine at least one anchor box corresponding to the at least one category identifier of each target point cloud ;or
    确定至少一个锚框尺寸,其中,所述至少一个锚框尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个锚框尺寸;确定每个目标点云的符合所述至少一个锚框尺寸的至少一个锚框。determining at least one anchor frame size, wherein the at least one anchor frame size includes at least one anchor frame size corresponding to each category identification in the at least one category identification; determining that each target point cloud conforms to the at least one anchor At least one anchor box for the box size.
  18. 根据权利要求17所述的目标检测装置,其特征在于,所述至少一种类别标识包括设定的类别标识和/或对参考图像进行目标检测后确定的类别标识,其中,所述参考图像为在所述第一图像之前进行目标检测的一帧图像;The target detection apparatus according to claim 17, wherein the at least one category identifier includes a set category identifier and/or a category identifier determined after performing target detection on a reference image, wherein the reference image is A frame of image for object detection before the first image;
    所述对参考图像进行目标检测后确定的类别标识的置信度大于设定阈值;The confidence of the category identification determined after the target detection is performed on the reference image is greater than the set threshold;
    在任一个目标点云的任一个锚框中,所述目标点云位于所述锚框的中心位置,或者位于所述锚框的任一边长的中点位置。In any anchor frame of any target point cloud, the target point cloud is located at the center position of the anchor frame, or at the midpoint position of any side length of the anchor frame.
  19. 根据权利要求17或18所述的目标检测装置,其特征在于,所述处理单元根据所述至少一种类别标识,确定每个目标点云的所述至少一种类别标识对应的至少一个锚框时,具体用于:The target detection apparatus according to claim 17 or 18, wherein the processing unit determines at least one anchor frame corresponding to the at least one category identification of each target point cloud according to the at least one category identification , specifically for:
    确定至少一个对象尺寸,其中,所述至少一个对象尺寸包含所述至少一种类别标识中每种类别标识对应的至少一个对象尺寸,任一个类别标识对应的对象尺寸用于表示所述类别标识所属的对象的尺寸;Determine at least one object size, wherein the at least one object size includes at least one object size corresponding to each category identification in the at least one category identification, and the object size corresponding to any category identification is used to indicate that the category identification belongs to the size of the object;
    根据所述至少一个对象尺寸,确定至少一个映射尺寸,其中,所述至少一个对象尺寸与所述至少一个映射尺寸一一对应,任一个对象尺寸对应的映射尺寸为目标类别标识所属的对象映射到所述第一图像后的尺寸,所述目标类别标识为所述对象尺寸对应的类别标识;At least one mapping size is determined according to the at least one object size, wherein the at least one object size is in one-to-one correspondence with the at least one mapping size, and the mapping size corresponding to any object size is the mapping of the object to which the target category identifier belongs. The size after the first image, the target category identifier is the category identifier corresponding to the object size;
    确定每个目标点云的符合所述至少一个映射尺寸的至少一个锚框。At least one anchor box of each target point cloud that conforms to the at least one mapping size is determined.
  20. 根据权利要求13~19任一所述的目标检测装置,其特征在于,所述处理单元根据生成的每个目标点云对应的至少一个锚框进行目标检测,确定待检测的至少一个目标对象的位置时,具体用于:The target detection device according to any one of claims 13 to 19, wherein the processing unit performs target detection according to at least one anchor frame corresponding to each generated target point cloud, and determines the at least one target object to be detected. position, specifically for:
    识别所述第一图像中包含的每个目标对象的目标类别,并确定所述目标类别的置信度;Identifying the target category of each target object contained in the first image, and determining the confidence level of the target category;
    在每个目标点云的至少一个锚框中,确定每个目标对象所在的目标锚框;In at least one anchor frame of each target point cloud, determine the target anchor frame where each target object is located;
    输出检测结果,所述检测结果包含:每个目标对象的目标类别,每个目标对象的目标类别的类别标识,每个目标对象的目标类别的置信度,以及每个目标对象所在的目标锚框。Output detection results, the detection results include: the target category of each target object, the category identifier of the target category of each target object, the confidence level of the target category of each target object, and the target anchor frame where each target object is located .
  21. 一种目标检测装置,其特征在于,包括存储器和处理器;A target detection device, comprising a memory and a processor;
    所述存储器用于存储计算机程序;the memory is used to store computer programs;
    所述处理器用于执行所述存储器中存储的计算程序,实现如权利要求1~12中任一项所述的方法。The processor is configured to execute the calculation program stored in the memory to implement the method according to any one of claims 1 to 12.
  22. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序指令,当所述计算机程序指令在目标检测装置上运行时,使得所述目标检测装置执行如权利要求1~12任一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer program instructions, when the computer program instructions are executed on the target detection device, the target detection device is made to perform the method as claimed in claim 1. The method of any one of ~12.
  23. 一种终端,其特征在于,所述终端包括如权利要求13~20中任一项所述的目标检测装置,或者包括如权利要求21中所述的目标检测装置。A terminal, characterized in that, the terminal includes the target detection device according to any one of claims 13 to 20 , or includes the target detection device according to claim 21 .
  24. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序或指令,当所述计算机程序或指令在目标检测装置上运行时,使得所述目标检测装置执行如权利要求1~12任一项所述的方法。A computer program product, characterized in that the computer program product comprises a computer program or instruction, when the computer program or instruction is run on a target detection device, the target detection device is made to perform any one of claims 1 to 12. one of the methods described.
PCT/CN2022/082553 2021-03-31 2022-03-23 Target detection method and apparatus WO2022206517A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110345758.9A CN115147333A (en) 2021-03-31 2021-03-31 Target detection method and device
CN202110345758.9 2021-03-31

Publications (1)

Publication Number Publication Date
WO2022206517A1 true WO2022206517A1 (en) 2022-10-06

Family

ID=83404575

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/082553 WO2022206517A1 (en) 2021-03-31 2022-03-23 Target detection method and apparatus

Country Status (2)

Country Link
CN (1) CN115147333A (en)
WO (1) WO2022206517A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115469292A (en) * 2022-11-01 2022-12-13 天津卡尔狗科技有限公司 Environment sensing method and device, electronic equipment and storage medium
US20230161000A1 (en) * 2021-11-24 2023-05-25 Smart Radar System, Inc. 4-Dimensional Radar Signal Processing Apparatus
CN116469014A (en) * 2023-01-10 2023-07-21 南京航空航天大学 Small sample satellite radar image sailboard identification and segmentation method based on optimized Mask R-CNN

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110988912A (en) * 2019-12-06 2020-04-10 中国科学院自动化研究所 Road target and distance detection method, system and device for automatic driving vehicle
US20200200912A1 (en) * 2018-12-19 2020-06-25 Andrew Chen Detection and tracking of road-side pole-shaped static objects from lidar point cloud data
CN111352112A (en) * 2020-05-08 2020-06-30 泉州装备制造研究所 Target detection method based on vision, laser radar and millimeter wave radar
CN111652097A (en) * 2020-05-25 2020-09-11 南京莱斯电子设备有限公司 Image millimeter wave radar fusion target detection method
CN112560972A (en) * 2020-12-21 2021-03-26 北京航空航天大学 Target detection method based on millimeter wave radar prior positioning and visual feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200200912A1 (en) * 2018-12-19 2020-06-25 Andrew Chen Detection and tracking of road-side pole-shaped static objects from lidar point cloud data
CN110988912A (en) * 2019-12-06 2020-04-10 中国科学院自动化研究所 Road target and distance detection method, system and device for automatic driving vehicle
CN111352112A (en) * 2020-05-08 2020-06-30 泉州装备制造研究所 Target detection method based on vision, laser radar and millimeter wave radar
CN111652097A (en) * 2020-05-25 2020-09-11 南京莱斯电子设备有限公司 Image millimeter wave radar fusion target detection method
CN112560972A (en) * 2020-12-21 2021-03-26 北京航空航天大学 Target detection method based on millimeter wave radar prior positioning and visual feature fusion

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230161000A1 (en) * 2021-11-24 2023-05-25 Smart Radar System, Inc. 4-Dimensional Radar Signal Processing Apparatus
CN115469292A (en) * 2022-11-01 2022-12-13 天津卡尔狗科技有限公司 Environment sensing method and device, electronic equipment and storage medium
CN115469292B (en) * 2022-11-01 2023-03-24 天津卡尔狗科技有限公司 Environment sensing method and device, electronic equipment and storage medium
CN116469014A (en) * 2023-01-10 2023-07-21 南京航空航天大学 Small sample satellite radar image sailboard identification and segmentation method based on optimized Mask R-CNN
CN116469014B (en) * 2023-01-10 2024-04-30 南京航空航天大学 Small sample satellite radar image sailboard identification and segmentation method based on optimized Mask R-CNN

Also Published As

Publication number Publication date
CN115147333A (en) 2022-10-04

Similar Documents

Publication Publication Date Title
US20230014874A1 (en) Obstacle detection method and apparatus, computer device, and storage medium
WO2022206517A1 (en) Target detection method and apparatus
US20230072289A1 (en) Target detection method and apparatus
CN114022830A (en) Target determination method and target determination device
CN108027877A (en) System and method for the detection of non-barrier
US20220319146A1 (en) Object detection method, object detection device, terminal device, and medium
CN113284163B (en) Three-dimensional target self-adaptive detection method and system based on vehicle-mounted laser radar point cloud
US20220343758A1 (en) Data Transmission Method and Apparatus
US20220301277A1 (en) Target detection method, terminal device, and medium
CN114463736A (en) Multi-target detection method and device based on multi-mode information fusion
WO2023279584A1 (en) Target detection method, target detection apparatus, and robot
CN115147328A (en) Three-dimensional target detection method and device
CN113325388A (en) Method and device for filtering floodlight noise of laser radar in automatic driving
TW202225730A (en) High-efficiency LiDAR object detection method based on deep learning through direct processing of 3D point data to obtain a concise and fast 3D feature to solve the shortcomings of complexity and time-consuming of the current voxel network model
CN115100741A (en) Point cloud pedestrian distance risk detection method, system, equipment and medium
CN112712066B (en) Image recognition method and device, computer equipment and storage medium
TW202017784A (en) Car detection method based on LiDAR by proceeding the three-dimensional feature extraction and the two-dimensional feature extraction on the three-dimensional point cloud map and the two-dimensional map
CN113256709A (en) Target detection method, target detection device, computer equipment and storage medium
CN116797894A (en) Radar and video fusion target detection method for enhancing characteristic information
CN116665179A (en) Data processing method, device, domain controller and storage medium
US20220301176A1 (en) Object detection method, object detection device, terminal device, and medium
WO2022048193A1 (en) Map drawing method and apparatus
CN115601275A (en) Point cloud augmentation method and device, computer readable storage medium and terminal equipment
CN115829898B (en) Data processing method, device, electronic equipment, medium and automatic driving vehicle
TWI819613B (en) Dual sensing method of object and computing apparatus for object sensing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22778700

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22778700

Country of ref document: EP

Kind code of ref document: A1