WO2021056516A1 - Method and device for target detection, and movable platform - Google Patents

Method and device for target detection, and movable platform Download PDF

Info

Publication number
WO2021056516A1
WO2021056516A1 PCT/CN2019/108897 CN2019108897W WO2021056516A1 WO 2021056516 A1 WO2021056516 A1 WO 2021056516A1 CN 2019108897 W CN2019108897 W CN 2019108897W WO 2021056516 A1 WO2021056516 A1 WO 2021056516A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
dimensional
target
grid
feature
Prior art date
Application number
PCT/CN2019/108897
Other languages
French (fr)
Chinese (zh)
Inventor
徐斌
陈晓智
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/108897 priority Critical patent/WO2021056516A1/en
Priority to CN201980033741.0A priority patent/CN112154448A/en
Publication of WO2021056516A1 publication Critical patent/WO2021056516A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the embodiments of the present application relate to detection technology, and in particular, to a target detection method, device, and movable platform.
  • point clouds are usually used to detect targets to guide airplanes, cars, robots, etc. to avoid obstacles and plan paths.
  • a lidar sensor is usually equipped, and the target detection is performed through the laser point cloud output by the lidar sensor.
  • the clustering algorithm is usually used when using the point cloud for target detection. Clustering is the collection of similar elements. In this way, the accuracy of the clustering results is often low, resulting in a high error rate in the above-mentioned target detection results, and subsequent failure to guide airplanes, cars, robots, etc. to avoid obstacles and path planning.
  • the embodiments of the present application provide a target detection method and device to overcome at least one of the above-mentioned problems.
  • an embodiment of the present application provides a target detection method, including:
  • the target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
  • an embodiment of the present application provides a target detection device, including a memory, a processor, and computer-executable instructions stored in the memory and running on the processor, and the processor executes the computer-executed
  • a target detection device including a memory, a processor, and computer-executable instructions stored in the memory and running on the processor, and the processor executes the computer-executed
  • the target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
  • an embodiment of the present application provides a movable platform, including:
  • the target detection device is installed on the movable platform body.
  • an embodiment of the present application provides a computer-readable storage medium that stores computer-executable instructions.
  • the processor executes the computer-executable instructions, the first aspect and the first aspect described above are implemented. In terms of various possible designs, the target detection method described.
  • the target detection method, device, and movable platform provided by the embodiments of the present application determine the point cloud vector based on the number of points in the point cloud to be processed and the attributes of each point; and then use a three-dimensional convolutional neural network to calculate the point cloud vector Perform target 3D point cloud feature extraction, retain height information, and then retain the maximum 3D structure information of the point cloud, and then determine the target information in the point cloud to be processed according to the extracted target 3D point cloud features, where the target information includes the target
  • the geometric position information of each dimension in the three-dimensional coordinate system accurately finds the three-dimensional target in the point cloud, and solves the problem that the existing target detection results have a high error rate and the subsequent failure to guide aircraft, cars, robots, etc. to avoid obstacles and path planning.
  • FIG. 1 is a schematic diagram of the architecture of a target detection system provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of a three-dimensional convolutional neural network provided by an embodiment of the application.
  • FIG. 4 is a schematic flowchart of another target detection method provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a target detection device provided by an embodiment of this application.
  • FIG. 6 is a schematic structural diagram of another target detection device provided by an embodiment of the application.
  • FIG. 7 is a schematic diagram of the hardware structure of a target detection device provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a movable platform provided by an embodiment of the application.
  • point clouds are usually used to detect targets to guide airplanes, cars, robots, etc. to avoid obstacles and plan paths.
  • a lidar sensor is usually equipped, and the target detection is performed through the laser point cloud output by the lidar sensor.
  • the clustering algorithm is usually used when using the point cloud for target detection. Clustering is the collection of similar elements. In this way, the accuracy of the clustering results is often low, resulting in a high error rate in the above-mentioned target detection results, and subsequent failure to guide airplanes, cars, robots, etc. to avoid obstacles and path planning.
  • this embodiment provides a target detection method.
  • the point cloud vector is determined by the number of points in the point cloud to be processed and the attributes of each point; then a three-dimensional convolutional neural network is used to target the point cloud vector.
  • the three-dimensional point cloud feature is extracted, and the target information in the point cloud to be processed is determined according to the extracted target three-dimensional point cloud feature, where the target information includes the geometric position information of the target in each dimension in the three-dimensional coordinate system.
  • the three-dimensional convolutional neural network is used to extract the target three-dimensional point cloud feature from the point cloud vector
  • three-dimensional convolution is used to ensure the retention of height information, and then the three-dimensional structure information of the point cloud is retained to the greatest extent.
  • Determine the target information in the point cloud to be processed according to the extracted target three-dimensional point cloud features which can accurately find the three-dimensional target in the point cloud, solve the problem of high error rate of existing target detection results, and fail to guide aircraft, cars, robots, etc.
  • FIG. 1 is a schematic structural diagram of a target detection system provided by an embodiment of the application. As shown in FIG. 1, it includes: a sensor 101, a first processor 102 and a second processor 103. Taking a vehicle as an example, the sensor 101 can generate a point cloud to be processed in real time, where the point cloud is used to identify the ground surrounding the vehicle. The first processor 102 can determine the point cloud vector based on the number of points in the point cloud generated by the sensor 101 and the attributes of each point.
  • the point cloud vector includes the geometric position information of the points in each dimension in the three-dimensional coordinate system, and uses the three-dimensional
  • the convolutional neural network processes the point cloud vector to extract the target three-dimensional point cloud feature of the point cloud vector, and determine the target information in the above-mentioned point cloud according to the target three-dimensional point cloud feature.
  • the target information includes the target's various dimensions in the three-dimensional coordinate system.
  • the geometric position information and the target information are sent to the second processor 103 for subsequent driving planning and use.
  • the first processor 102 and the second processor 103 may be a vehicle computing platform, an unmanned aerial vehicle processor, or the like. This embodiment does not particularly limit the implementation of the first processor 102 and the second processor 103, as long as the first processor 102 and the second processor 103 can perform the above-mentioned corresponding functions.
  • the foregoing architecture is only an exemplary system architecture block diagram. During specific implementation, it can be set according to application requirements. For example, the first processor 102 and the second processor 103 can be set separately or combined to meet different application requirements. .
  • the above-mentioned target detection system may also include a receiving device, a display device, and the like.
  • the receiving device can be an input/output interface or a communication interface.
  • the receiving device may receive a user's instruction, for example, the receiving device may be an input interface connected to a mouse.
  • the display device can be used to display the above-mentioned target information.
  • the display device may also be a touch screen, which is used to receive user instructions while displaying the above-mentioned target information, so as to realize interaction with the user.
  • FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of this application.
  • the execution subject of this embodiment may be the first processor in the embodiment shown in FIG. 1.
  • the method includes:
  • S201 Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, where the point cloud vector includes geometric position information of the points in each dimension in the three-dimensional coordinate system.
  • the above-mentioned point cloud may be an image point cloud, a radar point cloud, a laser point cloud, etc., and one or more of the above-mentioned point clouds may be used in the subsequent processing according to actual conditions.
  • the point cloud to be processed can be acquired by the sensor.
  • the acquisition range can be limited in the three-dimensional space. For example, get F meters in front of the sensor, B meters behind the sensor, L and R meters on the left and right sides of the sensor, U meters on the sensor and D meters on the bottom.
  • the point cloud processing range of the entire three-dimensional space is limited to the range of (F+B)x(L+R)x(U+D).
  • the values of F, B, L, R, U, and D can be set according to actual conditions.
  • the point cloud vector can be determined according to the number of points in the point cloud to be processed and the attributes of each point.
  • the determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point includes:
  • the attributes of each point determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
  • the point cloud vector is determined according to the vector of each grid after the division.
  • the determining the vector of each grid after division according to the number of points in each grid after division and the attributes of each point includes:
  • the vector of each grid after division is determined.
  • the grid is divided along the direction of each coordinate axis of the point cloud coordinate system, the front-view direction (X axis) is divided every resx meter, and the left and right direction (Y axis) is divided every resy meter, upward
  • the direction (Z axis) is divided every resz meter.
  • the grid is a small cuboid of resx*resy*resz.
  • the values of resx, resy, and resz can be set according to actual conditions.
  • the number of laser points contained in it can be limited to N.
  • the number of points is greater than N , Random sampling can be performed to obtain N points among them, and the redundant laser points can be discarded.
  • the number of points is less than N, random copying can be performed to reach N points, so that all small grids contain the same number of points.
  • the value of the aforementioned number N can be set according to actual conditions.
  • the entire point cloud is represented by a K*N*4 vector, where K represents the number of grids with a non-zero number of points. N is the maximum number of points in each grid. 4 means that each point has 4-dimensional attributes, which are xyz coordinates and reflectivity. It should be understood that in addition to using the K*N*4 vector to represent the above-mentioned point cloud, other information can also be added to the point cloud vector, such as density, height, and so on.
  • the entire three-dimensional space is divided into quantitative small grids, and then the feature vector of each grid is determined, that is, point cloud preprocessing (point cloud coding) is carried out through a certain structured method, and the original point cloud is preserved to the greatest extent Structure information.
  • point cloud preprocessing point cloud coding
  • S202 Use a three-dimensional convolutional neural network to process the point cloud vector to extract a target three-dimensional point cloud feature of the point cloud vector.
  • the embodiment of the present application adopts three-dimensional convolution to extract features from three-dimensional space and retain spatial structure information.
  • the three-dimensional convolutional neural network includes a third convolutional neural network.
  • the processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
  • the third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
  • the third convolutional neural network is obtained through three-dimensional grid vector and three-dimensional grid feature training.
  • the feature extraction is performed on each small grid using a three-dimensional convolutional neural network.
  • This part can be realized by a convolutional layer, an up-sampling layer, and a fully connected layer.
  • the feature vector obtained is K*C
  • the feature corresponding to each small grid is C-dimensional .
  • the feature corresponding to each grid is C-dimensional. If the grid has no points, the feature is a C-dimensional zero vector. If the total number of small grids is X1*Y1*Z1, then the feature vector obtained is X1*Y1*Z1*C.
  • the three-dimensional convolutional neural network further includes a fourth convolutional neural network.
  • the method further includes:
  • the fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
  • the fourth convolutional neural network is obtained through three-dimensional grid feature and three-dimensional point cloud feature training.
  • the feature vector needs to be reshaped as X1*Y1*(Z1*C), where Z1*C is recorded as the dimension of the feature.
  • Z1*C the height dimension ( Z direction) and the characteristic channel dimension are combined, at this time, a certain height information will be lost.
  • the embodiment of the present application uses three-dimensional convolution, which can maintain structural information in various directions and maximize the preservation of the spatial information of the original point cloud. After a series of three-dimensional convolution and other operations, the final feature vector will be obtained, denoted as X2*Y2*Z2*C2.
  • the above-mentioned three-dimensional convolutional neural network may include:
  • the up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
  • the fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
  • the depths of the convolutional layers in the multiple convolutional layers are different;
  • the fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and
  • the processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature ,
  • the number of the fully connected layers is multiple.
  • the number of up-sampling layers is multiple.
  • the above-mentioned three-dimensional convolutional neural network includes a first convolutional layer, a second convolutional layer, a third convolutional layer, a fourth convolutional layer, a first upsampling layer, and a first full convolutional layer.
  • the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer The activation functions of the build-up layer, the fifth convolutional layer, and the sixth convolutional layer all use Relu.
  • the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer sequentially perform three-dimensional feature extraction on the extracted three-dimensional point cloud features.
  • the first up-sampling layer is used for spatial resolution according to the first preset Upsampling the three-dimensional features extracted by the fourth convolutional layer at a high rate.
  • the first fully connected layer is used to perform feature fusion on the three-dimensional features sampled on the first upsampling layer and the three-dimensional features extracted by the third convolutional layer.
  • the buildup layer is used to extract 3D features from the fused features of the first fully connected layer
  • the second upsampling layer is used to upsample the 3D features extracted by the fifth convolutional layer according to the second preset spatial resolution
  • the second The fully connected layer is used to perform feature fusion on the three-dimensional features sampled on the second upsampling layer and the three-dimensional features extracted by the second convolution layer
  • the sixth convolution layer is used to perform three-dimensional features on the fused features of the second fully connected layer Extract and determine the three-dimensional point cloud features of the target.
  • the depths of the convolutional layers in the above-mentioned multiple convolutional layers are different.
  • the depths of the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer increase in order.
  • the above-mentioned up-sampling layer can enlarge the image and increase the information of the image.
  • the convolutional layer reduces the resolution of the image.
  • the quality of the up-sampled image can exceed the quality of the original image, which facilitates subsequent processing.
  • the above-mentioned fully connected layer realizes the splicing and fusion of data, and the data output by multiple convolutional layers (some of which are upsampled by the upsampling layer) are input into different fully connected layers respectively, and the deep and shallow features are merged.
  • S203 Determine target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud, where the target information includes geometric position information of the target in each dimension in a three-dimensional coordinate system.
  • the point cloud to be processed includes the point cloud to be processed corresponding to the target.
  • the point cloud can be coded and used as the input of the neural network, and the characteristics of the point cloud can be learned through the neural network, which can be directly used for the prediction of the three-dimensional target.
  • the prediction is a dense prediction on the feature map of the neural network, and the final detection result can be obtained through end-to-end learning.
  • the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
  • the first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
  • the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
  • the above-mentioned target information can be expressed by selecting the center point, the three-dimensional size, and the yaw angle, as well as the coordinates of the corner points, the coordinates of the bottom rectangle, the height, and the like.
  • the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
  • the target information is determined according to the effective points in the target three-dimensional point cloud feature.
  • the method before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further includes:
  • an effective point in the target three-dimensional point cloud feature is determined.
  • each grid is mapped to the feature map to obtain Ri.
  • the judgment method can use the pixel Euclidean distance L between this point and the center point.
  • a distance threshold T will be set at the same time. If L is less than T, this point is a valid point, otherwise it is an invalid point.
  • the distance threshold T can be set according to actual conditions. For example, taking the target as a vehicle as an example, the above threshold can be set according to the length of the vehicle body.
  • a second convolutional neural network can be used to determine the target category probability based on the target three-dimensional point cloud feature ;
  • Error target removal is performed on the target according to the category probability of the target.
  • the second convolutional neural network is obtained through point cloud feature and target category probability training.
  • the target information in the point cloud to be processed determined above may be a large amount (obtained by intensive prediction of each 3D voxel point).
  • non-maximum value suppression and corresponding score thresholds can be set, Get the final test result.
  • the category probability of the target can be determined based on the above-mentioned three-dimensional point cloud characteristics of the target, for example, the probability of being a vehicle is A%.
  • the corresponding relationship between the target information and the category probability of the target can be determined.
  • the category probability of the target corresponding to the first target information the probability of being a vehicle is 99%
  • the category probability of the target corresponding to the second target information the probability of being a vehicle is 10%.
  • the wrong target is removed, for example, the above-mentioned second target is removed, and so on, to obtain the final detection result.
  • the point cloud vector is determined by the number of points in the point cloud to be processed and the attributes of each point; then the point cloud vector is used to extract features of the target three-dimensional point cloud from the point cloud vector and retain Height information, which retains the three-dimensional structure information of the point cloud to the greatest extent, and then determines the target information in the point cloud to be processed according to the extracted target three-dimensional point cloud features, where the target information includes the geometric position of the target in each dimension in the three-dimensional coordinate system Information, accurately find out the three-dimensional target in the point cloud.
  • the use of convolutional neural network to perform three-dimensional target object detection on point cloud data including detecting the coordinates of the object relative to the sensor, three-dimensional size, the yaw angle of the object in the real world, etc., so that the point cloud data can be used to detect Dynamic obstacles, guide airplanes, cars, robots, etc. to avoid obstacles and plan paths.
  • FIG. 4 is a schematic flowchart of another target detection method provided by an embodiment of the application. This embodiment, on the basis of the embodiment in FIG. 2, describes in detail the specific implementation process of this embodiment. As shown in Figure 4, the method includes:
  • S401 Perform grid division on the point cloud to be processed.
  • S402 Adjust the number of points in each grid after division according to the preset number of points.
  • S403. Determine the vector of each grid after division according to the product of the number of points in each grid after adjustment and the attributes of each point, where the attributes of each point include the three-dimensional coordinates of the point in the three-dimensional coordinate system and Reflectivity.
  • the entire three-dimensional space is divided into quantitative small grids, and then the feature vector of each grid is determined, that is, the point cloud is preprocessed (point cloud coding) through a certain structured method to maximize the preservation of the point cloud Original structure information.
  • S404 Use the third convolutional neural network to extract the three-dimensional grid features of the divided vectors of each grid, where the third convolutional neural network is obtained by training the three-dimensional grid vector and the three-dimensional grid feature.
  • S405 Perform target 3D point cloud feature extraction on the extracted 3D grid features using the fourth convolutional neural network, where the fourth convolutional neural network is obtained by training on the 3D grid feature and the 3D point cloud feature.
  • the third convolutional neural network and the fourth convolutional neural network are both three-dimensional convolutional neural networks.
  • the embodiment of the present application adopts three-dimensional convolution to extract features from three-dimensional space and retain spatial structure information.
  • S406 Use the first convolutional neural network to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the three-dimensional coordinate system of the target in the point cloud to be processed based on the above-mentioned target three-dimensional point cloud feature, where the first convolutional neural network Through point cloud features and the target's center point coordinates, three-dimensional size and yaw angle training in the three-dimensional coordinate system.
  • the point cloud After the point cloud is encoded above, it is used as the input of the neural network, and the characteristics of the point cloud are learned through the neural network, which is directly used for the prediction of the three-dimensional target.
  • the prediction is a dense prediction on the feature map of the neural network, and the final detection result can be obtained through end-to-end learning.
  • S408 Perform wrong target removal on the target according to the class probability of the target.
  • the target information in the point cloud to be processed determined above may be a large amount (obtained by intensive prediction of each three-dimensional voxel point), in order to obtain an accurate result, the wrong target is removed to obtain the final detection result.
  • the point cloud vector is determined by the number of points in the point cloud to be processed and the attributes of each point; then the point cloud vector is used to extract features of the target three-dimensional point cloud from the point cloud vector and retain Height information, which retains the three-dimensional structure information of the point cloud to the greatest extent, and then determines the target information in the point cloud to be processed according to the extracted target three-dimensional point cloud features, where the target information includes the geometric position of the target in each dimension in the three-dimensional coordinate system Information, accurately find the three-dimensional target in the point cloud, and solve the problem that the existing target detection results have a high error rate, and the subsequent failure to guide airplanes, cars, robots, etc. to avoid obstacles and path planning.
  • FIG. 5 is a schematic structural diagram of a target detection device provided by an embodiment of the application. For ease of description, only the parts related to the embodiments of the present application are shown.
  • the target detection device 50 includes: a first determination module 501, an extraction module 502, and a second determination module 503.
  • the first determining module 501 is configured to determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, and the point cloud vector includes the geometric position information of the points in each dimension in the three-dimensional coordinate system.
  • the extraction module 502 is configured to process the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector.
  • the second determining module 503 is configured to determine target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud, where the target information includes geometric position information of the target in each dimension in a three-dimensional coordinate system.
  • the device provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and details are not described herein again in this embodiment.
  • Fig. 6 is a schematic structural diagram of another target detection device provided by an embodiment of the present invention. As shown in FIG. 6, this embodiment is based on the embodiment in FIG. 5, and the foregoing target detection device further includes: a third determination module 504 and a removal module 505.
  • the second determining module 503 is specifically configured to:
  • the first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
  • the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
  • the third determining module 504 is configured to use the second determining module 503 to determine the target information in the to-be-processed point cloud according to the target three-dimensional point cloud characteristics.
  • the convolutional neural network determines the class probability of the target based on the characteristics of the target three-dimensional point cloud.
  • the removal module 505 is configured to remove erroneous targets for the target according to the class probability of the target.
  • the second convolutional neural network is obtained through point cloud feature and target category probability training.
  • the first determining module 501 is specifically configured to:
  • the attributes of each point determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
  • the point cloud vector is determined according to the vector of each grid after the division.
  • the first determining module 501 determines the vector of each grid after division according to the number of points in each grid after division and the attributes of each point, including:
  • the vector of each grid after division is determined.
  • the three-dimensional convolutional neural network includes a third convolutional neural network.
  • the extraction module 502 is specifically used for:
  • the third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
  • the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
  • the three-dimensional convolutional neural network further includes a fourth convolutional neural network.
  • the extraction module 502 uses the third convolutional neural network to extract the three-dimensional grid features of the divided vectors of each grid, it is also used to:
  • the fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
  • the fourth convolutional neural network is obtained by training on three-dimensional grid features and three-dimensional point cloud features
  • the second determining module 503 is specifically configured to:
  • the target information is determined according to the effective points in the target three-dimensional point cloud feature.
  • the second determining module 503 before determining the target information according to the effective points in the target three-dimensional point cloud feature, is further configured to:
  • an effective point in the target three-dimensional point cloud feature is determined.
  • the three-dimensional convolutional neural network includes:
  • the up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
  • the fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
  • the depths of the convolutional layers in the multiple convolutional layers are different;
  • the fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and
  • the processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature ,
  • the number of the fully connected layers is multiple.
  • the number of the up-sampling layers is multiple.
  • the device provided in the embodiment of the present application can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and the details of the embodiments of the present application are not repeated here.
  • FIG. 7 is a schematic diagram of the hardware structure of a target detection device provided by an embodiment of the application.
  • the target detection device 70 of this embodiment includes: a memory 701 and a processor 702; wherein
  • the memory 701 is used to store program instructions
  • the processor 702 is configured to execute program instructions stored in the memory. When the program instructions are executed, the processor executes the following steps:
  • the target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
  • the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
  • the first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
  • the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
  • the method further includes:
  • Error target removal is performed on the target according to the category probability of the target.
  • the second convolutional neural network is trained by point cloud features and target category probabilities.
  • the determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point includes:
  • the attributes of each point determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
  • the point cloud vector is determined according to the vector of each grid after the division.
  • the determination of the vector of each grid after division according to the number of points in each grid after division and the attributes of each point includes:
  • the vector of each grid after division is determined.
  • the three-dimensional convolutional neural network includes a third convolutional neural network
  • the processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
  • the third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
  • the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
  • the three-dimensional convolutional neural network further includes a fourth convolutional neural network
  • the method further includes:
  • the fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
  • the fourth convolutional neural network is obtained through training of three-dimensional grid features and three-dimensional point cloud features.
  • the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
  • the target information is determined according to the effective points in the target three-dimensional point cloud feature.
  • the method before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further includes:
  • an effective point in the target three-dimensional point cloud feature is determined.
  • the three-dimensional convolutional neural network includes:
  • the up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
  • the fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
  • the depths of the convolutional layers in the multiple convolutional layers are different;
  • the fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and
  • the processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature ,
  • the number of the fully connected layers is multiple.
  • the number of the up-sampling layers is multiple.
  • the memory 701 may be independent or integrated with the processor 702.
  • the target detection device further includes a bus 703 for connecting the memory 701 and the processor 702.
  • the target detection device 70 may be a single device, and the device includes a complete set of the foregoing memory 701, processor 702, and so on.
  • the components of the target detection device 70 may be distributed and integrated on the vehicle, that is, the memory 701, the processor 702, etc. may be respectively arranged in different positions of the vehicle.
  • FIG. 8 is a schematic structural diagram of a movable platform provided by an embodiment of the application.
  • the movable platform 80 of this embodiment includes: a movable platform body 801 and a target detection device 802; the target detection device 802 is provided on the movable platform body 801, the movable platform body 801 and the The target detection device 802 is connected wirelessly or wiredly.
  • the target detection device 802 determines a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, and the point cloud vector includes geometric position information of the points in each dimension in a three-dimensional coordinate system;
  • the target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
  • the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
  • the first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
  • the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
  • the method further includes:
  • Error target removal is performed on the target according to the category probability of the target.
  • the second convolutional neural network is obtained through point cloud feature and target category probability training.
  • the determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point includes:
  • the attributes of each point determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
  • the point cloud vector is determined according to the vector of each grid after the division.
  • the determination of the vector of each grid after division according to the number of points in each grid after division and the attributes of each point includes:
  • the vector of each grid after division is determined.
  • the three-dimensional convolutional neural network includes a third convolutional neural network
  • the processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
  • the third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
  • the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
  • the three-dimensional convolutional neural network further includes a fourth convolutional neural network
  • the method further includes:
  • the fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
  • the fourth convolutional neural network is obtained through training of three-dimensional grid features and three-dimensional point cloud features.
  • the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
  • the target information is determined according to the effective points in the target three-dimensional point cloud feature.
  • the method before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further includes:
  • an effective point in the target three-dimensional point cloud feature is determined.
  • the three-dimensional convolutional neural network includes:
  • the up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
  • the fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
  • the depths of the convolutional layers in the multiple convolutional layers are different;
  • the fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and
  • the processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature ,
  • the number of the fully connected layers is multiple.
  • the number of the up-sampling layers is multiple.
  • the movable platform includes: a movable platform body and a target detection device.
  • the target detection device is set on the movable platform body.
  • the target detection device passes through the number of points in the point cloud to be processed and the number of points in each point. Attribute, determine the point cloud vector; then use the 3D convolutional neural network to extract the target 3D point cloud feature from the point cloud vector, retain the height information, and then retain the maximum 3D structure information of the point cloud, and then according to the extracted target 3D point cloud feature
  • the follow-up can not guide aircraft, cars, robots and other issues in obstacle avoidance and path planning.
  • the embodiment of the present application provides a computer-readable storage medium having program instructions stored in the computer-readable storage medium, and when a processor executes the program instructions, the target detection method as described above is implemented.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present invention may be integrated into one processing unit, or each module may exist alone physically, or two or more modules may be integrated into one unit.
  • the units formed by the above modules can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the above-mentioned integrated modules implemented in the form of software functional modules may be stored in a computer readable storage medium.
  • the above-mentioned software function module is stored in a storage medium and includes a number of instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (English: processor) execute the various embodiments of the present application Part of the method.
  • processor may be a central processing unit (Central Processing Unit, CPU for short), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Referred to as ASIC) and so on.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps of the method disclosed in combination with the invention can be directly embodied as executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the memory may include a high-speed RAM memory, or may also include a non-volatile storage NVM, such as at least one disk storage, and may also be a U disk, a mobile hard disk, a read-only memory, a magnetic disk, or an optical disk.
  • NVM non-volatile storage
  • the bus may be an Industry Standard Architecture (ISA) bus, Peripheral Component (PCI) bus, or Extended Industry Standard Architecture (EISA) bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the buses in the drawings of this application are not limited to only one bus or one type of bus.
  • the above-mentioned storage medium can be realized by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Except programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable except programmable read only memory
  • PROM programmable read only memory
  • ROM read only memory
  • magnetic memory flash memory
  • flash memory magnetic disk or optical disk.
  • optical disk any available medium that can be accessed by a general-purpose or special-purpose computer.
  • An exemplary storage medium is coupled to the processor, so that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium may also be an integral part of the processor.
  • the processor and the storage medium may be located in Application Specific Integrated Circuits (ASIC for short).
  • ASIC Application Specific Integrated Circuits
  • the processor and the storage medium may also exist as discrete components in the electronic device or the main control device.
  • a person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware.
  • the aforementioned program can be stored in a computer readable storage medium. When the program is executed, it executes the steps including the foregoing method embodiments; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

A method and device for target detection, and a movable platform. The method: determining a point cloud vector on the basis of the number of points in a point cloud to be processed and attributes of each point; then utilizing a three-dimensional convolutional neural network to extract a target three-dimensional point cloud feature of the point cloud vector, retaining height information, thus retaining three dimensional structure information of the point cloud to the greatest extent, and determining target information in said point cloud on the basis of the extracted target three-dimensional point cloud feature, where the target information comprises geometric location information of a target in each dimension of a three-dimensional coordinate system. This allows a three-dimensional target in a point cloud to be found accurately, solves the existing problem of a high error rate with target detection results followed by inability to guide a plane, vehicle, or robot to perform obstacle avoidance and route planning.

Description

目标检测方法、设备及可移动平台Target detection method, equipment and movable platform 技术领域Technical field
本申请实施例涉及检测技术,尤其涉及一种目标检测方法、设备及可移动平台。The embodiments of the present application relate to detection technology, and in particular, to a target detection method, device, and movable platform.
背景技术Background technique
随着经济技术的不断发展,飞机、汽车、机器人等应用到人们生活的很多方面,相应地,如何指导飞机、汽车、机器人等进行避障和路径规划成为人们关注焦点。With the continuous development of economy and technology, airplanes, automobiles, robots, etc. are applied to many aspects of people's lives. Accordingly, how to guide airplanes, automobiles, robots, etc. to avoid obstacles and plan paths has become the focus of attention.
在相关技术中,通常利用点云检测目标,从而指导飞机、汽车、机器人等进行避障和路径规划。尤其对于智能驾驶而言,通常都会配备激光雷达传感器,通过激光雷达传感器输出的激光点云进行目标检测。In related technologies, point clouds are usually used to detect targets to guide airplanes, cars, robots, etc. to avoid obstacles and plan paths. Especially for intelligent driving, a lidar sensor is usually equipped, and the target detection is performed through the laser point cloud output by the lidar sensor.
然而,上述利用点云进行目标检测时通常采用聚类算法。聚类就是将相似元素进行集合,这样,聚类结果往往准确率较低,导致上述目标检测结果出错率较高,后续无法指导飞机、汽车、机器人等进行避障和路径规划。However, the clustering algorithm is usually used when using the point cloud for target detection. Clustering is the collection of similar elements. In this way, the accuracy of the clustering results is often low, resulting in a high error rate in the above-mentioned target detection results, and subsequent failure to guide airplanes, cars, robots, etc. to avoid obstacles and path planning.
发明内容Summary of the invention
本申请实施例提供一种目标检测方法及设备,以克服上述至少一个问题。The embodiments of the present application provide a target detection method and device to overcome at least one of the above-mentioned problems.
第一方面,本申请实施例提供一种目标检测方法,包括:In the first aspect, an embodiment of the present application provides a target detection method, including:
根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息;Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, the point cloud vector including geometric position information of the points in each dimension in the three-dimensional coordinate system;
利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征;Processing the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector;
根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
第二方面,本申请实施例提供一种目标检测设备,包括存储器、处理 器以及存储在所述存储器中并可在所述处理器上运行的计算机执行指令,所述处理器执行所述计算机执行指令时实现如下步骤:In a second aspect, an embodiment of the present application provides a target detection device, including a memory, a processor, and computer-executable instructions stored in the memory and running on the processor, and the processor executes the computer-executed The following steps are implemented when ordering:
根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息;Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, the point cloud vector including geometric position information of the points in each dimension in the three-dimensional coordinate system;
利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征;Processing the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector;
根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
第三方面,本申请实施例提供一种可移动平台,包括:In the third aspect, an embodiment of the present application provides a movable platform, including:
可移动设备平台;以及Mobile device platform; and
如上第二方面以及第二方面各种可能的设计所述的目标检测设备,所述目标检测设备安装于所述可移动平台本体。As for the target detection device described in the above second aspect and various possible designs of the second aspect, the target detection device is installed on the movable platform body.
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的目标检测方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium that stores computer-executable instructions. When the processor executes the computer-executable instructions, the first aspect and the first aspect described above are implemented. In terms of various possible designs, the target detection method described.
本申请实施例提供的目标检测方法、设备及可移动平台,该方法通过待处理点云中点的数目和每个点的属性,确定点云向量;然后利用三维卷积神经网络对点云向量进行目标三维点云特征提取,保留高度信息,进而最大程度保留点云的三维结构信息,再根据提取的目标三维点云特征确定待处理点云中的目标信息,其中,该目标信息包括目标在三维坐标系中各维度的几何位置信息,准确找出点云中的三维目标,解决现有目标检测结果出错率较高,后续无法指导飞机、汽车、机器人等进行避障和路径规划的问题。The target detection method, device, and movable platform provided by the embodiments of the present application determine the point cloud vector based on the number of points in the point cloud to be processed and the attributes of each point; and then use a three-dimensional convolutional neural network to calculate the point cloud vector Perform target 3D point cloud feature extraction, retain height information, and then retain the maximum 3D structure information of the point cloud, and then determine the target information in the point cloud to be processed according to the extracted target 3D point cloud features, where the target information includes the target The geometric position information of each dimension in the three-dimensional coordinate system accurately finds the three-dimensional target in the point cloud, and solves the problem that the existing target detection results have a high error rate and the subsequent failure to guide aircraft, cars, robots, etc. to avoid obstacles and path planning.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。The drawings here are incorporated into the specification and constitute a part of the specification, show embodiments that conform to the application, and are used together with the specification to explain the principle of the application.
图1为本申请实施例提供的目标检测***架构示意图;FIG. 1 is a schematic diagram of the architecture of a target detection system provided by an embodiment of the application;
图2为本申请实施例提供的一种目标检测方法的流程示意图;2 is a schematic flowchart of a target detection method provided by an embodiment of the application;
图3为本申请实施例提供的三维卷积神经网络的示意图;FIG. 3 is a schematic diagram of a three-dimensional convolutional neural network provided by an embodiment of the application;
图4为本申请实施例提供的另一种目标检测方法的流程示意图;4 is a schematic flowchart of another target detection method provided by an embodiment of the application;
图5为本申请实施例提供的一种目标检测设备的结构示意图;FIG. 5 is a schematic structural diagram of a target detection device provided by an embodiment of this application;
图6为本申请实施例提供的另一目标检测设备的结构示意图;FIG. 6 is a schematic structural diagram of another target detection device provided by an embodiment of the application;
图7为本申请实施例提供的目标检测设备的硬件结构示意图;FIG. 7 is a schematic diagram of the hardware structure of a target detection device provided by an embodiment of the application;
图8为本申请实施例提供的一种可移动平台的结构示意图。FIG. 8 is a schematic structural diagram of a movable platform provided by an embodiment of the application.
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。Through the above drawings, the specific embodiments of the present application have been shown, which will be described in more detail below. These drawings and text descriptions are not intended to limit the scope of the concept of the present application in any way, but to explain the concept of the present application for those skilled in the art by referring to specific embodiments.
具体实施方式detailed description
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。The exemplary embodiments will be described in detail here, and examples thereof are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present application. On the contrary, they are merely examples of devices and methods consistent with some aspects of the application as detailed in the appended claims.
在相关技术中,通常利用点云检测目标,从而指导飞机、汽车、机器人等进行避障和路径规划。尤其对于智能驾驶而言,通常都会配备激光雷达传感器,通过激光雷达传感器输出的激光点云进行目标检测。In related technologies, point clouds are usually used to detect targets to guide airplanes, cars, robots, etc. to avoid obstacles and plan paths. Especially for intelligent driving, a lidar sensor is usually equipped, and the target detection is performed through the laser point cloud output by the lidar sensor.
然而,上述利用点云进行目标检测时通常采用聚类算法。聚类就是将相似元素进行集合,这样,聚类结果往往准确率较低,导致上述目标检测结果出错率较高,后续无法指导飞机、汽车、机器人等进行避障和路径规划。However, the clustering algorithm is usually used when using the point cloud for target detection. Clustering is the collection of similar elements. In this way, the accuracy of the clustering results is often low, resulting in a high error rate in the above-mentioned target detection results, and subsequent failure to guide airplanes, cars, robots, etc. to avoid obstacles and path planning.
为了解决该技术问题,本实施例提供一种目标检测方法,通过待处理点云中点的数目和每个点的属性,确定点云向量;然后利用三维卷积神经网络对点云向量进行目标三维点云特征提取,再根据提取的目标三维点云特征确定待处理点云中的目标信息,其中,该目标信息包括目标在三维坐标系中各维度的几何位置信息。In order to solve this technical problem, this embodiment provides a target detection method. The point cloud vector is determined by the number of points in the point cloud to be processed and the attributes of each point; then a three-dimensional convolutional neural network is used to target the point cloud vector The three-dimensional point cloud feature is extracted, and the target information in the point cloud to be processed is determined according to the extracted target three-dimensional point cloud feature, where the target information includes the geometric position information of the target in each dimension in the three-dimensional coordinate system.
由于利用三维卷积神经网络对点云向量进行目标三维点云特征提取,在进行三维点云特征提取时,使用三维卷积来保证高度信息的保留,进而 最大程度保留点云的三维结构信息,根据提取的目标三维点云特征确定待处理点云中的目标信息,能够准确找出点云中的三维目标,解决现有目标检测结果出错率较高,后续无法指导飞机、汽车、机器人等进行避障和路径规划的问题。Since the three-dimensional convolutional neural network is used to extract the target three-dimensional point cloud feature from the point cloud vector, when performing the three-dimensional point cloud feature extraction, three-dimensional convolution is used to ensure the retention of height information, and then the three-dimensional structure information of the point cloud is retained to the greatest extent. Determine the target information in the point cloud to be processed according to the extracted target three-dimensional point cloud features, which can accurately find the three-dimensional target in the point cloud, solve the problem of high error rate of existing target detection results, and fail to guide aircraft, cars, robots, etc. The problem of obstacle avoidance and path planning.
图1为本申请实施例提供的目标检测***的架构示意图。如图1所示,包括:传感器101、第一处理器102和第二处理器103。以目标为车辆为例,传感器101可以实时生成待处理点云,其中,该点云用于标识车辆周围环境的地面。第一处理器102可以结合传感器101生成的点云中点的数目和每个点的属性,确定点云向量,该点云向量包括点在三维坐标系中各维度的几何位置信息,并利用三维卷积神经网络对点云向量处理,以提取点云向量的目标三维点云特征,根据目标三维点云特征确定上述点云中的目标信息,该目标信息包括目标在三维坐标系中各维度的几何位置信息,并将目标信息送入至第二处理器103做后续的行驶规划和使用。FIG. 1 is a schematic structural diagram of a target detection system provided by an embodiment of the application. As shown in FIG. 1, it includes: a sensor 101, a first processor 102 and a second processor 103. Taking a vehicle as an example, the sensor 101 can generate a point cloud to be processed in real time, where the point cloud is used to identify the ground surrounding the vehicle. The first processor 102 can determine the point cloud vector based on the number of points in the point cloud generated by the sensor 101 and the attributes of each point. The point cloud vector includes the geometric position information of the points in each dimension in the three-dimensional coordinate system, and uses the three-dimensional The convolutional neural network processes the point cloud vector to extract the target three-dimensional point cloud feature of the point cloud vector, and determine the target information in the above-mentioned point cloud according to the target three-dimensional point cloud feature. The target information includes the target's various dimensions in the three-dimensional coordinate system. The geometric position information and the target information are sent to the second processor 103 for subsequent driving planning and use.
这里,第一处理器102和第二处理器103可以为车用计算平台、无人飞行器处理器等。本实施例对第一处理器102和第二处理器103的实现方式不做特别限制,只要第一处理器102和第二处理器103能够上述相应功能即可。Here, the first processor 102 and the second processor 103 may be a vehicle computing platform, an unmanned aerial vehicle processor, or the like. This embodiment does not particularly limit the implementation of the first processor 102 and the second processor 103, as long as the first processor 102 and the second processor 103 can perform the above-mentioned corresponding functions.
应理解上述架构仅为一种示例性***架构框图,具体实施时,可以根据应用需求设置,例如第一处理器102和第二处理器103可以单独设置,也可以合到一起,满足不同应用需求。It should be understood that the foregoing architecture is only an exemplary system architecture block diagram. During specific implementation, it can be set according to application requirements. For example, the first processor 102 and the second processor 103 can be set separately or combined to meet different application requirements. .
另外,上述目标检测***还可以包括接收装置、显示装置等。In addition, the above-mentioned target detection system may also include a receiving device, a display device, and the like.
在具体实现过程中,接收装置可以是输入/输出接口,也可以是通信接口。接收装置可以接收用户的指令,例如接收装置可以是连接鼠标的输入接口。In the specific implementation process, the receiving device can be an input/output interface or a communication interface. The receiving device may receive a user's instruction, for example, the receiving device may be an input interface connected to a mouse.
显示装置可以用于对上述目标信息进行显示。显示装置还可以是触摸显示屏,用于在显示上述目标信息的同时接收用户指令,以实现与用户的交互。The display device can be used to display the above-mentioned target information. The display device may also be a touch screen, which is used to receive user instructions while displaying the above-mentioned target information, so as to realize interaction with the user.
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将 结合附图,对本申请的实施例进行描述。The technical solutions of the present application and how the technical solutions of the present application solve the above technical problems will be described in detail below with specific embodiments. The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described below in conjunction with the drawings.
图2为本申请实施例提供的目标检测方法的流程示意图,本实施例的执行主体可以为图1所示实施例中的第一处理器,如图2所示,该方法包括:FIG. 2 is a schematic flowchart of a target detection method provided by an embodiment of this application. The execution subject of this embodiment may be the first processor in the embodiment shown in FIG. 1. As shown in FIG. 2, the method includes:
S201、根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息。S201: Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, where the point cloud vector includes geometric position information of the points in each dimension in the three-dimensional coordinate system.
其中,上述点云可以为图像点云、雷达点云、激光点云等,在后续处理中可以根据实际情况采用上述一种或多种点云。The above-mentioned point cloud may be an image point cloud, a radar point cloud, a laser point cloud, etc., and one or more of the above-mentioned point clouds may be used in the subsequent processing according to actual conditions.
这里,如上述,在根据待处理点云中点的数目和每个点的属性,确定点云向量之前,可以通过传感器获取待处理点云,具体的,可以在三维空间对获取范围进行限制,例如获取传感器前F米,传感器后B米,传感器左右各L、R米,传感器上U米,下D米。这样,整个三维空间的点云处理范围便限制在了(F+B)x(L+R)x(U+D)的范围之内。其中,F、B、L、R、U、D的取值可以根据实际情况设置。Here, as mentioned above, before determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, the point cloud to be processed can be acquired by the sensor. Specifically, the acquisition range can be limited in the three-dimensional space. For example, get F meters in front of the sensor, B meters behind the sensor, L and R meters on the left and right sides of the sensor, U meters on the sensor and D meters on the bottom. In this way, the point cloud processing range of the entire three-dimensional space is limited to the range of (F+B)x(L+R)x(U+D). Among them, the values of F, B, L, R, U, and D can be set according to actual conditions.
在获取待处理点云后,可以根据待处理点云中点的数目和每个点的属性,确定点云向量。After obtaining the point cloud to be processed, the point cloud vector can be determined according to the number of points in the point cloud to be processed and the attributes of each point.
可选地,所述根据待处理点云中点的数目和每个点的属性,确定点云向量,包括:Optionally, the determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point includes:
对所述待处理点云进行网格划分;Meshing the point cloud to be processed;
根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率;According to the number of points in each grid after division and the attributes of each point, determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
根据所述划分后每个网格的向量,确定所述点云向量。The point cloud vector is determined according to the vector of each grid after the division.
具体地,所述根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,包括:Specifically, the determining the vector of each grid after division according to the number of points in each grid after division and the attributes of each point includes:
根据预设点数对划分后每个网格中点的数目进行调整;Adjust the number of points in each grid after division according to the preset number of points;
根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后每个网格的向量。According to the product of the number of points in each grid after adjustment and the attributes of each point, the vector of each grid after division is determined.
示例性的,沿着点云坐标系的每个坐标轴的方向,进行网格的划分,前视方向(X轴)每resx米进行划分,左右方向(Y轴)每resy米进行划 分,向上方向(Z轴)每resz米进行划分。这样,对于整个三维空间,共划分出了T=((F+B)/resx)*((L+R)/resy)*((U+D)/resz)个小网格,每个小网格都是一个resx*resy*resz的小长方体。其中,resx、resy和resz的取值可以根据实际情况设置。Exemplarily, the grid is divided along the direction of each coordinate axis of the point cloud coordinate system, the front-view direction (X axis) is divided every resx meter, and the left and right direction (Y axis) is divided every resy meter, upward The direction (Z axis) is divided every resz meter. In this way, for the entire three-dimensional space, a total of T=((F+B)/resx)*((L+R)/resy)*((U+D)/resz) small grids are divided, each small The grid is a small cuboid of resx*resy*resz. Among them, the values of resx, resy, and resz can be set according to actual conditions.
而且,对于上述所有的点,按照网格的范围进行划分,判断每个点所属的网格,对于任意一个网格,可以限制其所含的激光点的个数N,当其点数大于N时,可以进行随机采样,得到其中的N的点,舍弃多余的激光点。当点数不足N时,可以进行随机复制,达到N个点,这样所有的小网格便包含了相同的点数。其中,上述个数N的取值可以根据实际情况设置。Moreover, for all the above points, divide them according to the range of the grid, and determine the grid that each point belongs to. For any grid, the number of laser points contained in it can be limited to N. When the number of points is greater than N , Random sampling can be performed to obtain N points among them, and the redundant laser points can be discarded. When the number of points is less than N, random copying can be performed to reach N points, so that all small grids contain the same number of points. Among them, the value of the aforementioned number N can be set according to actual conditions.
这样,通过上述处理,整个点云便用K*N*4的向量表示,其中,K表示点数不为0的网格的个数。N为每个网格最大点数。4表示每个点有4维属性,分别为xyz坐标和反射率。应理解,除使用K*N*4的向量表示上述点云外,还可以把其它信息加入点云向量中,例如密度、高度等等信息。In this way, through the above processing, the entire point cloud is represented by a K*N*4 vector, where K represents the number of grids with a non-zero number of points. N is the maximum number of points in each grid. 4 means that each point has 4-dimensional attributes, which are xyz coordinates and reflectivity. It should be understood that in addition to using the K*N*4 vector to represent the above-mentioned point cloud, other information can also be added to the point cloud vector, such as density, height, and so on.
上述将整个三维空间划分为定量的小网格,再确定每个网格的特征向量,即通过一定的结构化方式进行点云的预处理(点云编码),最大化地保留点云的原始结构信息。As mentioned above, the entire three-dimensional space is divided into quantitative small grids, and then the feature vector of each grid is determined, that is, point cloud preprocessing (point cloud coding) is carried out through a certain structured method, and the original point cloud is preserved to the greatest extent Structure information.
S202、利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征。S202: Use a three-dimensional convolutional neural network to process the point cloud vector to extract a target three-dimensional point cloud feature of the point cloud vector.
这里,对于特征提取,为了保留三维结构信息,本申请实施例采用三维卷积,从三维空间提取特征,保留空间结构信息。Here, for feature extraction, in order to retain three-dimensional structure information, the embodiment of the present application adopts three-dimensional convolution to extract features from three-dimensional space and retain spatial structure information.
可选地,所述三维卷积神经网络包括第三卷积神经网络。Optionally, the three-dimensional convolutional neural network includes a third convolutional neural network.
所述利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征,包括:The processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取。The third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
其中,所述第三卷积神经网络通过三维网格向量和三维网格特征训练得到。Wherein, the third convolutional neural network is obtained through three-dimensional grid vector and three-dimensional grid feature training.
这里,在上述通过一定的结构化方式进行点云的预处理之后,对每个 小网格利用三维卷积神经网络进行特征提取,这部分可以用卷积层、上采样层和全连层实现,即输入上述K*N*4的向量,通过一系列卷积层、上采样层和全连层之后,得到的特征向量为K*C,其中每个小网格所对应的特征为C维。对于三维空间所有网格而言,每个网格对应的特征为C维,若是该网格没有点,则特征为C维的零向量。若小网格的总数为X1*Y1*Z1,则得到的特征向量为X1*Y1*Z1*C,得到了此结构化的数据,便可以使用卷积操作直接提取特征了。Here, after the above-mentioned preprocessing of the point cloud through a certain structured method, the feature extraction is performed on each small grid using a three-dimensional convolutional neural network. This part can be realized by a convolutional layer, an up-sampling layer, and a fully connected layer. , That is, input the above K*N*4 vector and pass through a series of convolutional layers, up-sampling layers and fully connected layers, and the feature vector obtained is K*C, and the feature corresponding to each small grid is C-dimensional . For all grids in the three-dimensional space, the feature corresponding to each grid is C-dimensional. If the grid has no points, the feature is a C-dimensional zero vector. If the total number of small grids is X1*Y1*Z1, then the feature vector obtained is X1*Y1*Z1*C. Once the structured data is obtained, the feature can be extracted directly using the convolution operation.
可选地,所述三维卷积神经网络还包括第四卷积神经网络。Optionally, the three-dimensional convolutional neural network further includes a fourth convolutional neural network.
在所述利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取之后,还包括:After the third convolutional neural network is used to extract the three-dimensional grid features of the divided vectors of each grid, the method further includes:
利用所述第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取。The fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
其中,所述第四卷积神经网络通过三维网格特征和三维点云特征训练得到。Wherein, the fourth convolutional neural network is obtained through three-dimensional grid feature and three-dimensional point cloud feature training.
示例性的,如果使用二维卷积,则需要将特征向量reshape为X1*Y1*(Z1*C),其中Z1*C记为特征的维度,使用二维卷积时,需要将高度维度(Z方向)与特征通道维度进行合并,此时会损失一定的高度信息。本申请实施例使用三维卷积,能够保持各个方向的结构信息,最大化的保留原始点云的空间信息。经过一系列的三维卷积等操作后,将会得到最终的特征向量,记为X2*Y2*Z2*C2。Exemplarily, if two-dimensional convolution is used, the feature vector needs to be reshaped as X1*Y1*(Z1*C), where Z1*C is recorded as the dimension of the feature. When two-dimensional convolution is used, the height dimension ( Z direction) and the characteristic channel dimension are combined, at this time, a certain height information will be lost. The embodiment of the present application uses three-dimensional convolution, which can maintain structural information in various directions and maximize the preservation of the spatial information of the original point cloud. After a series of three-dimensional convolution and other operations, the final feature vector will be obtained, denoted as X2*Y2*Z2*C2.
另外,上述三维卷积神经网络可以包括:In addition, the above-mentioned three-dimensional convolutional neural network may include:
多个卷积层,用于对所述点云向量进行卷积操作,以输出三维点云特征;Multiple convolutional layers for performing convolution operations on the point cloud vector to output three-dimensional point cloud features;
上采样层,与所述多个卷积层中的至少一个卷积层相连接,用于获取所述多个卷积层中的至少一个卷积层输出的三维点云特征,并对获取的三维点云特征进行处理,以输出处理后的三维点云特征;The up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
全连接层,与所述多个卷积层中的卷积层和所述上采样层相连接,用于获取一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对获取的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成融合后的三维点云特征,将所述融合后的三维点云特征输入另一卷积层, 以经过所述另一卷积层的卷积操作后,确定所述目标三维点云特征。The fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
可选地,所述多个卷积层中卷积层的深度不同;Optionally, the depths of the convolutional layers in the multiple convolutional layers are different;
所述全连接层与所述多个卷积层中的第一卷积层、第二卷积层和所述上采样层连接,以获取所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成所述融合后的三维点云特征,将所述融合后的三维点云特征输入所述第二卷积层,以经过所述第二卷积层的卷积操作后,确定所述目标三维点云特征,其中,所述第二卷积层的深度大于所述第一卷积层的深度。The fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and The processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature , Input the fused three-dimensional point cloud feature into the second convolutional layer, and after the convolution operation of the second convolutional layer, determine the target three-dimensional point cloud feature, wherein the second The depth of the convolutional layer is greater than the depth of the first convolutional layer.
可选地,所述全连接层的个数为多个。Optionally, the number of the fully connected layers is multiple.
可选地,所述上采样层的个数为多个。Optionally, the number of up-sampling layers is multiple.
示例性的,如图3所示,上述三维卷积神经网络包括第一卷积层、第二卷积层、第三卷积层、第四卷积层、第一上采样层、第一全连接层、第五卷积层、第二上采样层、第二全连接层和第六卷积层,其中,第一卷积层、第二卷积层、第三卷积层、第四卷积层、第五卷积层和第六卷积层的激活函数均使用Relu。Exemplarily, as shown in FIG. 3, the above-mentioned three-dimensional convolutional neural network includes a first convolutional layer, a second convolutional layer, a third convolutional layer, a fourth convolutional layer, a first upsampling layer, and a first full convolutional layer. The connection layer, the fifth convolutional layer, the second upsampling layer, the second fully connected layer, and the sixth convolutional layer. Among them, the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer The activation functions of the build-up layer, the fifth convolutional layer, and the sixth convolutional layer all use Relu.
第一卷积层、第二卷积层、第三卷积层和第四卷积层依次对提取的三维点云特征进行三维特征提取,第一上采样层用于根据第一预设空间分辨率对第四卷积层提取的三维特征进行上采样,第一全连接层用于对第一上采样层上采样的三维特征和第三卷积层提取的三维特征进行特征融合,第五卷积层用于对第一全连接层融合后的特征进行三维特征提取,第二上采样层用于根据第二预设空间分辨率对第五卷积层提取的三维特征进行上采样,第二全连接层用于对第二上采样层上采样的三维特征和第二卷积层提取的三维特征进行特征融合,第六卷积层用于对第二全连接层融合后的特征进行三维特征提取,确定目标三维点云特征。The first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer sequentially perform three-dimensional feature extraction on the extracted three-dimensional point cloud features. The first up-sampling layer is used for spatial resolution according to the first preset Upsampling the three-dimensional features extracted by the fourth convolutional layer at a high rate. The first fully connected layer is used to perform feature fusion on the three-dimensional features sampled on the first upsampling layer and the three-dimensional features extracted by the third convolutional layer. Volume 5 The buildup layer is used to extract 3D features from the fused features of the first fully connected layer, the second upsampling layer is used to upsample the 3D features extracted by the fifth convolutional layer according to the second preset spatial resolution, and the second The fully connected layer is used to perform feature fusion on the three-dimensional features sampled on the second upsampling layer and the three-dimensional features extracted by the second convolution layer, and the sixth convolution layer is used to perform three-dimensional features on the fused features of the second fully connected layer Extract and determine the three-dimensional point cloud features of the target.
其中,上述多个卷积层中卷积层的深度不同。第一卷积层、第二卷积层、第三卷积层、第四卷积层的深度依次增大。Wherein, the depths of the convolutional layers in the above-mentioned multiple convolutional layers are different. The depths of the first convolutional layer, the second convolutional layer, the third convolutional layer, and the fourth convolutional layer increase in order.
上述上采样层能够放大图像,增加图像的信息,例如卷积层降低图像的分辨率,通过上采样层上采样后能够使上采样后的图像质量超过原图质量,方便后续处理。The above-mentioned up-sampling layer can enlarge the image and increase the information of the image. For example, the convolutional layer reduces the resolution of the image. After up-sampling by the up-sampling layer, the quality of the up-sampled image can exceed the quality of the original image, which facilitates subsequent processing.
上述全连接层实现数据的拼接、融合,多个卷积层输出的数据(有些通过上采样层上采样),分别输入不同的全连接层,融合深层和浅层的特征。The above-mentioned fully connected layer realizes the splicing and fusion of data, and the data output by multiple convolutional layers (some of which are upsampled by the upsampling layer) are input into different fully connected layers respectively, and the deep and shallow features are merged.
S203、根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。S203: Determine target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud, where the target information includes geometric position information of the target in each dimension in a three-dimensional coordinate system.
所述待处理点云包括与目标相对应的待处理点云。这里,可以通过对点云进行编码,用于神经网络的输入,通过神经网络学习点云的特征,直接用于三维目标的预测。该预测是在神经网络特征图上的密集预测,能够通过端到端的学习,得到最终的检测结果。The point cloud to be processed includes the point cloud to be processed corresponding to the target. Here, the point cloud can be coded and used as the input of the neural network, and the characteristics of the point cloud can be learned through the neural network, which can be directly used for the prediction of the three-dimensional target. The prediction is a dense prediction on the feature map of the neural network, and the final detection result can be obtained through end-to-end learning.
可选地,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:Optionally, the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
利用第一卷积神经网络,基于所述目标三维点云特征确定所述待处理点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角。The first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
其中,所述第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。Wherein, the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
这里,对于每一副场景点云,首先提取其特征,然后利用卷积神经网络基于上述特征确定目标信息,可以用7个参数进行表示,(x,y,z)表示其中心点的坐标,(l,h,w)表示其三维尺寸,r表示其航偏角。其中,上述目标信息除选择中心点、三维尺寸、航偏角的方式来表示外,还可以使用角点的坐标、底面矩形的坐标、高度等来表示。Here, for each scene point cloud, first extract its features, and then use the convolutional neural network to determine the target information based on the above features, which can be represented by 7 parameters, (x, y, z) represents the coordinates of its center point, (l, h, w) represents its three-dimensional size, and r represents its yaw angle. Among them, the above-mentioned target information can be expressed by selecting the center point, the three-dimensional size, and the yaw angle, as well as the coordinates of the corner points, the coordinates of the bottom rectangle, the height, and the like.
具体地,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:Specifically, the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
根据所述目标三维点云特征中的有效点确定所述目标信息。The target information is determined according to the effective points in the target three-dimensional point cloud feature.
可选地,在所述根据所述目标三维点云特征中的有效点确定所述目标信息之前,还包括:Optionally, before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further includes:
获得所述目标三维点云特征中的每个点距离其所在网格的中心点的像素欧氏距离;Obtaining the pixel Euclidean distance between each point in the target three-dimensional point cloud feature and the center point of the grid where it is located;
根据所述像素欧氏距离与预设距离阈值,确定所述目标三维点云特征中的有效点。According to the Euclidean distance of the pixel and a preset distance threshold, an effective point in the target three-dimensional point cloud feature is determined.
这里,在对待处理点云进行网格划分后,将每个网格映射到特征图上, 得到Ri,这样,对于上述目标三维点云特征上每一个体素点,首先找出离它最近的Ri,判断方式可以用该点与中心点的像素欧氏距离L。此时,会同时设定一个距离阈值T,若L小于T,则此点为有效点,否则为无效点。根据所有的有效点,确定上述目标信息,即中心点的坐标、三维尺寸以及航偏角。其中,距离阈值T可以根据实际情况设置,例如以目标为车辆为例,上述阈值可以根据车身长度设置。Here, after the point cloud to be processed is divided into grids, each grid is mapped to the feature map to obtain Ri. In this way, for each voxel point on the above-mentioned target 3D point cloud feature, first find the nearest Ri, the judgment method can use the pixel Euclidean distance L between this point and the center point. At this time, a distance threshold T will be set at the same time. If L is less than T, this point is a valid point, otherwise it is an invalid point. According to all the effective points, determine the above-mentioned target information, namely the coordinates of the center point, the three-dimensional size and the yaw angle. The distance threshold T can be set according to actual conditions. For example, taking the target as a vehicle as an example, the above threshold can be set according to the length of the vehicle body.
另外,在上述根据所述目标三维点云特征确定所述待处理点云中的目标信息之后,还可以利用第二卷积神经网络,基于所述目标三维点云特征确定所述目标的类别概率;In addition, after the target information in the to-be-processed point cloud is determined based on the target three-dimensional point cloud feature, a second convolutional neural network can be used to determine the target category probability based on the target three-dimensional point cloud feature ;
根据所述目标的类别概率对所述目标进行错误目标去除。Error target removal is performed on the target according to the category probability of the target.
其中,所述第二卷积神经网络通过点云特征和目标类别概率训练得到。Wherein, the second convolutional neural network is obtained through point cloud feature and target category probability training.
这里,上述确定的待处理点云中的目标信息可能是大量的(对每个三维体素点进行密集预测获得),为了得到准确结果,可以通过非极大值抑制以及设置相应的得分阈值,得到最终的检测结果。具体地,可以基于上述目标三维点云特征确定目标的类别概率,例如为车辆的概率为A%等。最终可以确定目标信息与目标的类别概率的对应关系,例如第一目标信息对应的目标的类别概率:为车辆的概率为99%;第二目标信息对应的目标的类别概率:为车辆的概率为10%。根据目标信息与目标的类别概率的对应关系,进行错误目标去除,例如去除上述第二目标,以此类推,得到最终的检测结果。Here, the target information in the point cloud to be processed determined above may be a large amount (obtained by intensive prediction of each 3D voxel point). In order to obtain accurate results, non-maximum value suppression and corresponding score thresholds can be set, Get the final test result. Specifically, the category probability of the target can be determined based on the above-mentioned three-dimensional point cloud characteristics of the target, for example, the probability of being a vehicle is A%. Finally, the corresponding relationship between the target information and the category probability of the target can be determined. For example, the category probability of the target corresponding to the first target information: the probability of being a vehicle is 99%; the category probability of the target corresponding to the second target information: the probability of being a vehicle is 10%. According to the corresponding relationship between the target information and the class probability of the target, the wrong target is removed, for example, the above-mentioned second target is removed, and so on, to obtain the final detection result.
本实施例提供的目标检测方法,通过待处理点云中点的数目和每个点的属性,确定点云向量;然后利用三维卷积神经网络对点云向量进行目标三维点云特征提取,保留高度信息,进而最大程度保留点云的三维结构信息,再根据提取的目标三维点云特征确定待处理点云中的目标信息,其中,该目标信息包括目标在三维坐标系中各维度的几何位置信息,准确找出点云中的三维目标。In the target detection method provided in this embodiment, the point cloud vector is determined by the number of points in the point cloud to be processed and the attributes of each point; then the point cloud vector is used to extract features of the target three-dimensional point cloud from the point cloud vector and retain Height information, which retains the three-dimensional structure information of the point cloud to the greatest extent, and then determines the target information in the point cloud to be processed according to the extracted target three-dimensional point cloud features, where the target information includes the geometric position of the target in each dimension in the three-dimensional coordinate system Information, accurately find out the three-dimensional target in the point cloud.
其中,利用卷积神经网络在点云数据上进行三维目标物体检测,包括检测出物体相对于传感器的坐标、三维尺寸、物体在真实世界的偏航角等信息,从而可以利用点云数据检测出动态障碍物,指导飞机、汽车、机器人等进行避障和路径规划。Among them, the use of convolutional neural network to perform three-dimensional target object detection on point cloud data, including detecting the coordinates of the object relative to the sensor, three-dimensional size, the yaw angle of the object in the real world, etc., so that the point cloud data can be used to detect Dynamic obstacles, guide airplanes, cars, robots, etc. to avoid obstacles and plan paths.
尤其对于自动驾驶汽车而言,通常都会配备激光雷达传感器,通过激光点云进行障碍物的检测,是整个技术环节中非常重要的一环Especially for self-driving cars, they are usually equipped with lidar sensors. Obstacle detection through laser point clouds is a very important part of the entire technical link.
图4为本申请实施例提供的另一种目标检测方法的流程示意图,本实施例在图2实施例的基础上,对本实施例的具体实现过程进行了详细说明。如图4所示,该方法包括:FIG. 4 is a schematic flowchart of another target detection method provided by an embodiment of the application. This embodiment, on the basis of the embodiment in FIG. 2, describes in detail the specific implementation process of this embodiment. As shown in Figure 4, the method includes:
S401、对待处理点云进行网格划分。S401: Perform grid division on the point cloud to be processed.
S402、根据预设点数对划分后每个网格中点的数目进行调整。S402: Adjust the number of points in each grid after division according to the preset number of points.
S403、根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率。S403. Determine the vector of each grid after division according to the product of the number of points in each grid after adjustment and the attributes of each point, where the attributes of each point include the three-dimensional coordinates of the point in the three-dimensional coordinate system and Reflectivity.
这里,将整个三维空间划分为定量的小网格,再确定每个网格的特征向量,即通过一定的结构化方式进行点云的预处理(点云编码),最大化地保留点云的原始结构信息。Here, the entire three-dimensional space is divided into quantitative small grids, and then the feature vector of each grid is determined, that is, the point cloud is preprocessed (point cloud coding) through a certain structured method to maximize the preservation of the point cloud Original structure information.
S404、利用第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取,其中,第三卷积神经网络通过三维网格向量和三维网格特征训练得到。S404: Use the third convolutional neural network to extract the three-dimensional grid features of the divided vectors of each grid, where the third convolutional neural network is obtained by training the three-dimensional grid vector and the three-dimensional grid feature.
S405、利用第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取,其中,第四卷积神经网络通过三维网格特征和三维点云特征训练得到。S405: Perform target 3D point cloud feature extraction on the extracted 3D grid features using the fourth convolutional neural network, where the fourth convolutional neural network is obtained by training on the 3D grid feature and the 3D point cloud feature.
其中,第三卷积神经网络和第四卷积神经网络均为三维卷积神经网络。Among them, the third convolutional neural network and the fourth convolutional neural network are both three-dimensional convolutional neural networks.
对于特征提取,为了保留三维结构信息,本申请实施例采用三维卷积,从三维空间提取特征,保留空间结构信息。For feature extraction, in order to retain three-dimensional structure information, the embodiment of the present application adopts three-dimensional convolution to extract features from three-dimensional space and retain spatial structure information.
S406、利用第一卷积神经网络,基于上述目标三维点云特征确定待处理点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角,其中,第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。S406. Use the first convolutional neural network to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the three-dimensional coordinate system of the target in the point cloud to be processed based on the above-mentioned target three-dimensional point cloud feature, where the first convolutional neural network Through point cloud features and the target's center point coordinates, three-dimensional size and yaw angle training in the three-dimensional coordinate system.
上述对点云进行编码后,用于神经网络的输入,通过神经网络学习点云的特征,直接用于三维目标的预测。该预测是在神经网络特征图上的密集预测,能够通过端到端的学习,得到最终的检测结果。After the point cloud is encoded above, it is used as the input of the neural network, and the characteristics of the point cloud are learned through the neural network, which is directly used for the prediction of the three-dimensional target. The prediction is a dense prediction on the feature map of the neural network, and the final detection result can be obtained through end-to-end learning.
S407、利用第二卷积神经网络,基于目标三维点云特征确定目标的类 别概率,其中,第二卷积神经网络通过点云特征和目标类别概率训练得到。S407. Use the second convolutional neural network to determine the category probability of the target based on the target three-dimensional point cloud feature, where the second convolutional neural network is obtained through point cloud feature and target category probability training.
S408、根据目标的类别概率对目标进行错误目标去除。S408: Perform wrong target removal on the target according to the class probability of the target.
由于上述确定的待处理点云中的目标信息可能是大量的(对每个三维体素点进行密集预测获得),为了得到准确结果,对错误目标进行去除,得到最终的检测结果。Since the target information in the point cloud to be processed determined above may be a large amount (obtained by intensive prediction of each three-dimensional voxel point), in order to obtain an accurate result, the wrong target is removed to obtain the final detection result.
本实施例提供的目标检测方法,通过待处理点云中点的数目和每个点的属性,确定点云向量;然后利用三维卷积神经网络对点云向量进行目标三维点云特征提取,保留高度信息,进而最大程度保留点云的三维结构信息,再根据提取的目标三维点云特征确定待处理点云中的目标信息,其中,该目标信息包括目标在三维坐标系中各维度的几何位置信息,准确找出点云中的三维目标,解决现有目标检测结果出错率较高,后续无法指导飞机、汽车、机器人等进行避障和路径规划的问题。In the target detection method provided in this embodiment, the point cloud vector is determined by the number of points in the point cloud to be processed and the attributes of each point; then the point cloud vector is used to extract features of the target three-dimensional point cloud from the point cloud vector and retain Height information, which retains the three-dimensional structure information of the point cloud to the greatest extent, and then determines the target information in the point cloud to be processed according to the extracted target three-dimensional point cloud features, where the target information includes the geometric position of the target in each dimension in the three-dimensional coordinate system Information, accurately find the three-dimensional target in the point cloud, and solve the problem that the existing target detection results have a high error rate, and the subsequent failure to guide airplanes, cars, robots, etc. to avoid obstacles and path planning.
图5为本申请实施例提供的一种目标检测设备的结构示意图。为了便于说明,仅示出了与本申请实施例相关的部分。如图5所示,该目标检测设备50包括:第一确定模块501、提取模块502和第二确定模块503。FIG. 5 is a schematic structural diagram of a target detection device provided by an embodiment of the application. For ease of description, only the parts related to the embodiments of the present application are shown. As shown in FIG. 5, the target detection device 50 includes: a first determination module 501, an extraction module 502, and a second determination module 503.
其中,第一确定模块501,用于根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息。The first determining module 501 is configured to determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, and the point cloud vector includes the geometric position information of the points in each dimension in the three-dimensional coordinate system.
提取模块502,用于利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征。The extraction module 502 is configured to process the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector.
第二确定模块503,用于根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The second determining module 503 is configured to determine target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud, where the target information includes geometric position information of the target in each dimension in a three-dimensional coordinate system.
本实施例提供的设备,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。The device provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and details are not described herein again in this embodiment.
图6为本发明实施例提供的另一目标检测设备的结构示意图。如图6所示,本实施例在图5实施例的基础上,上述目标检测设备还包括:第三确定模块504和去除模块505。Fig. 6 is a schematic structural diagram of another target detection device provided by an embodiment of the present invention. As shown in FIG. 6, this embodiment is based on the embodiment in FIG. 5, and the foregoing target detection device further includes: a third determination module 504 and a removal module 505.
在一种可能的设计中,所述第二确定模块503,具体用于:In a possible design, the second determining module 503 is specifically configured to:
利用第一卷积神经网络,基于所述目标三维点云特征确定所述待处理 点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角。The first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
在一种可能的设计中,所述第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。In a possible design, the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
在一种可能的设计中,所述第三确定模块504,用于在所述第二确定模块503根据所述目标三维点云特征确定所述待处理点云中的目标信息之后,利用第二卷积神经网络,基于所述目标三维点云特征确定所述目标的类别概率。In a possible design, the third determining module 504 is configured to use the second determining module 503 to determine the target information in the to-be-processed point cloud according to the target three-dimensional point cloud characteristics. The convolutional neural network determines the class probability of the target based on the characteristics of the target three-dimensional point cloud.
所述去除模块505,用于根据所述目标的类别概率对所述目标进行错误目标去除。The removal module 505 is configured to remove erroneous targets for the target according to the class probability of the target.
在一种可能的设计中,所述第二卷积神经网络通过点云特征和目标类别概率训练得到。In a possible design, the second convolutional neural network is obtained through point cloud feature and target category probability training.
在一种可能的设计中,所述第一确定模块501,具体用于:In a possible design, the first determining module 501 is specifically configured to:
对所述待处理点云进行网格划分;Meshing the point cloud to be processed;
根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率;According to the number of points in each grid after division and the attributes of each point, determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
根据所述划分后每个网格的向量,确定所述点云向量。The point cloud vector is determined according to the vector of each grid after the division.
在一种可能的设计中,所述第一确定模块501根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,包括:In a possible design, the first determining module 501 determines the vector of each grid after division according to the number of points in each grid after division and the attributes of each point, including:
根据预设点数对划分后每个网格中点的数目进行调整;Adjust the number of points in each grid after division according to the preset number of points;
根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后每个网格的向量。According to the product of the number of points in each grid after adjustment and the attributes of each point, the vector of each grid after division is determined.
在一种可能的设计中,所述三维卷积神经网络包括第三卷积神经网络。In a possible design, the three-dimensional convolutional neural network includes a third convolutional neural network.
所述提取模块502,具体用于:The extraction module 502 is specifically used for:
利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取。The third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
在一种可能的设计中,所述第三卷积神经网络通过三维网格向量和三维网格特征训练得到。In a possible design, the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
在一种可能的设计中,所述三维卷积神经网络还包括第四卷积神经网络。In a possible design, the three-dimensional convolutional neural network further includes a fourth convolutional neural network.
所述提取模块502在利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取之后,还用于:After the extraction module 502 uses the third convolutional neural network to extract the three-dimensional grid features of the divided vectors of each grid, it is also used to:
利用所述第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取。The fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
在一种可能的设计中,所述第四卷积神经网络通过三维网格特征和三维点云特征训练得到In a possible design, the fourth convolutional neural network is obtained by training on three-dimensional grid features and three-dimensional point cloud features
在一种可能的设计中,所述第二确定模块503,具体用于:In a possible design, the second determining module 503 is specifically configured to:
根据所述目标三维点云特征中的有效点确定所述目标信息。The target information is determined according to the effective points in the target three-dimensional point cloud feature.
在一种可能的设计中,所述第二确定模块503在根据所述目标三维点云特征中的有效点确定所述目标信息之前,还用于:In a possible design, before determining the target information according to the effective points in the target three-dimensional point cloud feature, the second determining module 503 is further configured to:
获得所述目标三维点云特征中的每个点距离其所在网格的中心点的像素欧氏距离;Obtaining the pixel Euclidean distance between each point in the target three-dimensional point cloud feature and the center point of the grid where it is located;
根据所述像素欧氏距离与预设距离阈值,确定所述目标三维点云特征中的有效点。According to the Euclidean distance of the pixel and a preset distance threshold, an effective point in the target three-dimensional point cloud feature is determined.
在一种可能的设计中,所述三维卷积神经网络包括:In a possible design, the three-dimensional convolutional neural network includes:
多个卷积层,用于对所述点云向量进行卷积操作,以输出三维点云特征;Multiple convolutional layers for performing convolution operations on the point cloud vector to output three-dimensional point cloud features;
上采样层,与所述多个卷积层中的至少一个卷积层相连接,用于获取所述多个卷积层中的至少一个卷积层输出的三维点云特征,并对获取的三维点云特征进行处理,以输出处理后的三维点云特征;The up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
全连接层,与所述多个卷积层中的卷积层和所述上采样层相连接,用于获取一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对获取的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成融合后的三维点云特征,将所述融合后的三维点云特征输入另一卷积层,以经过所述另一卷积层的卷积操作后,确定所述目标三维点云特征。The fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
在一种可能的设计中,所述多个卷积层中卷积层的深度不同;In a possible design, the depths of the convolutional layers in the multiple convolutional layers are different;
所述全连接层与所述多个卷积层中的第一卷积层、第二卷积层和所述上采样层连接,以获取所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成所述融合后的三维点云特征,将所 述融合后的三维点云特征输入所述第二卷积层,以经过所述第二卷积层的卷积操作后,确定所述目标三维点云特征,其中,所述第二卷积层的深度大于所述第一卷积层的深度。The fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and The processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature , Input the fused three-dimensional point cloud feature into the second convolutional layer, and after the convolution operation of the second convolutional layer, determine the target three-dimensional point cloud feature, wherein the second The depth of the convolutional layer is greater than the depth of the first convolutional layer.
在一种可能的设计中,所述全连接层的个数为多个。In a possible design, the number of the fully connected layers is multiple.
在一种可能的设计中,所述上采样层的个数为多个。In a possible design, the number of the up-sampling layers is multiple.
本申请实施例提供的设备,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本申请实施例此处不再赘述。The device provided in the embodiment of the present application can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar, and the details of the embodiments of the present application are not repeated here.
图7为本申请实施例提供的目标检测设备的硬件结构示意图。如图7所示,本实施例的目标检测设备70包括:存储器701和处理器702;其中FIG. 7 is a schematic diagram of the hardware structure of a target detection device provided by an embodiment of the application. As shown in FIG. 7, the target detection device 70 of this embodiment includes: a memory 701 and a processor 702; wherein
存储器701,用于存储程序指令;The memory 701 is used to store program instructions;
处理器702,用于执行存储器存储的程序指令,当所述程序指令被执行时,处理器执行如下步骤:The processor 702 is configured to execute program instructions stored in the memory. When the program instructions are executed, the processor executes the following steps:
根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息;Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, the point cloud vector including geometric position information of the points in each dimension in the three-dimensional coordinate system;
利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征;Processing the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector;
根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
在一种可能的设计中,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:In a possible design, the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
利用第一卷积神经网络,基于所述目标三维点云特征确定所述待处理点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角。The first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
在一种可能的设计中,所述第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。In a possible design, the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
在一种可能的设计中,在所述根据所述目标三维点云特征确定所述待处理点云中的目标信息之后,还包括:In a possible design, after the target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, the method further includes:
利用第二卷积神经网络,基于所述目标三维点云特征确定所述目标的类别概率;Using a second convolutional neural network to determine the class probability of the target based on the characteristics of the target three-dimensional point cloud;
根据所述目标的类别概率对所述目标进行错误目标去除。Error target removal is performed on the target according to the category probability of the target.
在一种可能的设计中,所述第二卷积神经网络通过点云特征和目标类 别概率训练得到In a possible design, the second convolutional neural network is trained by point cloud features and target category probabilities.
在一种可能的设计中,所述根据待处理点云中点的数目和每个点的属性,确定点云向量,包括:In a possible design, the determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point includes:
对所述待处理点云进行网格划分;Meshing the point cloud to be processed;
根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率;According to the number of points in each grid after division and the attributes of each point, determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
根据所述划分后每个网格的向量,确定所述点云向量。The point cloud vector is determined according to the vector of each grid after the division.
在一种可能的设计中,所述根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,包括:In a possible design, the determination of the vector of each grid after division according to the number of points in each grid after division and the attributes of each point includes:
根据预设点数对划分后每个网格中点的数目进行调整;Adjust the number of points in each grid after division according to the preset number of points;
根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后每个网格的向量。According to the product of the number of points in each grid after adjustment and the attributes of each point, the vector of each grid after division is determined.
在一种可能的设计中,所述三维卷积神经网络包括第三卷积神经网络;In a possible design, the three-dimensional convolutional neural network includes a third convolutional neural network;
所述利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征,包括:The processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取。The third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
在一种可能的设计中,所述第三卷积神经网络通过三维网格向量和三维网格特征训练得到。In a possible design, the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
在一种可能的设计中,所述三维卷积神经网络还包括第四卷积神经网络;In a possible design, the three-dimensional convolutional neural network further includes a fourth convolutional neural network;
在所述利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取之后,还包括:After the third convolutional neural network is used to extract the three-dimensional grid features of the divided vectors of each grid, the method further includes:
利用所述第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取。The fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
在一种可能的设计中,所述第四卷积神经网络通过三维网格特征和三维点云特征训练得到。In a possible design, the fourth convolutional neural network is obtained through training of three-dimensional grid features and three-dimensional point cloud features.
在一种可能的设计中,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:In a possible design, the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
根据所述目标三维点云特征中的有效点确定所述目标信息。The target information is determined according to the effective points in the target three-dimensional point cloud feature.
在一种可能的设计中,在所述根据所述目标三维点云特征中的有效点确定所述目标信息之前,还包括:In a possible design, before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further includes:
获得所述目标三维点云特征中的每个点距离其所在网格的中心点的像素欧氏距离;Obtaining the pixel Euclidean distance between each point in the target three-dimensional point cloud feature and the center point of the grid where it is located;
根据所述像素欧氏距离与预设距离阈值,确定所述目标三维点云特征中的有效点。According to the Euclidean distance of the pixel and a preset distance threshold, an effective point in the target three-dimensional point cloud feature is determined.
在一种可能的设计中,所述三维卷积神经网络包括:In a possible design, the three-dimensional convolutional neural network includes:
多个卷积层,用于对所述点云向量进行卷积操作,以输出三维点云特征;Multiple convolutional layers for performing convolution operations on the point cloud vector to output three-dimensional point cloud features;
上采样层,与所述多个卷积层中的至少一个卷积层相连接,用于获取所述多个卷积层中的至少一个卷积层输出的三维点云特征,并对获取的三维点云特征进行处理,以输出处理后的三维点云特征;The up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
全连接层,与所述多个卷积层中的卷积层和所述上采样层相连接,用于获取一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对获取的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成融合后的三维点云特征,将所述融合后的三维点云特征输入另一卷积层,以经过所述另一卷积层的卷积操作后,确定所述目标三维点云特征。The fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
在一种可能的设计中,所述多个卷积层中卷积层的深度不同;In a possible design, the depths of the convolutional layers in the multiple convolutional layers are different;
所述全连接层与所述多个卷积层中的第一卷积层、第二卷积层和所述上采样层连接,以获取所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成所述融合后的三维点云特征,将所述融合后的三维点云特征输入所述第二卷积层,以经过所述第二卷积层的卷积操作后,确定所述目标三维点云特征,其中,所述第二卷积层的深度大于所述第一卷积层的深度。The fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and The processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature , Input the fused three-dimensional point cloud feature into the second convolutional layer, and after the convolution operation of the second convolutional layer, determine the target three-dimensional point cloud feature, wherein the second The depth of the convolutional layer is greater than the depth of the first convolutional layer.
在一种可能的设计中,所述全连接层的个数为多个。In a possible design, the number of the fully connected layers is multiple.
在一种可能的设计中,所述上采样层的个数为多个。In a possible design, the number of the up-sampling layers is multiple.
在一种可能的设计中,存储器701既可以是独立的,也可以跟处理器702集成在一起。In a possible design, the memory 701 may be independent or integrated with the processor 702.
当存储器701独立设置时,该目标检测设备还包括总线703,用于连接所述存储器701和处理器702。When the memory 701 is set independently, the target detection device further includes a bus 703 for connecting the memory 701 and the processor 702.
在一种可能的设计中,目标检测设备70可以是一个单独的设备,该设备包括上述存储器701、处理器702等一整套。另外,以车辆为例,目标检测设备70的各组成部分可以分布式地集成在车辆上,即存储器701、处理器702等可以分别设置在车辆的不同位置。In a possible design, the target detection device 70 may be a single device, and the device includes a complete set of the foregoing memory 701, processor 702, and so on. In addition, taking a vehicle as an example, the components of the target detection device 70 may be distributed and integrated on the vehicle, that is, the memory 701, the processor 702, etc. may be respectively arranged in different positions of the vehicle.
图8为本申请实施例提供的一种可移动平台的结构示意图。如图8所示,本实施例的可移动平台80包括:可移动平台本体801,以及目标检测设备802;目标检测设备802设置在可移动平台本体801,所述可移动平台本体801和所述目标检测设备802无线连接或有线连接。FIG. 8 is a schematic structural diagram of a movable platform provided by an embodiment of the application. As shown in FIG. 8, the movable platform 80 of this embodiment includes: a movable platform body 801 and a target detection device 802; the target detection device 802 is provided on the movable platform body 801, the movable platform body 801 and the The target detection device 802 is connected wirelessly or wiredly.
所述目标检测设备802根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息;The target detection device 802 determines a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, and the point cloud vector includes geometric position information of the points in each dimension in a three-dimensional coordinate system;
利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征;Processing the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector;
根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
在一种可能的设计中,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:In a possible design, the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
利用第一卷积神经网络,基于所述目标三维点云特征确定所述待处理点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角。The first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
在一种可能的设计中,所述第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。In a possible design, the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
在一种可能的设计中,在所述根据所述目标三维点云特征确定所述待处理点云中的目标信息之后,还包括:In a possible design, after the target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, the method further includes:
利用第二卷积神经网络,基于所述目标三维点云特征确定所述目标的类别概率;Using a second convolutional neural network to determine the class probability of the target based on the characteristics of the target three-dimensional point cloud;
根据所述目标的类别概率对所述目标进行错误目标去除。Error target removal is performed on the target according to the category probability of the target.
在一种可能的设计中,所述第二卷积神经网络通过点云特征和目标类别概率训练得到。In a possible design, the second convolutional neural network is obtained through point cloud feature and target category probability training.
在一种可能的设计中,所述根据待处理点云中点的数目和每个点的属性,确定点云向量,包括:In a possible design, the determining the point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point includes:
对所述待处理点云进行网格划分;Meshing the point cloud to be processed;
根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率;According to the number of points in each grid after division and the attributes of each point, determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
根据所述划分后每个网格的向量,确定所述点云向量。The point cloud vector is determined according to the vector of each grid after the division.
在一种可能的设计中,所述根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,包括:In a possible design, the determination of the vector of each grid after division according to the number of points in each grid after division and the attributes of each point includes:
根据预设点数对划分后每个网格中点的数目进行调整;Adjust the number of points in each grid after division according to the preset number of points;
根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后每个网格的向量。According to the product of the number of points in each grid after adjustment and the attributes of each point, the vector of each grid after division is determined.
在一种可能的设计中,所述三维卷积神经网络包括第三卷积神经网络;In a possible design, the three-dimensional convolutional neural network includes a third convolutional neural network;
所述利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征,包括:The processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取。The third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
在一种可能的设计中,所述第三卷积神经网络通过三维网格向量和三维网格特征训练得到。In a possible design, the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
在一种可能的设计中,所述三维卷积神经网络还包括第四卷积神经网络;In a possible design, the three-dimensional convolutional neural network further includes a fourth convolutional neural network;
在所述利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取之后,还包括:After the third convolutional neural network is used to extract the three-dimensional grid features of the divided vectors of each grid, the method further includes:
利用所述第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取。The fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
在一种可能的设计中,所述第四卷积神经网络通过三维网格特征和三维点云特征训练得到。In a possible design, the fourth convolutional neural network is obtained through training of three-dimensional grid features and three-dimensional point cloud features.
在一种可能的设计中,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:In a possible design, the determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud includes:
根据所述目标三维点云特征中的有效点确定所述目标信息。The target information is determined according to the effective points in the target three-dimensional point cloud feature.
在一种可能的设计中,在所述根据所述目标三维点云特征中的有效点确定所述目标信息之前,还包括:In a possible design, before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further includes:
获得所述目标三维点云特征中的每个点距离其所在网格的中心点的像素欧氏距离;Obtaining the pixel Euclidean distance between each point in the target three-dimensional point cloud feature and the center point of the grid where it is located;
根据所述像素欧氏距离与预设距离阈值,确定所述目标三维点云特征中的有效点。According to the Euclidean distance of the pixel and a preset distance threshold, an effective point in the target three-dimensional point cloud feature is determined.
在一种可能的设计中,所述三维卷积神经网络包括:In a possible design, the three-dimensional convolutional neural network includes:
多个卷积层,用于对所述点云向量进行卷积操作,以输出三维点云特征;Multiple convolutional layers for performing convolution operations on the point cloud vector to output three-dimensional point cloud features;
上采样层,与所述多个卷积层中的至少一个卷积层相连接,用于获取所述多个卷积层中的至少一个卷积层输出的三维点云特征,并对获取的三维点云特征进行处理,以输出处理后的三维点云特征;The up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
全连接层,与所述多个卷积层中的卷积层和所述上采样层相连接,用于获取一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对获取的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成融合后的三维点云特征,将所述融合后的三维点云特征输入另一卷积层,以经过所述另一卷积层的卷积操作后,确定所述目标三维点云特征。The fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
在一种可能的设计中,所述多个卷积层中卷积层的深度不同;In a possible design, the depths of the convolutional layers in the multiple convolutional layers are different;
所述全连接层与所述多个卷积层中的第一卷积层、第二卷积层和所述上采样层连接,以获取所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成所述融合后的三维点云特征,将所述融合后的三维点云特征输入所述第二卷积层,以经过所述第二卷积层的卷积操作后,确定所述目标三维点云特征,其中,所述第二卷积层的深度大于所述第一卷积层的深度。The fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and The processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature , Input the fused three-dimensional point cloud feature into the second convolutional layer, and after the convolution operation of the second convolutional layer, determine the target three-dimensional point cloud feature, wherein the second The depth of the convolutional layer is greater than the depth of the first convolutional layer.
在一种可能的设计中,所述全连接层的个数为多个。In a possible design, the number of the fully connected layers is multiple.
在一种可能的设计中,所述上采样层的个数为多个。In a possible design, the number of the up-sampling layers is multiple.
本实施例提供的可移动平台,包括:可移动平台本体,以及目标检测设备,目标检测设备设置在可移动平台本体,其中,目标检测设备通过待处理点云中点的数目和每个点的属性,确定点云向量;然后利用三维卷积 神经网络对点云向量进行目标三维点云特征提取,保留高度信息,进而最大程度保留点云的三维结构信息,再根据提取的目标三维点云特征确定待处理点云中的目标信息,其中,该目标信息包括目标在三维坐标系中各维度的几何位置信息,准确找出点云中的三维目标,解决现有目标检测结果出错率较高,后续无法指导飞机、汽车、机器人等进行避障和路径规划的问题。The movable platform provided in this embodiment includes: a movable platform body and a target detection device. The target detection device is set on the movable platform body. The target detection device passes through the number of points in the point cloud to be processed and the number of points in each point. Attribute, determine the point cloud vector; then use the 3D convolutional neural network to extract the target 3D point cloud feature from the point cloud vector, retain the height information, and then retain the maximum 3D structure information of the point cloud, and then according to the extracted target 3D point cloud feature Determine the target information in the point cloud to be processed, where the target information includes the geometric position information of the target in each dimension in the three-dimensional coordinate system, accurately find the three-dimensional target in the point cloud, and solve the high error rate of the existing target detection results. The follow-up can not guide aircraft, cars, robots and other issues in obstacle avoidance and path planning.
本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有程序指令,当处理器执行所述程序指令时,实现如上所述的目标检测方法。The embodiment of the present application provides a computer-readable storage medium having program instructions stored in the computer-readable storage medium, and when a processor executes the program instructions, the target detection method as described above is implemented.
在本发明所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules can be combined or integrated. To another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本发明各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个单元中。上述模块成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, the functional modules in the various embodiments of the present invention may be integrated into one processing unit, or each module may exist alone physically, or two or more modules may be integrated into one unit. The units formed by the above modules can be implemented in the form of hardware, or in the form of hardware plus software functional units.
上述以软件功能模块的形式实现的集成的模块,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(英文:processor)执行本申请各个实施例所述方法的部分步骤。The above-mentioned integrated modules implemented in the form of software functional modules may be stored in a computer readable storage medium. The above-mentioned software function module is stored in a storage medium and includes a number of instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (English: processor) execute the various embodiments of the present application Part of the method.
应理解,上述处理器可以是中央处理单元(Central Processing Unit, 简称CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,简称DSP)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合发明所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。It should be understood that the foregoing processor may be a central processing unit (Central Processing Unit, CPU for short), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and application specific integrated circuits (Application Specific Integrated Circuits). Referred to as ASIC) and so on. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps of the method disclosed in combination with the invention can be directly embodied as executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
存储器可能包含高速RAM存储器,也可能还包括非易失性存储NVM,例如至少一个磁盘存储器,还可以为U盘、移动硬盘、只读存储器、磁盘或光盘等。The memory may include a high-speed RAM memory, or may also include a non-volatile storage NVM, such as at least one disk storage, and may also be a U disk, a mobile hard disk, a read-only memory, a magnetic disk, or an optical disk.
总线可以是工业标准体系结构(Industry Standard Architecture,简称ISA)总线、外部设备互连(Peripheral Component,简称PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,简称EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。The bus may be an Industry Standard Architecture (ISA) bus, Peripheral Component (PCI) bus, or Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, the buses in the drawings of this application are not limited to only one bus or one type of bus.
上述存储介质可以是由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。存储介质可以是通用或专用计算机能够存取的任何可用介质。The above-mentioned storage medium can be realized by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Except programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk. The storage medium may be any available medium that can be accessed by a general-purpose or special-purpose computer.
一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于专用集成电路(Application Specific Integrated Circuits,简称ASIC)中。当然,处理器和存储介质也可以作为分立组件存在于电子设备或主控设备中。An exemplary storage medium is coupled to the processor, so that the processor can read information from the storage medium and write information to the storage medium. Of course, the storage medium may also be an integral part of the processor. The processor and the storage medium may be located in Application Specific Integrated Circuits (ASIC for short). Of course, the processor and the storage medium may also exist as discrete components in the electronic device or the main control device.
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。A person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware. The aforementioned program can be stored in a computer readable storage medium. When the program is executed, it executes the steps including the foregoing method embodiments; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的 普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments can still be modified, or some or all of the technical features can be equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the technical solutions of the embodiments of the present invention. range.

Claims (36)

  1. 一种目标检测方法,其特征在于,包括:A target detection method is characterized in that it comprises:
    根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息;Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, the point cloud vector including geometric position information of the points in each dimension in the three-dimensional coordinate system;
    利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征;Processing the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector;
    根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:The method according to claim 1, wherein the determining the target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud comprises:
    利用第一卷积神经网络,基于所述目标三维点云特征确定所述待处理点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角。The first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
  3. 根据权利要求2所述的方法,其特征在于,所述第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。The method according to claim 2, wherein the first convolutional neural network is obtained by training of point cloud features and the center point coordinates, three-dimensional size, and yaw angle of the target in a three-dimensional coordinate system.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,在所述根据所述目标三维点云特征确定所述待处理点云中的目标信息之后,还包括:The method according to any one of claims 1 to 3, wherein after the determining the target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud, the method further comprises:
    利用第二卷积神经网络,基于所述目标三维点云特征确定所述目标的类别概率;Using a second convolutional neural network to determine the class probability of the target based on the characteristics of the target three-dimensional point cloud;
    根据所述目标的类别概率对所述目标进行错误目标去除。Error target removal is performed on the target according to the category probability of the target.
  5. 根据权利要求4所述的方法,其特征在于,所述第二卷积神经网络通过点云特征和目标类别概率训练得到。The method according to claim 4, wherein the second convolutional neural network is obtained through point cloud feature and target category probability training.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述根据待处理点云中点的数目和每个点的属性,确定点云向量,包括:The method according to any one of claims 1 to 5, wherein the determining a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point comprises:
    对所述待处理点云进行网格划分;Meshing the point cloud to be processed;
    根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率;According to the number of points in each grid after division and the attributes of each point, determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
    根据所述划分后每个网格的向量,确定所述点云向量。The point cloud vector is determined according to the vector of each grid after the division.
  7. 根据权利要求6中所述的方法,其特征在于,所述根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,包括:The method according to claim 6, wherein the determining the vector of each grid after division according to the number of points in each grid after division and the attributes of each point comprises:
    根据预设点数对划分后每个网格中点的数目进行调整;Adjust the number of points in each grid after division according to the preset number of points;
    根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后每个网格的向量。According to the product of the number of points in each grid after adjustment and the attributes of each point, the vector of each grid after division is determined.
  8. 根据权利要求6或7所述的方法,其特征在于,所述三维卷积神经网络包括第三卷积神经网络;The method according to claim 6 or 7, wherein the three-dimensional convolutional neural network comprises a third convolutional neural network;
    所述利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征,包括:The processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
    利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取。The third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
  9. 根据权利要求8所述的方法,其特征在于,所述第三卷积神经网络通过三维网格向量和三维网格特征训练得到。The method according to claim 8, wherein the third convolutional neural network is obtained through training of three-dimensional grid vectors and three-dimensional grid features.
  10. 根据权利要求8或9所述的方法,其特征在于,所述三维卷积神经网络还包括第四卷积神经网络;The method according to claim 8 or 9, wherein the three-dimensional convolutional neural network further comprises a fourth convolutional neural network;
    在所述利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取之后,还包括:After the third convolutional neural network is used to extract the three-dimensional grid features of the divided vectors of each grid, the method further includes:
    利用所述第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取。The fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
  11. 根据权利要求10所述的方法,其特征在于,所述第四卷积神经网络通过三维网格特征和三维点云特征训练得到。The method according to claim 10, wherein the fourth convolutional neural network is obtained through training of three-dimensional grid features and three-dimensional point cloud features.
  12. 根据权利要求1至11中任一项所述的方法,其特征在于,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:The method according to any one of claims 1 to 11, wherein the determining the target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud comprises:
    根据所述目标三维点云特征中的有效点确定所述目标信息。The target information is determined according to the effective points in the target three-dimensional point cloud feature.
  13. 根据权利要求12所述的方法,其特征在于,在所述根据所述目标三维点云特征中的有效点确定所述目标信息之前,还包括:The method according to claim 12, wherein before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further comprises:
    获得所述目标三维点云特征中的每个点距离其所在网格的中心点的像素欧氏距离;Obtaining the pixel Euclidean distance between each point in the target three-dimensional point cloud feature and the center point of the grid where it is located;
    根据所述像素欧氏距离与预设距离阈值,确定所述目标三维点云特征中的有效点。According to the Euclidean distance of the pixel and a preset distance threshold, an effective point in the target three-dimensional point cloud feature is determined.
  14. 根据权利要求1至13中任一项所述的方法,其特征在于,所述三维卷积神经网络包括:The method according to any one of claims 1 to 13, wherein the three-dimensional convolutional neural network comprises:
    多个卷积层,用于对所述点云向量进行卷积操作,以输出三维点云特征;Multiple convolutional layers for performing convolution operations on the point cloud vector to output three-dimensional point cloud features;
    上采样层,与所述多个卷积层中的至少一个卷积层相连接,用于获取所述多个卷积层中的至少一个卷积层输出的三维点云特征,并对获取的三维点云特征进行处理,以输出处理后的三维点云特征;The up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
    全连接层,与所述多个卷积层中的卷积层和所述上采样层相连接,用于获取一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对获取的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成融合后的三维点云特征,将所述融合后的三维点云特征输入另一卷积层,以经过所述另一卷积层的卷积操作后,确定所述目标三维点云特征。The fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
  15. 根据权利要求14所述的方法,其特征在于,所述多个卷积层中卷积层的深度不同;The method according to claim 14, wherein the depths of the convolutional layers in the plurality of convolutional layers are different;
    所述全连接层与所述多个卷积层中的第一卷积层、第二卷积层和所述上采样层连接,以获取所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成所述融合后的三维点云特征,将所述融合后的三维点云特征输入所述第二卷积层,以经过所述第二卷积层的卷积操作后,确定所述目标三维点云特征,其中,所述第二卷积层的深度大于所述第一卷积层的深度。The fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and The processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature , Input the fused three-dimensional point cloud feature into the second convolutional layer, and after the convolution operation of the second convolutional layer, determine the target three-dimensional point cloud feature, wherein the second The depth of the convolutional layer is greater than the depth of the first convolutional layer.
  16. 根据权利要求14或15所述的方法,其特征在于,所述全连接层的个数为多个。The method according to claim 14 or 15, wherein the number of the fully connected layers is multiple.
  17. 根据权利要求14至16中任一项所述的方法,其特征在于,所述上采样层的个数为多个。The method according to any one of claims 14 to 16, wherein the number of the up-sampling layer is multiple.
  18. 一种目标检测设备,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机执行指令,所述处理器执行所述计算机执行指令时实现如下步骤:A target detection device, which is characterized by comprising a memory, a processor, and computer-executable instructions stored in the memory and running on the processor, and the processor implements the following steps when the processor executes the computer-executable instructions :
    根据待处理点云中点的数目和每个点的属性,确定点云向量,所述点云向量包括点在三维坐标系中各维度的几何位置信息;Determine a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point, the point cloud vector including geometric position information of the points in each dimension in the three-dimensional coordinate system;
    利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征;Processing the point cloud vector by using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector;
    根据所述目标三维点云特征确定所述待处理点云中的目标信息,所述目标信息包括目标在三维坐标系中各维度的几何位置信息。The target information in the point cloud to be processed is determined according to the characteristics of the target three-dimensional point cloud, and the target information includes geometric position information of the target in each dimension in the three-dimensional coordinate system.
  19. 根据权利要求18所述的设备,其特征在于,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:The device according to claim 18, wherein the determining the target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud comprises:
    利用第一卷积神经网络,基于所述目标三维点云特征确定所述待处理点云中的目标在三维坐标系中的中心点坐标、三维尺寸和航偏角。The first convolutional neural network is used to determine the center point coordinates, the three-dimensional size, and the yaw angle of the target in the to-be-processed point cloud in the three-dimensional coordinate system based on the characteristics of the target three-dimensional point cloud.
  20. 根据权利要求19所述的设备,其特征在于,所述第一卷积神经网络通过点云特征以及目标在三维坐标系中的中心点坐标、三维尺寸和航偏角训练得到。The device according to claim 19, wherein the first convolutional neural network is obtained by training of point cloud features and center point coordinates, three-dimensional size and yaw angle of the target in a three-dimensional coordinate system.
  21. 根据权利要求18至20中任一项所述的设备,其特征在于,在所述根据所述目标三维点云特征确定所述待处理点云中的目标信息之后,还包括:The device according to any one of claims 18 to 20, characterized in that, after determining the target information in the to-be-processed point cloud according to the characteristics of the target three-dimensional point cloud, it further comprises:
    利用第二卷积神经网络,基于所述目标三维点云特征确定所述目标的类别概率;Using a second convolutional neural network to determine the class probability of the target based on the characteristics of the target three-dimensional point cloud;
    根据所述目标的类别概率对所述目标进行错误目标去除。Error target removal is performed on the target according to the category probability of the target.
  22. 根据权利要求21所述的设备,其特征在于,所述第二卷积神经网络通过点云特征和目标类别概率训练得到。The device according to claim 21, wherein the second convolutional neural network is obtained through point cloud feature and target category probability training.
  23. 根据权利要求18至22中任一项所述的设备,其特征在于,所述根据待处理点云中点的数目和每个点的属性,确定点云向量,包括:The device according to any one of claims 18 to 22, wherein the determining a point cloud vector according to the number of points in the point cloud to be processed and the attributes of each point comprises:
    对所述待处理点云进行网格划分;Meshing the point cloud to be processed;
    根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,其中,每个点的属性包括点在三维坐标系中的三维坐标和反射率;According to the number of points in each grid after division and the attributes of each point, determine the vector of each grid after division, where the attributes of each point include the three-dimensional coordinates and reflectivity of the point in the three-dimensional coordinate system;
    根据所述划分后每个网格的向量,确定所述点云向量。The point cloud vector is determined according to the vector of each grid after the division.
  24. 根据权利要求23所述的设备,其特征在于,所述根据划分后每个网格中点的数目和每个点的属性,确定划分后每个网格的向量,包括:The device according to claim 23, wherein the determining the vector of each grid after division according to the number of points in each grid after division and the attributes of each point comprises:
    根据预设点数对划分后每个网格中点的数目进行调整;Adjust the number of points in each grid after division according to the preset number of points;
    根据调整后每个网格中点的数目和每个点的属性的乘积,确定划分后 每个网格的向量。According to the product of the number of points in each grid after adjustment and the attributes of each point, the vector of each grid after division is determined.
  25. 根据权利要求23或24所述的设备,其特征在于,所述三维卷积神经网络包括第三卷积神经网络;The device according to claim 23 or 24, wherein the three-dimensional convolutional neural network comprises a third convolutional neural network;
    所述利用三维卷积神经网络对所述点云向量处理,以提取所述点云向量的目标三维点云特征,包括:The processing the point cloud vector using a three-dimensional convolutional neural network to extract the target three-dimensional point cloud feature of the point cloud vector includes:
    利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取。The third convolutional neural network is used to extract the three-dimensional grid feature of each grid vector after division.
  26. 根据权利要求25所述的设备,其特征在于,所述第三卷积神经网络通过三维网格向量和三维网格特征训练得到。The device according to claim 25, wherein the third convolutional neural network is obtained by training with a three-dimensional grid vector and a three-dimensional grid feature.
  27. 根据权利要求25或26所述的设备,其特征在于,所述三维卷积神经网络还包括第四卷积神经网络;The device according to claim 25 or 26, wherein the three-dimensional convolutional neural network further comprises a fourth convolutional neural network;
    在所述利用所述第三卷积神经网络对划分后每个网格的向量进行三维网格特征提取之后,还包括:After the third convolutional neural network is used to extract the three-dimensional grid features of the divided vectors of each grid, the method further includes:
    利用所述第四卷积神经网络对提取的三维网格特征进行目标三维点云特征提取。The fourth convolutional neural network is used to extract features of the target three-dimensional point cloud from the extracted three-dimensional grid features.
  28. 根据权利要求27所述的设备,其特征在于,所述第四卷积神经网络通过三维网格特征和三维点云特征训练得到。The device according to claim 27, wherein the fourth convolutional neural network is obtained through training of three-dimensional grid features and three-dimensional point cloud features.
  29. 根据权利要求18至28中任一项所述的设备,其特征在于,所述根据所述目标三维点云特征确定所述待处理点云中的目标信息,包括:The device according to any one of claims 18 to 28, wherein the determining the target information in the point cloud to be processed according to the characteristics of the target three-dimensional point cloud comprises:
    根据所述目标三维点云特征中的有效点确定所述目标信息。The target information is determined according to the effective points in the target three-dimensional point cloud feature.
  30. 根据权利要求29所述的设备,其特征在于,在所述根据所述目标三维点云特征中的有效点确定所述目标信息之前,还包括:The device according to claim 29, characterized in that, before the determining the target information according to the effective points in the target three-dimensional point cloud feature, the method further comprises:
    获得所述目标三维点云特征中的每个点距离其所在网格的中心点的像素欧氏距离;Obtaining the pixel Euclidean distance between each point in the target three-dimensional point cloud feature and the center point of the grid where it is located;
    根据所述像素欧氏距离与预设距离阈值,确定所述目标三维点云特征中的有效点。According to the Euclidean distance of the pixel and a preset distance threshold, an effective point in the target three-dimensional point cloud feature is determined.
  31. 根据权利要求18至30中任一项所述的设备,其特征在于,所述三维卷积神经网络包括:The device according to any one of claims 18 to 30, wherein the three-dimensional convolutional neural network comprises:
    多个卷积层,用于对所述点云向量进行卷积操作,以输出三维点云特征;Multiple convolutional layers for performing convolution operations on the point cloud vector to output three-dimensional point cloud features;
    上采样层,与所述多个卷积层中的至少一个卷积层相连接,用于获取所述多个卷积层中的至少一个卷积层输出的三维点云特征,并对获取的三维点云特征进行处理,以输出处理后的三维点云特征;The up-sampling layer is connected to at least one convolutional layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by at least one of the multiple convolutional layers, and to compare the obtained 3D point cloud features are processed to output the processed 3D point cloud features;
    全连接层,与所述多个卷积层中的卷积层和所述上采样层相连接,用于获取一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对获取的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成融合后的三维点云特征,将所述融合后的三维点云特征输入另一卷积层,以经过所述另一卷积层的卷积操作后,确定所述目标三维点云特征。The fully connected layer is connected to the convolutional layer and the upsampling layer of the multiple convolutional layers, and is used to obtain the three-dimensional point cloud features output by a convolutional layer and the processed three-dimensional point cloud features, And perform feature fusion on the acquired three-dimensional point cloud features and the processed three-dimensional point cloud features to generate a fused three-dimensional point cloud feature, and input the fused three-dimensional point cloud feature into another convolutional layer to After the convolution operation of the other convolution layer, the target three-dimensional point cloud feature is determined.
  32. 根据权利要求31所述的设备,其特征在于,所述多个卷积层中卷积层的深度不同;The device according to claim 31, wherein the depths of the convolutional layers in the plurality of convolutional layers are different;
    所述全连接层与所述多个卷积层中的第一卷积层、第二卷积层和所述上采样层连接,以获取所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征,并对所述第一卷积层输出的三维点云特征和所述处理后的三维点云特征进行特征融合,以生成所述融合后的三维点云特征,将所述融合后的三维点云特征输入所述第二卷积层,以经过所述第二卷积层的卷积操作后,确定所述目标三维点云特征,其中,所述第二卷积层的深度大于所述第一卷积层的深度。The fully connected layer is connected to the first convolutional layer, the second convolutional layer, and the up-sampling layer among the plurality of convolutional layers to obtain the three-dimensional point cloud features output by the first convolutional layer and The processed three-dimensional point cloud feature, and feature fusion is performed on the three-dimensional point cloud feature output by the first convolution layer and the processed three-dimensional point cloud feature to generate the fused three-dimensional point cloud feature , Input the fused three-dimensional point cloud feature into the second convolutional layer, and after the convolution operation of the second convolutional layer, determine the target three-dimensional point cloud feature, wherein the second The depth of the convolutional layer is greater than the depth of the first convolutional layer.
  33. 根据权利要求31或32所述的设备,其特征在于,所述全连接层的个数为多个。The device according to claim 31 or 32, wherein the number of the fully connected layers is multiple.
  34. 根据权利要求31至33中任一项所述的设备,其特征在于,所述上采样层的个数为多个。The device according to any one of claims 31 to 33, wherein the number of the upsampling layers is multiple.
  35. 一种可移动平台,其特征在于,包括:A movable platform, characterized in that it comprises:
    可移动平台本体;以及The movable platform body; and
    权利要求18~34任一项所述的目标检测设备,所述目标检测设备安装于所述可移动平台本体。The target detection device according to any one of claims 18 to 34, wherein the target detection device is installed on the movable platform body.
  36. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至17任一项所述的目标检测方法。A computer-readable storage medium, wherein a computer-executable instruction is stored in the computer-readable storage medium, and when the processor executes the computer-executable instruction, the computer-executable instruction is implemented as described in any one of claims 1 to 17 Target detection method.
PCT/CN2019/108897 2019-09-29 2019-09-29 Method and device for target detection, and movable platform WO2021056516A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/108897 WO2021056516A1 (en) 2019-09-29 2019-09-29 Method and device for target detection, and movable platform
CN201980033741.0A CN112154448A (en) 2019-09-29 2019-09-29 Target detection method and device and movable platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/108897 WO2021056516A1 (en) 2019-09-29 2019-09-29 Method and device for target detection, and movable platform

Publications (1)

Publication Number Publication Date
WO2021056516A1 true WO2021056516A1 (en) 2021-04-01

Family

ID=73892057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/108897 WO2021056516A1 (en) 2019-09-29 2019-09-29 Method and device for target detection, and movable platform

Country Status (2)

Country Link
CN (1) CN112154448A (en)
WO (1) WO2021056516A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111787A (en) * 2021-04-15 2021-07-13 北京沃东天骏信息技术有限公司 Target detection method, device, equipment and storage medium
CN113312966A (en) * 2021-04-21 2021-08-27 广东工业大学 Action identification method and device based on first-person visual angle
CN113780078A (en) * 2021-08-05 2021-12-10 广州西威科智能科技有限公司 Method for quickly and accurately identifying fault object in unmanned visual navigation
CN114373358A (en) * 2022-03-07 2022-04-19 中国人民解放军空军工程大学航空机务士官学校 Aviation aircraft maintenance operation simulation training system based on rapid modeling

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113052131A (en) * 2021-04-20 2021-06-29 深圳市商汤科技有限公司 Point cloud data processing and automatic driving vehicle control method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108427438A (en) * 2018-04-11 2018-08-21 北京木业邦科技有限公司 Flight environment of vehicle detection method, device, electronic equipment and storage medium
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN109902702A (en) * 2018-07-26 2019-06-18 华为技术有限公司 The method and apparatus of target detection
CN110228484A (en) * 2019-06-17 2019-09-13 福州视驰科技有限公司 A kind of low time delay intelligent remote control loop driving function with auxiliary

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN108427438A (en) * 2018-04-11 2018-08-21 北京木业邦科技有限公司 Flight environment of vehicle detection method, device, electronic equipment and storage medium
CN109902702A (en) * 2018-07-26 2019-06-18 华为技术有限公司 The method and apparatus of target detection
CN110228484A (en) * 2019-06-17 2019-09-13 福州视驰科技有限公司 A kind of low time delay intelligent remote control loop driving function with auxiliary

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111787A (en) * 2021-04-15 2021-07-13 北京沃东天骏信息技术有限公司 Target detection method, device, equipment and storage medium
CN113312966A (en) * 2021-04-21 2021-08-27 广东工业大学 Action identification method and device based on first-person visual angle
CN113312966B (en) * 2021-04-21 2023-08-08 广东工业大学 Action recognition method and device based on first person viewing angle
CN113780078A (en) * 2021-08-05 2021-12-10 广州西威科智能科技有限公司 Method for quickly and accurately identifying fault object in unmanned visual navigation
CN113780078B (en) * 2021-08-05 2024-03-19 广州西威科智能科技有限公司 Rapid and accurate fault object identification method in unmanned visual navigation
CN114373358A (en) * 2022-03-07 2022-04-19 中国人民解放军空军工程大学航空机务士官学校 Aviation aircraft maintenance operation simulation training system based on rapid modeling
CN114373358B (en) * 2022-03-07 2023-11-24 中国人民解放军空军工程大学航空机务士官学校 Aviation aircraft maintenance operation simulation training system based on rapid modeling

Also Published As

Publication number Publication date
CN112154448A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
WO2021056516A1 (en) Method and device for target detection, and movable platform
US11379699B2 (en) Object detection method and apparatus for object detection
JP7112993B2 (en) Laser Radar Internal Parameter Accuracy Verification Method and Its Apparatus, Equipment and Medium
KR102126724B1 (en) Method and apparatus for restoring point cloud data
CN110163930B (en) Lane line generation method, device, equipment, system and readable storage medium
CN109685842B (en) Sparse depth densification method based on multi-scale network
US8199977B2 (en) System and method for extraction of features from a 3-D point cloud
CN110378919B (en) Narrow-road passing obstacle detection method based on SLAM
CN111563450B (en) Data processing method, device, equipment and storage medium
CN113378760A (en) Training target detection model and method and device for detecting target
CN112711034B (en) Object detection method, device and equipment
CN112097732A (en) Binocular camera-based three-dimensional distance measurement method, system, equipment and readable storage medium
US11842440B2 (en) Landmark location reconstruction in autonomous machine applications
Shivakumar et al. Real time dense depth estimation by fusing stereo with sparse depth measurements
CN115147809B (en) Obstacle detection method, device, equipment and storage medium
WO2023002093A1 (en) Systems and methods for determining road traversability using real time data and a trained model
CN116168384A (en) Point cloud target detection method and device, electronic equipment and storage medium
CN117590362B (en) Multi-laser radar external parameter calibration method, device and equipment
CN114170596A (en) Posture recognition method and device, electronic equipment, engineering machinery and storage medium
CN113111787A (en) Target detection method, device, equipment and storage medium
CN116844124A (en) Three-dimensional object detection frame labeling method, three-dimensional object detection frame labeling device, electronic equipment and storage medium
CN116246033A (en) Rapid semantic map construction method for unstructured road
CN114648639A (en) Target vehicle detection method, system and device
CN116863325A (en) Method for multiple target detection and related product
Kovacs et al. Edge detection in discretized range images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946518

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19946518

Country of ref document: EP

Kind code of ref document: A1