CN110706288A

CN110706288A - Target detection method, device, equipment and readable storage medium

Info

Publication number: CN110706288A
Application number: CN201910957822.1A
Authority: CN
Inventors: 周康明; 郭义波
Original assignee: Shanghai Eye Control Technology Co Ltd
Current assignee: Shanghai Eye Control Technology Co Ltd
Priority date: 2019-10-10
Filing date: 2019-10-10
Publication date: 2020-01-17

Abstract

The embodiment of the invention provides a method, a device, equipment and a readable storage medium for target detection, wherein the method comprises the following steps: acquiring central point data of a target to be detected in a pixel coordinate system and point cloud data of the target to be detected in a camera coordinate system; acquiring conversion data of the central point data in a camera coordinate system; determining a point set associated with the central point data in the camera coordinate system according to the conversion data and the central point data; generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point; and carrying out target detection on the point cloud data falling in the three-dimensional anchor point frame. The method provided by the embodiment of the invention limits the generation of the three-dimensional anchor point frame only in part of the camera space, so that the number of the generated three-dimensional anchor point frames is reduced, the matching calculation amount of the anchor point frame and the real target detection frame at the later stage is reduced, the matching speed is accelerated, and the target detection efficiency is improved.

Description

Target detection method, device, equipment and readable storage medium

Technical Field

The present invention relates to the field of target detection, and in particular, to a method, an apparatus, a device, and a readable storage medium for target detection.

Background

With the development of computer technology and the wide application of computer vision principle, the research of detecting targets by using computer image processing technology is more and more popular, and the target detection has wide application value in the fields of intelligent traffic systems, intelligent monitoring systems, military target detection, medical health care and the like.

In the target detection method in the prior art, a means of generating anchor frames in advance is adopted, the anchor frames are compared with real target detection frames, each anchor frame is labeled, the anchor frames are divided into a positive type and a negative type, a feature diagram of a target to be detected is input into a classification model, an obtained positive type result is input into a regression model, and finally an optimal result is selected by using a non-maximum suppression algorithm. For two-dimensional target detection, the generation method of the anchor point frame is simple, for example: and generating 9 anchor point frames on each point of the target feature map to be detected, wherein the 9 anchor point frames have different areas and respectively comprise three length-width ratios, namely 1:2, 2:1 and 1: 1.

However, unlike two-dimensional object detection that detects only the center point coordinates, length, and width of an object on a two-dimensional pixel plane, three-dimensional object detection requires detection of the length, width, height, and three-dimensional center point coordinates and yaw angle of the object. Aiming at the detection of a three-dimensional target, the existing generation method of the anchor point frame suitable for the detection of a two-dimensional target is not suitable, and the specific expression is that the number of the anchor point frames generated in a three-dimensional space is huge, so that the matching calculation amount of the anchor point frame and the real target detection frame in the later period is huge, the matching speed is slow, and the target detection efficiency is not high.

Disclosure of Invention

The embodiment of the invention provides a target detection method, a target detection device, target detection equipment and a readable storage medium, which are used for solving the technical problems that the number of anchor point frames generated in a three-dimensional space is huge in the existing target detection method, so that the later-stage matching calculation amount of the anchor point frames and the real target detection frame is huge, the matching speed is not fast, and the target detection efficiency is not high.

In a first aspect, an embodiment of the present invention provides a method for target detection, including:

acquiring central point data of a target to be detected in a pixel coordinate system;

acquiring point cloud data of a target to be detected in a camera coordinate system;

acquiring conversion data of the central point data in the camera coordinate system;

determining a point set associated with the central point data in the camera coordinate system according to the conversion data and the central point data;

generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point;

and carrying out target detection on the point cloud data falling in the three-dimensional anchor point frame.

Further, the method as described above, the acquiring the conversion data of the center point data in the camera coordinate system includes:

taking values on a Z axis of a camera coordinate system at preset intervals to obtain each depth coordinate value; determining each depth coordinate value as the conversion data.

Further, the method as described above, wherein the center point data is a center point coordinate value, and the determining a point set associated with the center point data in a camera coordinate system according to the conversion data and the center point data includes:

respectively inputting the coordinate value of the central point and each depth coordinate value into a transformation formula so as to output a transverse coordinate value and a longitudinal coordinate value corresponding to each point;

forming the depth coordinate value, the horizontal coordinate value and the longitudinal coordinate value corresponding to each point into coordinate values of corresponding points in a camera coordinate system;

and forming the coordinate values of the corresponding points into the point set.

Further, the method as described above, the generating a corresponding three-dimensional anchor point frame with each point in the point set as a central point includes:

acquiring preset scale data of the three-dimensional anchor point frame;

generating at least one three-dimensional anchor point frame matched with the scale data by taking each point as a central point;

wherein the preset scale data comprises: length, width and height.

Further, the method for acquiring the center point data of the object to be detected in the pixel coordinate system includes:

acquiring two-dimensional frame coordinate data of a target to be detected in a pixel coordinate system;

and calculating the central point data according to the two-dimensional frame coordinate data.

Further, the method for acquiring point cloud data of the target to be detected in the camera coordinate system includes:

acquiring laser point cloud data returned by the target to be detected;

and converting the laser point cloud data from a laser radar coordinate system to a camera coordinate system to obtain point cloud data in the camera coordinate system.

In a second aspect, an embodiment of the present invention provides an apparatus for object detection, including:

the central point data acquisition module is used for acquiring central point data of the target to be detected in a pixel coordinate system;

the point cloud data acquisition module is used for acquiring point cloud data of the target to be detected in a camera coordinate system;

the conversion data acquisition module is used for acquiring conversion data of the central point data in the camera coordinate system;

a determining module, configured to determine, according to the conversion data and the center point data, a point set associated with the center point data in the camera coordinate system;

the generating module is used for generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point;

and the detection module is used for carrying out target detection on the point cloud data falling in the three-dimensional anchor point frame.

Further, the apparatus as described above, the conversion data obtaining module, comprising:

the value taking submodule is used for taking values on a Z axis of a camera coordinate system at preset intervals so as to obtain each depth coordinate value;

a determining submodule for determining each of the depth coordinate values as the conversion data.

Further, in the apparatus as described above, the center point data is a coordinate value of a center point, and the determining module is specifically configured to:

respectively inputting the coordinate value of the central point and each depth coordinate value into a transformation formula so as to output a transverse coordinate value and a longitudinal coordinate value corresponding to each point; forming the depth coordinate value, the horizontal coordinate value and the longitudinal coordinate value corresponding to each point into coordinate values of corresponding points in a camera coordinate system; and forming the coordinate values of the corresponding points into the point set.

Further, in the apparatus described above, the generating module is specifically configured to:

acquiring preset scale data of the three-dimensional anchor point frame; generating at least one three-dimensional anchor point frame matched with the scale data by taking each point as a central point; wherein the preset scale data comprises: length, width and height.

Further, in the above apparatus, the central point data obtaining module is specifically configured to:

acquiring two-dimensional frame coordinate data of a target to be detected in a pixel coordinate system; and calculating the central point data according to the two-dimensional frame coordinate data.

Further, in the above apparatus, the point cloud data obtaining module is specifically configured to:

acquiring laser point cloud data returned by the target to be detected; and converting the laser point cloud data from a laser radar coordinate system to a camera coordinate system to obtain point cloud data in the camera coordinate system.

In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any of the first aspects.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, the computer program being executed by a processor to implement the method according to any one of the first aspect.

The embodiment of the invention provides a target detection method, a target detection device, target detection equipment and a readable storage medium, which are used for converting central point data of a target to be detected in a two-dimensional coordinate system into a point set associated with the central point data in a three-dimensional camera coordinate system, generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point, and performing target detection on point cloud data of the target to be detected falling in the three-dimensional anchor point frame. According to the target detection method, the target detection device, the target detection equipment and the readable storage medium, the central point data of the target to be detected in the two-dimensional coordinate system are converted into the point set associated with the central point data in the three-dimensional camera coordinate system, and each point in the point set is used as the central point to generate the corresponding three-dimensional anchor point frame, so that the three-dimensional anchor point frames are limited to be generated only in part of the camera space, the number of the generated three-dimensional anchor point frames is reduced, the matching calculation amount of the anchor point frames and the real target detection frame in the later period is reduced, the matching speed is increased, and the target detection efficiency is improved.

It should be understood that what is described in the summary above is not intended to limit key or critical features of embodiments of the invention, nor is it intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of a method for detecting a target according to an embodiment of the present invention;

FIG. 2 is a flowchart of a target detection method according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of an apparatus for target detection according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an apparatus for target detection according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.

With the above figures, certain embodiments of the invention have been illustrated and described in more detail below. The drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the invention by those skilled in the art with reference to specific embodiments.

Detailed Description

Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present invention. It should be understood that the drawings and the embodiments of the present invention are illustrative only and are not intended to limit the scope of the present invention.

The terms "comprises," "comprising," and any other variation thereof in the description of the embodiments of the invention are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

First, terms related to embodiments of the present invention are explained:

yaw angle: the angle deviating from the flight path refers to an included angle between the projection of the course vector of the object on the geographic coordinate xoy plane in the motion process and an x-axis, and the positive direction of the x-axis is from left to right.

Fig. 1 is a flowchart of a target detection method according to an embodiment of the present invention, and as shown in fig. 1, the target detection method according to the embodiment includes the following steps.

Step 101, obtaining central point data of a target to be detected in a pixel coordinate system.

The target to be detected may be a vehicle, a pedestrian, or other moving or stationary object, which is not limited in this embodiment.

Specifically, in this embodiment, the obtaining of the central point data of the target to be detected in the pixel coordinate system may be: firstly, an image of a target to be detected is obtained, and then the image is processed to obtain central point data of the target to be detected under a pixel coordinate system.

Optionally, the mode of acquiring the image of the target to be detected may be to acquire the image of the target to be detected by using an image acquisition sensor, or to acquire the image of the target to be detected from a storage device of the image of the target to be detected.

When the image is processed to obtain the central point data of the target to be detected in the pixel coordinate system, as an optional implementation mode, the contour information of the target to be detected is extracted, the central point of the contour of the target to be detected is determined, and the central point data is used as the central point data of the target to be detected in the pixel coordinate system. Or as another optional implementation mode, a two-dimensional target detection algorithm is adopted, the target image to be detected is input into the two-dimensional target detection algorithm, the two-dimensional target detection algorithm detects the target to be detected and can output two-dimensional frame coordinate data, and the two-dimensional frame frames the target to be detected. And determining the central point data of the target to be detected in the pixel coordinate system by adopting the two-dimensional frame coordinate data.

The type of the two-dimensional target detection method is not limited.

The description of the values is that the data of the central point of the object to be detected in the pixel coordinate system may be obtained in other manners, which is not limited in this embodiment.

Step 102, point cloud data of the target to be detected under a camera coordinate system is obtained.

Specifically, the point cloud data of the target to be detected carries information of the target to be detected, and the point cloud data of the target to be detected in the camera coordinate system can be obtained by performing coordinate system transformation on the laser point cloud data in the laser radar coordinate system returned by the target to be detected.

And 103, acquiring conversion data of the central point data in a camera coordinate system.

Specifically, values are taken at certain intervals on the Z axis in the camera coordinate system, and the series of values is referred to as conversion data of the central point data in the camera coordinate system.

And step 104, determining a point set associated with the central point data in the camera coordinate system according to the conversion data and the central point data.

Specifically, the point set is determined as follows:

inputting the central point data and the conversion data of the central point data acquired in the step 103 in the camera coordinate system into a coordinate system conversion formula to output a transverse coordinate value and a longitudinal coordinate value corresponding to each central point data;

and converting the central point data corresponding to each point in a camera coordinate system, wherein the transverse coordinate values and the longitudinal coordinate values form coordinate values of corresponding points in the camera coordinate system, and the coordinate values of the corresponding points form a point set.

And 105, generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point.

Specifically, under a camera coordinate system, each point in the point set is taken as a central point, and the scale data of the object of the same type as the target to be detected is taken as input to generate a three-dimensional anchor point frame corresponding to the scale data. The number of the three-dimensional anchor point frames generated by each central point can be limited according to the actual application requirements. The three-dimensional anchor point frames are uniformly distributed around the central point.

And 106, carrying out target detection on the point cloud data in the three-dimensional anchor point frame.

Specifically, the point cloud data of the target to be detected is distributed in the three-dimensional camera space, and the target detection is performed only on the point cloud data falling into the three-dimensional anchor point frame.

Among them, there are various methods for detecting targets, such as: target detection methods such as fast-R-CNN (fast-Region-conditional Neural Networks), SDD (Single Shot Multi Box Detector) and the like are not specifically limited, as long as effective detection of point cloud data of a target to be detected falling into a three-dimensional anchor point frame can be completed according to the target detection method.

The embodiment of the invention provides a target detection method, which comprises the steps of converting central point data of a target to be detected in a two-dimensional coordinate system to a point set associated with the central point data in a three-dimensional camera coordinate system, generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point, and carrying out target detection on point cloud data of the target to be detected falling in the three-dimensional anchor point frame. The data of the central point of the target to be detected in the two-dimensional coordinate system is converted into the point set associated with the data of the central point in the three-dimensional camera coordinate system, and each point in the point set is used as the central point to generate the corresponding three-dimensional anchor point frame, so that the three-dimensional anchor point frames are limited to be generated only in part of the camera space, the number of the generated three-dimensional anchor point frames is reduced, the matching calculation amount of the anchor point frames and the real target detection frame in the later period is reduced, the matching speed is increased, and the target detection efficiency is improved.

Fig. 2 is a flowchart of a target detection method according to a second embodiment of the present invention, and as shown in fig. 2, the target detection method according to this embodiment is further detailed in steps 101 to 106 based on the target detection method according to the embodiment shown in fig. 1, and then the target detection method according to this embodiment includes the following steps.

Step 201, acquiring two-dimensional frame coordinate data of a target to be detected in a pixel coordinate system.

Specifically, in this embodiment, first, image information of a target to be detected is acquired, and an acquisition mode may be acquired through an electronic device with an image acquisition sensor, where the electronic device is not specifically limited and may be a camera, a mobile phone, a tablet computer, and the like.

Then, the acquired image information of the target to be detected is processed by using a two-dimensional target detection algorithm, for example: the method comprises the steps of detecting image information by a fast-R-CNN (fast-Region-constant Neural Networks) method to obtain two-dimensional frame coordinate data of an object to be detected.

Optionally, the obtained two-dimensional frame coordinate data of the object to be detected is recorded as (x)₁,y₁,x₂,y₂) Wherein (x)₁,y₁) The abscissa and ordinate values of the two-dimensional frame representing the object to be detected at the upper left corner of the pixel coordinate system (x)₂,y₂) And the abscissa value and the ordinate value of the two-dimensional frame representing the target to be detected in the lower right corner of the pixel coordinate system.

Step 202, obtaining central point data of the target to be detected in a pixel coordinate system.

In the present embodiment, the two-dimensional frame coordinate data (x) of the object to be detected acquired in step 201 is used as the basis₁,y₁,x₂,y₂) And calculating the central point data of the target to be detected in the pixel coordinate system.

Specifically, the center point data is represented as (x, y) in the pixel coordinate system.

Wherein x represents the abscissa value of the center point data, y represents the ordinate value of the center point data, and x is (x)₁+x₂)/2，y＝(y₁+y₂)/2。

Thus, the central point data of the target to be detected in the pixel coordinate system is obtained as ((x)₁+x₂)/2，(y₁+y₂)/2)。

And 203, acquiring laser point cloud data returned by the target to be detected, and converting the laser point cloud data from a laser radar coordinate system to a camera coordinate system to obtain point cloud data of the target to be detected in the camera coordinate system.

Specifically, in this embodiment, first, a laser is emitted onto a target to be detected by a laser, and laser point cloud data returned by the target to be detected is received, where the laser point cloud data is represented in a three-dimensional coordinate form in a laser radar coordinate system.

And secondly, converting the laser point cloud data of the target to be detected in the laser radar coordinate system into the laser point cloud data of the target to be detected in the camera coordinate system according to a conversion formula from the laser radar coordinate system to the camera coordinate system. The conversion formula is a conversion formula between the existing coordinate systems.

Step 204, acquiring the conversion data of the central point data in the camera coordinate system.

Optionally, step 204 comprises the steps of:

step 2041, values are taken on the Z axis of the camera coordinate system at preset intervals to obtain each depth coordinate value.

Step 2042, determine each depth coordinate value as transformation data.

Specifically, in this embodiment, on the Z axis of the camera coordinate system, the value is taken at a preset interval a to obtain a depth coordinate value Z_cEach a corresponding to a Z_cZ of this_cNamely the conversion data of the central point data in the camera coordinate system.

Optionally, the value of the preset interval a may be set according to the length and the width of the target to be detected. The preset interval a may be 0.5m, 1m or other suitable values, which is not limited in this embodiment.

Optionally, a depth coordinate value Z_cThe maximum value of (a) does not exceed the length of the target to be detected.

In step 205, a point set associated with the center point data in the camera coordinate system is determined from the transformed data and the center point data.

Optionally, step 205 comprises the steps of:

step 2051, the coordinate value of the center point and each depth coordinate value are respectively input into a transformation formula to output a lateral coordinate value and a longitudinal coordinate value corresponding to each point.

And step 2052, forming the depth coordinate value, the horizontal coordinate value and the longitudinal coordinate value corresponding to each point into coordinate values of corresponding points in a camera coordinate system.

And step 2053, forming a point set by the coordinate values of all the corresponding points.

Specifically, in the present embodiment, the center point data is the coordinate value of the center point obtained in step 202, i.e., ((x)₁+x₂)/2，(y₁+y₂) /2), converting the data into a plurality of depth coordinate values Z acquired in step 205_c。

Firstly, according to the transformation formula from the pixel coordinate system to the camera coordinate system,

wherein, in the formula (f),

u₀，v₀are all camera parameters, f represents the focal length of the camera,

representing the number of pixels contained in a physical unit of 1 mm in the x-axis direction,

represents the number of pixels contained in a physical unit of 1 mm in the y-axis direction; u. of₀，v₀Indicating the position of the origin of the physical image coordinate system in the pixel coordinate system.

Coordinate values of the center point (x, y) and depth Z_cInputting the data into a coordinate system transformation formula (1), wherein,

the parameters u and v are x and y corresponding to the coordinate value of the central point, the camera parameter f,

u₀，v₀Can be obtained by inquiring the factory parameters of the camera or manually calibrating, thereby obtaining the coordinate system of each Z in the camera coordinate system_cThe value of the horizontal coordinate value X of the corresponding center point coordinate value_cAnd a longitudinal coordinate value Y_c。

Secondly, the calculated transverse coordinate value X is calculated under a camera coordinate system_cAnd a longitudinal coordinate value Y_cAnd a depth coordinate value Z corresponding thereto_cThe coordinate values of the corresponding points, i.e. the coordinate values of the central point of the target to be detected in the pixel coordinate system, are mapped into a plurality of corresponding points (X) in the camera coordinate system_c，Y_c，Z_c) The coordinate values of the plurality of corresponding points constitute a point set.

And step 206, generating a corresponding three-dimensional anchor point frame by taking each point in the point set as a central point.

Optionally, step 206 comprises the steps of:

step 2061, obtaining preset scale data of the three-dimensional anchor point frame.

And step 2062, taking each point as a central point, and generating at least one three-dimensional anchor point frame matched with the scale data.

Wherein the preset scale data comprises: length, width and height.

Specifically, in this example, first, scale data of an object that is the same as the target to be detected is obtained from a training set that is the same as the target to be detected, where the scale data includes the length, width, and height of the object.

Optionally, if the target to be detected is a vehicle, the preset scale data is the length, width and height of the vehicle. Obviously, the target to be detected may also be other objects, and is not limited specifically here.

Next, in a camera coordinate system, each point in the point set obtained in step 206 is used as a central point, and a three-dimensional anchor point frame corresponding to the scale data is generated, that is, the length, width, and height of the three-dimensional anchor point frame are the length, width, and height of the scale data. The number of the three-dimensional anchor point frames generated by each central point can be limited according to the actual application requirements.

Alternatively, for example, 12 three-dimensional anchor blocks are generated, the yaw angle θ of the anchor block is calculated as follows:

wherein, the value range of n is as follows: n belongs to [0,11], and n is a positive integer.

Optionally, the generated three-dimensional anchor boxes are evenly distributed around the central point.

And step 207, carrying out target detection on the point cloud data falling in the three-dimensional anchor point frame.

Specifically, in this embodiment, first, point cloud data falling within the three-dimensional anchor box is input to a convolutional neural network, for example: and the point net network is used for extracting the characteristics of the point cloud data to obtain a characteristic diagram. Through the steps, the point clouds with different points in each anchor point frame are converted into the shallow feature map with the same dimensionality.

And secondly, splicing the shallow feature maps, and inputting the spliced shallow feature maps into a full convolution neural network to obtain a deep feature map.

Thirdly, inputting the deep feature map into a detection head and a regression head, wherein the detection head can be a common model of the conventional convolutional neural network, and the deep feature map is classified twice through the detection head without specific limitation. Before classification, the detection head needs to be trained, the trained data set uses a plurality of three-dimensional anchor points generated in step 206, the cross-over ratio of each anchor point and the real detection frame of the target to be detected is calculated, each anchor point is labeled according to the cross-over ratio result and a set threshold, and the anchor points are classified into a positive type and a negative type.

Optionally, considering the situation that the number of the positive type samples is small and the number of the negative type samples is large, the screening may be performed during the training, that is, the number of the negative type samples is reduced, so that the ratio of the positive type samples to the negative type samples is within a reasonable range, so as to improve the output accuracy of the detection head.

Further, when the trained detection head detects the deep layer feature map, a positive anchor frame and a negative anchor frame can be obtained.

And finally, performing regression on the positive anchor point frames output by the detection head, and selecting the optimal anchor point frame by a non-maximum suppression algorithm.

According to the target detection method provided by the embodiment of the invention, the data of the central point of the target to be detected in the two-dimensional coordinate system is converted into the point set associated with the data of the central point in the three-dimensional camera coordinate system, and each point in the point set is used as the central point to generate the corresponding three-dimensional anchor point frame, so that the three-dimensional anchor point frame is limited to be generated only in part of the camera space, the number of the generated three-dimensional anchor point frames is reduced, the matching calculation amount of the later anchor point frame and the real target detection frame is reduced, the matching speed is accelerated, and the target detection efficiency is improved. In addition, according to the camera imaging principle, the central point of the target to be detected in the camera coordinate system is certainly closer to a certain point in the point set, the three-dimensional anchor point frame generated by taking the points as the central point can cover the target to be detected, and meanwhile, the three-dimensional anchor point frame is closer to the real detection frame of the target to be detected, so that the accuracy of target detection is improved.

Fig. 3 is a schematic structural diagram of a target detection apparatus according to a third embodiment of the present invention, and as shown in fig. 3, the target detection apparatus according to the third embodiment includes:

a central point data acquisition module 10, a point cloud data acquisition module 15, a conversion data acquisition module 11, a determination module 12, a generation module 13, and a detection module 14, wherein,

and a central point data acquisition module 10, configured to acquire central point data of the target to be detected in the pixel coordinate system.

Specifically, in this embodiment, the central point data obtaining module may perform calculation and obtain through two-dimensional frame coordinate data of the target to be detected in the pixel coordinate system.

And the point cloud data acquisition module 15 is used for acquiring point cloud data of the target to be detected in the camera coordinate system.

Specifically, in this embodiment, the point cloud data acquisition module may acquire the point cloud data of the target to be detected in the camera coordinate system by performing coordinate system transformation on the laser point cloud data of the laser radar coordinate system returned by the target to be detected.

And the conversion data acquisition module 11 is used for acquiring conversion data of the central point data in a camera coordinate system.

Specifically, in this embodiment, the conversion data obtaining module performs values at certain intervals on the Z axis in the camera coordinate system, and the series of values are referred to as conversion data of the central point data in the camera coordinate system.

A determining module 12 for determining a set of points associated with the center point data in the camera coordinate system based on the transformation data and the center point data.

Specifically, in this embodiment, the central point data and the conversion data of the central point data in the camera coordinate system are input into a coordinate system transformation formula to output a horizontal coordinate value and a vertical coordinate value corresponding to each central point data; and converting the central point data corresponding to each point in a camera coordinate system, wherein the transverse coordinate values and the longitudinal coordinate values form coordinate values of corresponding points in the camera coordinate system, and the coordinate values of the corresponding points form a point set.

And the generating module 13 is configured to generate a corresponding three-dimensional anchor point frame by using each point in the point set as a central point.

Specifically, in this embodiment, in the camera coordinate system, each point in the point set is taken as a central point, and the scale data of the object of the same type as the target to be detected is taken as an input, so as to generate the three-dimensional anchor point frame corresponding to the scale data. The number of the three-dimensional anchor point frames generated by each central point can be limited according to the actual application requirements.

And the detection module 14 is used for carrying out target detection on the point cloud data falling in the three-dimensional anchor point frame.

Specifically, in this embodiment, the point cloud data of the target to be detected is distributed in the three-dimensional camera space, and only the point cloud data falling into the three-dimensional anchor point frame is subjected to target detection. The detection method may use target detection methods such as fast-R-CNN (fast-Region-conditional Neural Networks), sdd (single Shot multi box detector), and the like, and is not particularly limited herein as long as the point cloud data of the target to be detected falling into the three-dimensional anchor frame can be effectively detected according to the target detection method.

The central point data acquisition module 10 is connected to the conversion data acquisition module 11, the conversion data acquisition module 11 is connected to the determination module 12, and the generation module 13 is connected to the determination module 12, the cloud data acquisition module 15 and the detection module 14, respectively.

The target detection apparatus provided in this embodiment may execute the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 4 is a schematic structural diagram of an apparatus for object detection according to a fourth embodiment of the present invention, and as shown in fig. 4, the apparatus for object detection according to the present embodiment is further refined on the basis of the apparatus for object detection according to the embodiment shown in fig. 3, and then the apparatus for object detection according to the present embodiment includes:

the central point data acquisition module 10 is used for acquiring central point data of the target to be detected in a pixel coordinate system;

the point cloud data acquisition module 15 is used for acquiring point cloud data of the target to be detected in a camera coordinate system;

a conversion data obtaining module 11 for obtaining the conversion data of the central point data in the camera coordinate system

A determining module 12, configured to determine, according to the conversion data and the center point data, a point set associated with the center point data in the camera coordinate system;

the generating module 13 is configured to generate a corresponding three-dimensional anchor point frame by using each point in the point set as a central point;

Further, the conversion data obtaining module 11 specifically includes:

the dereferencing submodule 11a is configured to dereference on a Z axis of a camera coordinate system at preset intervals to obtain each depth coordinate value;

a determination submodule 11b for determining each depth coordinate value as conversion data.

The determining submodule 11b is connected to the central point data acquiring module 10, the value obtaining submodule 11a and the determining module 12, and the generating module 13 is connected to the determining module 12, the point cloud data acquiring module 15 and the detecting module 14.

Further, the central point data is a central point coordinate value, and the determining module 12 is specifically configured to:

respectively inputting the coordinate value of the central point and each depth coordinate value into a transformation formula so as to output a transverse coordinate value and a longitudinal coordinate value corresponding to each point; forming the depth coordinate value, the horizontal coordinate value and the longitudinal coordinate value corresponding to each point into coordinate values of corresponding points in a camera coordinate system; and forming the coordinate values of the corresponding points into a point set.

Further, the generating module 13 is specifically configured to:

acquiring preset scale data of a three-dimensional anchor point frame; generating at least one three-dimensional anchor point frame matched with the scale data by taking each point as a central point; wherein the preset scale data comprises: length, width and height.

Further, the central point data obtaining module 10 is specifically configured to:

acquiring two-dimensional frame coordinate data of a target to be detected in a pixel coordinate system; and calculating central point data according to the two-dimensional frame coordinate data.

Further, the point cloud data obtaining module 15 is specifically configured to:

acquiring laser point cloud data returned by a target to be detected; and converting the laser point cloud data from the laser radar coordinate system to the camera coordinate system to obtain the point cloud data in the camera coordinate system. The target detection apparatus provided in this embodiment may execute the technical solution of the method embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention, and as shown in fig. 5, the electronic device according to the present embodiment includes: a memory 1001, a processor 1002, and computer programs.

The computer program is stored in the memory 1001 and configured to be executed by the processor 1002 to implement the method for object detection according to any one of the embodiments of the present invention corresponding to fig. 1-2.

The memory 1001 and the processor 1002 are connected by a bus 1003.

The relevant description may be understood by referring to the relevant description and effect corresponding to the steps in fig. 1 to fig. 2, and redundant description is not repeated here.

One embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the method for object detection provided by any one of the embodiments corresponding to fig. 1-2 of the present invention.

The computer readable storage medium may be, among others, ROM, Random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method of object detection, comprising:

2. The method of claim 1, wherein the obtaining of the transformed data of the center point data in a camera coordinate system comprises:

taking values on a Z axis of a camera coordinate system at preset intervals to obtain each depth coordinate value;

determining each depth coordinate value as the conversion data.

3. The method of claim 1, wherein the center point data is center point coordinate values,

the determining a set of points associated with the center point data in a camera coordinate system according to the transformation data and the center point data includes:

4. The method of claim 1, wherein generating the corresponding three-dimensional anchor point box with each point in the set of points as a center point comprises:

acquiring preset scale data of the three-dimensional anchor point frame;

wherein the preset scale data comprises: length, width and height.

5. The method according to any one of claims 1 to 4, wherein the acquiring of the center point data of the object to be detected in the pixel coordinate system comprises:

6. The method according to any one of claims 1 to 4, wherein the acquiring point cloud data of the object to be detected under a camera coordinate system comprises:

acquiring laser point cloud data returned by the target to be detected;

7. An apparatus for object detection, comprising:

8. The apparatus of claim 7, wherein the transformed data acquisition module comprises:

9. An electronic device, comprising:

a memory, a processor, and a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-6.

10. A computer-readable storage medium, having stored thereon a computer program for execution by a processor to perform the method of any one of claims 1-6.