CN113223091A - Three-dimensional target detection method, three-dimensional target capture device and electronic equipment - Google Patents

Three-dimensional target detection method, three-dimensional target capture device and electronic equipment Download PDF

Info

Publication number
CN113223091A
CN113223091A CN202110473106.3A CN202110473106A CN113223091A CN 113223091 A CN113223091 A CN 113223091A CN 202110473106 A CN202110473106 A CN 202110473106A CN 113223091 A CN113223091 A CN 113223091A
Authority
CN
China
Prior art keywords
point cloud
camera
world
coordinate system
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110473106.3A
Other languages
Chinese (zh)
Other versions
CN113223091B (en
Inventor
刘亦芃
杜国光
赵开勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cloudminds Robotics Co Ltd
Original Assignee
Cloudminds Robotics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cloudminds Robotics Co Ltd filed Critical Cloudminds Robotics Co Ltd
Priority to CN202110473106.3A priority Critical patent/CN113223091B/en
Publication of CN113223091A publication Critical patent/CN113223091A/en
Priority to PCT/CN2021/143443 priority patent/WO2022227678A1/en
Application granted granted Critical
Publication of CN113223091B publication Critical patent/CN113223091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the technical field of computer vision, and discloses a three-dimensional target detection method, a three-dimensional target grabbing device and electronic equipment. The method comprises the following steps: acquiring a depth image containing a target recognition object; generating a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, wherein the camera point cloud is a point cloud under a camera coordinate system; converting the camera point cloud into a world point cloud, wherein the world point cloud is a point cloud under a world coordinate system; carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system; and generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system. Through the mode, the detection quality of the three-dimensional target is improved.

Description

Three-dimensional target detection method, three-dimensional target capture device and electronic equipment
Technical Field
The embodiment of the invention relates to the technical field of computer vision, in particular to a three-dimensional target detection method, a three-dimensional target grabbing device and electronic equipment.
Background
Three-dimensional target detection refers to a technique of detecting three-dimensional space coordinates of an object. In the field of automatic driving, the collision of the vehicle can be avoided by controlling the vehicle through three-dimensional target detection; in the field of service robots, objects can be accurately grabbed through three-dimensional target detection.
Three-dimensional target detection generally outputs bounding boxes, categories, and corresponding confidences of target identifiers according to input point cloud data. However, in the related art, when detecting a three-dimensional object, it is generally required to obtain external parameters of a camera, and convert point cloud data in a camera coordinate system into point cloud data in a world coordinate system according to the external parameters of the camera. When the external parameters of the camera cannot be acquired, the accuracy of detecting the three-dimensional target by the related technology is low.
Disclosure of Invention
In view of the above problems, embodiments of the present invention provide a three-dimensional target detection method, a three-dimensional target capture device, and an electronic device, so as to solve the problem in the prior art that the three-dimensional target detection accuracy is low.
According to an aspect of an embodiment of the present invention, there is provided a three-dimensional target detection method, including:
acquiring a depth image containing a target recognition object;
generating a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, wherein the camera point cloud is a point cloud under a camera coordinate system;
converting the camera point cloud into a world point cloud, wherein the world point cloud is a point cloud under a world coordinate system;
carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system;
and generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system.
In an alternative approach, the converting the camera point cloud to a world point cloud comprises:
registering the camera point cloud with a preset plane point cloud to generate a transformation matrix from a camera coordinate system to a world coordinate system;
converting the camera point cloud to a world point cloud according to the transformation matrix.
In an alternative manner, the registering the camera point cloud with a preset plane point cloud to generate a transformation matrix from a camera coordinate system to a world coordinate system includes:
respectively calculating the mean values of the camera point clouds in three dimensions;
constructing a homogeneous transformation matrix according to the mean value, and setting the homogeneous transformation matrix as an initial value of an iterative closest point algorithm;
and generating a transformation matrix from a camera coordinate system to a world coordinate system according to the iterative closest point algorithm and the plane point cloud vertical to the gravity axis.
In an alternative approach, the converting the camera point cloud to a world point cloud according to the transformation matrix comprises:
determining a rotation matrix corresponding to the transformation matrix;
if the rotation angle corresponding to the rotation matrix is larger than 90 degrees, generating a world point cloud according to the rotation matrix and the camera point cloud;
and if the rotation angle corresponding to the rotation matrix is not more than 90 degrees, generating the world point cloud according to the complementary angle rotation amount corresponding to the rotation matrix and the camera point cloud.
In an optional manner, the method further comprises:
constructing a point cloud data training set, wherein the point cloud data training set comprises a plurality of groups of world point cloud data and label information corresponding to each group of world point cloud data;
and training a preset target recognition algorithm by using the point cloud data training set to generate the target recognition model.
In an alternative approach, the constructing a training set of point cloud data includes:
constructing a three-dimensional model library, wherein the three-dimensional model library comprises three-dimensional models of a plurality of identification objects;
after each identification object is placed in a world coordinate system, calculating an initial value of a bounding box of each identification object;
carrying out simulation arrangement on each identification object, and calculating a bounding box simulation value of each identification object at a simulation position;
randomly generating a camera view angle, and rendering based on the camera view angle to generate camera point cloud data of each identification object;
converting the camera point cloud data of each identification object into corresponding world point cloud data;
and adding label information to the corresponding world point cloud data.
According to another aspect of the embodiments of the present invention, there is provided a three-dimensional object capturing method, including the three-dimensional object detecting method described above, the three-dimensional object capturing method further includes:
determining the space position of the target recognition object according to the bounding box of the target recognition object under the camera coordinate system;
and generating a grabbing instruction according to the space position, so that the grabber grabs the target recognition object according to the grabbing instruction.
According to another aspect of the embodiments of the present invention, there is provided a three-dimensional object detecting apparatus, the apparatus including:
the acquisition module is used for acquiring a depth image containing a target recognition object;
the first generation module is used for generating a camera point cloud corresponding to the depth image according to the depth image and camera internal parameters, and the camera point cloud is a point cloud under a camera coordinate system;
the conversion module is used for converting the camera point cloud into a world point cloud, and the world point cloud is a point cloud under a world coordinate system;
the second generation module is used for carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system;
and the third generation module is used for generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system.
According to another aspect of the embodiments of the present invention, there is provided an electronic device including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the three-dimensional object detection method or the three-dimensional object grabbing method.
According to another aspect of the embodiments of the present invention, there is provided a computer-readable storage medium, in which at least one executable instruction is stored, and when the executable instruction is executed on an electronic device, the electronic device executes the operations of the three-dimensional object detection method or the three-dimensional object capture method.
According to the embodiment of the invention, the camera point cloud corresponding to the depth image can be generated through the depth image and the camera internal reference; after the camera point cloud is converted into the world point cloud, target detection can be performed on the world point cloud according to a preset target identification model so as to generate a bounding box of a target identification object under a world coordinate system; furthermore, the bounding box of the target identification object in the camera coordinate system can be generated according to the bounding box of the target identification object in the world coordinate system, so that the detection of the target identification object is completed. Therefore, the method and the device can generate the bounding box of the target identification object under the camera coordinate system based on the camera point cloud under the condition that the camera external reference is not acquired, and can improve the detection accuracy of the target identification object.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.
Drawings
The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flow chart illustrating a three-dimensional target detection method according to an embodiment of the present invention;
fig. 2(a) is a schematic diagram illustrating an identifier placement scene and a corresponding simulated camera position according to an embodiment of the present invention;
FIG. 2(b) is a schematic diagram showing rendering effects of the camera in FIG. 2 (a);
fig. 3(a) is a schematic diagram illustrating another identifier placement scene and a corresponding simulated camera position according to an embodiment of the present invention;
FIG. 3(b) is a schematic diagram showing rendering effects of the camera in FIG. 3 (a);
fig. 4 is a schematic flow chart illustrating a three-dimensional object capture method according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a three-dimensional object detection apparatus provided in an embodiment of the present invention;
fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein.
Fig. 1 shows a flow chart of a three-dimensional object detection method according to an embodiment of the invention, which is performed by an electronic device. The memory of the electronic device is used for storing at least one executable instruction, and the executable instruction enables the processor of the electronic device to execute the operation of the three-dimensional target detection method. The electronic device may be a robot, car, computer or other terminal device. As shown in fig. 1, the method comprises the steps of:
step 110: and acquiring a depth image containing the target recognition object.
The depth image may be an RGBD image, i.e., a depth image of an RGB color mode. The target recognition object in the depth image is a recognition object which needs target detection. The object identifiers can be, for example, cups, bottles, fruits, etc. Generally, a depth image including a target recognition object can be acquired by shooting a scene including the target recognition object by a depth camera.
Step 120: and generating a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, wherein the camera point cloud is a point cloud under a camera coordinate system.
And generating a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, wherein the camera point cloud is a point cloud under a camera coordinate system. The camera internal parameter is a parameter related to a characteristic of a camera that captures a depth image, and generally includes a focal length, a pixel size, and the like of the camera.
Step 130: and converting the camera point cloud into a world point cloud, wherein the world point cloud is a point cloud under a world coordinate system.
In an alternative manner, the camera point cloud and the preset plane point cloud may be registered to generate a transformation matrix from the camera coordinate system to the world coordinate system, and the camera point cloud is converted into the world point cloud according to the transformation matrix. In order to obtain a transformation matrix, the mean values of the camera point cloud in three dimensions can be respectively calculated, a homogeneous transformation matrix is constructed according to the mean values, the homogeneous transformation matrix is set as an initial value of an iterative closest point algorithm, and the transformation matrix from a camera coordinate system to a world coordinate system is generated according to the iterative closest point algorithm and the plane point cloud perpendicular to the gravity axis.
For example, the mean value of the camera point cloud in each dimension in the three-dimensional space is first calculated
Figure BDA0003046127500000061
And
Figure BDA0003046127500000062
then construct homogeneous transformation matrix
Figure BDA0003046127500000063
As an initial value of the iterative closest point algorithm; and generating a plane point cloud perpendicular to a gravity axis (z axis) of the world coordinate system to obtain a transformation matrix from the camera point cloud to the plane point cloud, and converting the camera point cloud into the world point cloud through the transformation matrix.
In an optional manner, when the camera point cloud is converted into the world point cloud according to the transformation matrix, first, a rotation matrix corresponding to the transformation matrix is determined, if a rotation angle corresponding to the rotation matrix is greater than 90 degrees, the world point cloud is generated according to the rotation matrix and the camera point cloud, and if the rotation angle corresponding to the rotation matrix is not greater than 90 degrees, the world point cloud is generated according to a complementary angle rotation amount corresponding to the rotation matrix and the camera point cloud. For example, if the rotation angle does not exceed 90 degrees, the difference between 180 degrees and the rotation angle is taken as the rotation angle of the rotation matrix.
Step 140: and carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system.
The target detection can be carried out on the world point cloud according to a preset target recognition model so as to generate a bounding box of the target identification object under the world coordinate system. Bounding boxes, i.e. bounding minimum rectangles, are algorithms for solving the optimal bounding space of a set of discrete points, the basic idea being to approximately replace complex geometric objects with slightly larger and characteristically simple geometries (bounding boxes). The bounding boxes of the target identifiers may be, for example, AABB bounding boxes, bounding balls, directional bounding boxes OBB, and fixed directional convex hulls FDH. Before target detection is performed on the world point cloud according to a preset target recognition model, a target recognition algorithm can be trained based on deep learning to generate a target recognition model. The training process of the target recognition algorithm is explained in detail below.
Before training a target recognition algorithm, a point cloud data training set is required to be constructed, wherein the point cloud data training set comprises a plurality of groups of world point cloud data and label information corresponding to each group of world point cloud data. And training a preset target recognition algorithm by using the point cloud data training set to generate a target recognition model. In one embodiment of the invention, the target recognition model may be, for example, a Vote Net network (three-dimensional target detection network). The Vote Net network is an end-to-end 3D object detection network based on the cooperation of deep point set network and Hough voting.
In an alternative way, a point cloud data training set can be constructed by:
and constructing a three-dimensional model library, wherein the three-dimensional model library comprises a plurality of three-dimensional models of the identification objects, and each identification object is arranged to a world coordinate system (the x axis is towards the right, the y axis is towards the front, and the z axis is towards the upper) so that the long axis corresponds to the y axis, the width corresponds to the x axis, and the height corresponds to the z axis when the object is vertically placed. The bounding box for each recognition can then be calculated using principal component analysis. Further, a recognition object placing scene for simulation is constructed, each recognition object is placed at the simulation position under the placing scene, and a bounding box of each recognition object at the simulation position is calculated. If the placing position comprises a plurality of identification objects, collision detection can be carried out to ensure that the identification objects do not collide with each other. The placing position is a space position of each identification object in a preset space range under a world coordinate system, after the identification objects are placed to the world coordinate system, the initial position of the identification objects is determined, and the placing position of the identification objects is determined through a translation matrix and a rotation matrix, wherein the rotation matrix is a rotation matrix around a z axis. Further, a plurality of camera view angles can be randomly generated, the world point cloud data is rendered based on each camera view angle to generate camera point cloud data of a corresponding camera view angle of each identification object, and identification object categories corresponding to the camera point cloud data, the centroid, the length, the width, the height of a corresponding bounding box and a rotation angle around the z axis are stored.
Fig. 2(a) shows a schematic diagram of an object placement scene and a corresponding simulated camera position provided by an embodiment of the present invention, and fig. 2(b) shows a schematic diagram of a rendering effect of the camera in fig. 2 (a); in the object placement scene in fig. 2(a), a camera view angle is randomly generated, and the object point cloud data in the world coordinate system is rendered based on the camera view angle, so that the rendering effect in fig. 2(b) can be obtained. Similarly, fig. 3(a) shows a schematic diagram of another object placement scene and a corresponding simulated camera position provided by the embodiment of the present invention, and fig. 3(b) shows a schematic diagram of a rendering effect of the camera in fig. 3 (a); in the object placement scene in fig. 3(a), a camera view angle is randomly generated, and the object point cloud data in the world coordinate system is rendered based on the camera view angle, so that the rendering effect in fig. 3(b) can be obtained. It should be noted that, for any object placement scene, multiple camera view angles may be randomly generated, and the world point cloud of the identification object is rendered based on each camera view angle, so as to obtain the camera point cloud under the corresponding camera view angle.
The following describes a process of calculating a bounding box of a recognition object by a principal component analysis method using a formula.
Let M be a 3 × n matrix representing the coordinates of the point clouds in the three-dimensional space, and n be the number of point clouds. Let mean (M) denote the matrix formed by the mean values of M in three dimensions, i.e. mean (M) is also a 3 × n matrix, the elements of each row are equal, and the elements of each row are equal to the mean value of M in the corresponding dimension. Definition of
Figure BDA0003046127500000081
Computing
Figure BDA0003046127500000082
The covariance matrix Corr of (a) is,
Figure BDA0003046127500000083
and the feature value a and the feature vector V of Corr are determined so that CorrV becomes AV. Further, the column vectors of the feature vectors V are rearranged to obtain feature vectors V' corresponding to 6 different placement modes of the identification object.
Further, by calculating M ' ═ V ' M, corrected point clouds M ' of 6 different placement states of the recognition object can be obtained. After M 'is translated to the origin, i.e., M' -mean (M '), the bounding box B of the corrected point cloud M' can be calculated. Wherein xmin, ymin and zmin are minimum values of the correction point cloud M 'in the x-axis direction, the y-axis direction and the z-axis direction, respectively, and xmax, ymax and zmax are maximum values of the correction point cloud M' in the x-axis direction, the y-axis direction and the z-axis direction, respectively.
Figure BDA0003046127500000084
By rotating the matrix
Figure BDA0003046127500000085
And the translation matrix t ═ tx,ty,tz]TThe correction point cloud M 'may be randomly placed and then updated to RM' + t. Wherein theta is the rotation angle of the correction point cloud M' around the z-axis, tx、tyAnd tzThe translation amounts of the correction point cloud M' in the x-axis, y-axis and z-axis, respectively.
The following describes a process of randomly generating a camera view angle and rendering a point cloud under a world coordinate system based on the camera view angle through a formula.
Wherein a position matrix C of the virtual camera can be setP=[xp,yp,zp]TFront orientation matrix Cf=[xf,yf,zf]TAnd an upper orientation matrix Ct=[xt,yt,zt]TThen it can be found that the left direction matrix of the camera is Cl=[ytzf-ztyf,ztxf-xtzf,xtyf-ytxf]T. The camera view angle of the virtual camera at the corresponding position can be determined through the front direction matrix, the upper direction matrix and the left direction matrix. Let T beCIs a homogeneous transformation matrix of the camera coordinate system relative to the world coordinate system
Figure BDA0003046127500000086
Wherein the content of the first and second substances,
Figure BDA0003046127500000087
as a cameraThe external reference matrix of (a) is,
Figure BDA0003046127500000091
a transformation matrix for the direction of the camera coordinate system relative to the world coordinate system.
By solving the above linear equation
Figure BDA0003046127500000092
Further, by the pair TCInverting to obtain a homogeneous transformation matrix of the world coordinate system relative to the camera coordinate system
Figure BDA0003046127500000093
Further, the camera point cloud coordinates M of the recognized objectCIs composed of
Figure BDA0003046127500000094
The embodiment of the invention selects the Vote Net network to be trained so as to obtain the target recognition model. Vote Net predicts rotation around a single axis only relatively well, so before training a Vote Net network based on deep learning, the camera point cloud of the recognizer needs to be transformed to the world point cloud, even if the direction of gravity is aligned with the-z axis. Further, the camera point cloud of the recognition object may be converted to a recognition object world point cloud based on an iterative closest point algorithm. The following describes a process of converting a camera point cloud of an identification object into a world point cloud.
In an alternative mode, the mean value of the camera point cloud of the identified object in each dimension in the three-dimensional space is calculated firstly
Figure BDA0003046127500000095
And
Figure BDA0003046127500000096
then, based on the mean value of each dimension
Figure BDA0003046127500000097
And
Figure BDA0003046127500000098
constructing homogeneous transformation matrices
Figure BDA0003046127500000099
As an initial value for the iterative closest point algorithm. Because the background desktop accounts for a large amount in the scene of placing the identification object, and the proportion of the point cloud corresponding to the background desktop is large, a plane point cloud perpendicular to the z axis is generated, plane registration can be performed by using an iterative closest point algorithm, a transformation matrix from the camera point cloud of the identification object to the plane point cloud is calculated, the transformation matrix comprises a translation matrix and a rotation matrix, and the rotation angle corresponding to the rotation matrix can be further determined.
In addition, since the default is a plan view when the robot grasps the object, (0,0,1)TThe rotation angle of the vector should exceed 90 degrees; if (0,0,1)TIf the rotation angle of the vector does not exceed 90 degrees, 180 degrees are compared with (0,0,1)TThe difference between the rotation angles of the vectors is taken as the rotation angle of the rotation matrix. And finally, converting the camera point cloud into a world point cloud through a rotating matrix, namely, enabling the-z axis to be consistent with the gravity direction.
And converting the camera point cloud data of the camera view angle corresponding to each placing position into world point cloud data, and adding label information to the world point cloud data to construct a point cloud data training set. The tag information may include, for example, the category of the corresponding identifier, and the centroid, length, width, height, and rotation angle about the z-axis of the bounding box corresponding to the simulated location.
The Vote Net network takes world point cloud as input and outputs a 3D bounding box, confidence coefficient and category of the target recognition object in the actual placement scene. The three-dimensional target is detected through the Vote Net network, only coordinate information of the world point cloud is needed, dependence on density of the world point cloud is not high, and generalization performance is good. Although Vote Net has achieved good results in the task of 3D object detection in indoor scenes, it is true data of large indoor objects that it handles. In this specification, Vote Net is used to process simulation data, train with the simulation data, and detect a world point cloud obtained from real shot data. The geometric characteristics of the simulation data and the real shooting data are not greatly different, so that the feasibility of the embodiment of the invention is better.
The following describes the training of the Vote Net network based on a point cloud data training set.
When the Vote Net network is trained, firstly, 2.5D point clouds in a simulation scene are constructed according to similar densities, then shooting is carried out through a virtual camera, world point cloud data are generated according to camera point cloud data obtained through shooting, and label information of each world point cloud data is automatically obtained, so that the training speed of a target recognition model can be improved. And inputting the world point cloud data containing the label information into a Vote Net network for training, and determining the total number of training rounds according to the cloud amount of the points. After Vote Net network training is finished, three-dimensional target detection is carried out on the world point cloud processed by the iterative closest point algorithm, and a 3D bounding box, confidence coefficient and a recognizer category of a target recognizer corresponding to camera point cloud data can be obtained.
Step 150: and generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system.
The bounding box of the target identification object in the world coordinate system can be converted into the bounding box of the target identification object in the camera coordinate system according to the rotation matrix. Further, the bounding box matrix of the object identification object under the camera coordinate system can be obtained by right multiplying the rotation matrix by the bounding box matrix of the object identification object under the world coordinate system.
According to the embodiment of the invention, the camera point cloud corresponding to the depth image can be generated through the depth image and the camera internal reference; after the camera point cloud is converted into the world point cloud, target detection can be performed on the world point cloud according to a preset target identification model so as to generate a bounding box of a target identification object under a world coordinate system; furthermore, the bounding box of the target identification object in the camera coordinate system can be generated according to the bounding box of the target identification object in the world coordinate system, so that the detection of the target identification object is completed. Therefore, the method and the device can generate the bounding box of the target identification object under the camera coordinate system based on the camera point cloud under the condition that the camera external reference is not acquired, and can improve the detection accuracy of the target identification object.
Fig. 4 is a flowchart illustrating a three-dimensional object capture method according to another embodiment of the invention, which is performed by an electronic device. The memory of the electronic device is used for storing at least one executable instruction, and the executable instruction enables the processor of the electronic device to execute the operation of the three-dimensional object grabbing method. As shown in fig. 4, the method comprises the steps of:
step 210: and determining the space position of the target recognition object according to the bounding box of the target recognition object under the camera coordinate system.
The space position of the target recognition object can be determined according to the bounding box of the target recognition object under the camera coordinate system. The spatial position of the target recognition object includes the spatial coordinates of the target recognition object and the rotation angle of the target recognition object in the three-dimensional space.
Step 220: and generating a grabbing instruction according to the space position, so that the grabber grabs the target recognition object according to the grabbing instruction.
The grabbing instruction can be generated according to the space position of the target recognition object, and the grabbing instruction is sent to the grabber used for grabbing the target recognition object. The grabber can determine a grabbing path of the target identification object according to the grabbing instruction, and grab the target identification object according to the grabbing path.
According to the embodiment of the invention, the bounding box of the target identification object under the camera coordinate system is generated based on the camera point cloud, the spatial position of the target identification object is determined according to the bounding box of the target identification object under the camera coordinate system, and the grabbing instruction is generated according to the spatial position, so that the grabber can accurately grab the target identification object according to the grabbing instruction.
Fig. 5 shows a schematic structural diagram of a three-dimensional target detection apparatus according to an embodiment of the present invention. As shown in fig. 5, the apparatus 300 includes: an acquisition module 310, a first generation module 320, a conversion module 330, a second generation module 340, and a third generation module 350.
The acquiring module 310 is configured to acquire a depth image including a target recognition object;
a first generating module 320, configured to generate a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, where the camera point cloud is a point cloud under a camera coordinate system;
a conversion module 330, configured to convert the camera point cloud into a world point cloud, where the world point cloud is a point cloud under a world coordinate system;
the second generating module 340 is configured to perform target detection on the world point cloud according to a preset target identification model to generate a bounding box of the target identifier in a world coordinate system;
a third generating module 350, configured to generate a bounding box of the target identifier in the camera coordinate system according to the bounding box of the target identifier in the world coordinate system.
In an alternative manner, the conversion module 330 is configured to:
registering the camera point cloud with a preset plane point cloud to generate a transformation matrix from a camera coordinate system to a world coordinate system;
converting the camera point cloud to a world point cloud according to the transformation matrix.
In an alternative manner, the conversion module 330 is configured to:
respectively calculating the mean values of the camera point clouds in three dimensions;
constructing a homogeneous transformation matrix according to the mean value, and setting the homogeneous transformation matrix as an initial value of an iterative closest point algorithm;
and generating a transformation matrix from a camera coordinate system to a world coordinate system according to the iterative closest point algorithm and the plane point cloud vertical to the gravity axis.
In an alternative manner, the conversion module 330 is configured to:
determining a rotation matrix corresponding to the transformation matrix;
if the rotation angle corresponding to the rotation matrix is larger than 90 degrees, generating a world point cloud according to the rotation matrix and the camera point cloud;
and if the rotation angle corresponding to the rotation matrix is not more than 90 degrees, generating the world point cloud according to the complementary angle rotation amount corresponding to the rotation matrix and the camera point cloud.
In an optional manner, the apparatus 300 further comprises a training module for:
constructing a point cloud data training set, wherein the point cloud data training set comprises a plurality of groups of world point cloud data and label information corresponding to each group of world point cloud data;
and training a preset target recognition algorithm by using the point cloud data training set to generate the target recognition model.
In an alternative, the training module is configured to:
constructing a three-dimensional model library, wherein the three-dimensional model library comprises three-dimensional models of a plurality of identification objects;
after each identification object is placed in a world coordinate system, calculating an initial value of a bounding box of each identification object;
carrying out simulation arrangement on each identification object, and calculating a bounding box simulation value of each identification object at a simulation position;
randomly generating a camera view angle, and rendering based on the camera view angle to generate camera point cloud data of each identification object;
converting the camera point cloud data of each identification object into corresponding world point cloud data;
and adding label information to the corresponding world point cloud data.
According to the embodiment of the invention, the camera point cloud corresponding to the depth image can be generated through the depth image and the camera internal reference; after the camera point cloud is converted into the world point cloud, target detection can be performed on the world point cloud according to a preset target identification model so as to generate a bounding box of a target identification object under a world coordinate system; furthermore, the bounding box of the target identification object in the camera coordinate system can be generated according to the bounding box of the target identification object in the world coordinate system, so that the detection of the target identification object is completed. Therefore, the method and the device can generate the bounding box of the target identification object under the camera coordinate system based on the camera point cloud under the condition that the camera external reference is not acquired, and can improve the detection accuracy of the target identification object.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and a specific implementation of the electronic device is not limited in the specific embodiment of the present invention.
As shown in fig. 6, the electronic device may include: a processor (processor)402, a Communications Interface 404, a memory 406, and a Communications bus 408.
Wherein: the processor 402, communication interface 404, and memory 406 communicate with each other via a communication bus 408. A communication interface 404 for communicating with network elements of other devices, such as clients or other servers. The processor 402 is configured to execute the program 410, and may specifically perform the relevant steps in the embodiment of the three-dimensional object detection method described above.
In particular, program 410 may include program code comprising computer-executable instructions.
The processor 402 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 406 for storing a program 410. Memory 406 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 410 may specifically be invoked by the processor 402 to cause the electronic device to perform the following operations:
acquiring a depth image containing a target recognition object;
generating a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, wherein the camera point cloud is a point cloud under a camera coordinate system;
converting the camera point cloud into a world point cloud, wherein the world point cloud is a point cloud under a world coordinate system;
carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system;
and generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system.
In an alternative, the program 410 is invoked by the processor 402 to cause the electronic device to perform the following operations:
registering the camera point cloud with a preset plane point cloud to generate a transformation matrix from a camera coordinate system to a world coordinate system;
converting the camera point cloud to a world point cloud according to the transformation matrix.
In an alternative, the program 410 is invoked by the processor 402 to cause the electronic device to perform the following operations:
respectively calculating the mean values of the camera point clouds in three dimensions;
constructing a homogeneous transformation matrix according to the mean value, and setting the homogeneous transformation matrix as an initial value of an iterative closest point algorithm;
and generating a transformation matrix from a camera coordinate system to a world coordinate system according to the iterative closest point algorithm and the plane point cloud vertical to the gravity axis.
In an alternative, the program 410 is invoked by the processor 402 to cause the electronic device to perform the following operations:
determining a rotation matrix corresponding to the transformation matrix;
if the rotation angle corresponding to the rotation matrix is larger than 90 degrees, generating a world point cloud according to the rotation matrix and the camera point cloud;
and if the rotation angle corresponding to the rotation matrix is not more than 90 degrees, generating the world point cloud according to the complementary angle rotation amount corresponding to the rotation matrix and the camera point cloud.
In an alternative, the program 410 is invoked by the processor 402 to cause the electronic device to perform the following operations:
constructing a point cloud data training set, wherein the point cloud data training set comprises a plurality of groups of world point cloud data and label information corresponding to each group of world point cloud data;
and training a preset target recognition algorithm by using the point cloud data training set to generate the target recognition model.
In an alternative, the program 410 is invoked by the processor 402 to cause the electronic device to perform the following operations:
constructing a three-dimensional model library, wherein the three-dimensional model library comprises three-dimensional models of a plurality of identification objects;
after each identification object is placed in a world coordinate system, calculating an initial value of a bounding box of each identification object;
carrying out simulation arrangement on each identification object, and calculating a bounding box simulation value of each identification object at a simulation position;
randomly generating a camera view angle, and rendering based on the camera view angle to generate camera point cloud data of each identification object;
converting the camera point cloud data of each identification object into corresponding world point cloud data;
and adding label information to the corresponding world point cloud data.
In an alternative, the program 410 is invoked by the processor 402 to cause the electronic device to perform the following operations:
determining the space position of the target recognition object according to the bounding box of the target recognition object under the camera coordinate system;
and generating a grabbing instruction according to the space position, so that the grabber grabs the target recognition object according to the grabbing instruction.
According to the embodiment of the invention, the camera point cloud corresponding to the depth image can be generated through the depth image and the camera internal reference; after the camera point cloud is converted into the world point cloud, target detection can be performed on the world point cloud according to a preset target identification model so as to generate a bounding box of a target identification object under a world coordinate system; furthermore, the bounding box of the target identification object in the camera coordinate system can be generated according to the bounding box of the target identification object in the world coordinate system, so that the detection of the target identification object is completed. Therefore, the method and the device can generate the bounding box of the target identification object under the camera coordinate system based on the camera point cloud under the condition that the camera external reference is not acquired, and can improve the detection accuracy of the target identification object.
An embodiment of the present invention provides a computer-readable storage medium, where the storage medium stores at least one executable instruction, and when the executable instruction is executed on an electronic device, the electronic device is enabled to execute a three-dimensional object detection method in any method embodiment described above.
The embodiment of the invention provides a three-dimensional target detection device, which is used for executing the three-dimensional target detection method.
Embodiments of the present invention provide a computer program, which can be called by a processor to enable an electronic device to execute a three-dimensional object detection method in any of the above method embodiments.
Embodiments of the present invention provide a computer program product comprising a computer program stored on a computer-readable storage medium, the computer program comprising program instructions that, when run on a computer, cause the computer to perform the three-dimensional object detection method in any of the above-mentioned method embodiments.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (10)

1. A method for three-dimensional object detection, the method comprising:
acquiring a depth image containing a target recognition object;
generating a camera point cloud corresponding to the depth image according to the depth image and the camera internal parameters, wherein the camera point cloud is a point cloud under a camera coordinate system;
converting the camera point cloud into a world point cloud, wherein the world point cloud is a point cloud under a world coordinate system;
carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system;
and generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system.
2. The method of claim 1, wherein the converting the camera point cloud to a world point cloud comprises:
registering the camera point cloud with a preset plane point cloud to generate a transformation matrix from a camera coordinate system to a world coordinate system;
converting the camera point cloud to a world point cloud according to the transformation matrix.
3. The method of claim 2, wherein registering the camera point cloud with a pre-defined planar point cloud to generate a transformation matrix of a camera coordinate system to a world coordinate system comprises:
respectively calculating the mean values of the camera point clouds in three dimensions;
constructing a homogeneous transformation matrix according to the mean value, and setting the homogeneous transformation matrix as an initial value of an iterative closest point algorithm;
and generating a transformation matrix from a camera coordinate system to a world coordinate system according to the iterative closest point algorithm and the plane point cloud vertical to the gravity axis.
4. The method of claim 2 or 3, wherein the converting the camera point cloud to a world point cloud according to the transformation matrix comprises:
determining a rotation matrix corresponding to the transformation matrix;
if the rotation angle corresponding to the rotation matrix is larger than 90 degrees, generating a world point cloud according to the rotation matrix and the camera point cloud;
and if the rotation angle corresponding to the rotation matrix is not more than 90 degrees, generating the world point cloud according to the complementary angle rotation amount corresponding to the rotation matrix and the camera point cloud.
5. The method of claim 1, further comprising:
constructing a point cloud data training set, wherein the point cloud data training set comprises a plurality of groups of world point cloud data and label information corresponding to each group of world point cloud data;
and training a preset target recognition algorithm by using the point cloud data training set to generate the target recognition model.
6. The method of claim 5, wherein the constructing a training set of point cloud data comprises:
constructing a three-dimensional model library, wherein the three-dimensional model library comprises three-dimensional models of a plurality of identification objects;
after each identification object is placed in a world coordinate system, calculating an initial value of a bounding box of each identification object;
carrying out simulation arrangement on each identification object, and calculating a bounding box simulation value of each identification object at a simulation position;
randomly generating a camera view angle, and rendering based on the camera view angle to generate camera point cloud data of each identification object;
converting the camera point cloud data of each identification object into corresponding world point cloud data;
and adding label information to the corresponding world point cloud data.
7. A three-dimensional object grasping method comprising the three-dimensional object detecting method according to any one of claims 1 to 6, the three-dimensional object grasping method further comprising:
determining the space position of the target recognition object according to the bounding box of the target recognition object under the camera coordinate system;
and generating a grabbing instruction according to the space position, so that the grabber grabs the target recognition object according to the grabbing instruction.
8. A three-dimensional object detection apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a depth image containing a target recognition object;
the first generation module is used for generating a camera point cloud corresponding to the depth image according to the depth image and camera internal parameters, and the camera point cloud is a point cloud under a camera coordinate system;
the conversion module is used for converting the camera point cloud into a world point cloud, and the world point cloud is a point cloud under a world coordinate system;
the second generation module is used for carrying out target detection on the world point cloud according to a preset target identification model so as to generate a bounding box of the target identification object under a world coordinate system;
and the third generation module is used for generating the bounding box of the target identification object in the camera coordinate system according to the bounding box of the target identification object in the world coordinate system.
9. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform the operations of the three-dimensional object detection method according to any one of claims 1-6 or the three-dimensional object capture method according to claim 7.
10. A computer-readable storage medium having stored therein at least one executable instruction, which when run on an electronic device, causes the electronic device to perform the operations of the three-dimensional object detection method according to any one of claims 1 to 6 or the three-dimensional object capture method according to claim 7.
CN202110473106.3A 2021-04-29 2021-04-29 Three-dimensional target detection method, three-dimensional target capture device and electronic equipment Active CN113223091B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110473106.3A CN113223091B (en) 2021-04-29 2021-04-29 Three-dimensional target detection method, three-dimensional target capture device and electronic equipment
PCT/CN2021/143443 WO2022227678A1 (en) 2021-04-29 2021-12-30 Three-dimensional target detection method and grabbing method, apparatus, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110473106.3A CN113223091B (en) 2021-04-29 2021-04-29 Three-dimensional target detection method, three-dimensional target capture device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113223091A true CN113223091A (en) 2021-08-06
CN113223091B CN113223091B (en) 2023-01-24

Family

ID=77090035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110473106.3A Active CN113223091B (en) 2021-04-29 2021-04-29 Three-dimensional target detection method, three-dimensional target capture device and electronic equipment

Country Status (2)

Country Link
CN (1) CN113223091B (en)
WO (1) WO2022227678A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689351A (en) * 2021-08-24 2021-11-23 北京石油化工学院 Dangerous chemical storage monitoring method, device and equipment based on depth camera
CN114627239A (en) * 2022-03-04 2022-06-14 北京百度网讯科技有限公司 Bounding box generation method, device, equipment and storage medium
CN114643588A (en) * 2022-05-19 2022-06-21 睿驰(深圳)智能有限公司 Control method, system and medium for autonomous mobile disinfection robot
CN114754779A (en) * 2022-04-27 2022-07-15 镁佳(北京)科技有限公司 Positioning and mapping method and device and electronic equipment
CN115222799A (en) * 2021-08-12 2022-10-21 达闼机器人股份有限公司 Method and device for acquiring image gravity direction, electronic equipment and storage medium
CN115272791A (en) * 2022-07-22 2022-11-01 仲恺农业工程学院 Multi-target detection positioning method for tea based on YoloV5
WO2022227678A1 (en) * 2021-04-29 2022-11-03 达闼机器人股份有限公司 Three-dimensional target detection method and grabbing method, apparatus, and electronic device
CN117689678A (en) * 2024-02-04 2024-03-12 法奥意威(苏州)机器人***有限公司 Workpiece weld joint identification method, device, equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116330306B (en) * 2023-05-31 2023-08-15 之江实验室 Object grabbing method and device, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157920A1 (en) * 2016-12-01 2018-06-07 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for recognizing obstacle of vehicle
CN108986161A (en) * 2018-06-19 2018-12-11 亮风台(上海)信息科技有限公司 A kind of three dimensional space coordinate estimation method, device, terminal and storage medium
CN110344621A (en) * 2019-06-13 2019-10-18 武汉大学 A kind of wheel points cloud detection method of optic towards intelligent garage
WO2020151109A1 (en) * 2019-01-22 2020-07-30 中国科学院自动化研究所 Three-dimensional target detection method and system based on point cloud weighted channel feature
CN111950426A (en) * 2020-08-06 2020-11-17 东软睿驰汽车技术(沈阳)有限公司 Target detection method and device and delivery vehicle
CN112200851A (en) * 2020-12-09 2021-01-08 北京云测信息技术有限公司 Point cloud-based target detection method and device and electronic equipment thereof
WO2021027710A1 (en) * 2019-08-12 2021-02-18 阿里巴巴集团控股有限公司 Method, device, and equipment for object detection

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614889B (en) * 2018-11-23 2020-09-18 华为技术有限公司 Object detection method, related device and computer storage medium
CN111986232B (en) * 2020-08-13 2021-09-14 上海高仙自动化科技发展有限公司 Target object detection method, target object detection device, robot and storage medium
CN113223091B (en) * 2021-04-29 2023-01-24 达闼机器人股份有限公司 Three-dimensional target detection method, three-dimensional target capture device and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157920A1 (en) * 2016-12-01 2018-06-07 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for recognizing obstacle of vehicle
CN108986161A (en) * 2018-06-19 2018-12-11 亮风台(上海)信息科技有限公司 A kind of three dimensional space coordinate estimation method, device, terminal and storage medium
WO2020151109A1 (en) * 2019-01-22 2020-07-30 中国科学院自动化研究所 Three-dimensional target detection method and system based on point cloud weighted channel feature
CN110344621A (en) * 2019-06-13 2019-10-18 武汉大学 A kind of wheel points cloud detection method of optic towards intelligent garage
WO2021027710A1 (en) * 2019-08-12 2021-02-18 阿里巴巴集团控股有限公司 Method, device, and equipment for object detection
CN111950426A (en) * 2020-08-06 2020-11-17 东软睿驰汽车技术(沈阳)有限公司 Target detection method and device and delivery vehicle
CN112200851A (en) * 2020-12-09 2021-01-08 北京云测信息技术有限公司 Point cloud-based target detection method and device and electronic equipment thereof

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227678A1 (en) * 2021-04-29 2022-11-03 达闼机器人股份有限公司 Three-dimensional target detection method and grabbing method, apparatus, and electronic device
CN115222799A (en) * 2021-08-12 2022-10-21 达闼机器人股份有限公司 Method and device for acquiring image gravity direction, electronic equipment and storage medium
CN113689351A (en) * 2021-08-24 2021-11-23 北京石油化工学院 Dangerous chemical storage monitoring method, device and equipment based on depth camera
CN113689351B (en) * 2021-08-24 2023-10-10 北京石油化工学院 Dangerous chemical storage monitoring method, device and equipment based on depth camera
CN114627239A (en) * 2022-03-04 2022-06-14 北京百度网讯科技有限公司 Bounding box generation method, device, equipment and storage medium
CN114627239B (en) * 2022-03-04 2024-04-30 北京百度网讯科技有限公司 Bounding box generation method, device, equipment and storage medium
CN114754779A (en) * 2022-04-27 2022-07-15 镁佳(北京)科技有限公司 Positioning and mapping method and device and electronic equipment
CN114643588A (en) * 2022-05-19 2022-06-21 睿驰(深圳)智能有限公司 Control method, system and medium for autonomous mobile disinfection robot
CN115272791A (en) * 2022-07-22 2022-11-01 仲恺农业工程学院 Multi-target detection positioning method for tea based on YoloV5
CN117689678A (en) * 2024-02-04 2024-03-12 法奥意威(苏州)机器人***有限公司 Workpiece weld joint identification method, device, equipment and storage medium
CN117689678B (en) * 2024-02-04 2024-05-28 法奥意威(苏州)机器人***有限公司 Workpiece weld joint identification method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2022227678A1 (en) 2022-11-03
CN113223091B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
CN113223091B (en) Three-dimensional target detection method, three-dimensional target capture device and electronic equipment
CN107063228B (en) Target attitude calculation method based on binocular vision
CN108898676B (en) Method and system for detecting collision and shielding between virtual and real objects
CN111443359B (en) Positioning method, device and equipment
JP4865557B2 (en) Computer vision system for classification and spatial localization of bounded 3D objects
CN108401461A (en) Three-dimensional mapping method, device and system, cloud platform, electronic equipment and computer program product
CN111738261A (en) Pose estimation and correction-based disordered target grabbing method for single-image robot
JP2016161569A (en) Method and system for obtaining 3d pose of object and 3d location of landmark point of object
CN109887030A (en) Texture-free metal parts image position and posture detection method based on the sparse template of CAD
CN112070782B (en) Method, device, computer readable medium and electronic equipment for identifying scene contour
CN111079565B (en) Construction method and identification method of view two-dimensional attitude template and positioning grabbing system
CN113927597B (en) Robot connecting piece six-degree-of-freedom pose estimation system based on deep learning
CN112509036B (en) Pose estimation network training and positioning method, device, equipment and storage medium
CN111754579A (en) Method and device for determining external parameters of multi-view camera
CN107680125A (en) The system and method that three-dimensional alignment algorithm is automatically selected in vision system
CN112348890B (en) Space positioning method, device and computer readable storage medium
CN114092553A (en) Disordered grabbing attitude estimation method based on FPFH (fast Fourier transform and inductively coupled plasma) and ICP (inductively coupled plasma) improved algorithm
CN111415420A (en) Spatial information determination method and device and electronic equipment
CN116309882A (en) Tray detection and positioning method and system for unmanned forklift application
CN114882085B (en) Three-dimensional point cloud registration method and system based on single cube
CN115284279A (en) Mechanical arm grabbing method and device based on aliasing workpiece and readable medium
CN112668596A (en) Three-dimensional object recognition method and device and recognition model training method and device
CN115063485A (en) Three-dimensional reconstruction method, device and computer-readable storage medium
CN114820899A (en) Attitude estimation method and device based on multi-view rendering
CN114653629A (en) Sorting method based on visual identification, intelligent sorting system and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 201111 Building 8, No. 207, Zhongqing Road, Minhang District, Shanghai

Applicant after: Dayu robot Co.,Ltd.

Address before: 200000 second floor, building 2, no.1508, Kunyang Road, Minhang District, Shanghai

Applicant before: Dalu Robot Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant