CN114972495A

CN114972495A - Grabbing method and device for object with pure plane structure and computing equipment

Info

Publication number: CN114972495A
Application number: CN202110217389.5A
Authority: CN
Inventors: 魏海永; 盛文波; 刘迪一; 丁有爽; 邵天兰
Original assignee: Mech Mind Robotics Technologies Co Ltd
Current assignee: Mech Mind Robotics Technologies Co Ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2022-08-30

Abstract

The invention discloses a method, a device and a computing device for grabbing an object with a pure plane structure, wherein the method comprises the following steps: acquiring point clouds corresponding to a plurality of objects in a current scene; performing edge extraction on the point cloud corresponding to each object to obtain an edge point cloud corresponding to the object, and calculating tangent vectors of all 3D points in the edge point cloud; aiming at any two 3D points in the edge point cloud, constructing a point pair comprising the two 3D points, and generating a point pair characteristic vector of the point pair according to tangent vectors of the two 3D points; and matching the edge point cloud corresponding to the object with a preset template point cloud according to the point pair characteristic vector of each point pair of the object to obtain the pose information of the object. According to the scheme, the point pair characteristic vector of the point pair is generated according to the tangent vector of the 3D point in the edge point cloud, the edge point cloud is matched with the preset template point cloud according to the point pair characteristic vector, and the quick and accurate identification of the pose information of the object is realized.

Description

Grabbing method and device for object with pure plane structure and computing equipment

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for grabbing an object with a pure plane structure and computing equipment.

Background

With the development of industrial intelligence, it is becoming more and more common to operate an object (e.g., an industrial part, a box, etc.) by a robot instead of a human. In operation of a robot, it is generally necessary to grasp an object, move the object from one location and place the object at another location, such as grasping the object from a conveyor belt and placing the object on a pallet or in a cage car, or grasping the object from a pallet, placing the object on a conveyor belt or other pallet as desired, and the like. However, in the prior art, the pose information of the object to be grabbed is not accurately identified, the identification efficiency is low, and the high-speed industrial automation requirement is difficult to meet.

Disclosure of Invention

In view of the above, the present invention is proposed in order to provide a grabbing method, an apparatus and a computing device for an object of a pure planar structure, which overcome the above problems or at least partially solve the above problems.

According to one aspect of the invention, there is provided a method of grabbing an object in a purely planar configuration, the method comprising:

acquiring point clouds corresponding to a plurality of objects in a current scene; wherein the plurality of objects have a pure planar structure;

performing edge extraction on the point cloud corresponding to each object to obtain an edge point cloud corresponding to the object, and calculating tangent vectors of all 3D points in the edge point cloud;

aiming at any two 3D points in the edge point cloud, constructing a point pair comprising the two 3D points, and generating a point pair characteristic vector of the point pair according to tangent vectors of the two 3D points;

and matching the edge point cloud corresponding to the object with a preset template point cloud according to the point pair characteristic vector of each point pair of the object to obtain the pose information of the object so that the robot can execute grabbing operation according to the pose information of the object.

According to another aspect of the present invention, there is provided a gripping device for objects of a purely planar construction, the device comprising:

the first acquisition module is suitable for acquiring point clouds corresponding to a plurality of objects in a current scene; wherein the plurality of objects have a pure planar structure;

the edge extraction module is suitable for extracting the edge of the point cloud corresponding to each object to obtain the edge point cloud corresponding to the object and calculating tangent vectors of all 3D points in the edge point cloud;

the point pair construction module is suitable for constructing a point pair comprising two 3D points aiming at any two 3D points in the edge point cloud, and generating a point pair characteristic vector of the point pair according to tangent vectors of the two 3D points;

and the matching module is suitable for matching the edge point cloud corresponding to the object with the preset template point cloud according to the point pair characteristic vector of each point pair of the object to obtain the pose information of the object so that the robot can execute grabbing operation according to the pose information of the object.

According to yet another aspect of the present invention, there is provided a computing device comprising: the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the grabbing method of the object with the pure plane structure.

According to a further aspect of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the above-mentioned grabbing method for an object having a purely planar structure.

According to the technical scheme provided by the invention, the pure plane structure characteristics are fully researched, the edge point cloud is extracted from the point cloud corresponding to the object, the point pair characteristic vector of the point pair is generated according to the tangent vector of the 3D point in the edge point cloud, and the shape structure characteristics of the object are accurately reflected through the point pair characteristic vector; the edge point cloud is matched with the preset template point cloud according to the point pair characteristic vector, so that the pose information of the object is accurately identified, the robot can accurately and firmly execute grabbing operation according to the pose information of the object, and grabbing errors such as incapability of successfully grabbing the object by the robot or falling of the object after grabbing are avoided; in addition, in the scheme, only the edge point clouds corresponding to the object are extracted to participate in the matching process, and the non-edge point clouds do not participate in the matching process, so that the matching workload is effectively reduced, and the identification efficiency of the pose information of the object is improved.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 shows a schematic flow diagram of a grabbing method for an object of a purely planar structure according to one embodiment of the present invention;

fig. 2 shows a schematic flow diagram of a method for grabbing an object in a purely planar configuration according to another embodiment of the present invention;

FIG. 3 shows a block diagram of a grasping apparatus for an object of a pure plane structure according to an embodiment of the present invention;

FIG. 4 shows a schematic structural diagram of a computing device according to an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Fig. 1 shows a schematic flow diagram of a grabbing method for an object with a pure plane structure according to an embodiment of the present invention, as shown in fig. 1, the grabbing method includes the following steps:

step S101, point clouds corresponding to a plurality of objects in the current scene are obtained.

The current scene contains a plurality of objects, and the plurality of objects have a pure plane structure, for example, the objects may be rectangular parallelepiped boxes. In step S101, point clouds corresponding to a plurality of objects in the current scene obtained by pre-processing may be acquired. The point cloud includes pose information of each 3D point, and the pose information of each 3D point may specifically include coordinate values of each 3D point in XYZ three axes of the space and information of each 3D point in the XYZ three axis direction. After the point clouds corresponding to the multiple objects are acquired, the processing is performed on each object in the current scene in the manner of step S102 to step S104.

Step S102, aiming at each object, carrying out edge extraction on the point cloud corresponding to the object to obtain an edge point cloud corresponding to the object, and calculating tangent vectors of all 3D points in the edge point cloud.

For each object, the edge point cloud corresponding to the object can be extracted from the point cloud corresponding to the object based on a 3D or 2D projection mode and the like. The inventor carefully analyzes the edge point cloud corresponding to the object with the pure plane structure in the invention process, and finds that when the objects with different shapes are in a flat state, the normal vector characteristics of each 3D point in the edge point cloud are the same, the normal directions are vertical and upward, and the tangent vectors thereof have difference, so that the shape structure characteristics of the object can be reflected by utilizing the tangent vectors. After the edge point cloud corresponding to the object is extracted, the tangent vector of each 3D point in the edge point cloud can be calculated by using a vector calculation algorithm for calculating tangent vectors in a three-dimensional space in the prior art, so as to generate a point pair feature vector of the point pair according to the tangent vectors. The vector calculation algorithm can be selected by those skilled in the art according to actual needs, and is not limited in particular here.

Step S103, aiming at any two 3D points in the edge point cloud, a point pair comprising the two 3D points is constructed, and a point pair characteristic vector of the point pair is generated according to tangent vectors of the two 3D points.

The edge point cloud corresponding to the object comprises the pose information of each 3D point, a point pair comprising the two 3D points is constructed for any two 3D points in the edge point cloud, and a point pair feature vector of the point pair is generated according to the pose information of the two 3D points and tangent vectors of the two 3D points, so that the object corresponds to a plurality of point pairs. Specifically, the point pair feature vector may include a euclidean distance of a connecting line vector between the two 3D points, an angle between a tangent vector of each of the two 3D points and the connecting line vector, and an angle between tangent vectors of the two 3D points. The person skilled in the art can set the point pair feature vector to include other contents according to actual needs, and is not limited herein.

And step S104, matching the edge point cloud corresponding to the object with a preset template point cloud according to the point pair feature vector of each point pair of the object to obtain the pose information of the object, so that the robot can execute grabbing operation according to the pose information of the object.

In order to conveniently and accurately identify the pose information of each object in a scene image, a template base containing a plurality of preset template point clouds is constructed in advance, and the preset template point clouds are point clouds corresponding to known objects which are determined in advance and serve as matching references. For each object in the current scene, matching the edge point cloud corresponding to the object with a preset template point cloud according to the point pair feature vector of each point pair of the object, thereby obtaining pose information of the object, wherein the pose information of the object may specifically include coordinate values of the object center in the three XYZ axes of the space and information of the object itself in the three XYZ axes directions. After the pose information of the object is obtained, the pose information of the object can be transmitted to the robot, so that the robot can perform grabbing operation on the object according to the pose information of the object.

According to the method for grabbing the object with the pure plane structure, provided by the embodiment, the pure plane structure characteristics are fully researched, the edge point cloud is extracted from the point cloud corresponding to the object, the point pair characteristic vector of the point pair is generated according to the tangent vector of the 3D point in the edge point cloud, and the shape structure characteristics of the object are accurately reflected through the point pair characteristic vector; the edge point cloud is matched with the preset template point cloud according to the point pair characteristic vector, so that the pose information of the object is accurately identified, the robot can accurately and firmly execute grabbing operation according to the pose information of the object, and grabbing errors such as incapability of successfully grabbing the object by the robot or falling of the object after grabbing are avoided; in addition, in the scheme, only the edge point clouds corresponding to the object are extracted to participate in the matching process, and the non-edge point clouds do not participate in the matching process, so that the matching workload is effectively reduced, and the identification efficiency of the pose information of the object is improved.

Fig. 2 shows a schematic flow diagram of a grabbing method for an object with a pure plane structure according to another embodiment of the present invention, as shown in fig. 2, the method includes the following steps:

step S201, obtaining a scene image of a current scene and a point cloud corresponding to the scene image, inputting the scene image into a trained deep learning segmentation model for instance segmentation processing, and obtaining segmentation results of each object in the scene image.

The current scene comprises a plurality of objects with pure plane structures. The method includes acquiring a scene image and a depth image of a current scene by a camera disposed at an upper position, where the camera may be specifically a 3D camera, and the 3D camera may be disposed at the upper position, for example, at a position directly above or obliquely above, and configured to acquire information of the current scene within a camera view angle at the same time to obtain the scene image and the depth image, and specifically, the 3D camera may include elements such as a visible light detector such as a laser detector and an LED, an infrared detector and/or a radar detector, and detect the current scene by using the elements to obtain the depth image. The scene image may be specifically an RGB image, and the pixels of the scene image and the depth image correspond one to one. By processing the scene image and the depth image, the point cloud corresponding to the scene image can be conveniently obtained. In step S201, a scene image of a current scene captured by a camera and a point cloud corresponding to the scene image obtained by processing the scene image and the depth image may be acquired.

In order to conveniently and accurately segment each object contained in a scene image, sample scene images can be collected in advance, a training sample set is constructed, each sample scene image in the training sample set is trained by adopting a deep learning algorithm, and finally a deep learning segmentation model is obtained through training.

Step S202, determining the point clouds corresponding to the objects according to the point clouds corresponding to the scene images and the segmentation results of the objects.

The segmentation result of each object may include a binarized segmented image of each object, and for each object, an object region where the object is located and a non-object region other than the object region may be included in the binarized segmented image of the object, where the object region may be specifically represented by a white region and the non-object region may be specifically represented by a black region.

Then, for each object, the point cloud corresponding to the scene image may be projected into the binarized segmented image of the object, and the 3D point projected into the object region of the binarized segmented image in the point cloud corresponding to the scene image is taken as the 3D point corresponding to the object, so as to obtain the point cloud corresponding to the object. Specifically, all 3D points in the point cloud corresponding to the scene image are projected, and if a certain 3D point in the point cloud corresponding to the scene image is projected and falls into a white object region, the 3D point is considered to belong to the object, that is, the 3D point is the 3D point corresponding to the object, and all 3D points corresponding to the object are summarized, so as to obtain the point cloud corresponding to the object. By the processing mode, the point cloud corresponding to the object is accurately determined.

Step S203, point clouds corresponding to a plurality of objects in the current scene are acquired.

Step S204, aiming at each object, performing edge extraction on the point cloud corresponding to the object to obtain an edge point cloud corresponding to the object, and calculating tangent vectors of all 3D points in the edge point cloud.

In an alternative embodiment, the edge point cloud may be extracted based on a 3D approach. Specifically, for each 3D point in the point cloud corresponding to each object, an adjacent 3D point located in a preset neighborhood region of the 3D point is searched in the point cloud to obtain a point set including the 3D point and the adjacent 3D point, for example, a neighborhood radius (e.g., 1cm or 0.3 cm) may be set, a region with the 3D point as a center and the neighborhood radius as a radius is configured as the preset neighborhood region of the 3D point, then, a 3D point located in the preset neighborhood region of the 3D point is searched in the point cloud corresponding to the object, the searched 3D point is called an adjacent 3D point, and all the adjacent 3D points of the 3D point and the 3D point are summarized to obtain the point set. After the point set is obtained, the normal vector can be obtained by calculation according to the points in the point set by using the calculation method of the normal vector in the prior art, and a plane where the normal vector is located is set for the normal vector.

Then, a connecting line between the 3D point and each adjacent 3D point in the point set is constructed, a first included angle between a projection line of each connecting line in a plane where a normal vector is located and a specified reference direction axis is calculated, for example, each connecting line is projected onto the plane where the normal vector is located, projection lines corresponding to each connecting line are obtained, the connecting lines and the projection lines are in one-to-one correspondence, and an included angle between each projection line and the specified reference direction axis (for example, an X axis and the like) in the plane where the normal vector is located is calculated. In this embodiment, in order to easily distinguish the included angle from an included angle between a tangent vector of each of any two 3D points in the edge point cloud and a link vector and an included angle between tangent vectors of any two 3D points in the edge point cloud, an included angle between a projection line and a reference direction axis specified in a plane where a normal vector is located is referred to as a first included angle, an included angle between a tangent vector of a 3D point and a link vector is referred to as a second included angle, and an included angle between tangent vectors of two 3D points is referred to as a third included angle.

After the first included angles corresponding to the projection lines are obtained, the projection lines can be sequenced according to the first included angles, for example, the projection lines are sequenced according to the clockwise direction, the counterclockwise direction, the sequence of the first included angles from large to small or the sequence of the first included angles from small to large, the included angle difference of the first included angles corresponding to two adjacent sequenced projection lines is calculated, the maximum value of the included angle difference is selected from the included angle difference, and whether the maximum value of the included angle difference is larger than a preset angle difference threshold value is judged; if yes, the 3D point is determined as an edge point, and if at least one area around the 3D point does not have other 3D points; if not, the 3D point is determined to be a non-edge point if the 3D points are all around the 3D point. And performing the above processing on each 3D point in the point cloud corresponding to the object, comparing the maximum value of the included angle difference with a preset angle difference threshold value to determine whether the point cloud is an edge point, and summarizing all the edge points of the object to obtain the edge point cloud corresponding to the object. The preset angle difference threshold value can be set by a person skilled in the art according to actual needs, and is not specifically limited herein.

In another optional embodiment, the edge point cloud may also be extracted based on a 2D projection method. Specifically, each 3D point in the point cloud corresponding to the object may be projected onto a preset plane (for example, a plane where the camera is located), so as to obtain a projected image, for example, an area where the 3D point is projected in the projected image is marked with white, and an area where the 3D point is not projected is marked with black. And then, identifying contour pixel points in the projected image, determining the 3D points corresponding to the contour pixel points as edge points, summarizing all the edge points of the object, and obtaining an edge point cloud corresponding to the object. The method includes the steps of recognizing a contour boundary line formed in a projected image by adopting an image recognition algorithm, determining a pixel point corresponding to the contour boundary line as a contour pixel point, and determining a 3D point projected onto the contour pixel point in a point cloud corresponding to an object as a 3D point corresponding to the contour pixel point, namely, if a certain 3D point in the point cloud corresponding to the object falls into the position of the contour pixel point after being projected, considering that the 3D point has a corresponding relation with the contour pixel point, wherein the 3D point is a 3D point corresponding to the contour pixel point, and the 3D point is an edge point.

Through the 3D and 2D projection mode, accurate extraction of the edge point cloud corresponding to the object can be conveniently achieved. After the edge point cloud corresponding to the object is extracted, the tangent vector of each 3D point in the edge point cloud can be calculated by using a vector calculation algorithm for calculating tangent vectors in a three-dimensional space in the prior art, so as to generate a point pair feature vector of a point pair according to the tangent vectors.

Step S205, for any two 3D points in the edge point cloud, a point pair including the two 3D points is constructed, and a point pair feature vector of the point pair is generated according to tangent vectors of the two 3D points.

The edge point cloud corresponding to the object comprises pose information of each 3D point, and the pose information comprises information such as coordinate values of the 3D points in XYZ three axes of the space. In step S205, a connection vector between the two 3D points may be constructed according to coordinate values of the arbitrary two 3D points in the XYZ three axes of the space, and an euclidean distance of the connection vector may be calculated; then, calculating a second included angle between a tangent vector of each 3D point of the two 3D points and a connecting line vector and a third included angle between tangent vectors of the two 3D points; and then generating a point pair characteristic vector of the point pair by using the Euclidean distance, the second included angle and the third included angle of the connecting line vector.

Suppose that any two 3D points in the edge point cloud are respectively represented as m ₁ And m ₂ ，m ₁ Is represented by the tangent vector of ₁ ，m ₂ Is represented by the tangent vector of ₂ ，m ₁ And m ₂ The line vector between is represented as d, where d ═ m ₂ -m ₁ Then comprise m ₁ And m ₂ The point pair feature vector of the point pair may be expressed as (| | d | | | non-calculation ₂ ,∠(t ₁ ,d),∠(t ₂ ,d),∠(t ₁ ,t ₂ ) | d | non-calculation phosphor powder ₂ Euclidean distance representing link vector d, and angle (t) ₁ And d) represents m ₁ Tangent vector t of ₁ And a second angle between the vector d of the connecting line, and angle (t) ₂ And d) represents m ₂ Tangent vector t of ₂ And a second angle between the connecting line vector d, angle (t) ₁ ,t ₂ ) Represents m ₁ Tangent vector t of ₁ And m ₂ Tangential vector t of ₂ A third angle therebetween.

Step S206, carrying out first matching on the point pair characteristic vector of each point pair in the edge point cloud corresponding to the object and the point pair characteristic vector of each point pair in the edge point cloud of the preset template point cloud to obtain a first matching result.

The preset template point cloud is a point cloud which is predetermined and corresponds to a known object serving as a matching reference, the edge point cloud of the preset template point cloud is the edge point cloud corresponding to the known object, the edge point cloud comprises pose information of all 3D points, a point pair comprising two 3D points is constructed for any two 3D points in the edge point cloud of the preset template point cloud, point pair characteristic vectors of the point pair are generated according to the pose information of the two 3D points and tangent vectors of the two 3D points, and therefore the preset template point cloud also corresponds to a plurality of point pairs.

After the point pair feature vectors of each point pair in the edge point cloud corresponding to the object are obtained through the processing of step S205, the point pair feature vectors of each point pair in the edge point cloud corresponding to the object are first matched with the point pair feature vectors of each point pair in the edge point cloud of the preset template point cloud, so as to obtain a first matching result. The pose information of each 3D point is predefined in each preset template point cloud, and in the first matching process, each preset template point cloud needs to be transformed into the current scene, so that the edge point cloud of the preset template point cloud after pose transformation and the edge point cloud corresponding to the object in the current scene coincide as much as possible, and a first matching result is obtained, wherein the first matching result can comprise a plurality of pose transformation relations of the matched preset template point clouds.

And step S207, performing secondary matching on the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relations of the matched preset template point clouds, and determining the pose information of the object according to the pose transformation relation of the matched preset template point cloud with the highest matching score in the secondary matching result.

Since the first matching result may include a plurality of pose transformation relationships of the matched preset template point cloud, in order to determine the pose information of the object more accurately, in this embodiment, the pose information of each 3D point in the edge point cloud corresponding to the object and the plurality of pose transformation relationships of the matched preset template point cloud are subjected to the second matching. Specifically, a preset evaluation algorithm can be used for calculating matching scores between the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relations of the matched preset template point cloud, so as to obtain a second matching result. The skilled person can select the preset evaluation algorithm according to the actual need, and the preset evaluation algorithm is not limited herein. For example, the preset evaluation algorithm may be an ICP (Iterative Closest Point) algorithm, a GMM (Gaussian Mixed Model) algorithm, or the like. And the second matching is further optimization and correction of the first matching result, the pose transformation relation of the matched preset template point cloud with the highest matching score in the second matching result is used as a final matching object, and the pose information of the object is determined according to the pose transformation relation of the matched preset template point cloud with the highest matching score in the second matching result.

And (5) processing each object in the current scene according to the modes from step S204 to step S207 to obtain the pose information of each object. Because the pose information of the object is determined in the camera coordinate system, in order to facilitate the robot to position the object, the pose information of the object needs to be converted into the robot coordinate system by using a preset conversion algorithm, and then the pose information after the object conversion is transmitted to the robot, so that the robot can perform grabbing operation on the object according to the pose information after the object conversion.

According to the method for grabbing the objects with the pure plane structure, the scene image is subjected to instance segmentation by using the deep learning segmentation model, so that each object in the scene image is accurately segmented; extracting edge point clouds from point clouds corresponding to an object based on a 3D or 2D projection mode, realizing accurate extraction of the edge point clouds, generating point pair characteristic vectors of point pairs according to tangent vectors of the 3D points in the edge point clouds, and realizing accurate embodiment of shape and structure characteristics of the object through the point pair characteristic vectors; according to the point pair characteristic vector, point clouds corresponding to all objects are subjected to secondary matching with preset template point clouds, so that the pose information of the objects is accurately identified, and the robot can accurately and firmly execute grabbing operation according to the pose information of the objects; in addition, in the scheme, only the edge point clouds corresponding to the object are extracted to participate in the matching process, and the non-edge point clouds do not participate in the matching process, so that the matching workload is effectively reduced, and the identification efficiency of the pose information of the object is improved.

Fig. 3 shows a block diagram of a grasping apparatus for an object of a pure plane structure according to an embodiment of the present invention, as shown in fig. 3, the apparatus includes: a first obtaining module 301, an edge extracting module 302, a point pair constructing module 303 and a matching module 304.

The first obtaining module 301 is adapted to: acquiring point clouds corresponding to a plurality of objects in a current scene; wherein the plurality of objects have a purely planar structure.

The edge extraction module 302 is adapted to: and aiming at each object, performing edge extraction on the point cloud corresponding to the object to obtain an edge point cloud corresponding to the object, and calculating tangent vectors of all 3D points in the edge point cloud.

The point pair construction module 303 is adapted to: and aiming at any two 3D points in the edge point cloud, constructing a point pair comprising the two 3D points, and generating a point pair characteristic vector of the point pair according to tangent vectors of the two 3D points.

The matching module 304 is adapted to: and matching the edge point cloud corresponding to the object with a preset template point cloud according to the point pair characteristic vector of each point pair of the object to obtain the pose information of the object so that the robot can execute grabbing operation according to the pose information of the object.

Optionally, the apparatus may further comprise: a second acquisition module 305, an instance segmentation module 306, and an object point cloud determination module 307. Wherein the second obtaining module 305 is adapted to: and acquiring a scene image of the current scene and a point cloud corresponding to the scene image. The instance segmentation module 306 is adapted to: and inputting the scene image into the trained deep learning segmentation model for instance segmentation processing to obtain the segmentation result of each object in the scene image. The object point cloud determination module 307 is adapted to: and determining the point clouds corresponding to the objects according to the point clouds corresponding to the scene images and the segmentation results of the objects.

Optionally, the edge extraction module 302 is further adapted to: aiming at each 3D point in the point cloud, searching adjacent 3D points in a preset neighborhood region of the 3D point in the point cloud to obtain a point set comprising the 3D point and the adjacent 3D points, and calculating a normal vector according to the point set; constructing a connecting line between the 3D point and each adjacent 3D point in the point set, and calculating a first included angle between a projection line of each connecting line in a plane where a normal vector is located and an appointed reference direction axis; sequencing the projection lines according to the first included angles, and calculating the included angle difference of the first included angles corresponding to two adjacent sequenced projection lines; judging whether the maximum value of the included angle difference is larger than a preset angle difference threshold value or not; if yes, determining the 3D point as an edge point; if not, determining the 3D point as a non-edge point; and summarizing all the edge points to obtain an edge point cloud corresponding to the object.

Optionally, the edge extraction module 302 is further adapted to: projecting each connecting line to a plane where the normal vector is located to obtain a projection line corresponding to each connecting line; and calculating a first included angle between each projection line and the specified reference direction axis in the plane of the normal vector.

Optionally, the edge extraction module 302 is further adapted to: projecting each 3D point in the point cloud corresponding to the object onto a preset plane to obtain a projected image; identifying contour pixel points in the projected image, and determining 3D points corresponding to the contour pixel points as edge points; and summarizing all the edge points to obtain an edge point cloud corresponding to the object.

Optionally, the edge extraction module 302 is further adapted to: and identifying contour boundary lines formed in the projected images, determining pixel points corresponding to the contour boundary lines as contour pixel points, and determining 3D points projected to the contour pixel points in the point cloud corresponding to the object as 3D points corresponding to the contour pixel points.

Optionally, the point pair construction module 303 is further adapted to: constructing a connecting line vector between the two 3D points, and calculating the Euclidean distance of the connecting line vector; calculating a second included angle between a tangent vector of each 3D point of the two 3D points and the connecting line vector and a third included angle between tangent vectors of the two 3D points; and generating a point pair characteristic vector of the point pair by using the Euclidean distance, the second included angle and the third included angle of the connecting line vector.

Optionally, the matching module 304 is further adapted to: performing first matching on the point pair characteristic vector of each point pair in the edge point cloud corresponding to the object and the point pair characteristic vector of each point pair in the edge point cloud of the preset template point cloud to obtain a first matching result; the first matching result comprises a plurality of pose transformation relations of matched preset template point clouds; and performing secondary matching on the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relations of the matched preset template point clouds, and determining the pose information of the object according to the pose transformation relation of the matched preset template point cloud with the highest matching score in a secondary matching result.

Optionally, the matching module 304 is further adapted to: and calculating matching scores between the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relations of the matched preset template point cloud by using a preset evaluation algorithm.

According to the grabbing device for the objects with the pure plane structure, the deep learning segmentation model is used for carrying out example segmentation on the scene image, and accurate segmentation on each object in the scene image is achieved; extracting edge point clouds from point clouds corresponding to an object based on a 3D or 2D projection mode, realizing accurate extraction of the edge point clouds, generating point pair characteristic vectors of point pairs according to tangent vectors of the 3D points in the edge point clouds, and realizing accurate embodiment of shape and structure characteristics of the object through the point pair characteristic vectors; according to the point pair characteristic vector, point clouds corresponding to all objects are subjected to secondary matching with preset template point clouds, so that the pose information of the objects is accurately identified, and the robot can accurately and firmly execute grabbing operation according to the pose information of the objects; in addition, in the scheme, only the edge point clouds corresponding to the object are extracted to participate in the matching process, and the non-edge point clouds do not participate in the matching process, so that the matching workload is effectively reduced, and the identification efficiency of the pose information of the object is improved.

The invention further provides a nonvolatile computer storage medium, and the computer storage medium stores at least one executable instruction, and the executable instruction can execute the grabbing method for the object with the pure plane structure in any method embodiment.

Fig. 4 is a schematic structural diagram of a computing device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.

As shown in fig. 4, the computing device may include: a processor (processor)402, a Communications Interface 404, a memory 406, and a Communications bus 408.

Wherein:

the processor 402, communication interface 404, and memory 406 communicate with each other via a communication bus 408.

A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.

The processor 402, configured to execute the program 410, may specifically perform relevant steps in the above-described embodiment of the grabbing method for the object with a pure planar structure.

In particular, program 410 may include program code comprising computer operating instructions.

The processor 402 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

And a memory 406 for storing a program 410. Memory 406 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 410 may specifically be configured to enable the processor 402 to execute the grabbing method for the object with a pure planar structure in any of the above-described method embodiments. For specific implementation of each step in the program 410, reference may be made to the corresponding steps and corresponding descriptions in the units in the above-mentioned embodiment for grabbing an object with a pure planar structure, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Moreover, those skilled in the art will appreciate that although some embodiments described herein include some features included in other embodiments, not others, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in accordance with embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims

1. A method of grabbing an object in a purely planar configuration, the method comprising:

acquiring point clouds corresponding to a plurality of objects in a current scene; wherein the plurality of objects have a purely planar structure;

and matching the edge point cloud corresponding to the object with a preset template point cloud according to the point pair characteristic vector of each point pair of the object to obtain the position and pose information of the object, so that the robot can execute grabbing operation according to the position and pose information of the object.

2. The method of claim 1, wherein prior to the acquiring point clouds corresponding to a plurality of objects in a current scene, the method further comprises:

acquiring a scene image of a current scene and a point cloud corresponding to the scene image, inputting the scene image into a trained deep learning segmentation model for instance segmentation processing, and obtaining segmentation results of objects in the scene image;

and determining the point clouds corresponding to the objects according to the point clouds corresponding to the scene images and the segmentation results of the objects.

3. The method of claim 1, wherein for each object, performing edge extraction on the point cloud corresponding to the object, and obtaining an edge point cloud corresponding to the object further comprises:

aiming at each 3D point in the point cloud, searching adjacent 3D points in a preset neighborhood region of the 3D point in the point cloud to obtain a point set comprising the 3D point and the adjacent 3D points, and summing an algorithm vector according to the point set;

constructing a connecting line between the 3D point and each adjacent 3D point in the point set, and calculating a first included angle between a projection line of each connecting line in a plane where the normal vector is located and an appointed reference direction axis;

sequencing all the projection lines according to the first included angles, and calculating the included angle difference of the first included angles corresponding to two adjacent sequenced projection lines;

judging whether the maximum value of the included angle difference is larger than a preset angle difference threshold value or not; if yes, determining the 3D point as an edge point; if not, determining the 3D point as a non-edge point;

and summarizing all the edge points to obtain an edge point cloud corresponding to the object.

4. The method of claim 3, wherein the calculating of the first angle between the projection line of each connecting line in the plane of the normal vector and the designated reference direction axis further comprises:

projecting each connecting line to the plane where the normal vector is located to obtain a projection line corresponding to each connecting line;

and calculating a first included angle between each projection line and a specified reference direction axis in the plane of the normal vector.

5. The method of claim 1, wherein for each object, performing edge extraction on the point cloud corresponding to the object, and obtaining an edge point cloud corresponding to the object further comprises:

projecting each 3D point in the point cloud corresponding to the object onto a preset plane to obtain a projected image;

identifying contour pixel points in the projected image, and determining 3D points corresponding to the contour pixel points as edge points;

6. The method of claim 5, wherein said identifying contour pixel points in said projection image further comprises:

and identifying a contour boundary line formed in the projected image, determining a pixel point corresponding to the contour boundary line as a contour pixel point, and determining a 3D point projected to the contour pixel point in the point cloud corresponding to the object as a 3D point corresponding to the contour pixel point.

7. The method of any one of claims 1-6, wherein the generating a point pair feature vector for the point pair from tangent vectors of the two 3D points further comprises:

constructing a connecting line vector between the two 3D points, and calculating the Euclidean distance of the connecting line vector;

calculating a second included angle between a tangent vector of each of the two 3D points and the connecting line vector and a third included angle between tangent vectors of the two 3D points;

and generating a point pair characteristic vector of the point pair by using the Euclidean distance of the connecting line vector, the second included angle and the third included angle.

8. The method according to any one of claims 1 to 7, wherein the matching, according to the point pair feature vector of each point pair of the object, the edge point cloud corresponding to the object with a preset template point cloud to obtain the pose information of the object further comprises:

carrying out first matching on the point pair characteristic vector of each point pair in the edge point cloud corresponding to the object and the point pair characteristic vector of each point pair in the edge point cloud of the preset template point cloud to obtain a first matching result; the first matching result comprises a plurality of pose transformation relations of matched preset template point clouds;

and performing secondary matching on the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relations of the matched preset template point clouds, and determining the pose information of the object according to the pose transformation relation of the matched preset template point cloud with the highest matching score in a secondary matching result.

9. The method of claim 8, wherein the second matching the pose information of each 3D point in the edge point cloud corresponding to the object with the pose transformation relationships of the matched pre-set template point clouds further comprises:

and calculating matching scores between the pose information of each 3D point in the edge point cloud corresponding to the object and a plurality of pose transformation relations of the matched preset template point cloud by using a preset evaluation algorithm.

10. A grasping apparatus for an object of a purely planar structure, the apparatus comprising:

the first acquisition module is suitable for acquiring point clouds corresponding to a plurality of objects in a current scene; wherein the plurality of objects have a purely planar structure;

the edge extraction module is suitable for extracting the edge of the point cloud corresponding to each object to obtain the edge point cloud corresponding to the object, and calculating tangent vectors of all 3D points in the edge point cloud;

the point pair construction module is suitable for constructing a point pair containing any two 3D points in the edge point cloud, and generating a point pair characteristic vector of the point pair according to tangent vectors of the two 3D points;

11. The apparatus of claim 10, wherein the apparatus further comprises:

the second acquisition module is suitable for acquiring a scene image of a current scene and a point cloud corresponding to the scene image;

the example segmentation module is suitable for inputting the scene image into a trained deep learning segmentation model to perform example segmentation processing so as to obtain a segmentation result of each object in the scene image;

and the object point cloud determining module is suitable for determining the point clouds corresponding to the objects according to the point clouds corresponding to the scene images and the segmentation results of the objects.

12. The apparatus of claim 10, wherein the edge extraction module is further adapted to:

13. The apparatus of claim 12, wherein the edge extraction module is further adapted to:

projecting each connecting line to a plane where the normal vector is located to obtain a projection line corresponding to each connecting line;

14. The apparatus of claim 10, wherein the edge extraction module is further adapted to:

15. The apparatus of claim 14, wherein the edge extraction module is further adapted to:

16. The apparatus of any one of claims 10-15, wherein the point pair construction module is further adapted to:

17. The apparatus of any one of claims 10-16, wherein the matching module is further adapted to:

18. The apparatus of claim 17, wherein the matching module is further adapted to:

19. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the grabbing method for the object with the pure plane structure according to any one of claims 1-9.

20. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the method of grabbing a purely planar structured object according to any one of claims 1-9.