CN111178250A

CN111178250A - Object identification positioning method and device and terminal equipment

Info

Publication number: CN111178250A
Application number: CN201911380815.6A
Authority: CN
Inventors: 刘培超; 徐培; 郎需林; 刘主福
Original assignee: Shenzhen Yuejiang Technology Co Ltd
Current assignee: Shenzhen Yuejiang Technology Co Ltd
Priority date: 2019-12-27
Filing date: 2019-12-27
Publication date: 2020-05-19
Anticipated expiration: 2039-12-27
Also published as: CN111178250B

Abstract

The invention is suitable for the technical field of machine vision, and provides an object identification and positioning method, an object identification and positioning device and terminal equipment, wherein the object identification and positioning device comprises the following steps: acquiring a two-dimensional image and point cloud data of a region to be detected; detecting the two-dimensional image through a pre-trained deep learning model, and identifying a two-dimensional target area and a geometric shape type corresponding to a target object in the two-dimensional image; mapping the two-dimensional target area to the point cloud data, and determining a first three-dimensional area of the target object according to a mapping result; and determining a second three-dimensional area of the target object and positioning the target object according to the geometric shape type and the first three-dimensional area. The embodiment of the invention can improve the efficiency and the accuracy of 3D object identification and positioning.

Description

Object identification positioning method and device and terminal equipment

Technical Field

The invention belongs to the technical field of machine vision, and particularly relates to an object identification and positioning method, an object identification and positioning device and terminal equipment.

Background

In industrial production or in robotic applications, it is often necessary to identify and position objects by machine vision for subsequent grasping or other processing steps.

For an existing three-dimensional (3D) object, a 3D model matching algorithm is generally adopted, that is, a model matching is performed on the object to be detected according to a pre-constructed 3D model of the target object, and the target object is identified from the model matching. However, the existing 3D model matching algorithm has poor robustness to occlusion and noisy backgrounds, and is prone to generating mismatching, resulting in low efficiency of three-dimensional object recognition.

Disclosure of Invention

In view of this, embodiments of the present invention provide an object identification and positioning method, an object identification and positioning device, and a terminal device, so as to solve the problem in the prior art how to improve the efficiency and accuracy of 3D object identification and positioning.

A first aspect of an embodiment of the present invention provides an object identification and positioning method, including:

acquiring a two-dimensional image and point cloud data of a region to be detected;

detecting the two-dimensional image through a pre-trained deep learning model, and identifying a two-dimensional target area and a geometric shape type corresponding to a target object in the two-dimensional image;

mapping the two-dimensional target area to the point cloud data, and determining a first three-dimensional area of the target object according to a mapping result;

and determining a second three-dimensional area of the target object and positioning the target object according to the geometric shape type and the first three-dimensional area.

A second aspect of the embodiments of the present invention provides an object identifying and positioning apparatus, including:

the first acquisition unit is used for acquiring a two-dimensional image and point cloud data of a region to be detected;

the recognition unit is used for detecting the two-dimensional image through a pre-trained deep learning model and recognizing a two-dimensional target area and a geometric shape type corresponding to a target object in the two-dimensional image;

the rough segmentation unit is used for mapping the two-dimensional target area to the point cloud data and determining a first three-dimensional area of the target object according to a mapping result;

and the positioning unit is used for determining a second three-dimensional area of the target object according to the geometric shape type and the first three-dimensional area and positioning the target object.

A third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the object identification and positioning method when executing the computer program.

A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of the object identification and positioning method.

A fifth aspect of embodiments of the present invention provides a computer program product, which, when run on a terminal device, causes the terminal device to execute the object identification and positioning method as described in the first aspect.

Compared with the prior art, the embodiment of the invention has the following beneficial effects: in the embodiment of the invention, as the robustness of the target object identified by the two-dimensional image is better than that of the method for identifying the target object by 3D model matching, the two-dimensional target area of the target object is identified by the two-dimensional image, and then the two-dimensional target area is mapped to the three-dimensional point cloud space to determine the first three-dimensional area of the target object, so that the target object can be preliminarily identified and the geometric shape type corresponding to the target object can be determined, and compared with the object identification method for directly carrying out 3D model matching with poor robustness, the efficiency of object identification and the probability of accurately identifying the object can be improved; and after the target object is preliminarily identified by determining the first three-dimensional area, the second three-dimensional area of the target object can be further accurately determined according to the geometric shape type of the target object and the first three-dimensional area, so that the target object is positioned, and the accuracy of object identification and positioning is further improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flow chart of an implementation of a first object identification and positioning method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of an implementation of a second object identification and positioning method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an object recognition and positioning device provided by an embodiment of the present invention;

fig. 4 is a schematic diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In addition, in the description of the present application, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

The first embodiment is as follows:

fig. 1 shows a schematic flow chart of a first object identification and location method provided in an embodiment of the present application, which is detailed as follows:

in S101, a two-dimensional image and point cloud data of a region to be measured are acquired.

The region to be detected is a region including a plurality of objects to be recognized, and a plurality of objects to be recognized can exist in the region at the same time. For convenience of description, the object to be recognized is hereinafter referred to as a target object. The two-dimensional image of the region to be measured may be acquired by a depth camera (e.g., RGBD camera) capable of acquiring the two-dimensional image, or may be acquired by a general camera, and the two-dimensional image includes two-dimensional shape information of an object in the region to be measured. The point cloud data of the area to be measured can be directly acquired by the depth camera, and can also be converted into a corresponding point cloud space according to a depth map acquired by the depth camera to be acquired. The point cloud data includes three-dimensional information of the area to be measured, and the three-dimensional information (e.g., three-dimensional coordinate information, three-dimensional direction information) of each object in the area to be measured can be reflected by the point cloud data. Preferably, the two-dimensional image of the region to be measured is a color image, because the color image includes color information of the target object in addition to the two-dimensional shape information of the target object, the target object can be better identified. Alternatively, the two-dimensional image and the point cloud data may be acquired by a depth camera (e.g., RGBD camera) that may be used to acquire the color image and the depth image simultaneously, or may be acquired by a color camera and a depth camera separately.

Optionally, the acquiring the two-dimensional image and the point cloud data of the region to be measured includes:

acquiring a color image and a depth image of a region to be detected, wherein the color image is a two-dimensional image of the region to be detected;

and generating point cloud data of the area to be detected according to the color image and the depth image.

The RGBD image of the area to be measured is collected through the RGBD Depth camera, the RGBD image comprises a color map (RGB map) and a Depth map (Depth map), the resolution of the color map and the resolution of the Depth map are the same, after alignment operation, points on the color map correspond to points on the Depth map one by one, and the points on the Depth map can be converted according to parameters of the camera to obtain corresponding space position coordinates. Therefore, the color map can be a two-dimensional image of the two-dimensional information of the region to be measured, and meanwhile, point cloud data reflecting the three-dimensional information of the region to be measured can be generated according to the depth map.

In S102, the two-dimensional image is detected through a pre-trained deep learning model, and a two-dimensional target region and a geometric shape type corresponding to a target object in the two-dimensional image are identified.

In the embodiments of the present application, for convenience of description, an image area in which a target object is located in a two-dimensional image is referred to as a two-dimensional target area. The pre-trained deep learning model is a target detection model trained in advance, the two-dimensional image is detected through the model, a two-dimensional target area corresponding to the target object in the two-dimensional image and a geometric shape type corresponding to the target object are identified, and the geometric shape type can comprise a plane, a sphere, a cylinder, a cuboid, a special-shaped body and the like. The corresponding feature information of the target objects with different three-dimensional shapes is different when the target objects with different three-dimensional shapes are projected and mapped into the two-dimensional images, so that the three-dimensional geometric shape types of the target objects can be preliminarily determined according to the two-dimensional images, and specifically, the target detection model trained in advance comprises geometric shape feature extraction parameters obtained through learning. Inputting the two-dimensional image into the model, extracting all geometric shape feature information in the two-dimensional image through geometric shape feature extraction parameters, and dividing a two-dimensional target area and determining the type of the geometric shape according to the geometric shape feature information.

Optionally, the target detection model includes a color feature information extraction network layer, a texture feature information extraction network layer, a geometric feature information extraction network layer, and a discriminator; correspondingly, the detecting the two-dimensional image through the pre-trained deep learning model, and identifying a two-dimensional target area and a geometric shape type corresponding to a target object in the two-dimensional image, including:

s1021: inputting the two-dimensional image into a target detection model, and extracting the color characteristic information of the two-dimensional image through a color characteristic information extraction network layer; extracting network layer through the texture feature information to obtain the texture feature information of the two-dimensional image; acquiring geometric characteristic information of the two-dimensional image through a geometric shape characteristic extraction network layer;

s1022: according to the color feature information, the texture feature information and the geometric feature information of the two-dimensional image, performing region division on the two-dimensional image to obtain a two-dimensional target region in the two-dimensional image;

s1023: and determining the geometric shape type of the target object according to the information of the two-dimensional target area and a discriminator.

In S1021, the two-dimensional image is subjected to preprocessing such as denoising or contrast enhancement, and then input to the target detection model, and the three types of feature information, i.e., color feature information, texture feature information, and geometric feature information of the two-dimensional image are extracted through the color feature information extraction network layer, the texture feature information extraction network layer, and the geometric feature extraction network layer, respectively.

In S1022, specifically, region division is performed according to the color feature information of the two-dimensional image, so as to obtain a first image; performing region division according to the texture feature information of the two-dimensional image to obtain a second image; performing region division according to the geometric characteristic information of the two-dimensional image to obtain a third image; and then, obtaining the intersection of the areas of the first image, the second image and the third image, and determining a two-dimensional target area in the two-dimensional image.

In S1023, the geometric feature information of the two-dimensional target area determined in step S1022 is acquired and input to the discriminator, and the geometric shape type of the target object corresponding to each two-dimensional target area in the two-dimensional image is determined. Wherein one two-dimensional target area corresponds to one target object.

In the embodiment of the application, when the two-dimensional target area is identified through the deep learning model, the color feature information, the texture feature information and the geometric feature information of the two-dimensional image can be respectively extracted, the three feature information are combined to divide the two-dimensional target area and finally determine the geometric shape type of the target object, so that the two-dimensional target area and the corresponding geometric shape type of the target object can be more accurately identified and obtained, and the accuracy of object identification and positioning is improved.

Optionally, when a two-dimensional target Region is identified, the two-dimensional target Region is framed in the two-dimensional image by a rectangular ROI (Region of interest), and geometric shape type information of a target object corresponding to the two-dimensional target Region is marked.

Optionally, before the step S102, the method further includes:

acquiring a two-dimensional sample image, wherein the two-dimensional sample image contains two-dimensional image information of a predetermined number of target objects;

framing a two-dimensional target area corresponding to the target object in the two-dimensional sample image, and identifying a corresponding geometric shape type label;

and taking the two-dimensional sample image as a training sample, and training through a target detection algorithm to obtain the pre-trained deep learning model.

And acquiring a two-dimensional sample image, wherein the two-dimensional sample image is a sample set formed by one or more two-dimensional sample images, and the sample set needs to contain two-dimensional image information of various target objects. One two-dimensional sample image can contain multiple two-dimensional image examples corresponding to the target objects to be identified, or can only contain one or multiple one or only one two-dimensional image example corresponding to the target object to be identified. In the sample set, the total number of two-dimensional image instances of each target object to be recognized needs to reach a predetermined number, for example, the number of two-dimensional image instances of each category of objects is greater than or equal to 500, and the greater the predetermined number, the higher the accuracy of the finally trained deep learning model.

In the two-dimensional sample image, a framing instruction of a user is received, two-dimensional target areas corresponding to all target objects are framed and selected, and corresponding category labels are marked. The category label may contain information such as the name, attribute characteristics, geometry type, etc. of the target object.

And (3) taking the two-dimensional sample image subjected to frame selection and identification as a training sample, inputting the training sample into a convolutional neural network, adjusting training parameters, and training through a target detection algorithm to obtain a pre-trained deep learning model. The target detection algorithm may be any one of an RCNN (Region-based probabilistic Neural Network) target detection algorithm, a Fast-RCNN target detection algorithm, a YOLO (young Only Look one) target detection algorithm, and an SSD (Single Shot multi box Detector) target detection algorithm, and specifically, the target detection algorithm may be selected according to requirements on training speed and training precision. Preferably, the pre-trained deep learning model is obtained through training of an SSD target detection algorithm or a fast-RCNN target detection algorithm, and the training speed is improved while the training precision is ensured.

In S103, the two-dimensional target area is mapped to the point cloud data, and a first three-dimensional area of the target object is determined according to the mapping result.

The two-dimensional image and the point cloud data have a one-to-one mapping relation, if the two-dimensional image and the point cloud data are acquired by the same depth camera, the coordinate mapping relation of the two-dimensional image and the point cloud data is calibrated, a two-dimensional target area in the two-dimensional image can be mapped into the point cloud data through certain algorithm conversion, a corresponding three-dimensional area of a target object in the point cloud data is determined, and coarse segmentation of the target object in the point cloud data is completed. For the sake of distinction, the three-dimensional region obtained by mapping the two-dimensional target region is referred to as a first three-dimensional region of the target object. If the two-dimensional image and the point cloud data are acquired by two different cameras, namely a color camera and a depth camera, the coordinate mapping relation of the two-dimensional image and the point cloud data is calibrated according to the position relation of the two cameras, and a two-dimensional target area in the two-dimensional image is mapped into the point cloud data to obtain a first three-dimensional area of a target object.

In S104, according to the geometry type and the first three-dimensional region, a second three-dimensional region of the target object is determined and the target object is located.

And according to the determined geometric shape type of the target object, performing more accurate second region segmentation in the first three-dimensional region by a fitting algorithm or a model matching method to obtain a three-dimensional region corresponding to the target object, and for distinguishing, referring the three-dimensional region of the target object obtained by the second segmentation to be the second three-dimensional region of the target object. After the second three-dimensional area is obtained, according to the second three-dimensional area in the point cloud data and the mapping relation between the depth camera coordinate system and the world coordinate system (namely the coordinate system of the real world where the object is located), the coordinates of all the target objects are expressed by the position coordinates of the world coordinate system, and the positioning of the target objects is completed.

Specifically, the step S104 includes:

s1041: if the geometric shape type is a regular geometric shape, determining a second three-dimensional region of the target object from the first three-dimensional region through a fitting algorithm and positioning the target object;

s1042: and if the geometric shape type is an irregular geometric shape, determining a second three-dimensional region of the target object from the first three-dimensional region by a 3D model matching method and positioning the target object.

In the embodiment of the present application, the geometric shape types at least include a plane, a sphere, a cylinder, a cuboid, a profile body, and the like, wherein the plane, the sphere, the cylinder, and the cuboid are regular geometric shapes, and the profile body is an irregular geometric shape.

In S1041, when the geometry type is a regular geometry, a second accurate three-dimensional region division may be performed in the first three-dimensional region through a fitting algorithm corresponding thereto, a second three-dimensional region of the target object is determined, and the actual position of the target object is located according to the second three-dimensional region. For example, if the geometric shape type of a target object in the region to be measured determined in step S102 is a sphere, a sphere fitting algorithm is established in the simulation software by combining a spherical equation and a least square method, a sphere is obtained by fitting in the first three-dimensional region of the target object, the sphere is the second three-dimensional region of the target object, and the target object is positioned according to the center coordinates and the radius of the sphere.

In S1042, since the irregular geometric shaped body cannot determine the second three-dimensional region by establishing a fitting algorithm, in the embodiment of the present application, a corresponding 3D template is established and pre-stored in advance for the irregular geometric shaped target object. When the geometric shape type of the target object is detected to be an irregular geometric shape, matching the first three-dimensional region with a pre-stored 3D template through a 3D model matching method, accurately determining a second three-dimensional region of the target object from the first three-dimensional region, and positioning the actual position of the target object according to the second three-dimensional region.

In the embodiment of the application, when the target object is in a regular geometric shape, a corresponding second three-dimensional area can be accurately determined from the first three-dimensional area by adopting a simple fitting algorithm with low computational resource consumption; and when the target object is in an irregular geometric shape, accurately determining the second three-dimensional area by adopting a 3D template matching algorithm. The corresponding second three-dimensional region determining method can be selected according to the geometric shape type characteristics of the target object, so that the calculation resources can be saved on the premise of ensuring accurate region division, and the efficiency and the accuracy of object identification and positioning are improved.

Optionally, after the second three-dimensional regions are determined in the point cloud data, all the second three-dimensional regions are framed by a rectangular ROI or identified by different colors, and attribute tags are identified for the respective second three-dimensional regions, where the attribute tags may include information such as names of target objects, geometric shape attributes (e.g., planes, cylinders, spheres, rectangular solids, irregular shapes), and the like.

In the embodiment of the invention, as the robustness of the target object identified by the two-dimensional image is better than that of the method for identifying the target object by 3D model matching, the two-dimensional target area of the target object is identified by the two-dimensional image, and then the two-dimensional target area is mapped to the three-dimensional point cloud space to determine the first three-dimensional area of the target object, so that the target object can be preliminarily identified and the geometric shape type corresponding to the target object can be determined, and the efficiency of object identification and the probability of accurate object identification can be improved compared with the object identification method for directly carrying out 3D model matching with poor robustness; and after the target object is preliminarily identified by determining the first three-dimensional area, the second three-dimensional area of the target object can be further accurately determined according to the geometric shape type of the target object and the first three-dimensional area, so that the target object is positioned, and the accuracy of object identification and positioning is further improved.

Example two:

fig. 2 shows a schematic flow chart of a second object identification and location method provided in the embodiment of the present application, which is detailed as follows:

in S201, a two-dimensional image and point cloud data of a region to be measured are acquired.

In this embodiment, S201 is the same as S101 in the previous embodiment, and please refer to the related description of S101 in the previous embodiment, which is not repeated herein.

In S202, the two-dimensional image is detected through a pre-trained deep learning model, and a two-dimensional target region and a geometric shape type corresponding to a target object in the two-dimensional image are identified.

In this embodiment, S202 is the same as S102 in the previous embodiment, and please refer to the related description of S102 in the previous embodiment, which is not repeated herein.

In S203, the two-dimensional target area is mapped to the point cloud data, and a first three-dimensional area of the target object is determined according to the mapping result.

In this embodiment, S203 is the same as S103 in the previous embodiment, and please refer to the related description of S103 in the previous embodiment, which is not repeated herein.

In S204, according to the geometry type and the first three-dimensional region, a second three-dimensional region of the target object is determined and the target object is located.

In this embodiment, S204 is the same as S104 in the previous embodiment, and please refer to the related description of S104 in the previous embodiment, which is not repeated herein.

In S205, the target object is grabbed according to the geometric shape type and the position of the target object.

According to the geometric shape type information of the target object determined in the two-dimensional image and the position of the target object positioned according to the second three-dimensional area of the target object, positioning the grabbing point (which may include position coordinate information and grabbing direction information of the grabbing point) of the target object through a fitting algorithm or a 3D matching algorithm, moving a grabbing instrument, and grabbing the target object.

Optionally, the step S205 includes:

and positioning a grabbing point and grabbing the target object according to the geometric shape type, the position of the target object and the type of a grabbing instrument.

In the embodiment of the application, the gripping instruments can be suction cup type gripping instruments and manipulator type gripping instruments. When the type of the grabbing instrument is a sucker type, the position of the central point of a plane of the target object is determined according to the position and the geometric shape of the target object (if the geometric shape of the target object is a sphere, any point on the sphere can be used as the position of the central point of the sphere) as the coordinate position of the grabbing point, and the normal direction passing through the grabbing point along the plane is determined as the grabbing direction. When the type of the grabbing instrument is a mechanical hand type, determining a plurality of edge contour coordinates of the target object as a plurality of grabbing point positions corresponding to the mechanical hand according to the position and the geometric shape type of the target object, wherein the grabbing direction is along the axis direction of the arm of the mechanical hand.

According to the embodiment of the application, the corresponding grabbing points can be determined according to the geometric shape type and the grabbing instrument type of the target object, so that the target object can be grabbed more stably and efficiently.

Specifically, the step S205 includes:

and if the geometric shape type is a regular geometric shape, calculating the grabbing point of the target object by adopting the position of the target object and the type of the grabbing instrument through a fitting algorithm, and grabbing the target object.

And if the geometric shape type is a regular geometric shape, calculating the position of the corresponding grabbing point through a fitting algorithm. For example, if the three-dimensional target object is a cylinder and the type of the grasping apparatus is a suction cup type, the corresponding fitting algorithm is used for fitting any one of two planes, namely the upper surface or the lower surface of the cylinder, from the cylindrical second three-dimensional region of the target object through a plane fitting algorithm, and the coordinate of the center point of the plane is determined to be the position of the grasping point.

In the embodiment of the application, when the geometric shape type of the target object is a regular geometric shape, the grabbing points of the target object are simply and accurately obtained through a fitting algorithm, so that the target object is accurately grabbed.

Specifically, the step S205 includes:

if the geometric shape type is an irregular geometric shape, matching the second three-dimensional area with a pre-stored 3D template to obtain 6D posture information of the target object, wherein the pre-stored 3D template comprises a preset grabbing point;

and grabbing the target object according to the 6D posture information and the matched pre-stored 3D template.

and if the target object is in an irregular geometric shape, matching the second three-dimensional area with a prestored 3D template to obtain 6D attitude information of the target object, wherein the 6D attitude information consists of three-dimensional position coordinate information and three-dimensional direction information of the target object, and the three-dimensional position coordinate information and the three-dimensional direction information both use a preset three-dimensional coordinate system XYZ as a reference system, the three-dimensional direction information can be represented by Euler angles, the Euler angles comprise a first included angle α, a second included angle β and a third included angle gamma, the first included angle α is used for representing an included angle between a Z axis of the reference system and an inherent Z axis of the target object, the second included angle β is used for representing an included angle between a Y axis of the reference system and an inherent Y axis of the target object, and the third included angle gamma is used for representing an included angle between an X axis of the reference system and an inherent X axis of the target object, and the prestored 3D template is point cloud data of the target object containing preset three-dimensional capture point information and comprises capture position information and capture direction information of capture points.

Optionally, after the second three-dimensional region is matched with a pre-stored 3D target to obtain 6D pose information of the target object, the 6D pose information is refined through an Iterative Closest Point (ICP) algorithm. Obtaining rough initial pose p of target object by using pre-stored 3D target matching₀Then, the initial pose p is used₀As an initial input, the obtained attitude is further refined (fine) by using an ICP algorithm to obtain a more accurate result p, and finally refined 6D attitude information p is output.

And according to the finally obtained 6D posture information of the target object and the position mapping relation of the pre-stored 3D template matched with the second three-dimensional region, determining the position information of a grabbing point corresponding to a preset grabbing point in the pre-stored 3D template in the actual object space, positioning the grabbing point and grabbing the target object.

In the embodiment of the application, when the geometric shape type of the target object is an irregular geometric shape, the grabbing point position and the grabbing direction of the target object can be accurately determined according to the preset grabbing point of the pre-stored 3D template, so that the target object can be accurately grabbed.

Optionally, before the grabbing the target object according to the second three-dimensional region in the point cloud data, the method further includes:

establishing a 3D template of the target object by adopting a point-to-feature PPF algorithm;

and determining a preset grabbing point in the 3D template and storing to obtain the pre-stored 3D template.

Generating a three-dimensional model of a target object through a three-dimensional Computer Aided Design (CAD) model or directly obtaining three-dimensional information of the target object through shooting of a depth camera, inputting the three-dimensional model or the three-dimensional information into a Point Pair Feature (PPF) detector, and establishing a 3D template of the target object through a PPF algorithm. Alternatively, before inputting the three-dimensional information of the target object into the PPF detector, a sampling step size of the PPF detector with respect to the model diameter, a discrete step size with respect to the model diameter, an angle discretization value, and the like may be set.

And receiving an appointed instruction in the established 3D template, appointing a preset grabbing point and storing to obtain a pre-stored 3D template containing the preset grabbing point. Specifically, the preset grabbing point includes position information and preset grabbing direction information of the preset grabbing point. Specifically, according to the characteristics of the grasping apparatus (which may be a suction head, a suction cup, a manipulator, or the like), a preset grasping point and a preset grasping direction are specified. For example, if the grasping apparatus is a suction cup, the position of the preset grasping point is designated as a central point of a certain plane of the target object, and the direction of the preset grasping point is a normal direction along the position of the preset grasping point.

In the embodiment of the application, the pre-stored 3D template containing the preset grabbing points is accurately established and obtained through the point pair characteristic algorithm, so that accurate basis is provided for the positioning of the target object and the determination of the grabbing points later in advance, and the accuracy and the efficiency of the identification, the positioning and the grabbing of the target object are improved.

In the embodiment of the invention, after the target object is positioned, the grabbing point of the target object can be determined according to the geometric shape type and the position of the target object, so that the grabbing of the target object can be accurately and efficiently realized.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Example three:

fig. 3 is a schematic structural diagram of an object identification and positioning apparatus provided in an embodiment of the present application, and for convenience of description, only parts related to the embodiment of the present application are shown:

the object recognition positioning device comprises: a first acquisition unit 31, a recognition unit 32, a rough segmentation unit 33, and a positioning unit 34. Wherein:

the first acquiring unit 31 is configured to acquire a two-dimensional image and point cloud data of a region to be measured.

Optionally, the first obtaining unit includes a first obtaining module and a point cloud data generating module:

the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a color image and a depth image of a region to be detected, and the color image is a two-dimensional image of the region to be detected;

and the point cloud data generation module is used for generating point cloud data of the area to be detected according to the depth map.

The identifying unit 32 is configured to detect the two-dimensional image through a pre-trained deep learning model, and identify a two-dimensional target region corresponding to a target object in the two-dimensional image.

Optionally, the object recognition and positioning device further includes a deep learning model training unit, where the deep learning model training unit specifically includes a second obtaining module, a first identification module, and a training module:

a second acquisition module for acquiring a two-dimensional sample image, wherein the two-dimensional sample image contains a predetermined number of target objects;

the first identification module is used for selecting a two-dimensional target area corresponding to a target object in the two-dimensional sample image and identifying a corresponding geometric shape type label;

and the training module is used for training the two-dimensional sample image as a training sample through a target detection algorithm to obtain the pre-trained deep learning model.

And a rough segmentation unit 33, configured to map the two-dimensional target region to the point cloud data, and determine a first three-dimensional region of the target object according to a mapping result.

And a positioning unit 34, configured to determine a second three-dimensional region of the target object and position the target object according to the geometry type and the first three-dimensional region.

Optionally, the positioning unit 34 includes a first positioning module and a second positioning module:

the first positioning module is used for determining a second three-dimensional area of the target object from the first three-dimensional area through a fitting algorithm and positioning the target object if the geometric shape type is a regular geometric shape;

and the second positioning module is used for determining a second three-dimensional area of the target object from the first three-dimensional area through a 3D model matching method and positioning the target object if the geometric shape type is an irregular geometric shape.

Optionally, the object identification and positioning device further comprises:

and the grabbing unit is used for grabbing the target object according to the geometric shape type and the position of the target object.

Optionally, the grasping unit includes:

and the first grabbing module is used for calculating grabbing points of the target object by adopting a fitting algorithm according to the position of the target object and the type of the grabbing instrument and grabbing the target object if the geometric shape is a regular geometric shape.

Optionally, the grabbing unit includes a matching module and a second grabbing module:

the matching module is used for matching the second three-dimensional area with a pre-stored 3D template to obtain 6D posture information of the target object if the geometric shape type is an irregular geometric shape, wherein the pre-stored 3D template comprises a preset grabbing point;

and the second grabbing module is used for grabbing the target object according to the 6D posture information and the matched pre-stored 3D template.

Optionally, the object identification and positioning device further includes a template establishing unit, where the template establishing unit includes an establishing module and a preset grabbing point determining module:

the template establishing unit is used for establishing a 3D template of the target object by adopting a point-to-feature PPF algorithm;

and the preset grabbing point determining module is used for determining and storing preset grabbing points in the 3D template to obtain the pre-stored 3D template.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Example four:

fig. 4 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 4, the terminal device 4 of this embodiment includes: a processor 40, a memory 41 and a computer program 42, such as an object identification and positioning program, stored in said memory 41 and executable on said processor 40. The processor 40 executes the computer program 42 to implement the steps in the above embodiments of the object identification and location method, such as the steps S101 to S104 shown in fig. 1. Alternatively, the processor 40, when executing the computer program 42, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 31 to 34 shown in fig. 3.

Illustratively, the computer program 42 may be partitioned into one or more modules/units that are stored in the memory 41 and executed by the processor 40 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 42 in the terminal device 4. For example, the computer program 42 may be divided into a first acquiring unit, a recognizing unit, and a positioning unit, and each unit specifically functions as follows:

the first acquisition unit is used for acquiring a two-dimensional image and point cloud data of a region to be detected.

And the recognition unit is used for detecting the two-dimensional image through a pre-trained deep learning model and recognizing a two-dimensional target area corresponding to a target object in the two-dimensional image.

And the rough segmentation unit is used for mapping the two-dimensional target area to the point cloud data and determining a first three-dimensional area of the target object according to a mapping result.

The terminal device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 40, a memory 41. Those skilled in the art will appreciate that fig. 4 is merely an example of a terminal device 4 and does not constitute a limitation of terminal device 4 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.

The Processor 40 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 41 may be an internal storage unit of the terminal device 4, such as a hard disk or a memory of the terminal device 4. The memory 41 may also be an external storage device of the terminal device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the terminal device 4. The memory 41 is used for storing the computer program and other programs and data required by the terminal device. The memory 41 may also be used to temporarily store data that has been output or is to be output.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. An object identification and positioning method is characterized by comprising the following steps:

2. The method for identifying and positioning objects according to claim 1, wherein before the detecting the two-dimensional image by the pre-trained deep learning model and identifying the two-dimensional target area and the geometric shape type corresponding to the target object in the two-dimensional image, the method further comprises:

3. The method for identifying and locating objects according to claim 1, wherein the determining a second three-dimensional region of the target object and locating the target object according to the geometry type and the first three-dimensional region comprises:

if the geometric shape type is a regular geometric shape, determining a second three-dimensional region of the target object from the first three-dimensional region through a fitting algorithm and positioning the target object;

and if the geometric shape type is an irregular geometric shape, determining a second three-dimensional region of the target object from the first three-dimensional region by a 3D model matching method and positioning the target object.

4. The object recognition and positioning method according to claim 3, further comprising, after determining a second three-dimensional region of the target object and positioning the target object based on the geometry type and the first three-dimensional region:

and grabbing the target object according to the geometric shape type and the position of the target object.

5. The object recognition and positioning method according to claim 4, wherein the grabbing the target object according to the geometry type and the position of the target object comprises:

6. An object recognition and positioning method according to claim 4, wherein the grabbing the target object according to the geometry type and the position of the target object comprises:

if the geometric shape type is an irregular geometric shape, matching the second three-dimensional area with a pre-stored 3D template to obtain 6D posture information of the target object, wherein the pre-stored 3D template comprises a pre-set grabbing point;

7. An object recognition and positioning method according to claim 6, wherein before the grabbing the target object according to the geometry type and the position of the target object, the method further comprises:

8. An object recognition positioning device, comprising:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.