CN114952809A

CN114952809A - Workpiece identification and pose detection method and system and grabbing control method of mechanical arm

Info

Publication number: CN114952809A
Application number: CN202210732860.9A
Authority: CN
Inventors: 徐刚; 赵有港; 崔玥; 周翔; 许允款; 曾晶; 肖江剑
Original assignee: Ningbo Institute of Material Technology and Engineering of CAS
Current assignee: Ningbo Institute of Material Technology and Engineering of CAS
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2022-08-30
Anticipated expiration: 2042-06-24
Also published as: CN114952809B

Abstract

The invention discloses a method and a system for workpiece identification and pose detection and a grabbing control method of a mechanical arm. The workpiece identification and pose detection method comprises the following steps: acquiring a 2D image and a 3D point cloud image in a scene to be identified; identifying a target workpiece in the scene to be identified based on the 2D image, and performing example segmentation based on a mapping relation to obtain a point cloud area corresponding to the target workpiece; and based on a deep learning algorithm, carrying out pose detection in the point cloud area to acquire pose information of the target workpiece. The workpiece identification and pose detection method provided by the invention avoids the problems of cross-modal data feature extraction and matching in the capture scene of small workpieces scattered and stacked, avoids excessively complex data processing calculation, and provides an optimized solution for the application scene of workpiece stack identification and capture in the direction of effectively improving the identification efficiency and the capture efficiency by combining the 2D image and the 3D point cloud image.

Description

Workpiece identification and pose detection method and system and mechanical arm grabbing control method

Technical Field

The invention relates to the technical field of image recognition and mechanical control, in particular to a method and a system for workpiece recognition and pose detection and a grabbing control method of a mechanical arm.

Background

The robot grabbing technology based on two-dimensional/three-dimensional vision is widely applied to simple scenes such as logistics express delivery, warehouse carrying and stacking, and the vision-guided robot has the enhanced perception capability facing complex environments. In an industrial grabbing scene, a two-dimensional image can provide dense and rich texture information, and the position (two-dimensional coordinate) of a grabbed workpiece is obtained through image processing and recognition, but depth information cannot be obtained; the three-dimensional image can provide distance information in a captured scene, but abundant detail information cannot be obtained, so that the capturing precision is reduced. The two types of data have better complementarity, and the data of the two modes are fused, so that the workpiece grabbing scene can be more comprehensively perceived. In recent years, as researches on a workpiece 6D attitude estimation algorithm are increasing, the computing capability of equipment is increasing, and a robot gripping system has made creative breakthroughs in related fields of disordered and scattered stacking of workpieces, disordered workpiece assembly, flexible gripping and the like.

The recognition and pose detection of the target object are key prerequisites of a robot grabbing task. Since the early days of computer vision, object 6D pose detection and estimation, which describes object pose information by a translation vector te R3 and a rotation matrix R e SO (3) with respect to a fixed coordinate system of a given reference frame, is a long-standing challenge and an open field of research.

Due to the real world objectsThe method comprises the following steps of firstly obtaining the coordinates (x, y and z) of the centroid position of a target object in a camera coordinate system through various algorithms, then matching a model to the centroid position, and obtaining the rotation pose (R) of the current target object in the camera coordinate system _x ，R _y ，R _z ) The position of the target object under the base coordinates of the mechanical arm is obtained based on hand-eye calibration matrix conversion, and finally the mechanical arm is controlled to move to carry out grabbing operation, so that certain challenges are achieved. From the technical point of view, how to skillfully fuse the three-dimensional point cloud data and the two-dimensional image data belonging to different modals, analyze reliable geometric characteristics according to the random stacking condition of workpieces in a captured scene, finally identify the workpieces which can be captured in the scene, and acquire pose information is a research direction for domestic and foreign science and technology workers.

The existing workpiece identification and pose detection methods can be divided into the following two types according to different input image data types: based on 2D visual data (with RGB or RGBD data as input), based on 3D visual data (with point cloud data as input). The simple identification method based on 2D data is lack of scene depth information, so that only plane object grabbing can be performed frequently, and stacked scenes cannot be processed, so that the workpiece identification and pose detection method based on 3D visual data gradually becomes the mainstream. Detection methods based on 3D visual data can be roughly classified into the following two categories according to differences in the implementation principle: template matching method, deep learning method. The first type of template matching method is generally based on a PPF (point Pair future) algorithm (e.g., dry B, Ulrich M, Navab n.et al. model globalley, match localization: Efficient and debug 3D object recognition [ C ]. IEEE computer science conference on component vision and pattern recognition. piscataway: IEEE Press 2010: 998-1005.), which is a description method based on point-to-point features, extracts point-to-point features and trains a model of an object according to 3D model data of the object, detects and matches 3D feature points in the object scene based on PPF feature descriptors, finds an initial estimate of the pose and iterates the vote, and finally performs an ICP algorithm reference operation on the result to obtain a more accurate pose result and outputs the result. The template matching method has the biggest defect that mismatching can occur, however, when the workpiece is too simple and the characteristics are not obvious, the method can often obtain wrong recognition results. The second type of deep learning method is to generate a simulation data set in a simulation scene, then learn data characteristics in a network, and finally obtain a pose detection result in a test data set. For example, a novel Point-by-Point posture Regression Network PPR-Net (Point-by-Point Regression Network) is proposed in the literature (Dong Z, Liu S, Zhou T.et al. PPR-Net: Point-by-Point position Regression Network for instance segmentation and 6d position estimation in bin-packing strategies [ C ]. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2019: 1773-. The method is a winner in IROS2019 'bin-pick attitude estimation challenge', and the method takes PointNet + + as a backbone network to carry out 6D attitude estimation on each point in point cloud of an object example to which the method belongs, and then averages the predicted attitude of each recognition in space based on a clustering method to obtain a final attitude hypothesis. However, this method has disadvantages in that: the processing efficiency of the workpiece scene global 3D point cloud image is low, and the analysis and detection time is long.

Therefore, how to provide a high-efficiency target object identification and pose detection method suitable for a scene blocked by small workpiece stacking is an urgent problem to be solved.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a workpiece identification and pose detection method, a workpiece identification and pose detection system and a grabbing control method of a mechanical arm.

In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:

in a first aspect, the present invention provides a workpiece identifying and pose detecting method, including:

s1, collecting a 2D image and a 3D point cloud image in a scene to be identified;

s2, identifying a target workpiece in the scene to be identified based on the 2D image, and performing example segmentation on an area where the target workpiece is located in the 3D point cloud image based on a mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud area corresponding to the target workpiece;

and S3, detecting the position and pose in the point cloud area based on a deep learning algorithm, and acquiring the position and pose information of the target workpiece.

In a second aspect, the present invention also provides a workpiece recognition and pose detection system, including:

the image acquisition module is used for acquiring a 2D image and a 3D point cloud image in a scene to be identified;

the area acquisition module is used for identifying a target workpiece in the scene to be identified based on the 2D image, and performing example segmentation on an area where the target workpiece is located in the 3D point cloud image based on a mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud area corresponding to the target workpiece;

and the pose acquisition module is used for detecting the pose in the point cloud area based on a deep learning algorithm and acquiring the pose information of the target workpiece.

In a third aspect, the present invention further provides a method for controlling grabbing of a robot arm, including:

acquiring a target workpiece and pose information thereof in a scene to be identified based on the workpiece identification and pose detection method;

and selecting the target workpiece to be grabbed, and controlling a mechanical arm to grab based on the pose information.

Based on the technical scheme, compared with the prior art, the invention has the beneficial effects that at least:

the workpiece identification and pose detection method provided by the invention avoids the problems of cross-modal data feature extraction and matching in the capture scene of small workpieces scattered and stacked, avoids excessively complex data processing calculation, and provides an optimized solution for the application scene of workpiece stack identification and capture in the direction of effectively improving the identification efficiency and the capture efficiency by combining the 2D image and the 3D point cloud image.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to enable those skilled in the art to more clearly understand the technical solutions of the present invention and to implement them according to the content of the description, the following description is made with reference to the preferred embodiments of the present invention and the detailed drawings.

Drawings

Fig. 1 is a schematic flow chart of a workpiece recognition and pose detection method according to an exemplary embodiment of the present invention;

fig. 2 is a partial schematic flow chart of a workpiece identifying and pose detecting method according to an exemplary embodiment of the present invention;

fig. 3 is a partial schematic flow chart of a workpiece identifying and pose detecting method according to an exemplary embodiment of the present invention;

fig. 4 is a partial schematic flow chart of a workpiece recognition and pose detection method according to an exemplary embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a workpiece recognition and pose detection system provided in an exemplary embodiment of the present invention

FIG. 6 is a schematic structural diagram of a simulation data set generation system provided in an exemplary embodiment of the present invention

FIG. 7 is a schematic structural diagram of a 2D/3D deep learning network provided in an exemplary embodiment of the present invention;

fig. 8 is an exemplary diagram of the recognition and detection effects of the workpiece recognition and pose detection method according to an exemplary embodiment of the present invention.

Detailed Description

In view of the deficiencies in the prior art, the inventors of the present invention have made extensive studies and extensive practices to provide technical solutions of the present invention. The technical solution, its implementation and principles, etc. will be further explained as follows.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and thus the scope of the present invention is not limited by the specific embodiments disclosed below.

Referring to fig. 1 to 4, an embodiment of the present invention provides a workpiece identifying and pose detecting method, which specifically includes the following steps S1 to S3:

and S1, acquiring a 2D image and a 3D point cloud image in the scene to be identified.

Specifically, in this embodiment, the adopted image information includes a 2D image and a 3D image, and specifically, a 2D visible light camera is adopted to photograph the workpiece in the stacked scene to obtain the 2D image, and a 3D visible light camera is adopted to photograph the workpiece in the stacked scene to obtain the 3D image.

Thus, in some embodiments, step Sl may specifically comprise: and acquiring the 2D image in the scene to be recognized by using a 2D camera, and acquiring the 3D image information in the scene to be recognized by using a 3D camera.

In some embodiments, the scene to be identified may preferably include a stacked scene of artifacts, and further preferably an randomly stacked scene of artifacts. The method is particularly suitable for application scenes of small workpieces scattered stacking, such as scenes that some small workpieces are scattered randomly on a tray or in a container, for example, during industrial production, common semi-finished products are stacked randomly on the tray, or during material transfer, workpieces such as a wrench are scattered and stacked in a material frame, and the like.

S2, identifying a target workpiece in the scene to be identified based on the 2D image, and performing example segmentation on the area where the target workpiece is located in the 3D point cloud image based on the mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud area corresponding to the target workpiece. The method comprises the steps of identifying a target workpiece in a scene based on a 2D image, and performing example segmentation on an identified target workpiece area by utilizing a mapping relation between the 2D image and a 3D point cloud image.

Specifically, as shown in fig. 2, the step S2 may include steps S21-S24:

and S21, collecting a plurality of 2D images, marking the workpiece outline closed area, and making into a training data set.

Specifically, in this embodiment, the contour data of the workpiece may be acquired from the 2D image, the contours of the workpiece facing upward in the same direction are labeled as the same type, the position of the labeled workpiece region in the global image is recorded, a plurality of images are collected and labeled, and the training data set is manufactured. The labeling can be performed manually, or can be performed by methods such as manual labeling and machine labeling in machine confrontation learning.

And S22, building a deep learning 2D image target segmentation network model, and training the network model based on the training data set.

Specifically, in this embodiment, a 2D image target segmentation network model based on a Mask R-CNN convolutional neural network may be built, and the network model may be trained and learned based on the manufactured training data set.

And S23, importing the 2D image acquired in real time into the trained network model for recognition, and acquiring the region position of the recognized workpiece.

Specifically, in this embodiment, the learned 2D image target segmentation network model may be used to introduce the 2D image acquired in real time into the network model, and Mask R-CNN may be used to implement pixel-level segmentation on the object identified in the 2D image, so as to obtain the region position in the segmented 2D image.

S24, mapping the identified 2D workpiece area position to a 3D point cloud scene area, and performing example segmentation on a target workpiece area in the 3D point cloud scene.

Specifically, in this embodiment, point cloud example segmentation may be performed on the same position region in a 3D point cloud data scene based on the target workpiece region after the example segmentation by using a mapping relationship between the 2D image and the 3D point cloud to obtain a point cloud set for identifying the workpiece.

Therefore, in some embodiments, step S2 specifically includes the following steps:

and importing the 2D image into a target segmentation model for identification, and acquiring the region position of the target workpiece.

And mapping the region position to a corresponding region in the 3D point cloud image, and performing example segmentation on the region where the target workpiece is located in the 3D point cloud image.

In some embodiments, the training method of the target segmentation model specifically includes the following steps:

providing a 2D training data set, the 2D training data set comprising a plurality of 2D images for training and corresponding labeling information thereof, the labeling information being indicative of at least a contour enclosing region of a workpiece in the 2D images.

And constructing a target segmentation initial model, and training the target segmentation initial model based on the 2D training data set to obtain the target segmentation model.

Specifically, in this embodiment, the point cloud set of the identified workpiece can be obtained by performing 2D image segmentation and mapping to a 3D point cloud, and the pose of the workpiece is detected based on the built and trained PPR-Net deep learning network, so as to obtain the pose information of the workpiece.

With continued reference to fig. 3, step S3 may specifically include the following steps S31-S34:

and S31, constructing a deep learning training simulation data set generation system based on the V-REP.

Specifically, in this embodiment, a deep learning training simulation data set generation system may be built based on V-REP simulation software, where the deep learning training simulation data set generation system includes building a Kinect simulation visual sensor, importing a workpiece 3D model, importing a material frame 3D model, and compiling a workpiece drop and image data acquisition program, and the built simulation system is as shown in fig. 6.

And S32, creating and generating a simulation 3D training data set.

And S33, building a deep learning 3D pose detection neural network model, and training the network model based on the training data set.

Specifically, in this embodiment, a 3D pose detection neural network model based on a PPR-Net deep learning network may be built, and based on the created simulation training data set, the network model is trained and learned.

And S34, introducing the 3D point cloud image obtained after the example segmentation into the trained network model for workpiece pose detection, and acquiring workpiece pose information.

Specifically, as shown in fig. 4, in the present embodiment, the step S32 may include the following steps S321 to S326:

s321, it is set that n workpieces exist in the scene.

In a preferred embodiment, the number of workpieces may be, for example, n-27.

And S322, based on the field randomization concept, dropping the i workpieces from a certain position randomly in the working area (setting the initial integer i to be 0), and giving different color information to different workpieces.

And S323, acquiring and saving a depth image and an rgb image in the scene based on the simulation vision sensor.

In a preferred embodiment, the simulated visual sensor may be a Kinect depth camera.

And S324, recording and storing the falling pose information of each workpiece acquired in the V-REP.

And S325, carrying out visualization degree analysis on each workpiece based on the acquired rgb image information, and recording visualization degree data.

In a preferred embodiment, the visualization degree analyzing method may be: introducing the visibility degree v epsilon [0, 1 ] of the workpiece]This parameter reflects the degree of occlusion of the predicted object, and is completely invisible when v is 0, completely free of occlusion when v is 1, and so on. The visualization degree of a certain workpiece in a scene is as follows: v is N/N _max 。

Wherein N is the area value of the color area of a certain example workpiece, N _max The color area is the largest value in the global workpiece.

S326, if the integer i is less than n, repeating the steps from S322 to S326; and if the integer i is equal to n, stopping dropping the workpiece, and making a generated data set label file.

More specifically, in this embodiment, the point cloud set of the identified workpiece may be obtained by segmenting and mapping the 2D image to the 3D point cloud, the pose of the workpiece may be detected based on the built and trained PPR-Net deep learning network, so as to obtain the pose information of the workpiece, a schematic diagram of a 2D/3D deep learning network structure based on the Mask R-CNN convolutional neural network and the PPR-Net deep learning network is shown in fig. 7, and a schematic diagram of a segmentation effect of a 2D image detection example and a 3D point cloud pose detection effect is shown in fig. 8.

As shown in FIG. 5, a deep learning training simulation data set generation system disclosed in the present invention may include:

and the image acquisition device is used for acquiring the image information of the workpieces in the stacking scene.

The image acquisition device comprises a 2D image acquisition unit and a 3D image acquisition unit, wherein the 2D image acquisition unit is used for photographing a workpiece in a stacked scene by adopting a 2D camera to obtain a 2D image. The 3D image acquisition unit is used for photographing the workpieces in the stacked scene by adopting the 3D camera to obtain a 3D image.

And the area limiting device is used for physically limiting the falling range of the workpiece according to the size of the visual field of the image acquisition device.

The region limiting device mainly comprises a material frame setting unit and a camera setting unit, the material frame setting unit is used for drawing and leading in a material frame 3D model, the proper position is adjusted, the workpiece falling range is conveniently subjected to physical limitation, the camera setting unit is used for adjusting internal parameters and external parameters of a camera, the camera parameters are guaranteed to be consistent with those of a real scene, and therefore an effective data set is generated.

And the workpiece pose acquisition device is used for recording the workpiece pose information at that time after the workpiece randomly falls off.

And the workpiece visualization degree analysis device is used for calculating and analyzing the visualization degree of each workpiece in the simulation scene.

The workpiece visualization degree analyzing device mainly comprises a color pixel collecting unit and a visualization degree calculating unit, wherein the color pixel collecting unit is used for counting the area value of each segmented example color pixel in a scene, and the visualization degree calculating unit performs visualization degree calculation by utilizing the statistical value provided by the color pixel collecting unit and outputs a visualization degree value.

And the data set label integrating device is used for integrating information such as image information, workpiece pose information, visualization degree and the like to manufacture a data set label.

Thus, in some embodiments, step S3 may specifically include the following steps:

and guiding the point cloud area into a pose detection model to detect the pose of the target workpiece, and acquiring the pose information of the target workpiece.

In some embodiments, the method for training the pose detection model may include the steps of:

and providing a 3D training data set, wherein the 3D training data set at least comprises a 3D training image and a workpiece pose label and a visualization degree label corresponding to the 3D training image.

And constructing a pose detection initial model, training the pose detection initial model based on the 3D training data set, and obtaining the pose detection model.

In some embodiments, the 3D training data set is generated by a simulation data set generation system simulation.

In some embodiments, the simulation generation may specifically include the following steps:

and constructing a simulation scene, and setting that n virtual workpieces exist in the simulation scene.

Based on a domain randomization method, i virtual artifacts are randomly dropped from selected positions within a working area of the simulation scenario, and different color information is given to the different virtual artifacts, wherein i is incrementally iterated from zero. For example, the values are sequentially increased from 0 to + 1.

Depth images and rgb images in the scene are acquired and saved based on the emulated vision sensors.

And recording and storing the falling pose information of each virtual workpiece in the simulation scene as the workpiece pose tag.

And analyzing the visualization degree of each virtual workpiece based on the acquired depth image and/or rgb image, and recording visualization degree data as the visualization degree label.

And when iteration is carried out until the integer i is not less than n, stopping dropping the virtual workpiece, and generating the 3D training data set based on the workpiece pose label and the visualization degree label.

In some embodiments, the simulation data set generation system may include:

and the image acquisition device is used for acquiring the 3D training image of the virtual workpiece in the simulation scene.

And the area limiting device is used for limiting the falling range of the virtual workpiece according to the size of the visual field acquired by the 3D training image.

And the workpiece pose acquisition device is used for recording pose information of the target workpiece at that time after the virtual workpiece randomly falls off.

And the workpiece visualization degree analysis device is used for calculating and analyzing the visualization degree of each virtual workpiece in the simulation scene.

And the data set label integration device is used for integrating the 3D training image, the workpiece pose label and the visualization degree label to generate the 3D training data set.

Based on the above method, another embodiment of the present invention further provides a workpiece identifying and pose detecting system, which includes:

and the image acquisition module is used for acquiring the 2D image and the 3D point cloud image in the scene to be identified.

And the area acquisition module is used for identifying a target workpiece in the scene to be identified based on the 2D image, and carrying out example segmentation on an area where the target workpiece is located in the 3D point cloud image based on the mapping relation between the 2D image and the 3D point cloud image to obtain a point cloud area corresponding to the target workpiece.

Similarly, an embodiment of the present invention further provides an electronic device that may be applied to the system, and the electronic device includes a processor and a memory, where the memory stores a computer program, and the computer program is executed to perform the steps of the workpiece recognition and pose detection method.

Meanwhile, the embodiment of the invention also provides a readable storage medium, wherein a computer program is stored, and the computer program is executed to execute the steps of the workpiece identification and pose detection method.

The above embodiment provides a workpiece recognition and pose detection method and system, and a simulation data set generation system to which the method and system are applied, as a further application of the method and system, another embodiment of the present invention further provides a robot grasping control method, including the following steps:

the target workpiece and the pose information thereof in the scene to be identified are acquired based on the workpiece identification and pose detection method in any one of the above embodiments.

Namely: and acquiring the pose information of the workpiece, and controlling the mechanical arm to perform grabbing operation.

It should be noted that, the main technical idea of the present invention is how to efficiently and accurately acquire pose information of a workpiece, and how to plan a path and/or an action of a mechanical arm according to the pose information, which is not the key point of the present invention.

According to the workpiece identification and pose detection method and the deep learning training simulation data set generation system applied by the method, the problems of cross-modal data feature extraction and matching are avoided in a capture scene in which small workpieces are randomly stacked, and meanwhile, too complex data processing calculation is avoided. Through combining 2D image and 3D point cloud data, for the application scene that the work piece piled up discernment and snatched, in effectively promoting discernment efficiency and promoting this orientation of snatching efficiency, provided the solution of optimization, simultaneously, avoided conventional sample manual marking, automatic preparation and generation training data set have greatly promoted work efficiency.

While the invention has been described with reference to illustrative embodiments, it will be understood by those skilled in the art that various other changes, omissions and/or additions may be made and substantial equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from its scope. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claims.

It should be understood that the above-mentioned embodiments are merely illustrative of the technical concepts and features of the present invention, which are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and therefore, the protection scope of the present invention is not limited thereby. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. A workpiece recognition and pose detection method is characterized by comprising the following steps:

2. The workpiece recognition and pose detection method according to claim 1, wherein the step S1 specifically comprises:

acquiring the 2D image in the scene to be recognized by using a 2D camera, and acquiring the 3D image information in the scene to be recognized by using a 3D camera;

preferably, the scene to be identified comprises a scene of stacking workpieces, and further preferably a scene of stacking workpieces out of order.

3. The workpiece recognition and pose detection method according to claim 1, wherein the step S2 specifically comprises:

importing the 2D image into a target segmentation model for recognition, and acquiring the region position of the target workpiece;

and mapping the region position to a corresponding region in the 3D point cloud image, and performing example segmentation on the region of the target workpiece in the 3D point cloud image.

4. The workpiece recognition and pose detection method according to claim 1, wherein the training method of the object segmentation model comprises:

providing a 2D training data set, wherein the 2D training data set comprises a plurality of 2D images for training and corresponding marking information thereof, and the marking information at least indicates a contour closed area of a workpiece in the 2D images;

5. The workpiece recognition and pose detection method according to claim 1, wherein step S3 specifically comprises:

6. The workpiece recognition and pose detection method according to claim 5, wherein the pose detection model training method comprises:

providing a 3D training data set, wherein the 3D training data set at least comprises a 3D training image and a workpiece pose label and a visualization degree label corresponding to the 3D training image;

7. The workpiece recognition and pose detection method of claim 6, wherein the 3D training data set is generated by a simulation data set generation system simulation;

preferably, the simulation generation specifically includes:

constructing a simulation scene, and setting n virtual workpieces in the simulation scene;

based on a domain randomization method, i virtual workpieces are randomly dropped from selected positions in a working area of the simulation scene, and different color information is given to different virtual workpieces;

acquiring and saving a depth image and an rgb image in a scene based on a simulation vision sensor;

recording and storing the falling pose information of each virtual workpiece in the simulation scene as a workpiece pose tag;

performing visualization degree analysis on each virtual workpiece based on the depth image and/or the rgb image, and recording visualization degree data as the visualization degree label;

and when iteration is carried out until the integer i is not less than n, stopping the virtual workpiece from falling off, and generating the 3D training data set based on the workpiece pose tag and the visualization degree tag.

8. The workpiece recognition and pose detection method according to claim 7, wherein the simulation data set generation system comprises:

the image acquisition device is used for acquiring a 3D training image of a virtual workpiece in a simulation scene;

the area limiting device is used for acquiring the size of a visual field according to the 3D training image and limiting the falling range of the virtual workpiece;

the workpiece pose acquisition device is used for recording pose information of the target workpiece at that time after the virtual workpiece randomly falls off;

the workpiece visualization degree analysis device is used for calculating and analyzing the visualization degree of each virtual workpiece in the simulation scene;

9. A workpiece recognition and pose detection system characterized by comprising:

10. A method for controlling the grasping of a robot arm, comprising:

acquiring a target workpiece and pose information thereof in a scene to be identified based on the workpiece identification and pose detection method of any one of claims 1-8;