CN111360862A - Method for generating optimal grabbing pose based on convolutional neural network - Google Patents

Method for generating optimal grabbing pose based on convolutional neural network Download PDF

Info

Publication number
CN111360862A
CN111360862A CN202010134999.4A CN202010134999A CN111360862A CN 111360862 A CN111360862 A CN 111360862A CN 202010134999 A CN202010134999 A CN 202010134999A CN 111360862 A CN111360862 A CN 111360862A
Authority
CN
China
Prior art keywords
neural network
grabbing
network model
pose
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010134999.4A
Other languages
Chinese (zh)
Other versions
CN111360862B (en
Inventor
庞剑坤
魏武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202010134999.4A priority Critical patent/CN111360862B/en
Publication of CN111360862A publication Critical patent/CN111360862A/en
Application granted granted Critical
Publication of CN111360862B publication Critical patent/CN111360862B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J15/00Gripping heads and other end effectors
    • B25J15/08Gripping heads and other end effectors having finger members
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J19/00Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
    • B25J19/02Sensing devices
    • B25J19/021Optical sensing devices
    • B25J19/023Optical sensing devices including video camera means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Image Analysis (AREA)
  • Manipulator (AREA)

Abstract

The invention discloses a convolution neural network-based method for generating an optimal grabbing pose, which comprises the following steps of: s1, setting parameters for representing the grabbing quality in the grabbing process; s2, constructing a convolutional neural network model; s3, training a neural network model by adopting a Cornell Graspining data set; and S4, inputting the object depth map acquired by the camera into the trained neural network model, and calculating grabbing parameters, wherein the grabbing parameters are used for driving the mechanical arm to grab. The algorithm for generating the optimal grabbing pose based on the convolutional neural network model can quickly obtain the optimal grabbing pose of an object only by inputting the depth information of the object, is simple and can be widely popularized in the fields of mechanical arm visual grabbing, dynamic tracking and the like.

Description

Method for generating optimal grabbing pose based on convolutional neural network
Technical Field
The invention relates to the field of mechanical arm visual grabbing, in particular to a method for generating an optimal grabbing pose based on a convolutional neural network.
Background
In recent years, with the rapid development of computer vision, a mechanical arm and vision are combined to integrate more environment perception capabilities, and the method is also gradually a research hotspot. If the mechanical arm wants to grab an object, the specific position of the object is firstly obtained through a camera (sensor), and then the optimal grabbing pose suitable for the object is found through an internal evaluation algorithm, wherein two processes are involved, the type of the object is confirmed, and the optimal grabbing pose is screened out according to the state of the object. If such objects were not seen before the computer (algorithm), it is more difficult to generate the optimal capture pose for the unseen objects. The method solves the problems, and the university of Berkeley, California 'Dex-Net 2.0: Deep Learning to plane Robust scales with Synthetic Point cloud and analytical Grasp Metrics' proposes an algorithm of a convolutional neural network, the algorithm has a high grabbing success rate for general objects, unfortunately, the algorithm contains too many network parameters (million levels), the operation rate is relatively low, and the algorithm is difficult to reappear on general machines, so the method faces challenges in practical popularization.
Disclosure of Invention
The invention provides a convolutional neural network-based method for generating an optimal grabbing pose, which mainly solves the problem that the current grabbing algorithm is lack of the method for quickly generating the optimal grabbing pose for unseen objects, realizes a series of work of data set processing, network training and model optimization, can quickly obtain the optimal grabbing pose of the objects only by inputting the depth information of the objects, is simple in model, has training parameters far smaller than those of other networks, has excellent generalization capability for identifying and generating the grabbing pose of the objects in daily life including the unseen objects, and can be widely popularized in the fields of mechanical arm visual grabbing, dynamic tracking and the like, and meanwhile, the success rate of identifying and generating the grabbing pose reaches over 90%.
The invention is realized by at least one of the following technical schemes.
A convolutional neural network-based method for generating an optimal grabbing pose comprises the following steps:
s1, setting parameters for representing the grabbing quality in the grabbing process;
s2, constructing a convolutional neural network model;
s3, training a neural network model by adopting a Cornell Graspining data set;
and S4, inputting the object depth map acquired by the camera into the trained neural network model, and calculating grabbing pose parameters, wherein the grabbing pose parameters are used for driving the mechanical arm to grab.
Further, the parameters in step S1 include G, Q, Φ, W; where G represents a series of parameters in each capture, corresponding to each pixel:
Figure BDA0002396274400000021
for a given 2.5D depth map
Figure BDA0002396274400000022
H represents the height of the depth map, W represents the width of the depth map, H and W parameters are obtained from the camera,
Figure BDA0002396274400000023
a representative dimension;
q represents the quality of each grabbing and is a scalar within (0,1), and the closer Q is to 1, the higher the grabbing quality is;
phi denotes the angle of rotation required for the jaw to reach the desired position during each grab,
Figure BDA0002396274400000024
the ideal position is the position of the optimal grabbing rectangle set in the data set, and the rotation angle refers to the rotation angle of the grabbing rectangle relative to the horizontal line;
w indicates the width of the jaws that need to be opened during gripping to ensure complete gripping of the object.
Further, the Cornell grading data set of step S3 provides 1035 pictures of 280 different objects, each with RGB map, depth information and data set for the best grabbed rectangle used to grab the object, including the size of the rectangle, the three-dimensional position of the center point of the rectangle.
Further, the structure of the neural network model includes different network layers: the first layer of the neural network model comprises 9 × 9 convolution kernels and 32 filters, the moving step is 3, the second layer comprises 5 × 5 convolution kernels and 16 filters, the moving step is 2, the third layer comprises 3 × 3 convolution kernels and 8 filters, the moving step is 2, the fourth, fifth and sixth layers are deconvolution layers, the purpose is to keep the resolution of input and output consistent, the fourth layer is a deconvolution layer and comprises 3 × 3 convolution kernels and 8 filters, the moving step is 2, the fifth layer is a deconvolution layer and comprises 3 × 3 convolution kernels and 16 filters, the moving step is 2, the sixth layer is a deconvolution layer and comprises 9 × 9 convolution kernels and 32 filters, and the moving step is 3.
Further, the loss function of the neural network model adopts L2A loss equation, which is a measure for evaluating the performance of the network, is approximated by a neural network to a complex equation M: I → G, and the calculation of the parameters of the neural network model includes:
M(I)=(Q,Φ,W)
G=(Φ,W,Q)
Mθ(I)=(Qθθ,Wθ)≈M(I)
wherein M (I) represents I, M for the input depth imageθ(I) An equation Q formed by actual capture pose parameters obtained by a neural network modelθθ,WθIs QTT,WTRepresenting the grabbing parameters of all objects in the whole network, and depth information I in the data setTInputting the data into a neural network model for training to obtain the optimal grabbing pose GTThus, the loss function is defined as:
Figure BDA0002396274400000031
further, the grasp pose parameters of step S4 include the grasp mass QTAngle of rotation phiTWidth W of jaw openingTThe calculation method is as follows:
grasping mass QT: when an object is grabbed, inputting the depth information of the object acquired by the Intel Realsense SR300 camera into the model trained in the step S3, comparing the depth information of the object with the information in the model, if the part with consistent depth information is set as 1, the inconsistent part is set as 0, then counting the values of 1 and 0 in all pixels, and calculating the grabbing quality value Q of the objectT
Rotation angle
Figure BDA0002396274400000032
The range of values is
Figure BDA0002396274400000033
And according to sin (2. phi.)T) And cos (2. phi.)T) To obtain a unique true value phiT
Figure BDA0002396274400000041
Width W of jaw openingTThe width of the object is obtained by adding 1 cm-2 cm on the basis of the width of the object, the width of the object is obtained through the depth information of the object, and the depth information of the object is obtained through an Intel Realsense SR300 camera.
Further, the equation for calculating the grabbing pose by the neural network model is as follows:
Mθ(I)=(Qθθ,Wθ)
wherein, I, Qθθ,WθAre picture parameters, Q, respectivelyθθ,WθAre each QTT,WTRepresents the grab parameters of all objects in the neural network model.
Further, the training process of the convolutional neural network model comprises the following steps:
(1) initializing a weight value by the convolutional neural network model;
(2) selecting 80% of a Cornell Graspeing data set as a training set of a network model, inputting depth information data of the training set into a convolutional neural network model, and obtaining an output value through propagation of a convolutional layer and a deconvolution layer;
(3) solving the error between the output value of the network model and the target value, namely the value of the loss function;
(4) when the error is larger than the expected value, the error is transmitted back to the network model, and the error of each network layer is sequentially obtained; the error of each network layer is the total error of the network;
(5) updating the weight according to the obtained error, and then entering the step (2); when the error is equal to or less than the desired value, the training is ended.
Compared with the prior art, the invention has the advantages and beneficial effects that:
1. the method solves the problem that the current grabbing algorithm is lack of the method for quickly generating the optimal grabbing pose for the unseen object, realizes a series of work of data set processing, network training and model optimization, can quickly obtain the optimal grabbing pose of the object only by inputting the depth information of the object, is simple in model and high in operation efficiency, and the trained parameters are far smaller than those of other networks.
2. For objects in daily life including unseen objects, the success rate of recognizing and generating the grabbing pose reaches over 90%, the robot has excellent generalization capability and simple deployment, and can be widely popularized in the fields of mechanical arm visual grabbing, dynamic tracking and the like.
Drawings
FIG. 1 is a schematic diagram of a neural network model structure according to the present embodiment;
FIG. 2 is a flowchart illustrating a method for generating an optimal capture pose based on a convolutional neural network according to an embodiment;
FIG. 3 is a schematic diagram illustrating calculation of the rotation angle and the opening width of the clamping jaw in the process of grabbing a target object according to the embodiment
FIG. 4 is a hierarchy diagram of a network architecture in an embodiment of the present invention;
in the figure: 1-grabbing a rectangle a; 2-grabbing a rectangle b; 3-object grasping object.
Detailed Description
The working principle and working process of the present invention will be further explained in detail with reference to the accompanying drawings.
A convolutional neural network-based method for generating an optimal grabbing pose comprises the following steps:
s1, as shown in FIG. 2, setting parameters for representing the grabbing quality in the grabbing process, including G, Q, phi and W; where G represents a series of parameters in each capture, corresponding to each pixel:
Figure BDA0002396274400000051
for a given 2.5D depth map
Figure BDA0002396274400000052
H represents the height of the depth map, W represents the width of the depth map, H and W parameters are obtained from the camera,
Figure BDA0002396274400000053
a representative dimension;
q represents the quality of each grabbing and is a scalar within (0,1), and the closer Q is to 1, the higher the grabbing quality is;
phi denotes the angle of rotation required for the jaw to reach the desired position during each grab,
Figure BDA0002396274400000054
the ideal position is the position of the optimal grabbing rectangle set in the data set, the rotation angle refers to the rotation angle of the rectangle relative to the horizontal line, and specific numerical values are also included in the data set;
w indicates the width of the jaws that need to be opened during gripping to ensure complete gripping of the object.
S2, constructing a convolutional neural network model, as shown in fig. 4, the structure of the neural network model includes different network layers: the first layer comprises 9 × 9 convolution kernels and 32 filters, the moving step is 3, the second layer comprises 5 × 5 convolution kernels and 16 filters, the moving step is 2, the third layer comprises 3 × 3 convolution kernels and 8 filters, the moving step is 2, the fourth layer to the sixth layer are deconvolution layers, the purpose is to keep the resolution of input and output consistent, the fourth layer is a deconvolution layer comprising 3 × 3 convolution kernels and 8 filters, the moving step is 2, the fifth layer is a deconvolution layer comprising 3 × 3 convolution kernels and 16 filters, the moving step is 2, the sixth layer is a deconvolution layer comprising 9 × 9 convolution kernels and 32 filters, and the moving step is 3.
Loss function of neural network model adopts L2A loss equation, which is a measure for evaluating network performance, is approximated by a neural network model to a complex equation M: I → G, and the calculation of parameters of the neural network model includes:
M(I)=(Q,Φ,W)
G=(Φ,W,Q)
Mθ(I)=(Qθθ,Wθ)≈M(I)
m (I) represents an equation formed by theoretically optimal grabbing pose parameters for the input depth image I; mθ(I) An equation consisting of actual grabbing pose parameters obtained by a neural network model is referred to; qθθ,WθIs QTT,WTRepresents the grab parameters of all objects in the whole network. Depth information I in data setTTraining as network input to obtain ideal optimal capture pose GT,GTRepresents corresponds to ITThus defining a loss function as:
Figure BDA0002396274400000061
λ(GT,Mθ(IT) Is) represents the L2 norm loss function equation, i.e., the least squares error equation.
S3, adopting Cornell grading data set (Kannell university grabing source data sethttp:// pr.cs.cornell.edu/grasping/rect_data/data.php) To train the model, which is an open source dataset for research and development at the university of cornellThere are 1035 pictures (RGB map and depth information) of 280 different objects, each of which was taken from a different direction or at a different angle, each picture being accompanied by artificially set data for the best rectangle to capture, including the size of the rectangle, the three-dimensional position of the center point of the rectangle, and the angle of rotation of the rectangle relative to the horizontal. This data set may provide important parameters regarding grabbing; the depth information in the data set is input into the neural network model for training, the trained model has good generalization capability, the model is simplified, and the network parameters needing to be trained are greatly reduced. 80% of the Cornell Graspeing dataset was selected as the training set for the network model and 20% was kept as the evaluation dataset. The training process comprises the following steps:
(1) initializing a weight value by the convolutional neural network model;
(2) inputting the depth information data of the training set, and obtaining an output value through propagation of a convolution layer and a deconvolution layer;
(3) solving the error between the output value of the network model and the target value, namely the value of the loss function;
(4) and when the error is larger than the expected value, the error is transmitted back to the network model, and the errors of the network layers are sequentially obtained. The error of each network layer is the total error of the network;
(5) updating the weight according to the obtained error, and then entering the step (2); when the error is equal to or less than the desired value, the training is ended.
And S4, inputting the object depth map acquired by the camera into the trained neural network model, and calculating grabbing pose parameters, wherein the grabbing pose parameters are used for driving a mechanical arm to grab, and the mechanical arm can be a UR5 mechanical arm. As shown in fig. 1 and 4, a 300 x 300 depth image is input to a trained neural network model, and the optimal capture pose G is obtainedθ
The model obtained after step S3 has extracted the G, Q, Φ information in the Cornell rasping dataset, and then for the newly input depth image, its capture quality Q is calculatedTAngle of rotation phiTThe clamping jaw is wide in openingDegree WTThe calculation method is as follows:
grasping mass QT: when an object is grabbed, the depth information of the object acquired from an Intel Realsense SR300 camera is taken as input and input into a model trained by a network, the depth information of the object is compared with the information in the model, if the part with consistent depth information is set as 1, the part with inconsistent depth information is set as 0, then the values of 1 and 0 in all pixels are counted, and the grabbing quality value Q of the object can be calculatedT
Setting a rotation angle
Figure BDA0002396274400000081
The range of values is
Figure BDA0002396274400000082
And according to sin (2. phi.)T) And cos (2. phi.)T) To obtain a unique true value phiT
Figure BDA0002396274400000083
Width W for jaw openingTThe width of the object can be obtained by adding 1cm to 2cm on the basis of the width of the object, the width of the object can be obtained through the depth information of the object, and the depth information of the object can be obtained through an Intel Realsense SR300 camera, as shown in FIG. 3, 3 is an irregular object with a thin top and a thick bottom, 1 is a grabbing rectangle a, and 2 is a grabbing rectangle b.
The method can be applied to industrial mechanical arms or mechanical arms for scientific research experiments. In industry, the mechanical arm needs to grab materials or goods, the method can quickly generate the optimal grabbing pose of the materials or the goods and send pose information to the mechanical arm, so that the mechanical arm can quickly grab specified materials or goods. In scientific research, the convolutional neural network model provided by the method has few parameters, the trained model is simple, the generalization capability is strong, and the method has a certain reference value for the research of mechanical arm visual capture.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents or improvements made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A method for generating an optimal grabbing pose based on a convolutional neural network is characterized by comprising the following steps:
s1, setting parameters for representing the grabbing quality in the grabbing process;
s2, constructing a convolutional neural network model;
s3, training a neural network model by adopting a Cornell Graspining data set;
and S4, inputting the object depth map acquired by the camera into the trained neural network model, and calculating grabbing pose parameters, wherein the grabbing pose parameters are used for driving the mechanical arm to grab.
2. The convolutional neural network-based method of generating an optimal grab pose of claim 1, wherein: the parameters in step S1 include G, Q, Φ, W; where G represents a series of parameters in each capture, corresponding to each pixel:
Figure FDA0002396274390000011
for a given 2.5D depth map
Figure FDA0002396274390000012
H represents the height of the depth map, W represents the width of the depth map, H and W parameters are obtained from the camera,
Figure FDA0002396274390000013
a representative dimension;
q represents the quality of each grabbing and is a scalar within (0,1), and the closer Q is to 1, the higher the grabbing quality is;
phi denotes the angle of rotation required for the jaw to reach the desired position during each grab,
Figure FDA0002396274390000014
the ideal position is the position of the optimal grabbing rectangle set in the data set, and the rotation angle refers to the rotation angle of the grabbing rectangle relative to the horizontal line;
w indicates the width of the jaws that need to be opened during gripping to ensure complete gripping of the object.
3. The convolutional neural network-based method of generating an optimal grab pose of claim 1, wherein: the Cornell grading data set described in step S3 provides 1035 pictures of 280 different objects, each with RGB maps, depth information, and data set for the best grabbed rectangle used to grab the object, including the size of the rectangle, and the three-dimensional position of the center point of the rectangle.
4. The network structure of the convolutional neural network-based method of generating an optimal grab pose according to claim 1, wherein: the structure of the neural network model includes different network layers: the first layer contains 9 × 9 convolution kernels and 32 filters, the shift step is 3, the second layer contains 5 × 5 convolution kernels and 16 filters, the shift step is 2, the third layer contains 3 × 3 convolution kernels and 8 filters, the shift step is 2, the fourth five six layers are deconvolution layers, the purpose is to keep the resolution of the input and the output consistent, the fourth layer is a deconvolution layer contains 3 × 3 convolution kernels and 8 filters, the shift step is 2, the fifth layer is a deconvolution layer contains 3 × 3 convolution kernels and 16 filters, the shift step is 2, the sixth layer is a deconvolution layer contains 9 × 9 convolution kernels and 32 filters, and the shift step is 3.
5. The convolutional neural network-based method of generating an optimal grab pose of claim 1, wherein: loss function of neural network model adopts L2Loss equation as a measure of evaluating network performance, approximating a complex equation M I → G with a neural networkThe computer comprises
M(I)=(Q,Φ,W)
G=(Φ,W,Q)
Mθ(I)=(Qθθ,Wθ)≈M(I)
Wherein M (I) represents I, M for the input depth imageθ(I) An equation Q formed by actual capture pose parameters obtained by a neural network modelθθ,WθIs QTT,WTRepresenting the grabbing parameters of all objects in the whole network, and depth information I in the data setTInputting the data into a neural network model for training to obtain the optimal grabbing pose GTThus, the loss function is defined as:
Figure FDA0002396274390000021
6. the convolutional neural network-based method of generating an optimal grab pose of claim 1, wherein: the grasp pose parameters of step S4 include grasp quality QTAngle of rotation phiTWidth W of jaw openingTThe calculation method is as follows:
grasping mass QT: when an object is grabbed, inputting the depth information of the object acquired by the Intel Realsense SR300 camera into the model trained in the step S3, comparing the depth information of the object with the information in the model, if the part with consistent depth information is set as 1, the inconsistent part is set as 0, then counting the values of 1 and 0 in all pixels, and calculating the grabbing quality value Q of the objectT
Rotation angle
Figure FDA0002396274390000033
The range of values is
Figure FDA0002396274390000031
And according to sin (2. phi.)T) And cos (2. phi.)T) To obtain a unique true value phiT
Figure FDA0002396274390000032
Width W of jaw openingTThe width of the object is obtained by adding 1 cm-2 cm on the basis of the width of the object, the width of the object is obtained through the depth information of the object, and the depth information of the object is obtained through an Intel Realsense SR300 camera.
7. The convolutional neural network-based method of generating an optimal grab pose of claim 1, wherein: the equation for calculating the grabbing pose by the neural network model is as follows:
Mθ(I)=(Qθθ,Wθ)
wherein I, Qθθ,WθRespectively representing picture parameters, Qθθ,WθAre each QTT,WTRepresents the grab parameters of all objects in the neural network model.
8. The convolutional neural network-based method of generating an optimal grab pose of claim 1, wherein: the training process of the convolutional neural network model comprises the following steps:
(1) initializing a weight value by the convolutional neural network model;
(2) selecting 80% of a CornelGrasp data set as a training set of a network model, inputting depth information data of the training set into a convolutional neural network model, and obtaining an output value through propagation of a convolutional layer and a deconvolution layer;
(3) solving the error between the output value of the network model and the target value, namely the value of the loss function;
(4) when the error is larger than the expected value, the error is transmitted back to the network model, and the error of each network layer is sequentially obtained; the error of each network layer is the total error of the network;
(5) updating the weight according to the obtained error, and then entering the step (2); when the error is equal to or less than the desired value, the training is ended.
CN202010134999.4A 2020-02-29 2020-02-29 Method for generating optimal grabbing pose based on convolutional neural network Active CN111360862B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010134999.4A CN111360862B (en) 2020-02-29 2020-02-29 Method for generating optimal grabbing pose based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010134999.4A CN111360862B (en) 2020-02-29 2020-02-29 Method for generating optimal grabbing pose based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN111360862A true CN111360862A (en) 2020-07-03
CN111360862B CN111360862B (en) 2023-03-24

Family

ID=71200241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010134999.4A Active CN111360862B (en) 2020-02-29 2020-02-29 Method for generating optimal grabbing pose based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111360862B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112297013A (en) * 2020-11-11 2021-02-02 浙江大学 Robot intelligent grabbing method based on digital twin and deep neural network
CN113268055A (en) * 2021-04-07 2021-08-17 北京拓疆者智能科技有限公司 Obstacle avoidance control method and device for engineering vehicle and mechanical equipment
CN113327295A (en) * 2021-06-18 2021-08-31 华南理工大学 Robot rapid grabbing method based on cascade full convolution neural network
CN113326666A (en) * 2021-07-15 2021-08-31 浙江大学 Robot intelligent grabbing method based on convolutional neural network differentiable structure searching
CN113799138A (en) * 2021-10-09 2021-12-17 中山大学 Mechanical arm grabbing method for generating convolutional neural network based on grabbing
CN115631401A (en) * 2022-12-22 2023-01-20 广东省科学院智能制造研究所 Robot autonomous grabbing skill learning system and method based on visual perception
CN117621145A (en) * 2023-12-01 2024-03-01 安徽大学 Fruit maturity detects flexible arm system based on FPGA

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170252924A1 (en) * 2016-03-03 2017-09-07 Google Inc. Deep machine learning methods and apparatus for robotic grasping
US10089575B1 (en) * 2015-05-27 2018-10-02 X Development Llc Determining grasping parameters for grasping of an object by a robot grasping end effector
CN109702741A (en) * 2018-12-26 2019-05-03 中国科学院电子学研究所 Mechanical arm visual grasping system and method based on self-supervisory learning neural network
CN110298886A (en) * 2019-07-01 2019-10-01 中国科学技术大学 A kind of Dextrous Hand Grasp Planning method based on level Four convolutional neural networks
CN110378325A (en) * 2019-06-20 2019-10-25 西北工业大学 A kind of object pose recognition methods during robot crawl
CN110509273A (en) * 2019-08-16 2019-11-29 天津职业技术师范大学(中国职业培训指导教师进修中心) The robot mechanical arm of view-based access control model deep learning feature detects and grasping means

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10089575B1 (en) * 2015-05-27 2018-10-02 X Development Llc Determining grasping parameters for grasping of an object by a robot grasping end effector
US20170252924A1 (en) * 2016-03-03 2017-09-07 Google Inc. Deep machine learning methods and apparatus for robotic grasping
CN109702741A (en) * 2018-12-26 2019-05-03 中国科学院电子学研究所 Mechanical arm visual grasping system and method based on self-supervisory learning neural network
CN110378325A (en) * 2019-06-20 2019-10-25 西北工业大学 A kind of object pose recognition methods during robot crawl
CN110298886A (en) * 2019-07-01 2019-10-01 中国科学技术大学 A kind of Dextrous Hand Grasp Planning method based on level Four convolutional neural networks
CN110509273A (en) * 2019-08-16 2019-11-29 天津职业技术师范大学(中国职业培训指导教师进修中心) The robot mechanical arm of view-based access control model deep learning feature detects and grasping means

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112297013A (en) * 2020-11-11 2021-02-02 浙江大学 Robot intelligent grabbing method based on digital twin and deep neural network
CN113268055A (en) * 2021-04-07 2021-08-17 北京拓疆者智能科技有限公司 Obstacle avoidance control method and device for engineering vehicle and mechanical equipment
CN113327295A (en) * 2021-06-18 2021-08-31 华南理工大学 Robot rapid grabbing method based on cascade full convolution neural network
CN113326666A (en) * 2021-07-15 2021-08-31 浙江大学 Robot intelligent grabbing method based on convolutional neural network differentiable structure searching
CN113326666B (en) * 2021-07-15 2022-05-03 浙江大学 Robot intelligent grabbing method based on convolutional neural network differentiable structure searching
CN113799138A (en) * 2021-10-09 2021-12-17 中山大学 Mechanical arm grabbing method for generating convolutional neural network based on grabbing
CN115631401A (en) * 2022-12-22 2023-01-20 广东省科学院智能制造研究所 Robot autonomous grabbing skill learning system and method based on visual perception
CN117621145A (en) * 2023-12-01 2024-03-01 安徽大学 Fruit maturity detects flexible arm system based on FPGA

Also Published As

Publication number Publication date
CN111360862B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
CN111360862B (en) Method for generating optimal grabbing pose based on convolutional neural network
CN113524194B (en) Target grabbing method of robot vision grabbing system based on multi-mode feature deep learning
CN109934864B (en) Residual error network deep learning method for mechanical arm grabbing pose estimation
CN108010078B (en) Object grabbing detection method based on three-level convolutional neural network
CN107953329B (en) Object recognition and attitude estimation method and device and mechanical arm grabbing system
CN112297013B (en) Robot intelligent grabbing method based on digital twin and deep neural network
CN111251295B (en) Visual mechanical arm grabbing method and device applied to parameterized parts
CN111553949B (en) Positioning and grabbing method for irregular workpiece based on single-frame RGB-D image deep learning
CN113409384B (en) Pose estimation method and system of target object and robot
JP2019056966A (en) Information processing device, image recognition method and image recognition program
CN109584298B (en) Robot-oriented autonomous object picking task online self-learning method
CN112605983B (en) Mechanical arm pushing and grabbing system suitable for intensive environment
CN110378325B (en) Target pose identification method in robot grabbing process
CN111151463A (en) Mechanical arm sorting and grabbing system and method based on 3D vision
CN108748149B (en) Non-calibration mechanical arm grabbing method based on deep learning in complex environment
CN111331607B (en) Automatic grabbing and stacking method and system based on mechanical arm
WO2019059343A1 (en) Workpiece information processing device and recognition method of workpiece
CN109508707B (en) Monocular vision-based grabbing point acquisition method for stably grabbing object by robot
CN114882109A (en) Robot grabbing detection method and system for sheltering and disordered scenes
Mittrapiyanumic et al. Calculating the 3d-pose of rigid-objects using active appearance models
Inoue et al. Transfer learning from synthetic to real images using variational autoencoders for robotic applications
CN113327295A (en) Robot rapid grabbing method based on cascade full convolution neural network
CN112288809B (en) Robot grabbing detection method for multi-object complex scene
CN113538576A (en) Grabbing method and device based on double-arm robot and double-arm robot
CN113436293B (en) Intelligent captured image generation method based on condition generation type countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant