CN117644511A - Robot grabbing method, system, equipment and medium based on implicit neural representation - Google Patents

Robot grabbing method, system, equipment and medium based on implicit neural representation Download PDF

Info

Publication number
CN117644511A
CN117644511A CN202311718061.7A CN202311718061A CN117644511A CN 117644511 A CN117644511 A CN 117644511A CN 202311718061 A CN202311718061 A CN 202311718061A CN 117644511 A CN117644511 A CN 117644511A
Authority
CN
China
Prior art keywords
grabbing
availability
implicit
target object
gripping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311718061.7A
Other languages
Chinese (zh)
Inventor
王栋
李学龙
张学超
赵斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai AI Innovation Center
Original Assignee
Shanghai AI Innovation Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai AI Innovation Center filed Critical Shanghai AI Innovation Center
Priority to CN202311718061.7A priority Critical patent/CN117644511A/en
Publication of CN117644511A publication Critical patent/CN117644511A/en
Pending legal-status Critical Current

Links

Landscapes

  • Manipulator (AREA)

Abstract

The invention relates to a robot grabbing control method, a system, equipment and a medium based on implicit neural representation, wherein the method comprises the following steps: predicting the grabbing availability in the direction which is not observed before by utilizing the new view synthesis capability of the implicit neural representation, and selecting a next observation view angle based on the potential optimal grabbing availability of the target object; and performing visual modeling on the target scene through closed-loop continuous observation, and stopping observation and performing grabbing when the grabbing availability of the target object is greater than a target threshold value. Compared with the prior art, the invention has the advantages of high grabbing accuracy and high efficiency.

Description

Robot grabbing method, system, equipment and medium based on implicit neural representation
Technical Field
The invention relates to the field of robot grabbing control, in particular to a robot grabbing method, a system, equipment and a medium based on implicit neural representation.
Background
Existing robot character capture methods are generally based on visual information input, such as capturing environmental features using a depth camera. Most use a fixed depth information acquisition approach, modeling a scene with visual information in a single view or in fixed multiple views. Such methods have difficulty in handling the grabbing of the specified object in complex, occluded stacked scenes.
Active perception planning aims at recursively planning the next observation position of the sensor, and active perception with recursive view planning enables more flexible acquisition of environmental information than the paradigm of passive observation, which is currently applied in various fields such as object reconstruction, object recognition and grip detection.
Active perception planning is generally divided into two categories: synthetic-based methods and search-based methods. The method based on synthesis directly calculates the next observation position according to the current observation and task constraint, and the method based on search firstly generates a certain number of candidate viewpoints and then selects the optimal observation position according to manual standards.
The active perception planning method has the following defects:
1) The method based on synthesis is difficult to deal with complex scenes when carrying out next observation gesture calculation, and the algorithm complexity requirement is high.
2) The defects in the search-based method adopt geometric information to evaluate the next observation gesture, so that the correlation between geometric reconstruction quality and grabbing quality is difficult to ensure, and the complex occlusion scene and the complex appearance target object to be grabbed are difficult to deal with; further, the latest method for evaluating the observation position planning by utilizing the grabbing availability often needs a large number of times of observation to realize better grabbing quality estimation, and reduces the overall grabbing efficiency.
Therefore, there is a need to design a robot gripping method with high gripping accuracy and high efficiency.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a robot grabbing method, a system, equipment and a medium with high grabbing accuracy and high efficiency based on implicit neural representation.
The aim of the invention can be achieved by the following technical scheme:
according to a first aspect of the present invention, there is provided a robot gripping control method based on implicit neural representation, the method comprising:
predicting the grabbing availability in the direction which is not observed before by utilizing the new view synthesis capability of the implicit neural representation, and selecting a next observation view angle based on the potential optimal grabbing availability of the target object;
and performing visual modeling on the target scene through closed-loop continuous observation, and stopping observation and performing grabbing when the grabbing availability of the target object is greater than a target threshold value.
Preferably, the method comprises the steps of:
s1, fusing a depth map acquired by a depth sensor to a truncated symbol distance function TSDF, inputting fused data to a neural network model feature to extract implicit neural expression of a target scene, and observing grasping availability and grasping gesture of a target object;
s2, when the grabbing availability of the target object reaches a target threshold value, driving a mechanical arm of the robot to run to a designated position to execute grabbing, otherwise turning to S3;
s3, sampling a potential next observation pose, selecting a potential optimal grabbing availability direction as a next observation target pose by predicting grabbing availability corresponding to a new view angle direction, driving a mechanical arm of a robot to move to a designated position to execute grabbing, and circulating.
Preferably, the neural network model is a three-dimensional convolutional neural network.
Preferably, a multi-layer perceptron neural network is adopted for grabbing availability and grabbing gesture observation of the target object.
Preferably, the capturing availability and capturing gesture observation of the target object by using the multi-layer perceptron neural network comprises:
and inputting the characteristics extracted from the implicit neural expression of the target scene according to the spatial position and the direction of the boundary o' clock of the target object into a multi-layer perceptron neural network, and predicting to obtain the grabbing availability and grabbing gesture of the target object.
Preferably, the grabbing gesture is a 6DoF grabbing gesture.
Preferably, the depth sensor is a depth camera.
According to a second aspect of the present invention, there is provided a robot gripping control system based on implicit neural representation, the system performing target object gripping control using any of the above methods.
According to a third aspect of the present invention there is provided an electronic device comprising a memory and a processor, the memory having stored thereon a computer program, the processor implementing the method of any one of the above when executing the program.
According to a fourth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method of any one of the above.
Compared with the prior art, the invention has the following beneficial effects:
1) According to the robot grabbing control method based on the implicit neural expression, corresponding grabbing quality can be predicted more accurately when the observing visual angle direction is the same as the grabbing direction, the depth sensor input compression is understood to be the implicit neural expression, grabbing availability is adopted to evaluate the next optimal observing visual angle, and the grabbing accuracy of the robot is improved.
2) According to the invention, the depth sensor input compression is understood as implicit neural expression through the convolutional neural network model, and the new view depth map synthesis is used for performing multitask training, so that the efficiency of the mechanical arm for observing and grabbing the whole flow is improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of a neural network model related to implicit neural representation according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
Examples
The embodiment provides a robot grabbing control method based on implicit neural representation, which uses new view synthesis capability of the implicit neural representation to predict grabbing availability in the direction which is not observed before and selects a next view based on potential optimal grabbing availability of a target object in consideration of more accurate prediction of corresponding grabbing quality when the observation view angle direction is the same as the grabbing direction; and (3) carrying out closed-loop continuous observation, carrying out visual modeling on the target scene, stopping observation and executing grabbing when the grabbing quality of the target object is optimized to reach a target threshold value.
The method of this embodiment will be described in detail with reference to fig. 1 and 2.
S1, fusing a depth map acquired by a depth sensor to a truncated symbol distance function TSDF, inputting fused data to a neural network model, compressing and understanding the truncated symbol distance function TSDF through the neural network model, extracting implicit neural expression of a target scene by features, and observing grasping availability and grasping gesture of a target object;
s2, when the grabbing availability of the target object reaches a target threshold value, driving a mechanical arm of the robot to run to a designated position to execute grabbing, otherwise turning to S3;
s3, sampling a potential next observation pose, selecting a potential optimal grabbing availability direction as a next observation target pose by predicting grabbing availability corresponding to a new view angle direction, driving a mechanical arm of a robot to move to a designated position to execute grabbing, and circulating.
The truncated signed distance function (Truncated Signed Distance Function, TSDF) in this embodiment is a data structure for representing the surface of an object in three dimensions, dividing the space into a regular grid of voxels, and storing a signed distance value for each voxel, the distance value representing the distance of the voxel center from the object surface.
The target threshold in this embodiment may be set according to the actual requirement, and further, the target threshold may be set to 0.95.
As another preferred embodiment, the depth sensor may be a depth camera.
As another preferred embodiment, the neural network model is a multi-layer 3D CNN layer, and the implicit neural expression of the target scene is obtained through feature extraction.
As another preferred embodiment, a multi-layer perceptron neural network is employed by traversing each direction G of each point in the target object bounding box v Obtaining grabbing availability and grabbing gesture prediction of each space point in different directions, wherein the corresponding expression is:
F(G v ,C geo )->G r ,G q ,G w
wherein F is the calculation process of the neural network of the multi-layer perceptron, G v C is the direction corresponding to the midpoint of the boundary box of the target object geo G for feature information extracted from implicit neural expression according to spatial location r For the grabbing pose of the target object, G q For grabbing availability of target object G w Is the grabbing width of the target object.
Bounding box T for cluttered scenes and target objects on a given desktop bbox The 6DoF grabbing gesture of the target object is predicted, and the training process is implemented as follows:
1) Judging whether the current time T exceeds the maximum running time T max If not, turning to S2, otherwise, ending;
2) Inputting the depth map of the current moment into D t As a truncated symbol distance function value M t The truncated symbol distance function value M t Input into a three-dimensional convolutional neural network, and the output of the three-dimensional convolutional neural network is used as implicit neural expression C at the current moment t S3, turning to S;
3) According to the boundary frame T of the target object at the current moment bbox Implicit neural expression C t And the direction of observation O v,t Predicting to obtain optimal grabbing availabilityTransfer 4);
4) JudgingOptimal availability of grabbing at the current timeWhether the predicted optimal grabbing is larger than a target threshold value or not, if the predicted optimal grabbing is larger than the target threshold value, directly executing predicted optimal grabbing, otherwise, turning to 5);
5) Traversing all possible directions G v Searching to obtain the optimal possible grabbing direction O v,t+1 And driving the robotic arm to a corresponding direction.
According to the invention, verification is carried out through a simulation experiment and a physical experiment, and the result shows that compared with the existing method, the method has 2% improvement in the grabbing success rate, and meanwhile, only 69% of observation times are used, so that the overall grabbing efficiency is improved. Meanwhile, in experiments for limiting the maximum observation times, the method improves the grabbing success rate by 3% compared with the existing method. In addition, the method can be well migrated from simulation training to physical experiments.
The electronic device of the present invention includes a Central Processing Unit (CPU) that can perform various appropriate actions and processes according to computer program instructions stored in a Read Only Memory (ROM) or computer program instructions loaded from a storage unit into a Random Access Memory (RAM). In the RAM, various programs and data required for the operation of the device can also be stored. The CPU, ROM and RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
A plurality of components in a device are connected to an I/O interface, comprising: an input unit such as a keyboard, a mouse, etc.; an output unit such as various types of displays, speakers, and the like; a storage unit such as a magnetic disk, an optical disk, or the like; and communication units such as network cards, modems, wireless communication transceivers, and the like. The communication unit allows the device to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processing unit performs the various methods and processes described above. For example, in some embodiments, the method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device via the ROM and/or the communication unit. One or more steps of the methods described above may be performed when the computer program is loaded into RAM and executed by a CPU. Alternatively, in other embodiments, the CPU may be configured to perform the method by any other suitable means (e.g., by means of firmware).
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), etc.
Program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. A robot gripping method based on implicit neural representation, the method comprising:
predicting the grabbing availability in the direction which is not observed before by utilizing the new view synthesis capability of the implicit neural representation, and selecting a next observation view angle based on the potential optimal grabbing availability of the target object;
and performing visual modeling on the target scene through closed-loop continuous observation, and stopping observation and performing grabbing when the grabbing availability of the target object is greater than a target threshold value.
2. The robot gripping method based on implicit neural representation according to claim 1, characterized in that it comprises the steps of:
s1, fusing a depth map acquired by a depth sensor to a truncated symbol distance function TSDF, inputting fused data to a neural network model feature to extract implicit neural expression of a target scene, and observing grasping availability and grasping gesture of a target object;
s2, when the grabbing availability of the target object reaches a target threshold value, driving a mechanical arm of the robot to run to a designated position to execute grabbing, otherwise turning to S3;
s3, sampling a potential next observation pose, selecting a potential optimal grabbing availability direction as a next observation target pose by predicting grabbing availability corresponding to a new view angle direction, driving a mechanical arm of a robot to move to a designated position to execute grabbing, and circulating.
3. The robot gripping method based on implicit neural representation of claim 2, wherein the neural network model is a three-dimensional convolutional neural network.
4. The robot gripping method based on implicit neural representation according to claim 2, wherein the gripping availability and the gripping gesture observation of the target object are performed by using a multi-layer perceptron neural network.
5. The robot gripping method based on implicit neural representation according to claim 4, wherein the using of the multi-layer perceptron neural network for gripping availability and gripping gesture observation of the target object comprises:
and inputting the characteristics extracted from the implicit neural expression of the target scene according to the spatial position and the direction of the boundary o' clock of the target object into a multi-layer perceptron neural network, and predicting to obtain the grabbing availability and grabbing gesture of the target object.
6. The robot gripping method based on implicit neural representation of claim 2, wherein the gripping gesture is a 6DoF gripping gesture.
7. The implicit neural representation-based robotic grasping method of claim 2, wherein the depth sensor is a depth camera.
8. A robot gripping system based on implicit neural representation, characterized in that the system performs target object gripping control using the method of any one of claims 1 to 7.
9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program, characterized in that the processor, when executing the program, implements the method according to any of claims 1-7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1-7.
CN202311718061.7A 2023-12-14 2023-12-14 Robot grabbing method, system, equipment and medium based on implicit neural representation Pending CN117644511A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311718061.7A CN117644511A (en) 2023-12-14 2023-12-14 Robot grabbing method, system, equipment and medium based on implicit neural representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311718061.7A CN117644511A (en) 2023-12-14 2023-12-14 Robot grabbing method, system, equipment and medium based on implicit neural representation

Publications (1)

Publication Number Publication Date
CN117644511A true CN117644511A (en) 2024-03-05

Family

ID=90049379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311718061.7A Pending CN117644511A (en) 2023-12-14 2023-12-14 Robot grabbing method, system, equipment and medium based on implicit neural representation

Country Status (1)

Country Link
CN (1) CN117644511A (en)

Similar Documents

Publication Publication Date Title
WO2020228446A1 (en) Model training method and apparatus, and terminal and storage medium
CN111062263B (en) Method, apparatus, computer apparatus and storage medium for hand gesture estimation
CN113378770B (en) Gesture recognition method, device, equipment and storage medium
CN114186632B (en) Method, device, equipment and storage medium for training key point detection model
CN113378712B (en) Training method of object detection model, image detection method and device thereof
US11941838B2 (en) Methods, apparatuses, devices and storage medium for predicting correlation between objects
US11756205B2 (en) Methods, devices, apparatuses and storage media of detecting correlated objects involved in images
CN114519881A (en) Face pose estimation method and device, electronic equipment and storage medium
CN115719436A (en) Model training method, target detection method, device, equipment and storage medium
CN115170510B (en) Focus detection method and device, electronic equipment and readable storage medium
CN112967388A (en) Training method and device for three-dimensional time sequence image neural network model
CN115239508A (en) Scene planning adjustment method, device, equipment and medium based on artificial intelligence
US20220300774A1 (en) Methods, apparatuses, devices and storage media for detecting correlated objects involved in image
CN114723809A (en) Method and device for estimating object posture and electronic equipment
CN114220163B (en) Human body posture estimation method and device, electronic equipment and storage medium
CN115984963A (en) Action counting method and related equipment thereof
CN117644511A (en) Robot grabbing method, system, equipment and medium based on implicit neural representation
Yang et al. Locator slope calculation via deep representations based on monocular vision
CN111814865A (en) Image identification method, device, equipment and storage medium
CN111753736A (en) Human body posture recognition method, device, equipment and medium based on packet convolution
CN112131902A (en) Closed loop detection method and device, storage medium and electronic equipment
CN115431968B (en) Vehicle controller, vehicle and vehicle control method
US11922667B2 (en) Object region identification device, object region identification method, and object region identification program
CN112686185B (en) Relation feature extraction method and device and electronic equipment
CN115880776B (en) Determination method of key point information and generation method and device of offline action library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination