CN116416307B - Prefabricated part hoisting splicing 3D visual guiding method based on deep learning - Google Patents

Prefabricated part hoisting splicing 3D visual guiding method based on deep learning Download PDF

Info

Publication number
CN116416307B
CN116416307B CN202310074563.4A CN202310074563A CN116416307B CN 116416307 B CN116416307 B CN 116416307B CN 202310074563 A CN202310074563 A CN 202310074563A CN 116416307 B CN116416307 B CN 116416307B
Authority
CN
China
Prior art keywords
prefabricated
hoisting
robot
splicing
prefabricated part
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310074563.4A
Other languages
Chinese (zh)
Other versions
CN116416307A (en
Inventor
舒江鹏
高一帆
张晓武
肖文楷
夏哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310074563.4A priority Critical patent/CN116416307B/en
Publication of CN116416307A publication Critical patent/CN116416307A/en
Application granted granted Critical
Publication of CN116416307B publication Critical patent/CN116416307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J5/00Manipulators mounted on wheels or on carriages
    • B25J5/02Manipulators mounted on wheels or on carriages travelling along a guideway
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1694Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
    • B25J9/1697Vision controlled systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a prefabricated part hoisting splicing 3D visual guiding method based on deep learning, which comprises the following steps: firstly, extracting the features of prefabricated components based on RGB-D images and carrying out mask processing; then processing the prefabricated part pixel blocks subjected to mask processing based on the CNN architecture to iteratively estimate the 6D gesture of the prefabricated part; and finally, performing 3D visual guidance on the hoisting and splicing of the prefabricated parts based on RRT-Star path planning through the 6D gesture of the end effector of the robot to move to the target point so as to guide the hoisting robot to automatically plan the movement path of the end effector of the hoisting robot to enable the grabbed prefabricated parts to move to the position of the hoisting and splicing. The invention can realize the robot automatic assembly technology of the prefabricated component through the computer 3D vision intelligent perception recognition and feature extraction technology.

Description

Prefabricated part hoisting splicing 3D visual guiding method based on deep learning
Technical Field
The invention relates to application of computer vision, in particular to application of computer vision in the field of robot intelligent construction, and particularly relates to a prefabricated part hoisting splicing 3D vision guiding method based on deep learning.
Background
At present, the fabricated structural system still has a development bottleneck with higher cost than the traditional cast-in-situ system, which affects the popularization of the fabricated structural system to a certain extent. The reason for the higher cost is mainly that the assembly automation degree of the prefabricated components is not high, the operation needs to be carried out under the traction of operators, and the labor cost is not obviously reduced. In the field of intelligent construction, technologies such as deep learning, computer vision perception and the like begin to play an increasingly important role, and a new implementation way is provided for the automatic assembly technology of prefabricated components. The construction robot has 3D vision capability by using a computer vision technology, and performs positioning, hanging and assembling by means of vision guiding prefabricated components. However, the existing method mainly uses an algorithm based on a convolutional neural network (Convolutional Neural Network, CNN) architecture, only considers the position difference of the prefabricated parts in x and y coordinate axes to carry out hoisting path planning and prediction, and lacks the comparison of the position and the posture (included angles with x, y and z axes) difference of the front end/top end of the overhanging steel bar of the prefabricated part in a third dimension z coordinate axis in the to-be-hoisted prefabricated part and the to-be-hoisted connecting position. This may result in the prefabricated part being lifted to the assembly point (x, y) but not being able to match the z coordinate axis position and posture of the front/top end of the overhanging steel bar of the existing prefabricated part at the lifting position for high-precision assembly. Therefore, the invention further explores a 3D visual guiding method for hoisting and splicing the prefabricated parts on the basis of the prior art.
Disclosure of Invention
The invention aims to provide a prefabricated part hoisting splicing 3D visual guiding method based on deep learning aiming at the defects of the prior art.
The aim of the invention is realized by the following technical scheme: a prefabricated part hoisting splicing 3D visual guiding method based on deep learning comprises the following steps:
(1) Based on the RGB-D image, extracting the features of the prefabricated part and carrying out mask processing;
(2) Processing the prefabricated part pixel blocks obtained by mask processing in the step (1) based on a CNN architecture to iteratively estimate the 6D gesture of the prefabricated part;
(3) And (3) acquiring a 6D gesture of the robot end effector to be moved to the target point, and performing 3D visual guidance on the hoisting and splicing of the prefabricated parts based on RRT-Star path planning so as to guide the hoisting robot to automatically plan the movement path of the end effector to enable the grabbed prefabricated parts to be hoisted to the position of the hoisting and splicing.
Optionally, the RGB-D image in the step (1) is acquired by an Intel RealSense depth camera to obtain a color image and a depth image.
Optionally, the color image comprises surface color information and texture information of an object in the simulated hoisting scene; the depth image includes spatial shape information simulating objects in the hoisted scene.
Optionally, the masking in the step (1) is implemented based on a fast R-CNN architecture, which includes a feature extraction network, a region candidate network, a region of interest pooling, and a classification regression full network.
Optionally, the CNN architecture in step (2) includes:
a full convolution network for processing color information, wherein each pixel point in the prefabricated component pixel block obtained by masking in the step (1) is mapped to a color feature space to be embedded as a color feature;
the coordinate conversion unit is used for converting depth channel information in the prefabricated part pixel blocks obtained by mask processing in the step (1) into point cloud data and mapping each data point into a geometric feature space to be embedded as geometric features; and
the CNN network is used for processing pixel-level image fusion and is used for combining color feature embedding and geometric feature embedding and iteratively outputting 6D gestures of the prefabricated component pixel blocks obtained through mask processing based on the unsupervised confidence score.
Optionally, the iteratively estimating the preform 6D pose in step (2) is performed on a multi-frame dense video stream.
Optionally, in the step (3), the motion position and joint angle of each joint of the robot are solved by giving the 6D gesture of the end effector of the robot to be moved to the target point, so as to perform 3D visual guidance on the hoisting and splicing of the prefabricated parts, and control the mechanical arm to work.
The invention has the beneficial effects that the invention takes the automatic construction technology of the robot as the core, integrates the system composition of the computer vision and the intelligent construction technology, provides the 3D vision capability for the construction robot through the RGB-D light sequence processed by the real-time feature extraction mask, and realizes the intelligent high-precision positioning and assembly of the prefabricated components. The whole technology automatically realizes the process from the RGB-D image processing of the prefabricated part to the control and digitization of the mechanical arm, and has the following advantages in practical engineering application: the invention builds the RGB-D image database of the common prefabricated component, and builds the color, shape and space feature extraction method and process of the common prefabricated component, so that the feature extraction of the prefabricated component can be more accurately and efficiently performed based on computer 3D vision; the invention can be used for replacing manual assembly of prefabricated components, wherein a worker is needed to assist the data transmission inside the robot platform; compared with the traditional method, the number of workers required in the construction process can be obviously reduced; the assembly type structure system has high assembly automation degree and high construction efficiency.
Drawings
FIG. 1 is a flow chart of a pre-fabricated part hoisting splicing 3D visual guiding method based on deep learning according to an embodiment of the invention;
FIG. 2 is a schematic diagram of recording an RGB-D image sequence of a simulated hoist scene including a precast element using an Intel RealSense depth camera;
FIG. 3 is a schematic diagram of an RGB-HSV color space transform;
fig. 4 shows a component hoisting actuator (KUKA industrial robot equipped with a linear slide rail on the ground).
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
Referring to fig. 1, the flow chart of the deep learning-based prefabricated part hoisting splicing 3D visual guiding method of the present invention includes the following steps:
(1) Extracting and masking features of the prefabricated parts: based on the RGB-D image, the prefabricated part features are extracted and masking processing is performed.
In this embodiment, the RGB-D image sequence is acquired: color and depth image (RGB-D) sequences of simulated lifting scenes of common prefabricated components (such as prefabricated beams, prefabricated columns, prefabricated floors, prefabricated wallboards, etc.) are recorded by using an Intel RealSense depth camera, as shown in fig. 2. The color image contains surface color information and texture information of objects in the simulated hoisted scene, and the depth image contains spatial shape information of the objects in the simulated hoisted scene.
It will be appreciated that for the identification of the preformed component, in an actual scenario, the image contains a broad background and no annotation of the location of the target preformed component. The preform may appear anywhere in the image, with a variety of different sizes and shapes (e.g., different sizes and shapes of preforms and preforms).
RGB-D image noise reduction: in the process of acquiring a continuous video stream based on an RGB-D image sequence, noise, such as a high-brightness pixel point or pixel block, object pixel blurring, geometric distortion and the like, which cause a strong visual effect, may be introduced due to factors of mutual shielding of objects in a scene image, dense object distribution, noisy environment and the like. Noise is interference information existing in image data and can adversely affect the processing (such as feature extraction) of subsequent images, so that the invention performs noise reduction processing on the acquired RGB-D image sequence. The RGB color space is defined by Red, green and Blue chromaticity, and other colors are generated according to the corresponding color triangle of the pixel. Images acquired under uneven illumination or low light conditions can produce chrominance shift phenomena that are affected by saturation and brightness. Unlike the RGB color space, the HSV color space separates luminance, saturation, and chrominance information. The HSV color space consists of three mutually independent channels, hue, saturation and Value. According to the method, an RGB image is transformed into an HSV space after being subjected to Gaussian filtering (Gaussian Binary Mask), as shown in fig. 3, a hue channel is kept unchanged, noise reduction processing based on a Bayesian estimation threshold is carried out on a brightness channel and a saturation channel, the execution times of Gaussian convolution are reduced, the noise reduction execution efficiency is improved, and finally, new HSV components are subjected to color space inverse transformation, so that a noise-reduced RGB-D image is obtained. The method avoids the defect that color distortion is easy to cause when noise reduction is directly carried out in the RGB space.
Masking processing based on local feature extraction: the local feature extraction is to extract features of the region of interest in the image, so that the region of interest can have a high degree of distinction. In this embodiment, CNN is used to extract local color, shape and spatial features of the prefabricated component in the RGB-D image sequence after noise reduction. The color, shape and spatial features describe the surface pixel RGB composition of the object in the image area, the outline and the mutual spatial position of the object, respectively. First, the CNN is trained using a set of RGB images containing pre-formed component colors, shapes, and spatial features. Then, after the training result is obtained, the training CNN is used to identify the prefabricated component in the new image sequence and give a boundary, and the prefabricated component is separated from other parts of the image. And finally, calculating the value of each pixel in the image again by using a mask kernel, shielding pixels except the target prefabricated part area, and realizing the frame cutting operation of the target prefabricated part pixel area.
In this embodiment, a fast R-CNN architecture is adopted to establish a CNN that can be trained on a multi-frame dense video stream, so as to shorten training/learning time of a network, and improve speed of extracting color, shape and spatial features of a prefabricated component, so as to implement mask processing of a prefabricated component image.
The Faster R-CNN architecture model contains four modules: feature extraction network, region candidate network, region of interest pooling and classification regression full network. Specifically, the input of the feature extraction network is a picture, and the red, green and blue channel features of the picture are output as a picture for the subsequent area candidate network. The region candidate network is input as the red, green and blue channel characteristics in the first step and output as a plurality of regions of interest. Each region of interest is specifically represented by a probability value (for judging whether it is foreground or background) and four coordinate values, the probability value represents the probability that the region of interest has an object, and the probability is obtained by performing two classifications on each region through a normalized exponential function. The coordinate value is the predicted position of the object, and regression is performed by using the coordinate and the real coordinate when training, so that the predicted position of the object is more accurate when testing. The interest domain pooling takes the interest domain output by the region candidate network and the red, green and blue channel characteristics output by the characteristic extraction network as inputs, synthesizes the two to obtain a region characteristic diagram with fixed size, and outputs the region characteristic diagram to the following fully-connected network for classification. And inputting the classification regression whole network into the region characteristic diagram obtained in the upper layer, and outputting the classification regression whole network into the category of the object in the region of interest and the accurate position of the object in the image. The layer classifies the images through a normalized exponential function and corrects the accurate position of the object through frame regression.
(2) Iterative estimation of prefabricated part 6D gesture: and (3) processing the prefabricated part pixel blocks obtained in the mask processing in the step (1) based on the CNN architecture to iteratively estimate the 6D pose of the prefabricated part.
In this embodiment, the 6D pose iterative estimation is to iteratively estimate the spatial position and direction of the target object under the camera coordinate system. The hoisting and splicing between prefabricated components are currently realized by adopting a mode of butting the transverse/longitudinal steel bars extending from one side of the prefabricated components with reserved steel bar insertion holes of the prefabricated components on the other side, for example, in the process of hoisting the prefabricated columns, the top end extending steel bars of the lower column penetrate into reserved grouting sleeve openings at the bottom end of the upper column, and concrete is poured into the grouting sleeve openings to realize the connection between the upper column and the lower column, and the hoisting and splicing between the prefabricated components needs accurate positioning and alignment technology.
It should be appreciated that the 6D pose of the target preform in the camera coordinate system may be obtained by a preform 6D pose iterative estimation algorithm, including the to-be-hoisted preform and the existing preform at the to-be-hoisted splice location.
Specifically, after the target prefabricated part pixel block is successfully cut out from the RGB-D image by masking, the target prefabricated part is subjected to 6D gesture iterative estimation by fully utilizing two complementary data sources of color (RGB) and depth image channels, and the CNN architecture comprises the following three parts:
(1) a full convolution network (Fully Convolutional Network, FCN) for processing color information maps each pixel point in the masking-processed preform pixel block to a color feature space as a color feature insert.
(2) And the coordinate conversion unit is used for converting depth channel information in the prefabricated component pixel blocks obtained by masking processing into point cloud data based on camera internal parameters, and mapping each data point into a geometric feature space to be embedded as geometric features.
(3) A CNN network for processing pixel-level image fusion that combines two embeddings (color and depth, i.e., color feature embeddings in (1) and geometric feature embeddings in (2)) and iteratively outputs the 6D pose of the masking-processed preformed component pixel block based on the unsupervised confidence score.
The embodiment establishes a fast R-CNN (Region-CNN) which can be trained on multi-frame dense video streams, so as to shorten the training/learning time of a network and improve the speed of estimating the 6D gesture of the prefabricated component. It should be appreciated that the 6D pose estimation for the prefabricated elements is performed on a multi-frame dense video stream.
(3) Prefabricated component robot automatic hoisting splice 3D vision guide: and (3) acquiring a 6D gesture of the robot end effector to be moved to the target point, and performing 3D visual guidance on hoisting and splicing of the prefabricated parts based on RRT-Star path planning so as to guide the hoisting robot to automatically plan the movement path of the end effector to enable the grabbed prefabricated parts to be hoisted to move to the position to be hoisted and spliced.
Hoisting robot hand-eye coordinate conversion: KUKA industrial robot (model KR1000titan, bearing capacity: 750-1300kg, arm extension: 3202-3601 mm) carrying a ground linear slide rail is adopted as a prefabricated part hoisting actuator, as shown in FIG. 4. In order to establish a coordinate conversion relationship between the camera (i.e., the eye of the hoisting robot) and the hoisting end effector (i.e., the hand of the hoisting robot), a method of fixing the calibration plate at the hoisting end effector is adopted. And (3) solving to obtain a transformation matrix from the hoisting end effector (namely the point in the calibration plate) to the camera end by recording the 6D gesture of the point in the calibration plate relative to the camera coordinate system, namely the calculated hoisting robot hand-eye calibration matrix. And converting the 6D gesture of the to-be-hoisted splicing position under the camera coordinate system into the 6D gesture under the hoisting end effector coordinate system through the obtained hand-eye calibration matrix.
RRT-Star path planning solution: the RRT-Star path planning algorithm is a path planning algorithm based on sampling, and is suitable for solving the path planning problem of the mobile robot under the high-dimensional space and complex constraint. The basic idea is to search for and advance to the target point by one step in a mode of generating random points, effectively avoid obstacles, avoid the path to sink into a local minimum value, and have the advantage of high convergence rate. The prior art designs of the path planning algorithm for the mechanical arm consider the components as particles, and the building components are rigid bodies with specific shapes in motion. For example, during the motion of the robotic arm grasping the pre-fabricated column, the pre-fabricated column is a rigid body rather than a mass point. Only taking the building components to be grabbed by the mechanical arm as mass points to carry out motion planning can deviate from a real scene, so that the obstacle avoidance function algorithm is not fully considered. The invention establishes a path planning and obstacle avoidance algorithm for automatic installation operation of the mechanical arm based on the rigid body model of the to-be-grabbed component. The specific method is to carry a Minkowski difference operator in an RRT-Star path planning architecture. The minkowski difference is the difference between the two sets of points in euclidean space. When the difference value has a value of 0, the two point sets have overlapping portions, and the collision detection result is "expected occurrence". The RRT-Star and Minkowski difference solving superposition algorithm is innovative, a multi-dimensional kinematic constraint is added for an independent variable redundant equation set, and an optimal motion path of the robot on a linear slide rail and an optimal motion track of each joint of the robot can be obtained, so that the work is controlled.
The invention develops algorithms such as prefabricated part feature extraction and mask processing, prefabricated part 6D gesture iterative estimation, prefabricated part robot automatic hoisting splicing 3D visual guidance and the like. The method can finally issue a control signal to the digital processor of the mechanical arm to guide the mechanical arm to automatically install the prefabricated component, and effectively solves the problems of low assembly automation degree, low construction efficiency and the like of the traditional assembly type structural system.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (5)

1. The 3D visual guiding method for hoisting and splicing the prefabricated parts based on deep learning is characterized by comprising the following steps of:
(1) Based on the RGB-D image, extracting the features of the prefabricated part and carrying out mask processing; wherein, for the identification of the prefabricated parts, the image contains a large-scale background and no mark of the position of the target prefabricated parts, the prefabricated parts appear at any position of the image and have various sizes and shapes;
(2) Processing the prefabricated part pixel blocks obtained by mask processing in the step (1) based on a CNN architecture to iteratively estimate the 6D gesture of the prefabricated part;
the CNN architecture in the step (2) includes:
a full convolution network for processing color information, wherein each pixel point in the prefabricated component pixel block obtained by masking in the step (1) is mapped to a color feature space to be embedded as a color feature;
the coordinate conversion unit is used for converting depth channel information in the prefabricated part pixel blocks obtained by mask processing in the step (1) into point cloud data and mapping each data point into a geometric feature space to be embedded as geometric features; and
the CNN network is used for processing pixel-level image fusion, is used for combining color feature embedding and geometric feature embedding and iteratively outputting the 6D gesture of the prefabricated member pixel block obtained by mask processing based on the unsupervised confidence score;
(3) Acquiring a 6D gesture of the robot end effector to be moved to a target point through the step (2), and performing 3D visual guidance on hoisting and splicing of the prefabricated parts based on RRT-Star path planning so as to guide the hoisting robot to automatically plan the movement path of the end effector to enable the grabbed prefabricated parts to be hoisted to the position of the hoisting and splicing;
in the step (3), the motion position and joint angle of each joint of the robot are solved by giving the 6D gesture of the end effector of the robot to be moved to the target point, so as to carry out 3D visual guidance on hoisting and splicing of the prefabricated parts and control the mechanical arm to work;
a path planning and obstacle avoidance algorithm of automatic mechanical arm installation operation is established based on a rigid body model of a to-be-grabbed component, and the specific method is that a Minkowski difference operator is carried in an RRT-Star path planning framework, wherein the Minkowski difference is the difference value of point sets of two Euclidean spaces; and the RRT-Star and Minkowski difference are solved and overlapped, and a multi-dimensional kinematic constraint is added for an independent variable redundant equation set so as to obtain an optimal motion path of the robot on the linear slide rail and an optimal motion track of each joint of the robot, and the mechanical arm is controlled to work.
2. The 3D visual guidance method for hoisting and splicing prefabricated components based on deep learning according to claim 1, wherein the RGB-D images in the step (1) are acquired by an Intel RealSense depth camera to obtain color images and depth images.
3. The deep learning based prefabricated part hoisting splice 3D visual guidance method of claim 2, wherein the color image comprises surface color information and texture information of objects in a simulated hoisting scene; the depth image includes spatial shape information simulating objects in the hoisted scene.
4. The deep learning based prefabricated part hoisting splice 3D visual guidance method according to claim 1, wherein the masking process in the step (1) is implemented based on a fast R-CNN architecture, which includes a feature extraction network, a region candidate network, a region of interest pooling, and a classification regression full network.
5. The 3D visual guidance method for hoisting and splicing prefabricated parts based on deep learning according to claim 1, wherein the iterative estimation of the 6D pose of the prefabricated parts in the step (2) is performed on a multi-frame dense video stream.
CN202310074563.4A 2023-02-07 2023-02-07 Prefabricated part hoisting splicing 3D visual guiding method based on deep learning Active CN116416307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310074563.4A CN116416307B (en) 2023-02-07 2023-02-07 Prefabricated part hoisting splicing 3D visual guiding method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310074563.4A CN116416307B (en) 2023-02-07 2023-02-07 Prefabricated part hoisting splicing 3D visual guiding method based on deep learning

Publications (2)

Publication Number Publication Date
CN116416307A CN116416307A (en) 2023-07-11
CN116416307B true CN116416307B (en) 2024-04-02

Family

ID=87053943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310074563.4A Active CN116416307B (en) 2023-02-07 2023-02-07 Prefabricated part hoisting splicing 3D visual guiding method based on deep learning

Country Status (1)

Country Link
CN (1) CN116416307B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215080A (en) * 2018-09-25 2019-01-15 清华大学 6D Attitude estimation network training method and device based on deep learning Iterative matching
CN110376195A (en) * 2019-07-11 2019-10-25 中国人民解放军国防科技大学 Explosive detection method
CN110962130A (en) * 2019-12-24 2020-04-07 中国人民解放军海军工程大学 Heuristic RRT mechanical arm motion planning method based on target deviation optimization
CN111145253A (en) * 2019-12-12 2020-05-12 深圳先进技术研究院 Efficient object 6D attitude estimation algorithm
CN114742888A (en) * 2022-03-12 2022-07-12 北京工业大学 6D attitude estimation method based on deep learning
CN114912287A (en) * 2022-05-26 2022-08-16 四川大学 Robot autonomous grabbing simulation system and method based on target 6D pose estimation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215080A (en) * 2018-09-25 2019-01-15 清华大学 6D Attitude estimation network training method and device based on deep learning Iterative matching
CN110376195A (en) * 2019-07-11 2019-10-25 中国人民解放军国防科技大学 Explosive detection method
CN111145253A (en) * 2019-12-12 2020-05-12 深圳先进技术研究院 Efficient object 6D attitude estimation algorithm
CN110962130A (en) * 2019-12-24 2020-04-07 中国人民解放军海军工程大学 Heuristic RRT mechanical arm motion planning method based on target deviation optimization
CN114742888A (en) * 2022-03-12 2022-07-12 北京工业大学 6D attitude estimation method based on deep learning
CN114912287A (en) * 2022-05-26 2022-08-16 四川大学 Robot autonomous grabbing simulation system and method based on target 6D pose estimation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Rui Zeng 等.View planning in robot active vision: A survey of systems, algorithms, and applications.2020,225–245. *
基于机器视觉的协作机器人智能抓取;邹德伟;《中国知网硕士电子期刊 信息科技辑》;正文第1-72页 *
邹德伟.基于机器视觉的协作机器人智能抓取.《中国知网硕士电子期刊 信息科技辑》.2022,正文第1-72页. *

Also Published As

Publication number Publication date
CN116416307A (en) 2023-07-11

Similar Documents

Publication Publication Date Title
CN108280856B (en) Unknown object grabbing pose estimation method based on mixed information input network model
CN109800864B (en) Robot active learning method based on image input
CN107914272B (en) Method for grabbing target object by seven-degree-of-freedom mechanical arm assembly
CN109986560B (en) Mechanical arm self-adaptive grabbing method for multiple target types
CN114912287B (en) Robot autonomous grabbing simulation system and method based on target 6D pose estimation
CN113222940B (en) Method for automatically grabbing workpiece by robot based on RGB-D image and CAD model
CN113370217B (en) Object gesture recognition and grabbing intelligent robot method based on deep learning
CN107301634A (en) A kind of robot automatic sorting method and system
CN114155372A (en) Deep learning-based structured light weld curve identification and fitting method
CN112560704B (en) Visual identification method and system for multi-feature fusion
CN111489394A (en) Object posture estimation model training method, system, device and medium
CN111814823A (en) Transfer learning method based on scene template generation
US11455767B1 (en) Intelligent material completeness detection and configuration method based on digital twin and augmented reality (AR)
CN115147488A (en) Workpiece pose estimation method based on intensive prediction and grasping system
CN116416307B (en) Prefabricated part hoisting splicing 3D visual guiding method based on deep learning
WO2022194883A3 (en) Visual servoing of a robot
CN116922448B (en) Environment sensing method, device and system for high-speed railway body-in-white transfer robot
CN115861780B (en) Robot arm detection grabbing method based on YOLO-GGCNN
CN108921852B (en) Double-branch outdoor unstructured terrain segmentation network based on parallax and plane fitting
CN109003268B (en) Method for detecting appearance color of ultrathin flexible IC substrate
CN112446299B (en) Traffic density detection method, system and computer readable storage medium
CN118202383A (en) Defect detection method, defect detection system, and defect detection program
CN112785619A (en) Unmanned underwater vehicle autonomous tracking method based on visual perception
CN117911287B (en) Interactive splicing and repairing method for large-amplitude wall painting images
CN113674349B (en) Steel structure identification and positioning method based on depth image secondary segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant