CN113129370B - Semi-supervised object pose estimation method combining generated data and label-free data - Google Patents

Semi-supervised object pose estimation method combining generated data and label-free data Download PDF

Info

Publication number
CN113129370B
CN113129370B CN202110241227.5A CN202110241227A CN113129370B CN 113129370 B CN113129370 B CN 113129370B CN 202110241227 A CN202110241227 A CN 202110241227A CN 113129370 B CN113129370 B CN 113129370B
Authority
CN
China
Prior art keywords
data
training
point cloud
pose
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110241227.5A
Other languages
Chinese (zh)
Other versions
CN113129370A (en
Inventor
陈启军
周光亮
颜熠
王德明
刘成菊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202110241227.5A priority Critical patent/CN113129370B/en
Publication of CN113129370A publication Critical patent/CN113129370A/en
Application granted granted Critical
Publication of CN113129370B publication Critical patent/CN113129370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a semi-supervised object pose estimation method combining generated data and label-free data, which comprises the following steps of: 1) generating point cloud data with pose labels, namely generating data; 2) acquiring a color image and a depth image of a target object without a label, inputting the color image into a trained example segmentation network to obtain an example segmentation result, and obtaining point cloud of the target object, namely label-free real data, from the depth image according to the segmentation result; 3) in each training period, performing supervised training on the posture estimation network model by adopting the generated data, and performing self-supervised training on the posture estimation network model by adopting unlabeled real data; 4) and after each training period is finished, calculating the accuracy of the pose estimation network model by adopting partial real data. Compared with the prior art, the method mainly solves the problem that the 6D pose tag is difficult to acquire, and can realize accurate estimation of the pose of the object by only utilizing the synthetic data and the unmarked real data.

Description

Semi-supervised object pose estimation method combining generated data and label-free data
Technical Field
The invention relates to the field of robot vision, in particular to a semi-supervised object pose estimation method combining generated data and label-free data.
Background
The object pose estimation technology based on computer vision is a key technology for realizing grabbing and smart operation of the robot, and has important significance for improving the environment and task adaptability of the robot, widening the application field of the robot and improving the flexibility and application efficiency of the robot in scenes such as intelligent manufacturing, warehouse logistics, home service and the like. In addition, the technology has wide application prospects in the fields of automatic driving, augmented reality, virtual reality and the like.
In recent years, with the vigorous development of deep learning technology, object pose estimation based on deep learning obtains a better effect. Under unstructured scenes such as background clutter, object stack shielding and illumination change, the robustness, accuracy and real-time performance of the deep learning method are superior to those of the traditional pose estimation method. However, the deep learning method is a data-driven algorithm, and needs a large amount of training data with labels to obtain an ideal effect, but in the field of object pose estimation, the 6D labels are difficult to obtain, and time and labor are wasted.
In order to solve the data acquisition problem, there are two main methods at the present stage. One is to artificially synthesize data using a CAD model of the object. However, the domain difference exists between the directly synthesized data and the real data, so that the model trained on the synthesized data has undesirable effect in a real scene. In order to eliminate the domain differences, several methods such as domain randomization, domain adaptation, and highly realistic image generation have been developed, and although these methods achieve certain effects, the effects of models trained using real data have not been achieved. The second category is methods based on self-supervised and semi-supervised learning. Self-supervision and semi-supervision learning are research hotspots in recent years, relatively extensive research is carried out in the fields of image classification, human body pose estimation and the like, only a few attempts are made in the field of object pose estimation, the existing method generates corresponding mask images, color images and depth images in a model rendering mode according to the pose predicted by a network, and the mask images, the color images and the depth images are visually aligned and geometrically aligned with the actual input to serve as a supervision learning target of the network, so that the self-supervision training of the network is realized. Although pose labeling is not needed, the method still needs to be supervised by using the generated color image, the influence of field difference is not eliminated, and the accuracy of the method cannot meet the requirement of practical application.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a semi-supervised object pose estimation method combining generated data and label-free data.
The purpose of the invention can be realized by the following technical scheme:
a semi-supervised object pose estimation method combining generated data and label-free data comprises the following steps:
1) generating point cloud data with a pose tag according to a CAD model of a target object, namely generating data;
2) acquiring a color image and a depth image of a target object without a label, inputting the color image into a trained example segmentation network to obtain an example segmentation result, and obtaining point cloud of the target object, namely label-free real data, from the depth image according to the segmentation result;
3) in each training period, performing supervised training on the posture estimation network model by adopting the generated data, and performing self-supervised training on the posture estimation network model by adopting unlabeled real data;
4) and after each training period is finished, calculating the accuracy of the pose estimation network model by adopting partial real data, selecting a final pose estimation network model according to the accuracy, and realizing the pose estimation of the object according to the final pose estimation network model.
In the step 3), in each training period, firstly performing supervised training, and then performing self-supervised training.
In the process of carrying out supervised training on the pose estimation network model by adopting the generated data, converting input point cloud data according to the pose predicted by the pose estimation network model and the real pose on the pose label, and calculating the average distance between the two converted point clouds to be used as a loss function with supervised training.
In the process of carrying out self-supervision training on the pose estimation network model by using label-free real data, converting the model point cloud according to the pose predicted by the pose estimation network model, and calculating the average distance between the converted model point cloud and the actually input real data to form a loss function of the self-supervision training.
The calculation of the average distance between the converted model point cloud and the actually input real data is specifically as follows:
for each point in the actually input real data point cloud, obtaining a point closest to the pose estimation network model in the model point cloud after conversion, forming a closest point set, and then calculating the average distance between the actual point cloud and the corresponding point in the closest point set, namely the average distance between the converted model point cloud and the actually input real data.
The expression of the loss function L of the self-supervised training is:
Figure BDA0002962260170000031
wherein,
Figure BDA0002962260170000032
for the ith point in the actually input point cloud,
Figure BDA0002962260170000033
is the j-th point in the object model point cloud,
Figure BDA0002962260170000034
and
Figure BDA0002962260170000035
and respectively predicting a rotation component and a translation component, namely the predicted pose, which are obtained by predicting the pose estimation network model, wherein N is the point number of the input point cloud, and M is the point number of the model point cloud.
In the process of carrying out self-supervision training on the pose estimation network model by adopting label-free real data, carrying out random homogeneous transformation on the input real data to obtain a new point cloud, respectively inputting two point clouds before and after the random homogeneous transformation into the pose estimation network model for pose transformation, respectively calculating two self-supervision loss functions according to predicted poses, and carrying out training together.
In the self-supervision training, the expression of the complete self-supervision loss function is:
Figure BDA0002962260170000036
subscripts 1 and 2 respectively represent two point clouds before and after random homogeneous transformation.
In the step 4), after each training period is finished, the average distance between the actual point cloud and the conversion model point cloud is calculated on a test set and used as an evaluation index of the accuracy of the pose estimation network model, and the smaller the average distance is, the more accurate the model is.
And after the first training period is finished, taking the calculated average distance as the optimal distance, updating the optimal distance value for the following training period if the calculated average distance is smaller than the optimal distance, abandoning the model trained in the period if the average distance is larger than the optimal distance and the difference value is larger than a set threshold value, and continuing training on the basis of the model in the previous period in the next period.
Compared with the prior art, the invention has the following advantages:
the semi-supervised training method using the generated data and the unmarked real data solves the problem that the existing object pose estimation method based on deep learning depends on large-scale marked real data, and greatly improves the flexibility of pose estimation application.
In the process of self-supervision training, the invention adopts a point cloud transformation strategy and calculates two self-supervision losses to train the network simultaneously, thereby effectively preventing the influence of the mis-alignment between the point clouds on the network training.
In each training period, the method and the device sequentially utilize the generated data to perform supervised training and utilize the real data to perform self-supervised training.
Drawings
Fig. 1 is a flowchart of a semi-supervised pose estimation method of the present invention.
Fig. 2 is a partial pose estimation result diagram.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
The embodiment provides a semi-supervised object pose estimation method based on generated data and unmarked real data, a frame schematic diagram of the method is shown in fig. 1, and the method specifically comprises the following steps:
s1, generating object point cloud data with pose labels by using the CAD model of the object;
s2, acquiring a color image and a depth image of the target object without a label, inputting the color image into a trained example segmentation network to obtain an example segmentation result, and acquiring a point cloud of the target object from the depth image according to the segmentation result;
and S3, in each training period, performing supervised training on the posture estimation network by using the generated data with labels, and performing self-supervised training on the network by using the real data without labels.
And S4, finishing each training period, calculating the accuracy of the pose estimation model by using partial real data, and selecting a final network model according to the accuracy.
In the implementation process of step S1, OpenGL is used to render and output the object CAD model, so as to obtain object point cloud data with labels at different poses.
In the implementation process of step S2, firstly, a real color image training example with 2D mask labels is used to segment the network; and then, segmenting the real color image by utilizing a trained segmentation network, converting an object part in the depth map into an object point cloud by combining camera internal parameters according to a segmentation result, and taking the object point cloud as the input of a next pose estimation network.
In the implementation process of step S3, in each training period, the pose estimation network is supervised by using the generated data, and then the network is self-supervised by using the unlabeled real data, as shown in fig. 1. When supervised training is carried out by using the generated data with the labels, the network predicted pose and the real pose are respectively utilized to convert the object model point clouds, and the average distance between the two conversion model point clouds is calculated and used as a loss function of the supervised training. When the real data is used for carrying out self-supervision training on the network, the network predicted pose is utilized to convert the object model point cloud, and then the average distance between the conversion model point cloud and the actual input point cloud is calculated, so that a loss function of the self-supervision training is formed. The average distance between the transformed model point cloud and the actual input point cloud is calculated as follows:
for each point in the actual input point cloud, finding a point closest to the point in the space distance in the conversion model point cloud to form a closest point set; and then calculating the average distance between the actual point cloud and the corresponding point in the closest point set as the average distance between the conversion model point cloud and the actual input point cloud. The calculation formula is as follows:
Figure BDA0002962260170000051
wherein,
Figure BDA0002962260170000052
is the ith point in the input object point cloud,
Figure BDA0002962260170000053
is the j-th point in the object model point cloud,
Figure BDA0002962260170000054
and
Figure BDA0002962260170000055
respectively, the rotation component and the translation component of the network prediction, wherein N is the point number of the input point cloud, and M is the point number of the model point cloud.
In addition, in the process of self-supervision training, firstly, input point clouds are subjected to random homogeneous transformation to obtain a new point cloud, the two point clouds are respectively input into a network to be subjected to pose transformation, two self-supervision losses are calculated according to the poses respectively predicted, and the network is trained together. The complete auto-supervision loss function is given by:
Figure BDA0002962260170000056
in the implementation process of step S4, after each training period is finished, the average distance between the actual point cloud and the converted model point cloud is calculated on a test set, and the calculated average distance is used as an evaluation index of model accuracy. The smaller the average distance, the more accurate the model is considered. And in the first training period, taking the calculated average distance as the optimal distance. In the subsequent training period, if the average distance is smaller than the optimal distance, the optimal distance value is updated; if the average distance is larger than the optimal average distance and the difference value is larger than the set threshold value, the model trained in the period is abandoned, and the next period continues to train on the basis of the model in the previous period.
In summary, compared with the pose estimation method in the prior art, the greatest innovation points of the invention include the following three points:
according to the invention, the pose estimation network can be trained only by generating data and unmarked real data, so that the problem that the 6D pose tag is difficult to acquire in the existing method is solved, and the flexibility of pose estimation application is greatly improved.
The invention provides an object pose estimation self-supervision training method based on point cloud, which realizes the self-supervision of a network by aligning the transformation model point cloud and the actual object point cloud; and a point cloud transformation strategy is provided, two self-supervision losses are calculated at the same time to train the network, and the influence of the mis-alignment between the point clouds on the network training is effectively prevented.
The invention provides a semi-supervised training method, namely, in each training period, the generated data are sequentially utilized for carrying out supervised training, and the real data are utilized for carrying out self-supervised training. The result of the partial pose estimation is shown in fig. 2.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (4)

1. A semi-supervised object pose estimation method combining generated data and label-free data is characterized by comprising the following steps of:
1) generating point cloud data with a pose tag according to a CAD model of a target object, namely generating data;
2) acquiring a color image and a depth image of a target object without a label, inputting the color image into a trained example segmentation network to obtain an example segmentation result, and obtaining point cloud of the target object, namely label-free real data, from the depth image according to the segmentation result;
3) in each training period, carrying out supervised training on the generated data alignment posture estimation network model, carrying out self-supervised training on the alignment posture estimation network model by using unlabeled real data, in each training period, carrying out supervised training firstly, and then carrying out self-supervised training, in the process of carrying out self-supervised training on the alignment posture estimation network model by using the unlabeled real data, converting model point clouds according to poses predicted by the pose estimation network model, calculating an average distance between the converted model point clouds and actually input real data to form a loss function of the self-supervised training, and calculating the average distance between the converted model point clouds and the actually input real data specifically as follows:
for each point in the actually input real data point cloud, acquiring a point closest to the pose estimation network model in the model point cloud after conversion to form a closest point set, and then calculating the average distance between the actual point cloud and a corresponding point in the closest point set, namely the average distance between the converted model point cloud and the actually input real data;
the expression of the loss function L of the self-supervised training is:
Figure FDA0003592516590000011
wherein,
Figure FDA0003592516590000012
for the ith point in the actually input point cloud,
Figure FDA0003592516590000013
is the j-th point in the object model point cloud,
Figure FDA0003592516590000014
and
Figure FDA0003592516590000015
respectively predicting a rotary component and a translational component, namely the pose, which are obtained by the pose estimation network model, wherein N is the point number of the input point cloud, and M is the point number of the model point cloud;
in the process of self-supervision training of the pose estimation network model by using label-free real data, carrying out random homogeneous transformation on input real data to obtain a new point cloud, respectively inputting two point clouds before and after the random homogeneous transformation into the pose estimation network model for pose transformation, respectively calculating two self-supervision loss functions according to predicted poses, and carrying out training together, wherein in the self-supervision training, the expression of a complete self-supervision loss function is as follows:
Figure FDA0003592516590000021
wherein, subscripts 1 and 2 respectively and correspondingly represent two point clouds before and after random homogeneous transformation;
4) and after each training period is finished, calculating the accuracy of the pose estimation network model by adopting partial real data, selecting a final pose estimation network model according to the accuracy, and realizing the pose estimation of the object according to the final pose estimation network model.
2. The semi-supervised object pose estimation method combining generated data and unlabeled data according to claim 1, wherein in the supervised training process of the pose estimation network model using the generated data, the input point cloud data are transformed according to the pose predicted by the pose estimation network model and the real pose on the pose tag, respectively, and the average distance between the two transformed point clouds is calculated as a loss function of the supervised training.
3. The semi-supervised object pose estimation method combining generation data and label-free data as claimed in claim 1, wherein in the step 4), after each training period is finished, an average distance between an actual point cloud and a conversion model point cloud is calculated on a test set and is used as an evaluation index of pose estimation network model accuracy, and the smaller the average distance is, the more accurate the model is.
4. The semi-supervised object pose estimation method combining generated data and unlabeled data according to claim 3, wherein after a first training period is finished, the calculated average distance is used as an optimal distance, for a subsequent training period, if the calculated average distance is smaller than the optimal distance, the optimal distance value is updated, if the average distance is larger than the optimal distance and the difference value is larger than a set threshold value, the model trained in the period is discarded, and training is continued on the basis of the model in the previous period in the next period.
CN202110241227.5A 2021-03-04 2021-03-04 Semi-supervised object pose estimation method combining generated data and label-free data Active CN113129370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110241227.5A CN113129370B (en) 2021-03-04 2021-03-04 Semi-supervised object pose estimation method combining generated data and label-free data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110241227.5A CN113129370B (en) 2021-03-04 2021-03-04 Semi-supervised object pose estimation method combining generated data and label-free data

Publications (2)

Publication Number Publication Date
CN113129370A CN113129370A (en) 2021-07-16
CN113129370B true CN113129370B (en) 2022-08-19

Family

ID=76772511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110241227.5A Active CN113129370B (en) 2021-03-04 2021-03-04 Semi-supervised object pose estimation method combining generated data and label-free data

Country Status (1)

Country Link
CN (1) CN113129370B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023206268A1 (en) * 2022-04-28 2023-11-02 西门子股份公司 Method and apparatus for generating training data set, and electronic device and readable medium
CN115953410B (en) * 2023-03-15 2023-05-12 安格利(成都)仪器设备有限公司 Corrosion pit automatic detection method based on target detection supervised learning

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845515A (en) * 2016-12-06 2017-06-13 上海交通大学 Robot target identification and pose reconstructing method based on virtual sample deep learning
CN108491880A (en) * 2018-03-23 2018-09-04 西安电子科技大学 Object classification based on neural network and position and orientation estimation method
CN109816725A (en) * 2019-01-17 2019-05-28 哈工大机器人(合肥)国际创新研究院 A kind of monocular camera object pose estimation method and device based on deep learning
EP3500897A1 (en) * 2016-08-29 2019-06-26 Siemens Aktiengesellschaft Method and system for anomaly detection in a manufacturing system
CN110490928A (en) * 2019-07-05 2019-11-22 天津大学 A kind of camera Attitude estimation method based on deep neural network
CN110503680A (en) * 2019-08-29 2019-11-26 大连海事大学 It is a kind of based on non-supervisory convolutional neural networks monocular scene depth estimation method
CN110910451A (en) * 2019-10-23 2020-03-24 同济大学 Object pose estimation method and system based on deformed convolution network
CN111797692A (en) * 2020-06-05 2020-10-20 武汉大学 Depth image gesture estimation method based on semi-supervised learning
CN111931591A (en) * 2020-07-15 2020-11-13 北京百度网讯科技有限公司 Method and device for constructing key point learning model, electronic equipment and readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3500897A1 (en) * 2016-08-29 2019-06-26 Siemens Aktiengesellschaft Method and system for anomaly detection in a manufacturing system
CN106845515A (en) * 2016-12-06 2017-06-13 上海交通大学 Robot target identification and pose reconstructing method based on virtual sample deep learning
CN108491880A (en) * 2018-03-23 2018-09-04 西安电子科技大学 Object classification based on neural network and position and orientation estimation method
CN109816725A (en) * 2019-01-17 2019-05-28 哈工大机器人(合肥)国际创新研究院 A kind of monocular camera object pose estimation method and device based on deep learning
CN110490928A (en) * 2019-07-05 2019-11-22 天津大学 A kind of camera Attitude estimation method based on deep neural network
CN110503680A (en) * 2019-08-29 2019-11-26 大连海事大学 It is a kind of based on non-supervisory convolutional neural networks monocular scene depth estimation method
CN110910451A (en) * 2019-10-23 2020-03-24 同济大学 Object pose estimation method and system based on deformed convolution network
CN111797692A (en) * 2020-06-05 2020-10-20 武汉大学 Depth image gesture estimation method based on semi-supervised learning
CN111931591A (en) * 2020-07-15 2020-11-13 北京百度网讯科技有限公司 Method and device for constructing key point learning model, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN113129370A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
CN111179324B (en) Object six-degree-of-freedom pose estimation method based on color and depth information fusion
CN108416840B (en) Three-dimensional scene dense reconstruction method based on monocular camera
CN108648274B (en) Cognitive point cloud map creating system of visual SLAM
CN110243370A (en) A kind of three-dimensional semantic map constructing method of the indoor environment based on deep learning
CN111145253B (en) Efficient object 6D attitude estimation algorithm
CN113129370B (en) Semi-supervised object pose estimation method combining generated data and label-free data
CN113012122B (en) Category-level 6D pose and size estimation method and device
CN110188835A (en) Data based on production confrontation network model enhance pedestrian's recognition methods again
CN109815847A (en) A kind of vision SLAM method based on semantic constraint
CN111797692B (en) Depth image gesture estimation method based on semi-supervised learning
CN113297988B (en) Object attitude estimation method based on domain migration and depth completion
CN114613013A (en) End-to-end human behavior recognition method and model based on skeleton nodes
CN113221647A (en) 6D pose estimation method fusing point cloud local features
CN113408584A (en) RGB-D multi-modal feature fusion 3D target detection method
Du et al. Stereo-matching network for structured light
CN115661246A (en) Attitude estimation method based on self-supervision learning
Tao et al. Indoor 3D semantic robot VSLAM based on mask regional convolutional neural network
CN112465836B (en) Thermal infrared semantic segmentation unsupervised field self-adaption method based on contour information
CN112308893B (en) Monocular depth estimation method based on iterative search strategy
Qiu et al. HGG-CNN: the generation of the optimal robotic grasp pose based on vision
Wada et al. Dataset Genreratoin for Semantic Segmentation from 3D Scanned Data Considering Domain Gap
CN114266900B (en) Monocular 3D target detection method based on dynamic convolution
CN117523547B (en) Three-dimensional scene semantic perception method, system, equipment and medium
CN116740820B (en) Single-view point cloud three-dimensional human body posture and shape estimation method based on automatic augmentation
Yang et al. Visual Semantic SLAM Based on Examination of Moving Consistency in Dynamic Scenes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant