论文信息 - Improving 6 D Pose Estimation of Objects in Clutter via Physics-aware

Improving 6 D Pose Estimation of Objects in Clutter via Physics-aware

This work proposes a process for efficiently searching over combinations of individual object 6D pose hypotheses in cluttered scenes, especially in cases involving occlusions and objects resting on each other. The initial set of candidate object poses is generated from state-of-the-art object detection and global point cloud registration techniques. The best scored pose per object by using these techniques may not be accurate due to overlaps and occlusions. Nevertheless, experimental indications provided in this work show that object poses with lower ranks may be closer to the real poses than ones with high ranks according to registration techniques. This motivates a global optimization process for improving these poses by taking into account scene-level physical interactions between objects. It also implies that the Cartesian product of candidate poses for interacting objects must be searched so as to identify the best scene-level hypothesis. To perform the search efficiently, the candidate poses for each object are clustered so as to reduce their number but still keep a sufficient diversity. Then, searching over the combinations of candidate object poses is performed through a Monte Carlo Tree Search (MCTS) process that uses the similarity between the observed depth image of the scene and a rendering of the scene given the hypothesized pose as a score that guides the search procedure. MCTS handles in a principled way the tradeoff between fine-tuning the most promising poses and exploring new ones, by using the Upper Confidence Bound (UCB) technique. Experimental results indicate that this process is able to quickly identify in cluttered scenes physically-consistent object poses that are significantly closer to ground truth compared to poses found by point cloud registration methods.

Kostas E. Bekris | Abdeslam Boularias | Chaitanya Mitash | Abdeslam Boularias | Chaitanya Mitash

[1] Nico Blodow,et al. Aligning point cloud views using persistent feature histograms , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] Matthias Nießner,et al. 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Dinesh Manocha,et al. C-DIST: efficient distance computation for rigid and articulated models in configuration space , 2007, Symposium on Solid and Physical Modeling.

[4] Nico Blodow,et al. CAD-model recognition and 6DOF pose estimation using 3D cues , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[5] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[6] Federico Tombari,et al. SHOT: Unique signatures of histograms for surface and texture description , 2014, Comput. Vis. Image Underst..

[7] Kostas E. Bekris,et al. A Dataset for Improved RGBD-Based Object Detection and Pose Estimation for Warehouse Pick-and-Place , 2015, IEEE Robotics and Automation Letters.

[8] Martijn Wisse,et al. Team Delft's Robot Winner of the Amazon Picking Challenge 2016 , 2016, RoboCup.

[9] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10] Shuichi Akizuki,et al. Physical Reasoning for 3D Object Recognition Using Global Hypothesis Verification , 2016, ECCV Workshops.

[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Zoltan-Csaba Marton,et al. Tutorial: Point Cloud Library: Three-Dimensional Object Recognition and 6 DOF Pose Estimation , 2012, IEEE Robotics & Automation Magazine.

[13] Niloy J. Mitra,et al. Super4PCS: Fast Global Pointcloud Registration via Smart Indexing , 2019 .

[14] Kostas E. Bekris,et al. A self-supervised learning system for object detection using physics simulation and multi-view pose estimation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15] Maxim Likhachev,et al. PERCH: Perception via search for multi-object recognition and localization , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[16] Markus Vincze,et al. Multimodal cue integration through Hypotheses Verification for RGB-D object recognition and 6DOF pose estimation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[17] Kuan-Ting Yu,et al. Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[18] Inderjit S. Dhillon,et al. Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[19] Sven Behnke,et al. RGB-D object detection and semantic segmentation for autonomous manipulation in clutter , 2018, Int. J. Robotics Res..

[20] Oliver Brock,et al. Analysis and Observations From the First Amazon Picking Challenge , 2016, IEEE Transactions on Automation Science and Engineering.

[21] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[22] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[23] Eric Brachmann,et al. Global Hypothesis Generation for 6D Object Pose Estimation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Paul J. Besl,et al. A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[25] Markus Vincze,et al. A Global Hypotheses Verification Method for 3D Object Recognition , 2012, ECCV.

[26] Markus Vincze,et al. A Global Hypothesis Verification Framework for 3D Object Recognition in Clutter , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Nassir Navab,et al. Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation , 2016, ECCV.

[28] Pavel Krsek,et al. The Trimmed Iterative Closest Point algorithm , 2002, Object recognition supported by user interaction for service robots.