Multi-view 6D Object Pose Estimation and Camera Motion Planning Using RGBD Images

Recovering object pose in a crowd is a challenging task due to severe occlusions and clutters. In active scenario, whenever an observer fails to recover the poses of objects from the current view point, the observer is able to determine the next view position and captures a new scene from another view point to improve the knowledge of the environment, which may reduce the 6D pose estimation uncertainty. We propose a complete active multi-view framework to recognize 6DOF pose of multiple object instances in a crowded scene. We include several components in active vision setting to increase the accuracy: Hypothesis accumulation and verification combines single-shot based hypotheses estimated from previous views and extract the most likely set of hypotheses; an entropy-based Next-Best-View prediction generates next camera position to capture new data to increase the performance; camera motion planning plans the trajectory of the camera based on the view entropy and the cost of movement. Different approaches for each component are implemented and evaluated to show the increase in performance.

[1]  Siddhartha S. Srinivasa,et al.  Efficient multi-view object recognition and full pose estimation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Tae-Kyun Kim,et al.  6D Object Detection and Next-Best-View Prediction in the Crowd , 2015, ArXiv.

[3]  Markus Vincze,et al.  A Global Hypotheses Verification Method for 3D Object Recognition , 2012, ECCV.

[4]  Luc Van Gool,et al.  A unified framework for content-aware view selection and planning through view importance , 2014, BMVC.

[5]  Gi Hyun Lim,et al.  3D object perception and perceptual learning in the RACE project , 2016, Robotics Auton. Syst..

[6]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[7]  Tae-Kyun Kim,et al.  Recovering 6D Object Pose and Predicting Next-Best-View in the Crowd , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Andrew Y. Ng,et al.  Multi-camera object detection for robotics , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[11]  Gi Hyun Lim,et al.  Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments , 2015, J. Intell. Robotic Syst..

[12]  Bo Li,et al.  Sketch-Based 3D Model Retrieval by Viewpoint Entropy-Based Adaptive View Clustering , 2013, 3DOR@Eurographics.

[13]  Norbert Krüger,et al.  Multi-view object recognition using view-point invariant shape relations and appearance information , 2013, 2013 IEEE International Conference on Robotics and Automation.

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[15]  Bernt Schiele,et al.  3D Object Detection with Multiple Kinects , 2012, ECCV Workshops.

[16]  Eric Brachmann,et al.  Learning 6D Object Pose Estimation Using 3D Object Coordinates , 2014, ECCV.

[17]  Kristen Grauman,et al.  Look-Ahead Before You Leap: End-to-End Active Recognition by Forecasting the Effect of Motion , 2016, ECCV.

[18]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[20]  Tae-Kyun Kim,et al.  Latent-Class Hough Forests for 3D Object Detection and Pose Estimation , 2014, ECCV.

[21]  Justus H. Piater,et al.  Integration of Probabilistic Pose Estimates from Multiple Views , 2016, ECCV.

[22]  Kuan-Ting Yu,et al.  Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Christian Perwass,et al.  Increasing pose estimation performance using multi-cue integration , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[24]  Vincent Lepetit,et al.  Dominant orientation templates for real-time detection of texture-less objects , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Stefan Leutenegger,et al.  Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).