Benchmarking pose estimation for robot manipulation

Abstract Robot grasping and manipulation require estimation of 3D object poses. Recently, a number of methods and datasets for vision-based pose estimation have been proposed. However, it is unclear how well the performance measures developed for visual pose estimation predict success in robot manipulation. In this work, we introduce an approach that connects error in pose and success in robot manipulation, and propose a probabilistic performance measure of the task success rate. A physical setup is needed to estimate the probability densities from real world samples, but evaluation of pose estimation methods is offline using captured test images, ground truth poses and the estimated densities. We validate the approach with four industrial manipulation tasks and evaluate a number of publicly available pose estimation methods. The popular pose estimation performance measure, Average Distance of Corresponding model points (ADC), does not offer any quantitatively meaningful indication of the frequency of success in robot manipulation. Our measure is instead quantitatively informative: e.g., a score of 0.24 corresponds to average success probability of 24%.

[1]  Eric Brachmann,et al.  Learning 6D Object Pose Estimation Using 3D Object Coordinates , 2014, ECCV.

[2]  Vincent Lepetit,et al.  Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes , 2012, ACCV.

[3]  Tae-Kyun Kim,et al.  Recovering 6D Object Pose and Predicting Next-Best-View in the Crowd , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrew W. Fitzgibbon,et al.  Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Stepán Obdrzálek,et al.  On Evaluation of 6D Object Pose Estimation , 2016, ECCV Workshops.

[6]  Frank Chongwoo Park,et al.  Robot sensor calibration: solving AX=XB on the Euclidean group , 1994, IEEE Trans. Robotics Autom..

[7]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Francisco José Madrid-Cuevas,et al.  Automatic generation and detection of highly reliable fiducial markers under occlusion , 2014, Pattern Recognit..

[9]  Luc Van Gool,et al.  Hough Transform and 3D SURF for Robust Three Dimensional Classification , 2010, ECCV.

[10]  Paolo Cignoni,et al.  MeshLab: an Open-Source Mesh Processing Tool , 2008, Eurographics Italian Chapter Conference.

[11]  Eric Brachmann,et al.  BOP: Benchmark for 6D Object Pose Estimation , 2018, ECCV.

[12]  Eric Brachmann,et al.  Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Federico Tombari,et al.  Object Recognition in 3D Scenes with Occlusions and Clutter by Hough Voting , 2010, 2010 Fourth Pacific-Rim Symposium on Image and Video Technology.

[14]  Nassir Navab,et al.  Deep Model-Based 6D Pose Refinement in RGB , 2018, ECCV.

[15]  Zhiguo Cao,et al.  Performance Evaluation of 3D Correspondence Grouping Algorithms , 2017, 2017 International Conference on 3D Vision (3DV).

[16]  Manolis I. A. Lourakis,et al.  T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[17]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Nassir Navab,et al.  SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Darwin G. Caldwell,et al.  AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[21]  Oliver Kroemer,et al.  Learning grasp affordance densities , 2011, Paladyn J. Behav. Robotics.

[22]  Hui Chen,et al.  3D free-form object recognition in range images using local surface patches , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[23]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[24]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Kate Saenko,et al.  High precision grasp pose detection in dense clutter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Henrik Gordon Petersen,et al.  In Search of Inliers: 3D Correspondence by Local and Global Voting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Joni-Kristian Kämäräinen,et al.  Robustifying correspondence based 6D object pose estimation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).