论文信息 - Global Hypothesis Generation for 6D Object Pose Estimation

Global Hypothesis Generation for 6D Object Pose Estimation

This paper addresses the task of estimating the 6D-pose of a known 3D object from a single RGB-D image. Most modern approaches solve this task in three steps: i) compute local features, ii) generate a pool of pose-hypotheses, iii) select and refine a pose from the pool. This work focuses on the second step. While all existing approaches generate the hypotheses pool via local reasoning, e.g. RANSAC or Hough-Voting, we are the first to show that global reasoning is beneficial at this stage. In particular, we formulate a novel fully-connected Conditional Random Field (CRF) that outputs a very small number of pose-hypotheses. Despite the potential functions of the CRF being non-Gaussian, we give a new, efficient two-step optimization procedure, with some guarantees for optimality. We utilize our global hypotheses generation procedure to produce results that exceed state-of-the-art for the challenging Occluded Object Dataset.

[1] David G. Lowe,et al. What and Where: 3D Object Recognition with Accurate Pose , 2006, Toward Category-Level Object Recognition.

[2] Vladimir Kolmogorov,et al. Minimizing Nonsubmodular Functions with Graph Cuts-A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] W. Kabsch. A solution for the best rotation to relate two sets of vectors , 1976 .

[4] Christoph Schnörr,et al. A Study of Parts-Based Object Class Detection Using Complete Graphs , 2010, International Journal of Computer Vision.

[5] Tae-Kyun Kim,et al. Latent-Class Hough Forests for 3D Object Detection and Pose Estimation , 2014, ECCV.

[6] Vladlen Koltun,et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[7] Kostas Daniilidis,et al. Seeing Glassware: from Edge Detection to Pose Estimation and Shape Recovery , 2016, Robotics: Science and Systems.

[8] Siddhartha S. Srinivasa,et al. MOPED: A scalable and low latency object recognition and pose estimation system , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9] Vincent Lepetit,et al. Gradient Response Maps for Real-Time Detection of Textureless Objects , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Eric Brachmann,et al. Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11] Daniel P. Huttenlocher,et al. Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[12] Vincent Lepetit,et al. Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes , 2012, ACCV.

[13] Vincent Lepetit,et al. Going Further with Point Pair Features , 2016, ECCV.

[14] Christopher Zach,et al. A dynamic programming approach for fast and robust object pose recognition from range images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Eric Brachmann,et al. Learning 6D Object Pose Estimation Using 3D Object Coordinates , 2014, ECCV.

[16] Luc Van Gool,et al. Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Alexander Shekhovtsov,et al. Maximum Persistency in Energy Minimization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Radu Bogdan Rusu,et al. 3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19] Jamie Shotton,et al. The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20] Sebastian Nowozin,et al. A Comparative Study of Modern Inference Techniques for Structured Discrete Energy Minimization Problems , 2014, International Journal of Computer Vision.

[21] Christoph Schnörr,et al. Partial Optimality by Pruning for MAP-Inference with General Graphical Models , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Nassir Navab,et al. Model globally, match locally: Efficient and robust 3D object recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23] Silvio Savarese,et al. Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery , 2010, ECCV.

[24] Paul J. Besl,et al. A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[25] Andrew W. Fitzgibbon,et al. Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Bogdan Savchynskyy,et al. Maximum persistency via iterative relaxed inference with graphical models , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Chen Wang,et al. Relaxation-Based Preprocessing Techniques for Markov Random Field Inference , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[29] Mario Vento,et al. Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[30] Vladimir Kolmogorov,et al. Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Derek Hoiem,et al. 3D LayoutCRF for Multi-View Object Class Recognition and Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Carsten Rother,et al. A Study of Lagrangean Decompositions and Dual Ascent Solvers for Graph Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).