Multimodal cue integration through Hypotheses Verification for RGB-D object recognition and 6DOF pose estimation

This paper proposes an effective algorithm for recognizing objects and accurately estimating their 6DOF pose in scenes acquired by a RGB-D sensor. The proposed method is based on a combination of different recognition pipelines, each exploiting the data in a diverse manner and generating object hypotheses that are ultimately fused together in an Hypothesis Verification stage that globally enforces geometrical consistency between model hypotheses and the scene. Such a scheme boosts the overall recognition performance as it enhances the strength of the different recognition pipelines while diminishing the impact of their specific weaknesses. The proposed method outperforms the state-of-the-art on two challenging benchmark datasets for object recognition comprising 35 object models and, respectively, 176 and 353 scenes.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  David G. Lowe,et al.  What and Where: 3D Object Recognition with Accurate Pose , 2006, Toward Category-Level Object Recognition.

[3]  Mohammed Bennamoun,et al.  Three-Dimensional Model-Based Object Recognition and Segmentation in Cluttered Scenes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[5]  Siddhartha S. Srinivasa,et al.  Object recognition and full pose registration from a single image for robotic manipulation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[6]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[7]  Darius Burschka,et al.  An Efficient RANSAC for 3D Object Recognition in Noisy and Occluded Scenes , 2010, ACCV.

[8]  Gary R. Bradski,et al.  Fast 3D recognition and pose using the Viewpoint Feature Histogram , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[10]  Gary R. Bradski,et al.  REIN - A fast, robust, scalable REcognition INfrastructure , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Nico Blodow,et al.  CAD-model recognition and 6DOF pose estimation using 3D cues , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[12]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[13]  Markus Vincze,et al.  OUR-CVFH - Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram for Object Recognition and 6DOF Pose Estimation , 2012, DAGM/OAGM Symposium.

[14]  Pieter Abbeel,et al.  A textured object recognition pipeline for color and depth image data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[15]  Markus Vincze,et al.  A Global Hypotheses Verification Method for 3D Object Recognition , 2012, ECCV.

[16]  Markus Vincze,et al.  Segmentation of unknown objects in indoor environments , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.