Object Recognition Robust to Imperfect Depth Data

In this paper, we present an adaptive data fusion model that robustly integrates depth and image only perception. Combining dense depth measurements with images can greatly enhance the performance of many computer vision algorithms, yet degraded depth measurements (e.g., missing data) can also cause dramatic performance losses to levels below image-only algorithms. We propose a generic fusion model based on maximum likelihood estimates of fused image-depth functions for both available and missing depth data. We demonstrate its application to each step of a state-of-the-art image-only object instance recognition pipeline. The resulting approach shows increased recognition performance over alternative data fusion approaches.

[1]  Siddhartha S. Srinivasa,et al.  Object recognition and full pose registration from a single image for robotic manipulation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Siddhartha S. Srinivasa,et al.  The MOPED framework: Object recognition and pose estimation for manipulation , 2011, Int. J. Robotics Res..

[4]  Mario Fritz,et al.  Improving the Kinect by Cross-Modal Stereo , 2011, BMVC.

[5]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[6]  John J. Leonard,et al.  Place Recognition using Near and Far Visual Information , 2011 .

[7]  Gert Kootstra,et al.  Fast and bottom-up object detection, segmentation, and evaluation using Gestalt principles , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  David G. Lowe,et al.  Using stereo for object recognition , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[10]  Jonathan T. Barron,et al.  A category-level 3-D object dataset: Putting the Kinect to work , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[11]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[12]  Mubarak Shah,et al.  Multi-sensor fusion: a perspective , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[13]  Dieter Fox,et al.  A Scalable Tree-Based Approach for Joint Object and Pose Recognition , 2011, AAAI.