Depth-assisted Real-time 3D Object Detection for Augmented Reality

In this paper, we propose a novel method of real-time object detection that can recognize three-dimensional (3D) target objects, regardless of their texture and lighting condition changes. Our method computes a set of reference templates of a target object from both RGB and depth images, which describes the texture and geometry of the object, and fuses them for robust detection. Combining both pieces of information has advantages over the sole use of RGB images: 1) the capability of detecting 3D objects with insufficient textures and complex shapes; 2) robust detection under varying lighting conditions; 3) better identification of a target based on its size. Our approach is inspired by a recent work on templatebased detection, and we show how to extend it with depth information, which results in better detection performance under varying lighting conditions. Intensive computations are parallelized on a GPU to achieve real-time speed, and it takes only about 33 milliseconds for detection and pose estimation. The proposed method can be used for marker-less AR applications using real-world 3D objects, beyond conventional planar target objects.

[1]  Selim Benhimane,et al.  Homography-based 2D Visual Tracking and Servoing , 2007, Int. J. Robotics Res..

[2]  A. Aydin Alatan,et al.  Shape Index SIFT: Range Image Recognition Using Local Features , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Hui Chen,et al.  3D free-form object recognition in range images using local surface patches , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Nassir Navab,et al.  Model globally, match locally: Efficient and robust 3D object recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Luis Moreno,et al.  Object Reconstruction and Recognition leveraging an RGB-D Camera , 2011, MVA.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Ko Nishino,et al.  Scale-hierarchical 3D object recognition in cluttered scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Dieter Fox,et al.  Sparse distance learning for object recognition combining RGB and depth information , 2011, 2011 IEEE International Conference on Robotics and Automation.

[10]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[11]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[12]  Vincent Lepetit,et al.  Scalable real-time planar targets tracking for digilog books , 2010, The Visual Computer.

[13]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[14]  J. Paul Siebert,et al.  Local feature extraction and matching on range images: 2.5D SIFT , 2009, Comput. Vis. Image Underst..

[15]  Vincent Lepetit,et al.  Dominant orientation templates for real-time detection of texture-less objects , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Nicolas Holzschuch,et al.  Efficient stream reduction on the GPU , 2007 .

[17]  Sebastian Thrun,et al.  A Noise‐aware Filter for Real‐time Depth Upsampling , 2008 .

[18]  Vincent Lepetit,et al.  Keyframe-based modeling and tracking of multiple 3D objects , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.

[19]  Bernt Schiele,et al.  3D object recognition from range images using local feature histograms , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.