N3M: Natural 3D Markers for Real-Time Object Detection and Pose Estimation

In this paper, a new approach for object detection and pose estimation is introduced. The contribution consists in the conception of entities permitting stable detection and reliable pose estimation of a given object. Thanks to a well- defined off-line learning phase, we design local and minimal subsets of feature points that have, at the same time, distinctive photometric and geometric properties. We call these entities Natural 3D Markers (N3Ms). Constraints on the selection and the distribution of the subsets coupled with a multi-level validation approach result in a detection at high frame rates and allow us to determine the precise pose of the object. The method is robust against noise, partial occlusions, background clutter and illumination changes. The experiments show its superiority to existing standard methods. The validation was carried out using simulated ground truth data. Excellent results on real data demonstrated the usefulness of this approach for many computer vision applications.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Andrew Zisserman,et al.  An Affine Invariant Salient Region Detector , 2004, ECCV.

[3]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[4]  Mark Fiala,et al.  ARTag, a fiducial marker system using digital techniques , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  E. Malis,et al.  2 1/2 D Visual Servoing , 1999 .

[8]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[10]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[11]  Carsten Steger,et al.  Similarity Measures for Occlusion, Clutter, and Illumination Invariant Object Recognition , 2001, DAGM-Symposium.

[12]  Vincent Lepetit,et al.  Keypoint recognition using randomized trees , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Rachid Deriche,et al.  A Robust Technique for Matching two Uncalibrated Images Through the Recovery of the Unknown Epipolar Geometry , 1995, Artif. Intell..

[14]  Hirokazu Kato,et al.  Marker tracking and HMD calibration for a video-based augmented reality conferencing system , 1999, Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99).

[15]  Stephen M. Smith,et al.  SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[16]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[17]  Patrick Rives,et al.  A new approach to visual servoing in robotics , 1992, IEEE Trans. Robotics Autom..

[18]  Vincent Lepetit,et al.  Feature Harvesting for Tracking-by-Detection , 2006, ECCV.

[19]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[20]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[22]  Nassir Navab,et al.  Fusion of 3D and Appearance Models for Fast Object Detection and Pose Estimation , 2006, ACCV.

[23]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[24]  Wolfgang Förstner,et al.  A Framework for Low Level Feature Extraction , 1994, ECCV.

[25]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[26]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[27]  Andrew Zisserman,et al.  Wide baseline stereo matching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).