Automatic Visual Bag-of-Words for Online Robot Navigation and Mapping

Detecting already-visited regions based on their visual appearance helps reduce drift and position uncertainties in robot navigation and mapping. Inspired from content-based image retrieval, an efficient approach is the use of visual vocabularies to measure similarities between images. This way, images corresponding to the same scene region can be associated. State-of-the-art proposals that address this topic use prebuilt vocabularies that generally require a priori knowledge of the environment. We propose a novel method for appearance-based navigation and mapping where the visual vocabularies are built online, thus eliminating the need for prebuilt data. We also show that the proposed technique allows efficient loop-closure detection, even at small vocabulary sizes, resulting in a higher computational efficiency.

[1]  Gordon Wyeth,et al.  FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Jean-Arcady Meyer,et al.  Incremental vision-based topological SLAM , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Dorian Gálvez-López,et al.  Real-time loop detection with bags of binary words , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Hong Zhang,et al.  BoRF: Loop-closure detection with scale invariant visual features , 2011, 2011 IEEE International Conference on Robotics and Automation.

[5]  Patrick M Kelly An Algorithm for Merging Hyperellipsoidal Clusters , 1994 .

[6]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[7]  Peter Auer,et al.  Weak Hypotheses and Boosting for Generic Object Detection and Recognition , 2004, ECCV.

[8]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  David Filliat,et al.  A visual bag of words method for interactive qualitative localization and mapping , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10]  Paul Newman,et al.  Highly scalable appearance-only SLAM - FAB-MAP 2.0 , 2009, Robotics: Science and Systems.

[11]  Paul Newman,et al.  Probabilistic Appearance Based Navigation and Loop Closing , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[12]  Luc Van Gool,et al.  Visual topological map building in self-similar environments , 2006, ICINCO-RA.

[13]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[15]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[16]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[17]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[18]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[20]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[21]  J. Crank Tables of Integrals , 1962 .

[22]  Vincent Lepetit,et al.  View-based Maps , 2010, Int. J. Robotics Res..

[23]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Paul Newman,et al.  FAB-MAP 3D: Topological mapping with spatial and visual appearance , 2010, 2010 IEEE International Conference on Robotics and Automation.

[25]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[26]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[28]  Roland Siegwart,et al.  Deriving and matching image fingerprint sequences for mobile robot localization , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[29]  S. Mills,et al.  Speeded-up Bag-of-Words algorithm for robot localisation through scene recognition , 2008, 2008 23rd International Conference Image and Vision Computing New Zealand.

[30]  D. F. Hays,et al.  Table of Integrals, Series, and Products , 1966 .

[31]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[32]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[33]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[34]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[35]  S. Lazebnik,et al.  Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study , 2005 .

[36]  Shahriar Negahdaripour,et al.  Efficient three-dimensional scene modeling and mosaicing , 2009 .

[37]  T. Nicosevici,et al.  Online Robust 3D Mapping Using Structure from Motion Cues , 2008, OCEANS 2008 - MTS/IEEE Kobe Techno-Ocean.

[38]  Rafael García,et al.  Efficient 3D Scene Modeling and Mosaicing , 2013, Springer Tracts in Advanced Robotics.

[39]  Ben J. A. Kröse,et al.  A probabilistic model for appearance-based robot localization , 2001, Image Vis. Comput..

[40]  Jean-Arcady Meyer,et al.  Real-time visual loop-closure detection , 2008, 2008 IEEE International Conference on Robotics and Automation.

[41]  Rafael García,et al.  On-line visual vocabularies for robot navigation and mapping , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Trevor Darrell,et al.  Adaptive Vocabulary Forests br Dynamic Indexing and Category Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[43]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.