LandmarkBoost: Efficient visualContext Classifiers for Robust Localization

The growing popularity of autonomous systems creates a need for reliable and efficient metric pose retrieval algorithms. Currently used approaches tend to rely on nearest neighbor search of binary descriptors to perform the 2D-3D matching and guarantee realtime capabilities on mobile platforms. These methods struggle, however, with the growing size of the map, changes in viewpoint or appearance, and visual aliasing present in the environment. The rigidly defined descriptor patterns only capture a limited neighborhood of the keypoint and completely ignore the overall visual context. We propose LandmarkBoost - an approach that, in contrast to the conventional 2D-3D matching methods, casts the search problem as a landmark classification task. We use a boosted classifier to classify landmark observations and directly obtain correspondences as classifier scores. We also introduce a formulation of visual context that is flexible, efficient to compute, and can capture relationships in the entire image plane. The original binary descriptors are augmented with contextual information and informative features are selected by the boosting framework. Through detailed experiments, we evaluate the retrieval quality and performance of Landmark-Boost, demonstrating that it outperforms common state-of-the-art descriptor matching methods.

[1]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[4]  Niko Sünderhauf,et al.  Are We There Yet? Challenging SeqSLAM on a 3000 km Journey Across All Four Seasons , 2013 .

[5]  Torsten Sattler,et al.  Scalable 6-DOF Localization on Mobile Devices , 2014, ECCV.

[6]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[7]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Roland Siegwart,et al.  Comparison of nearest-neighbor-search strategies and implementations for efficient shape registration , 2012 .

[10]  Roland Siegwart,et al.  Collaborative navigation for flying and walking robots , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Roland Siegwart,et al.  Robust Visual Place Recognition with Graph Kernels , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Michael Isard,et al.  Descriptor Learning for Efficient Retrieval , 2010, ECCV.

[13]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Torsten Sattler,et al.  Camera Pose Voting for Large-Scale Image-Based Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[16]  Michael Bosse,et al.  The gist of maps - summarizing experience for lifelong localization , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Michael Bosse,et al.  Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization , 2015, Robotics: Science and Systems.

[19]  Henrik Andreasson,et al.  LOGOS: Local Geometric Support for High-Outlier Spatial Verification , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Ming Yang,et al.  Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Simon Lacroix,et al.  Building Location Models for Visual Place Recognition , 2016, Int. J. Robotics Res..

[22]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Roland Siegwart,et al.  Visual place recognition with probabilistic voting , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Paul Newman,et al.  Made to measure: Bespoke landmarks for 24-hour, all-weather localisation with a camera , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[26]  David J. Fleet,et al.  Fast search in Hamming space with multi-index hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Roland Siegwart,et al.  Efficient descriptor learning for large scale localization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Paul Newman,et al.  Scene Signatures: Localised and Point-less Features for Localisation , 2014, Robotics: Science and Systems.

[29]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[30]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[31]  Sen Wang,et al.  VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Roland Siegwart,et al.  Maplab: An Open Framework for Research in Visual-Inertial Mapping and Localization , 2017, IEEE Robotics and Automation Letters.

[33]  Horst Bischof,et al.  From structure-from-motion point clouds to fast location recognition , 2009, CVPR.

[34]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[35]  Laurent Kneip,et al.  OpenGV: A unified and generalized approach to real-time calibrated geometric vision , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Michael Bosse,et al.  Keep it brief: Scalable creation of compressed localization maps , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[38]  Ilya Kostrikov,et al.  PlaNet - Photo Geolocation with Convolutional Neural Networks , 2016, ECCV.

[39]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[40]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[41]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[42]  Paul Newman,et al.  Closing loops without places , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.