Global Robot Ego-localization Combining Image Retrieval and HMM-based Filtering

This paper addresses the problem of global visual ego-localization of a robot equipped with a monocular camera that has to navigate autonomously in an urban environment. The robot has access to a database of geo-referenced images of its environment and to the outputs of an odometric system (Inertial Measurement Unit or visual odometry). We suppose that no GPS information is available. The goal of the approach described and evaluated in this paper is to exploit a Hidden Markov Model (HMM) to combine the localization estimates provided by the odometric system and the visual similarities between acquired images and the geo-localized image database. It is shown that the use of spatial and temporal constraints reduces the mean localization error from 16 m to 4 m over a 11 km path evaluated on the Google Pittsburgh dataset when compared to an image based method alone.

[1]  Paul Newman,et al.  Appearance-only SLAM at large scale with FAB-MAP 2.0 , 2011, Int. J. Robotics Res..

[2]  Mubarak Shah,et al.  Accurate Image Localization Based on Google Maps Street View , 2010, ECCV.

[3]  Amir Roshan Zamir,et al.  City scale geo-spatial trajectory estimation of a moving camera , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[5]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[6]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[7]  Gordon Wyeth,et al.  CAT-SLAM: probabilistic localisation and mapping using a continuous appearance-based trajectory , 2012, Int. J. Robotics Res..

[8]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[9]  Takeo Kanade,et al.  Real-time topometric localization , 2012, 2012 IEEE International Conference on Robotics and Automation.

[10]  Sebastian Thrun,et al.  Autonomous Driving: Context and State-of-the-Art , 2012 .

[11]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Tomás Pajdla,et al.  Learning and Calibrating Per-Location Classifiers for Visual Place Recognition , 2013, International Journal of Computer Vision.

[13]  Davide Scaramuzza,et al.  MAV urban localization from Google street view data , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[15]  Avideh Zakhor,et al.  Location-based image retrieval for urban environments , 2011, 2011 18th IEEE International Conference on Image Processing.

[16]  Andreas Geiger,et al.  Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.