A Robust Indoor Localization System Integrating Visual Localization Aided by CNN-Based Image Retrieval with Monte Carlo Localization

This paper proposes a novel multi-sensor-based indoor global localization system integrating visual localization aided by CNN-based image retrieval with a probabilistic localization approach. The global localization system consists of three parts: coarse place recognition, fine localization and re-localization from kidnapping. Coarse place recognition exploits a monocular camera to realize the initial localization based on image retrieval, in which off-the-shelf features extracted from a pre-trained Convolutional Neural Network (CNN) are adopted to determine the candidate locations of the robot. In the fine localization, a laser range finder is equipped to estimate the accurate pose of a mobile robot by means of an adaptive Monte Carlo localization, in which the candidate locations obtained by image retrieval are considered as seeds for initial random sampling. Additionally, to address the problem of robot kidnapping, we present a closed-loop localization mechanism to monitor the state of the robot in real time and make adaptive adjustments when the robot is kidnapped. The closed-loop mechanism effectively exploits the correlation of image sequences to realize the re-localization based on Long-Short Term Memory (LSTM) network. Extensive experiments were conducted and the results indicate that the proposed method not only exhibits great improvement on accuracy and speed, but also can recover from localization failures compared to two conventional localization methods.

[1]  Sung Wook Baik,et al.  Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features , 2018, IEEE Access.

[2]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[3]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[4]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Shin'ichi Satoh,et al.  Large vocabulary quantization for searching instances from videos , 2012, ICMR '12.

[6]  Gordon Wyeth,et al.  Place categorization and semantic mapping on a mobile robot , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Edward M. Riseman,et al.  TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[9]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Bo Zhang,et al.  A unified framework for image retrieval using keyword and visual features , 2005, IEEE Transactions on Image Processing.

[11]  Stergios I. Roumeliotis,et al.  Bayesian estimation and Kalman filtering: a unified framework for mobile robot localization , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[12]  Mubarak Shah,et al.  Accurate Image Localization Based on Google Maps Street View , 2010, ECCV.

[13]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Torsten Sattler,et al.  Camera Pose Voting for Large-Scale Image-Based Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  David Stutz,et al.  Neural Codes for Image Retrieval , 2015 .

[16]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[17]  Wolfram Burgard,et al.  Monte Carlo Localization with Mixture Proposal Distribution , 2000, AAAI/IAAI.

[18]  Torsten Sattler,et al.  Large-Scale Location Recognition and the Geometric Burstiness Problem , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Xin Pan,et al.  ARIEL: automatic wi-fi based room fingerprinting for indoor localization , 2012, UbiComp.

[20]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[21]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Lei Zhang,et al.  Self-adaptive Monte Carlo localization for mobile robots using range sensors , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[24]  Torsten Sattler,et al.  Efficient & Effective Prioritized Matching for Large-Scale Image-Based Localization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Torsten Sattler,et al.  Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization? , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Wolfram Burgard,et al.  Monte Carlo localization for mobile robots , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[28]  Mubarak Shah,et al.  Image Geo-Localization Based on MultipleNearest Neighbor Feature Matching UsingGeneralized Graphs , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[31]  Jing Dong,et al.  What Is the Best Practice for CNNs Applied to Visual Instance Retrieval? , 2016, ArXiv.

[32]  Shin'ichi Satoh,et al.  Faster R-CNN Features for Instance Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[34]  Hongbin Zha,et al.  Coarse-to-fine vision-based localization by indexing scale-Invariant features , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[35]  K. Srinivasan,et al.  Multiple Sensor Fusion in Mobile Robot Localization , 2007, 2007 Canadian Conference on Electrical and Computer Engineering.

[36]  Kyung Shik Roh,et al.  Coarse-to-Fine Localization for a Mobile Robot Based on Place Learning With a 2-D Range Scan , 2016, IEEE Transactions on Robotics.

[37]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[38]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[39]  In-So Kweon,et al.  Rao-Blackwellized particle filtering with Gaussian mixture models for robust visual tracking , 2014, Comput. Vis. Image Underst..

[40]  Larry S. Davis,et al.  Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[41]  Torsten Sattler,et al.  Image Retrieval for Image-Based Localization Revisited , 2012, BMVC.

[42]  Wolfram Burgard,et al.  Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization , 2005, IEEE Transactions on Robotics.

[43]  W. Burgard,et al.  Markov Localization for Mobile Robots in Dynamic Environments , 1999, J. Artif. Intell. Res..

[44]  Hyun Myung,et al.  A Probabilistic Feature Map-Based Localization System Using a Monocular Camera , 2015, Sensors.

[45]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Wolfram Burgard,et al.  Robust Monte Carlo localization for mobile robots , 2001, Artif. Intell..

[47]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[48]  Hao Wu,et al.  A multi-sensor-based mobile robot localization framework , 2014, 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014).

[49]  Manuela M. Veloso,et al.  Multi-sensor Mobile Robot Localization for Diverse Environments , 2013, RoboCup.

[50]  Nando de Freitas,et al.  The Unscented Particle Filter , 2000, NIPS.

[51]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Jing Liu,et al.  Survey of Wireless Indoor Positioning Techniques and Systems , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).