Visual place recognition with CNNs: From global to partial

Visual place recognition is one of the most challenging problems in computer vision, due to the large diversities that real-world places can represent. Recently, visual place recognition has become a key part of loop closure detection and topological localization in long-term mobile robot autonomy. In this work, we build up a novel visual place recognition pipeline composed of a first filtering stage followed by a partial reranking process. In the filtering stage, image-wise features are utilized to find a small set of potential places. Afterwards, stable region-wise landmarks are extracted for more accurate matching in the partial reranking process. All global and partial image representations are derived from pre-trained Convolutional Neural Networks (CNNs), and the landmarks are extracted by object proposal techniques. Moreover, a new similarity measurement is provided by considering both spatial and scale distribution of landmarks. Compared with current methods only considering scale distribution, the presented similarity measurement can benefit recognition precision and robustness effectively. Experiments with varied viewpoints and environmental conditions demonstrate that the proposed method achieves superior performance against state-of-the-art methods.

[1]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Niko Sünderhauf,et al.  On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[5]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[6]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[9]  Luis Miguel Bergasa,et al.  Towards life-long visual localization using an efficient matching of binary sequences from images , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Robert Pless,et al.  Consistent Temporal Variations in Many Outdoor Scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Xue Yang,et al.  SRAL: Shared Representative Appearance Learning for Long-Term Visual Place Recognition , 2017, IEEE Robotics and Automation Letters.

[12]  Ananth Ranganathan,et al.  Towards illumination invariance for visual localization , 2013, 2013 IEEE International Conference on Robotics and Automation.

[13]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[14]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[15]  Andrew Calway,et al.  Visual Place Recognition Using Landmark Distribution Descriptors , 2016, ACCV.

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[18]  D. Zhang,et al.  Principle Component Analysis , 2004 .

[19]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[20]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[21]  Xin Yang,et al.  Local difference binary for ultrafast and distinctive feature description. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[22]  Michael Milford,et al.  Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free , 2015, Robotics: Science and Systems.

[23]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[24]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).