Omnidirectional CNN for Visual Place Recognition and Navigation

Visual place recognition is challenging, especially when only a few place exemplars are given. To mitigate the challenge, we consider place recognition method using omnidirectional cameras and propose a novel Omnidirectional Convolutional Neural Network (O-CNN) to handle severe camera pose variation. Given a visual input, the task of the O-CNN is not to retrieve the matched place exemplar, but to retrieve the closest place exemplar and estimate the relative distance between the input and the closest place. With the ability to estimate relative distance, a heuristic policy is proposed to navigate a robot to the retrieved closest place. Note that the network is designed to take advantage of the omnidirectional view by incorporating circular padding and rotation invariance. To train a powerful O-CNN, we build a virtual world for training on a large scale. We also propose a continuous lifted structured feature embedding loss to learn the concept of distance efficiently. Finally, our experimental results confirm that our method achieves state-of-the-art accuracy and speed with both the virtual world and real-world datasets.

[1]  Roberto Cipolla,et al.  Geometric Loss Functions for Camera Pose Regression with Deep Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  James J. Little,et al.  Vision-based SLAM using the Rao-Blackwellised Particle Filter , 2005 .

[3]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[4]  Myung Jin Chung,et al.  SLAM with omni-directional stereo vision sensor , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[5]  Michael Milford,et al.  Convolutional Neural Network-based Place Recognition , 2014, ICRA 2014.

[6]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[9]  Yiannis Andreopoulos,et al.  Vectors of locally aggregated centers for compact video representation , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[10]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[12]  Tom Duckett,et al.  Localization for Mobile Robots using Panoramic Vision, Local Features and Particle Filter , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[13]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Parvaneh Saeedi,et al.  Vision-based 3-D trajectory tracking for unknown environments , 2006, IEEE Transactions on Robotics.

[15]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[16]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[18]  Stephen R. Marsland,et al.  Learning globally consistent maps by relaxation , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[19]  Ben Glocker,et al.  Real-time RGB-D camera relocalization , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[20]  Peter I. Corke,et al.  All-environment visual place recognition with SMART , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[24]  David Filliat,et al.  Incremental topo-metric SLAM using vision and robot odometry , 2011, 2011 IEEE International Conference on Robotics and Automation.

[25]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Roland Siegwart,et al.  Appearance-Guided Monocular Omnidirectional Visual Odometry for Outdoor Ground Vehicles , 2008, IEEE Transactions on Robotics.

[28]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[29]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[30]  Jean-Arcady Meyer,et al.  Visual topological SLAM and global localization , 2009, 2009 IEEE International Conference on Robotics and Automation.

[31]  Wolfram Burgard,et al.  Robust Visual Robot Localization Across Seasons Using Network Flows , 2014, AAAI.

[32]  Niko Sünderhauf,et al.  On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Rahul Sukthankar,et al.  Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.

[34]  Takeo Kanade,et al.  Real-time topometric localization , 2012, 2012 IEEE International Conference on Robotics and Automation.

[35]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[36]  Patrick Gros,et al.  Robot motion control from a visual memory , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[37]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[38]  Roberto Cipolla,et al.  Modelling uncertainty in deep learning for camera relocalization , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[39]  Illah R. Nourbakhsh,et al.  Appearance-based place recognition for topological localization , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[40]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[41]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face and Kinship Verification , 2017, IEEE Transactions on Image Processing.

[42]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[44]  Andrew Zisserman,et al.  DisLocation: Scalable Descriptor Distinctiveness for Location Recognition , 2014, ACCV.

[45]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Daniel Cremers,et al.  Image-Based Localization Using LSTMs for Structured Feature Correlation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Laurent Itti,et al.  Biologically Inspired Mobile Robot Vision Localization , 2009, IEEE Transactions on Robotics.

[48]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[49]  Torsten Sattler,et al.  Large-Scale Location Recognition and the Geometric Burstiness Problem , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).