Robust Place Categorization With Deep Domain Generalization

Traditional place categorization approaches in robot vision assume that training and test images have similar visual appearance. Therefore, any seasonal, illumination, and environmental changes typically lead to severe degradation in performance. To cope with this problem, recent works have been proposed to adopt domain adaptation techniques. While effective, these methods assume that some prior information about the scenario where the robot will operate is available at training time. Unfortunately, in many cases, this assumption does not hold, as we often do not know where a robot will be deployed. To overcome this issue, in this paper, we present an approach that aims at learning classification models able to generalize to unseen scenarios. Specifically, we propose a novel deep learning framework for domain generalization. Our method develops from the intuition that, given a set of different classification models associated to known domains (e.g., corresponding to multiple environments, robots), the best model for a new sample in the novel domain can be computed directly at test time by optimally combining the known models. To implement our idea, we exploit recent advances in deep domain adaptation and design a convolutional neural network architecture with novel layers performing a weighted version of batch normalization. Our experiments, conducted on three common datasets for robot place categorization, confirm the validity of our contribution.

[1]  Jianxiong Xiao,et al.  Robot In a Room: Toward Perfect Object Recognition in Closed Environments , 2015, ArXiv.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  John K. Tsotsos,et al.  Histogram of Oriented Uniform Patterns for robust place recognition and categorization , 2012, Int. J. Robotics Res..

[4]  Michael Milford,et al.  2D visual place recognition for domestic service robots at night , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[6]  Joachim Denzler,et al.  ImageNet pre-trained models with batch normalization , 2016, ArXiv.

[7]  James M. Rehg,et al.  CENTRIST: A Visual Descriptor for Scene Categorization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[9]  Fabio Maria Carlucci,et al.  AutoDIAL: Automatic Domain Alignment Layers , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Michael Milford,et al.  Deep learning features at scale for visual place recognition , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Wolfram Burgard,et al.  Speeding-up multi-robot exploration by considering semantic place information , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Barbara Caputo,et al.  Transfer Learning of Visual Concepts across Robots: a Discriminative Approach , 2012 .

[14]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[15]  Ales Leonardis,et al.  Part-based room categorization for household service robots , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Dong Xu,et al.  Exploiting Low-Rank Structure from Latent Domains for Domain Generalization , 2014, ECCV.

[17]  Barbara Caputo,et al.  A realistic benchmark for visual indoor place recognition , 2010, Robotics Auton. Syst..

[18]  Daniel Huber,et al.  Vision-based robot localization across seasons and in remote locations , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Jianxin Wu,et al.  Object Templates for Visual Place Categorization , 2012, ACCV.

[20]  Barbara Caputo,et al.  COLD: The CoSy Localization Database , 2009, Int. J. Robotics Res..

[21]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[22]  Kanji Tanaka,et al.  Self-localization Using Visual Experience Across Domains , 2015, ArXiv.

[23]  Antonios Gasteratos,et al.  Semantic mapping for mobile robotics tasks: A survey , 2015, Robotics Auton. Syst..

[24]  Paul Newman,et al.  Lighting invariant urban street classification , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yan Lu,et al.  Robustness to lighting variations: An RGB-D indoor visual odometry using line segments , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Kanji Tanaka Cross-season place recognition using NBNN scene descriptor , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Zsolt Kira Transfer of sparse coding representations and object classifiers across heterogeneous robots , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Ryan M. Eustice,et al.  Learning visual feature descriptors for dynamic lighting conditions , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Lei Shi,et al.  Understand scene categories by objects: A semantic regularized scene classifier using Convolutional Neural Networks , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Jiaying Liu,et al.  Revisiting Batch Normalization For Practical Domain Adaptation , 2016, ICLR.

[32]  Alex Bewley,et al.  Addressing appearance change in outdoor robotics with adversarial domain adaptation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Barbara Caputo,et al.  SVM-based Transfer of Visual Knowledge Across Robotic Platforms , 2007, ICVS 2007.

[34]  Nicu Sebe,et al.  Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Barbara Caputo,et al.  Learning Deep NBNN Representations for Robust Place Categorization , 2017, IEEE Robotics and Automation Letters.

[36]  Fabio Maria Carlucci,et al.  Just DIAL: DomaIn Alignment Layers for Unsupervised Domain Adaptation , 2017, ICIAP.

[37]  Stefan Leutenegger,et al.  SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Francesc Moreno-Noguer,et al.  Learning Depth-Aware Deep Representations for Robotic Perception , 2017, IEEE Robotics and Automation Letters.

[39]  Barbara Caputo,et al.  Visual Servoing to Help Camera Operators Track Better , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40]  Bernhard Schölkopf,et al.  Domain Generalization via Invariant Feature Representation , 2013, ICML.

[41]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[42]  James M. Rehg,et al.  Visual Place Categorization: Problem, dataset, and algorithm , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Robert Pless,et al.  Consistent Temporal Variations in Many Outdoor Scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Paolo Valigi,et al.  A transfer learning approach for multi-cue semantic place recognition , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[45]  Alexei A. Efros,et al.  Undoing the Damage of Dataset Bias , 2012, ECCV.