Kitting in the Wild through Online Domain Adaptation

Technological developments call for increasing perception and action capabilities of robots. Among other skills, vision systems that can adapt to any possible change in the working conditions are needed. Since these conditions are unpredictable, we need benchmarks which allow to assess the generalization and robustness capabilities of our visual recognition algorithms. In this work we focus on robotic kitting in unconstrained scenarios. As a first contribution, we present a new visual dataset for the kitting task. Differently from standard object recognition datasets, we provide images of the same objects acquired under various conditions where camera, illumination and background are changed. This novel dataset allows for testing the robustness of robot visual recognition algorithms to a series of different domain shifts both in isolation and unified. Our second contribution is a novel online adaptation algorithm for deep models, based on batch-normalization layers, which allows to continuously adapt a model to the current working conditions. Differently from standard domain adaptation algorithms, it does not require any image from the target domain at training time. We benchmark the performance of the algorithm on the proposed dataset, showing its capability to fill the gap between the performances of a standard architecture and its counterpart adapted offline to the given target domain.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Barbara Caputo,et al.  Best Sources Forward: Domain Generalization through Source-Specific Nets , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[4]  Trevor Darrell,et al.  Continuous Manifold Based Adaptation for Evolving Visual Domains , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Barbara Caputo,et al.  Leveraging over prior knowledge for online learning of visual categories , 2012, BMVC.

[8]  Ajmal S. Mian,et al.  Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Bernd Kuhlenkötter,et al.  A modular and extensible framework for real and virtual bin-picking environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[10]  Bernhard Schölkopf,et al.  Domain Generalization via Invariant Feature Representation , 2013, ICML.

[11]  Fabio Maria Carlucci,et al.  A deep representation for depth images from synthetic data , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Lorenzo Rosasco,et al.  Object identification from few examples by improving the invariance of a Deep Convolutional Neural Network , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Rama Chellappa,et al.  Fast object localization and pose estimation in heavy clutter for robotic bin picking , 2012, Int. J. Robotics Res..

[14]  Fabio Maria Carlucci,et al.  (DE)$^2$CO: Deep Depth Colorization , 2017, IEEE Robotics and Automation Letters.

[15]  Fabio Maria Carlucci,et al.  AutoDIAL: Automatic Domain Alignment Layers , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[17]  Nils Bore,et al.  Human-centric partitioning of the environment , 2017, 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[18]  Jiaying Liu,et al.  Revisiting Batch Normalization For Practical Domain Adaptation , 2016, ICLR.

[19]  Sven Behnke,et al.  A skill-based system for object perception and manipulation for automating kitting tasks , 2015, 2015 IEEE 20th Conference on Emerging Technologies & Factory Automation (ETFA).

[20]  Barbara Caputo,et al.  Semantic web-mining and deep vision for lifelong object discovery , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Trevor Darrell,et al.  Cross-modal adaptation for RGB-D detection , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Yongxin Yang,et al.  Deeper, Broader and Artier Domain Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Fabio Maria Carlucci,et al.  Just DIAL: DomaIn Alignment Layers for Unsupervised Domain Adaptation , 2017, ICIAP.

[24]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, ICCV.

[25]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Barbara Caputo,et al.  Online Open World Recognition , 2016, ArXiv.

[27]  Nadir Shah,et al.  An ontology to enable optimized task partitioning in human-robot collaboration for warehouse kitting operations , 2015, Commercial + Scientific Sensing and Imaging.

[28]  Gabriela Csurka,et al.  A Comprehensive Survey on Domain Adaptation for Visual Applications , 2017, Domain Adaptation in Computer Vision Applications.

[29]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Krishnanand N. Kaipa,et al.  Resolving automated perception system failures in bin-picking tasks using assistance from remote human operators , 2015, 2015 IEEE International Conference on Automation Science and Engineering (CASE).

[31]  Barbara Caputo,et al.  Robust Place Categorization With Deep Domain Generalization , 2018, IEEE Robotics and Automation Letters.

[32]  Christoph H. Lampert,et al.  Classifier adaptation at prediction time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Steven C. H. Hoi,et al.  OTL: A Framework of Online Transfer Learning , 2010, ICML.

[34]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[35]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Mahdieh Soleymani Baghshah,et al.  Incremental Evolving Domain Adaptation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[37]  Barbara Caputo,et al.  Adaptive Deep Learning Through Visual Domain Localization , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Barbara Caputo,et al.  Boosting Domain Adaptation by Discovering Latent Domains , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.