Data-efficient Weakly-supervised Learning for On-line Object Detection under Domain Shift in Robotics

Several object detection methods have recently been proposed in the literature, the vast majority based on Deep Convolutional Neural Networks (DCNNs). Such architectures have been shown to achieve remarkable performance, at the cost of computationally expensive batch training and extensive labeling. These methods have important limitations for robotics: Learning solely on off-line data may introduce biases (the socalled domain shift), and prevents adaptation to novel tasks. In this work, we investigate how weakly-supervised learning can cope with these problems. We compare several techniques for weakly-supervised learning in detection pipelines to reduce model (re)training costs without compromising accuracy. In particular, we show that diversity sampling for constructing active learning queries and strong positives selection for selfsupervised learning enable significant annotation savings and improve domain shift adaptation. By integrating our strategies into a hybrid DCNN/FALKON on-line detection pipeline [1], our method is able to be trained and updated efficiently with few labels, overcoming limitations of previous work. We experimentally validate and benchmark our method on challenging robotic object detection tasks under domain shift.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Lorenzo Rosasco,et al.  Are we Done with Object Recognition? The iCub robot's Perspective , 2017, Robotics Auton. Syst..

[3]  Lorenzo Rosasco,et al.  Interactive data collection for deep learning object detectors on humanoid robots , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[4]  Iñaki Inza,et al.  Weak supervision and other non-standard classification problems: A taxonomy , 2016, Pattern Recognit. Lett..

[5]  Joost van de Weijer,et al.  Active Learning for Deep Detection Neural Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[7]  Elisa Maiettini,et al.  A Weakly Supervised Strategy for Learning Object Detection on a Humanoid Robot , 2019, 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids).

[8]  Yarin Gal,et al.  BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning , 2019, NeurIPS.

[9]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Lorenzo Rosasco,et al.  Fast Region Proposal Learning for Object Detection for Robotics , 2020, ArXiv.

[11]  Fedor Zhdanov,et al.  Diverse mini-batch Active Learning , 2019, ArXiv.

[12]  Trevor Darrell,et al.  Learning Detection with Diverse Proposals , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Lorenzo Rosasco,et al.  Fast Object Segmentation Learning with Kernel-based Methods for Robotics , 2020, ArXiv.

[14]  Lei Zhang,et al.  Cost-Effective Object Detection: Active Sample Mining With Switchable Selection Criteria , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Di Huang,et al.  Improving Object Detection with Selective Self-supervised Self-training , 2020, ECCV.

[16]  John Langford,et al.  Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds , 2019, ICLR.

[17]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Lei Zhang,et al.  Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Andreas Krause,et al.  Active Detection via Adaptive Submodularity , 2014, ICML.

[23]  Lorenzo Rosasco,et al.  FALKON: An Optimal Large Scale Kernel Method , 2017, NIPS.

[24]  Ming-Yu Liu,et al.  Localization-Aware Active Learning for Object Detection , 2018, ACCV.

[25]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[26]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[28]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[30]  Lorenzo Rosasco,et al.  Less is More: Nyström Computational Regularization , 2015, NIPS.

[31]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[32]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[33]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[34]  Lorenzo Rosasco,et al.  On-line object detection: a robotics challenge , 2019, Autonomous Robots.

[35]  Lorenzo Rosasco,et al.  Speeding-Up Object Detection Training for Robotics with FALKON , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Lorenzo Rosasco,et al.  Kernel methods through the roof: handling billions of points efficiently , 2020, NeurIPS.

[38]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[39]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[40]  Wei Guo,et al.  An Adaptive Supervision Framework for Active Learning in Object Detection , 2019, BMVC.

[41]  Michele Fenzi,et al.  Scalable Active Learning for Object Detection , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).