Training Object Detectors from Few Weakly-Labeled and Many Unlabeled Images

Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set are available. In this work, we study the problem of training an object detector from one or few clean images with image-level labels and a larger set of completely unlabeled images. This is an extreme case of semi-supervised learning where the labeled data are not enough to bootstrap the learning of a classifier or detector. Our solution is to use a standard weakly-supervised pipeline to train a student model from image-level pseudo-labels generated on the unlabeled set by a teacher model, bootstrapped by region-level similarities to clean labeled images. By using the recent pipeline of PCL [47] and more unlabeled images, we achieve performance competitive or superior to many state of the art weakly-supervised detection solutions.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Wenyu Liu,et al.  PCL: Proposal Cluster Learning for Weakly Supervised Object Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Khoi Nguyen,et al.  Feature Weighting and Boosting for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Martin Jägersand,et al.  AMP: Adaptive Masked Proxies for Few-Shot Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Yi Yang,et al.  SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation , 2018, IEEE Transactions on Cybernetics.

[8]  Tao Xiang,et al.  Transfer Learning by Ranking for Weakly Supervised Object Annotation , 2017, BMVC.

[9]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[10]  Bharath Hariharan,et al.  Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Christoph H. Lampert,et al.  Unsupervised Object Discovery: A Comparison , 2010, International Journal of Computer Vision.

[12]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[13]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jin-Hee Lee,et al.  Co-Occurrence Matrix Analysis-Based Semi-Supervised Training for Object Detection , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[15]  Miaojing Shi,et al.  Weakly Supervised Object Localization Using Size Estimates , 2016, ECCV.

[16]  Wenyu Liu,et al.  Weakly Supervised Region Proposal Network and Object Detection , 2018, ECCV.

[17]  Vittorio Ferrari,et al.  Revisiting Knowledge Transfer for Training Object Class Detectors , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  T. Tuytelaars,et al.  Weakly Supervised Object Detection with Posterior Regularization , 2014 .

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Xin Wang,et al.  Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Yong Jae Lee,et al.  You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Frank Keller,et al.  Training Object Class Detectors with Click Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Wenyu Liu,et al.  Multiple Instance Detection Network with Online Instance Classifier Refinement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Nenghai Yu,et al.  Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yunchao Wei,et al.  Towards Computational Baby Learning: A Weakly-Supervised Approach for Object Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[27]  Mohammad H. Poursaeidi,et al.  Robust support vector machines for multiple instance learning , 2012, Annals of Operations Research.

[28]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[29]  Yale Song,et al.  Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Weilin Huang,et al.  CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images , 2018, ECCV.

[31]  Bingbing Ni,et al.  HCP: A Flexible CNN Framework for Multi-Label Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Larry S. Davis,et al.  C-WSL: Count-guided Weakly Supervised Localization , 2017, ECCV.

[33]  Changshui Zhang,et al.  Weakly- and Semi-Supervised Object Detection with Expectation-Maximization Algorithm , 2017, ArXiv.

[34]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[35]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[36]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[37]  Yang Wang,et al.  Weakly supervised localization of novel objects using appearance transfer , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yuxing Tang,et al.  Large Scale Semi-Supervised Object Detection Using Visual and Semantic Knowledge Transfer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Trevor Darrell,et al.  Detector discovery in the wild: Joint multiple instance and representation learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Liujuan Cao,et al.  Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Kaiming He,et al.  Data Distillation: Towards Omni-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Yu-Wing Tai,et al.  Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Jianfei Cai,et al.  Exploiting Web Images for Weakly Supervised Object Detection , 2019, IEEE Transactions on Multimedia.

[44]  Qi Tian,et al.  Zigzag Learning for Weakly Supervised Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Trevor Darrell,et al.  LSDA: Large Scale Detection through Adaptation , 2014, NIPS.

[46]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[47]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[48]  Kan Chen,et al.  Billion-scale semi-supervised learning for image classification , 2019, ArXiv.

[49]  Andrea Vedaldi,et al.  Weakly Supervised Deep Detection Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Smit Marvaniya,et al.  Drawing an Automatic Sketch of Deformable Objects Using Only a Few Images , 2012, ECCV Workshops.

[51]  Yong Jae Lee,et al.  Weakly-supervised Discovery of Visual Pattern Configurations , 2014, NIPS.

[52]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[53]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[54]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[55]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Alexei A. Efros,et al.  Few-Shot Segmentation Propagation with Guided Networks , 2018, ArXiv.

[57]  Deyu Meng,et al.  Few-Example Object Detection with Model Communication , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Shiguang Shan,et al.  Weakly Supervised Object Detection With Segmentation Collaboration , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59]  Cordelia Schmid,et al.  Multi-fold MIL Training for Weakly Supervised Object Localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Miaojing Shi,et al.  Weakly Supervised Object Localization Using Things and Stuff Transfer , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[61]  Kiyoharu Aizawa,et al.  Object-Aware Instance Labeling for Weakly Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[62]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Yannis Avrithis,et al.  Label Propagation for Deep Semi-Supervised Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[65]  Fang Wan,et al.  Min-Entropy Latent Model for Weakly Supervised Object Detection , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[67]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[68]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[69]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[70]  Davide Modolo,et al.  Learning Semantic Part-Based Models from Google Images , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[72]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Tao Xiang,et al.  Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Hongyang Chao,et al.  WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[75]  Nir Ailon,et al.  Semi-supervised deep learning by metric embedding , 2016, ICLR.