Semi-supervised Object Detection with Unlabeled Data

Besides the fully supervised object detection, many approaches have tried other training settings such as weakly-supervised learning which uses only weak labels (image-level) or mix-supervised learning which uses few strong labels (instance-level) and many weak labels. In our work, we investigate the semi-supervised learning with few instance-level labeled images and many unlabeled images. Considering the training of unlabeled images as a latent variable model, we propose an Expectation-Maximization method for semisupervised object detection with unlabeled images. We estimate the latent labels and optimize the model for both classification part and localization part of object detection. Implementing our method on the one-stage object detection model YOLO, we show that like the weakly labeled images, the unlabeled images also can boost the performance of the detector by empirical experimentation on the Pascal VOC dataset.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[3]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[4]  Daniel Cremers,et al.  Learning by Association — A Versatile Semi-Supervised Training Method for Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Changshui Zhang,et al.  Weakly- and Semi-Supervised Object Detection with Expectation-Maximization Algorithm , 2017, ArXiv.

[6]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[7]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[9]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[10]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[13]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[16]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.