Discrepant multiple instance learning for weakly supervised object detection

Abstract Multiple Instance Learning (MIL) is a fundamental method for weakly supervised object detection (WSOD), but experiences difficulty in excluding local optimal solutions and may miss objects or falsely localize object parts. In this paper, we introduce discrepantly collaborative modules into MIL and thereby create discrepant multiple instance learning (D-MIL), pursuing optimal solutions in a simple-yet-effective way. D-MIL adopts multiple MIL learners to pursue discrepant yet complementary solutions indicating object parts, which are fused with a collaboration module for precise object localization. D-MIL implements a new “teachers-students” model, where MIL learners act as “teachers” and object detectors as “students”. Multiple teachers provide rich yet complementary information, which are absorbed by students and transferred back to reinforce the performance of teachers. Experiments show that D-MIL significantly improves the baseline while achieves state-of-the-art performance on the challenging MS-COCO object detection benchmark.

[1]  Guillermo Sapiro,et al.  Self-Learning Scene-Specific Pedestrian Detectors Using a Progressive Latent Model , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ivan Laptev,et al.  ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization , 2016, ECCV.

[3]  Wenyu Liu,et al.  Weakly Supervised Region Proposal Network and Object Detection , 2018, ECCV.

[4]  Chong Wang,et al.  Weakly Supervised Object Localization with Latent Category Learning , 2014, ECCV.

[5]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[6]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[7]  Alex ChiChung Kot,et al.  Splitting Vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation , 2020, ECCV.

[8]  Rongrong Ji,et al.  Generative Adversarial Learning Towards Fast Weakly Supervised Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Jiabin Zhang,et al.  CADN: A weakly supervised learning-based category-aware object detection network for surface defect detection , 2021, Pattern Recognit..

[10]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[11]  C. V. Jawahar,et al.  Dissimilarity Coefficient Based Weakly Supervised Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Liang Chen,et al.  A Fully Convolutional Tri-Branch Network (FCTN) for Domain Adaptation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Yongjian Wu,et al.  UWSOD: Toward Fully-Supervised-Level Capacity Weakly Supervised Object Detection , 2020, NeurIPS.

[14]  Kiyoharu Aizawa,et al.  Object-Aware Instance Labeling for Weakly Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Zaïd Harchaoui,et al.  On learning to localize objects with minimal supervision , 2014, ICML.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Objectness Consistent Representation for Weakly Supervised Object Detection , 2020, ACM Multimedia.

[18]  Cordelia Schmid,et al.  Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Rui Zhang,et al.  Collaborative Learning for Weakly Supervised Object Detection , 2018, IJCAI.

[20]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[21]  Rongxin Jiang,et al.  SLV: Spatial Likelihood Voting for Weakly Supervised Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Shiguang Shan,et al.  Weakly Supervised Object Detection With Segmentation Collaboration , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[25]  Yongqiang Zhang,et al.  Weakly-supervised object detection via mining pseudo ground truth bounding-boxes , 2018, Pattern Recognit..

[26]  R. Beran Minimum Hellinger distance estimates for parametric models , 1977 .

[27]  Yang Zou,et al.  Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection , 2020, NeurIPS.

[28]  Hongyang Chao,et al.  WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[30]  Guorui Zhou,et al.  Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net , 2017, AAAI.

[31]  Masashi Sugiyama,et al.  Unsupervised Domain Adaptation Based on Source-guided Discrepancy , 2018, AAAI.

[32]  Daochang Liu,et al.  Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Luc Van Gool,et al.  Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation , 2020, ECCV.

[34]  Shikui Wei,et al.  GradingNet: Towards Providing Reliable Supervisions for Weakly Supervised Object Detection by Grading the Box Candidates , 2021, AAAI.

[35]  Rongrong Ji,et al.  FreeAnchor: Learning to Match Anchors for Visual Object Detection , 2019, NeurIPS.

[36]  Eric Granger,et al.  Multiple instance learning: A survey of problem characteristics and applications , 2016, Pattern Recognit..

[37]  Wenyu Liu,et al.  Deep patch learning for weakly supervised object classification and discovery , 2017, Pattern Recognit..

[38]  Dongrui Fan,et al.  C-MIDN: Coupled Multiple Instance Detection Network With Segmentation Guidance for Weakly Supervised Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Yong Jae Lee,et al.  Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Fang Wan,et al.  Min-Entropy Latent Model for Weakly Supervised Object Detection , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Andrea Vedaldi,et al.  Weakly Supervised Deep Detection Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Tinne Tuytelaars,et al.  Weakly supervised object detection with convex clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Qixiang Ye,et al.  Min-Entropy Latent Model for Weakly Supervised Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Deyu Meng,et al.  Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self-Paced Curriculum Learning Framework , 2018, International Journal of Computer Vision.

[45]  Liujuan Cao,et al.  Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[47]  Chang Liu,et al.  C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[49]  Luc Van Gool,et al.  Weakly Supervised Cascaded Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[51]  Wei Liu,et al.  Deep Self-Taught Learning for Weakly Supervised Object Localization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Carsten Rother,et al.  Learning discriminative localization from weakly labeled data , 2014, Pattern Recognit..

[54]  Xiaogang Wang,et al.  Diversity Regularized Spatiotemporal Attention for Video-Based Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Xiang Bai,et al.  Human-Like Delicate Region Erasing Strategy for Weakly Supervised Detection , 2019, AAAI.

[56]  Wengang Zhou,et al.  Instance Mining with Class Feature Banks for Weakly Supervised Object Detection , 2021, AAAI.

[57]  Wenyu Liu,et al.  Multiple Instance Detection Network with Online Instance Classifier Refinement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Yoram Reich,et al.  Ensemble modelling or selecting the best model: Many could be better than one , 1999, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[59]  Yong Dou,et al.  Towards Precise End-to-End Weakly Supervised Object Detection Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[60]  Yang Wang,et al.  Weakly supervised localization of novel objects using appearance transfer , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Thomas Deselaers,et al.  Weakly Supervised Localization and Learning with Generic Knowledge , 2012, International Journal of Computer Vision.

[62]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[63]  Qiaosong Wang,et al.  Weakly-Supervised Semantic Segmentation via Sub-Category Exploration , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Wenyu Liu,et al.  PCL: Proposal Cluster Learning for Weakly Supervised Object Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Thomas Deselaers,et al.  Localizing Objects While Learning Their Appearance , 2010, ECCV.

[66]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[67]  Zhoujun Li,et al.  Adversarial Learning for Weakly-Supervised Social Network Alignment , 2019, AAAI.