DID: Disentangling-Imprinting-Distilling for Continuous Low-Shot Detection

Practical applications often face a challenging continuous low-shot detection scenario, where a target detection task only has a few annotated training images, and a number of such new tasks come in sequence. To address this challenge, we propose a generic detection scheme via Disentangling-Imprinting-Distilling (DID). DID can leverage delicate transfer insights into the main development flow of deep learning, i.e., architecture design (Disentangling), model initialization (Imprinting), and training methodology (Distilling). This allows DID to be a simple but effective solution for continuous low-shot detection. In addition, DID can integrate the supervision from different detection tasks into a progressive learning procedure. As a result, one can efficiently adapt the previous detector for a new low-shot task, while maintaining the learned detection knowledge in the history. Finally, we evaluate our DID on a number of challenging settings in continuous/incremental low-shot detection. All the results demonstrate that our DID outperforms the recent state-of-the-art approaches. The code and models are available at https://github.com/chenxy99/DID.

[1]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[2]  Deyu Meng,et al.  Few-Example Object Detection with Model Communication , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Wenyu Liu,et al.  Multiple Instance Detection Network with Online Instance Classifier Refinement , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[6]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yuxing Tang,et al.  Large Scale Semi-Supervised Object Detection Using Visual and Semantic Knowledge Transfer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[9]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[10]  Wei Shen,et al.  Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[13]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[14]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[15]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Yunchao Wei,et al.  Towards Computational Baby Learning: A Weakly-Supervised Approach for Object Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Hao Chen,et al.  LSTD: A Low-Shot Transfer Detector for Object Detection , 2018, AAAI.

[19]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Xiaodan Liang,et al.  Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[22]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[23]  Cordelia Schmid,et al.  Incremental Learning of Object Detectors without Catastrophic Forgetting , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[25]  Xiaojin Gong,et al.  Saliency Guided End-to-End Learning for Weakly Supervised Object Detection , 2017, IJCAI.

[26]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[27]  Luca Bertinetto,et al.  Learning feed-forward one-shot learners , 2016, NIPS.

[28]  Junmo Kim,et al.  A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Matthew B. Blaschko,et al.  Encoder Based Lifelong Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Luc Van Gool,et al.  Weakly Supervised Cascaded Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Xin Wang,et al.  Few-Shot Object Detection via Feature Reweighting , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[35]  Matthew A. Brown,et al.  Low-Shot Learning with Imprinted Weights , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Yu Qiao,et al.  Sparse Deep Transfer Learning for Convolutional Neural Network , 2017, AAAI.

[37]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[39]  Andrea Vedaldi,et al.  Weakly Supervised Deep Detection Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[41]  Trevor Darrell,et al.  LSDA: Large Scale Detection through Adaptation , 2014, NIPS.