Discriminative Distillation to Reduce Class Confusion in Continual Learning

Successful continual learning of new knowledge would enable intelligent systems to recognize more and more classes of objects. However, current intelligent systems often fail to correctly recognize previously learned classes of objects when updated to learn new classes. It is widely believed that such downgraded performance is solely due to the catastrophic forgetting of previously learned knowledge. In this study, we argue that the class confusion phenomena may also play a role in downgrading the classification performance during continual learning, i.e., the high similarity between new classes and any previously learned classes would also cause the classifier to make mistakes in recognizing these old classes, even if the knowledge of these old classes is not forgotten. To alleviate the class confusion issue, we propose a discriminative distillation strategy to help the classify well learn the discriminative features between confusing classes during continual learning. Experiments on multiple natural image classification tasks support that the proposed distillation strategy, when combinedwith existing methods, is effective in further improving continual learning.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Richard Socher,et al.  Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting , 2019, ICML.

[3]  Cordelia Schmid,et al.  Memory-Efficient Incremental Learning Through Feature Adaptation , 2020, ECCV.

[4]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[5]  Yee Whye Teh,et al.  Continual Unsupervised Representation Learning , 2019, NeurIPS.

[6]  Ying Fu,et al.  Incremental Learning Using Conditional Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Dahua Lin,et al.  Lifelong Learning via Progressive Distillation and Retrospection , 2018, ECCV.

[8]  Simone Calderara,et al.  Conditional Channel Gated Networks for Task-Aware Continual Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Taesup Moon,et al.  Uncertainty-based Continual Learning with Adaptive Regularization , 2019, NeurIPS.

[10]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Laurent Itti,et al.  Closed-Loop Memory GAN for Continual Learning , 2018, IJCAI.

[12]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[13]  Hyo-Eun Kim,et al.  Keep and Learn: Continual Learning by Constraining the Latent Space for Knowledge Preservation in Neural Networks , 2018, MICCAI.

[14]  Tinne Tuytelaars,et al.  Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Djallel Bouneffouf,et al.  Scalable Recollections for Continual Lifelong Learning , 2017, AAAI.

[17]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[18]  Ronald Kemker,et al.  Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.

[19]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[21]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Patrick Jähnichen,et al.  Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[24]  Yi-Ming Chan,et al.  Compacting, Picking and Growing for Unforgetting Continual Learning , 2019, NeurIPS.

[25]  Fahad Shahbaz Khan,et al.  Random Path Selection for Continual Learning , 2019, NeurIPS.

[26]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[28]  Taesup Moon,et al.  Continual Learning with Node-Importance based Adaptive Group Sparse Regularization , 2020, NeurIPS.

[29]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Shin'ichi Satoh,et al.  ADINet: Attribute Driven Incremental Network for Retinal Image Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[32]  Wei-Shi Zheng,et al.  Continual Learning of New Diseases with Dual Distillation and Ensemble Strategy , 2020, MICCAI.

[33]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Matthieu Cord,et al.  PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning , 2020, ECCV.

[35]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[36]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[37]  Ender Konukoglu,et al.  A Lifelong Learning Approach to Brain MR Segmentation Across Scanners and Protocols , 2018, MICCAI.

[38]  Christopher Kanan,et al.  REMIND Your Neural Network to Prevent Catastrophic Forgetting , 2020, ECCV.