论文信息 - Self-Supervised Training Enhances Online Continual Learning

Self-Supervised Training Enhances Online Continual Learning

In continual learning, a system must incrementally learn from a non-stationary data stream without catastrophic forgetting. Recently, multiple methods have been devised for incrementally learning classes on large-scale image classification tasks, such as ImageNet. State-of-the-art continual learning methods use an initial supervised pre-training phase, in which the first 10% 50% of the classes in a dataset are used to learn representations in an offline manner before continual learning of new classes begins. We hypothesize that self-supervised pre-training could yield features that generalize better than supervised learning, especially when the number of samples used for pre-training is small. We test this hypothesis using the self-supervised MoCo-V2 and SwAV algorithms. On ImageNet, we find that both outperform supervised pre-training considerably for online continual learning, and the gains are larger when fewer samples are available. Our findings are consistent across three continual learning algorithms. Our best system achieves a 14.95% relative increase in top-1 accuracy on class incremental ImageNet over the prior state of the art for online continual learning.

Tyler L. Hayes | Christopher Kanan | Jhair Gallardo | Christopher Kanan | Jhair Gallardo

[1] Terrance E. Boult,et al. Towards Open World Recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Adrian Popescu,et al. DeeSIL: Deep-Shallow Incremental Learning , 2018, ECCV Workshops.

[3] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[4] Dahua Lin,et al. Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Matthieu Cord,et al. PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning , 2020, ECCV.

[6] David Barber,et al. Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[7] Thomas Brox,et al. Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[8] R. Srikant,et al. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[9] Yann LeCun,et al. Barlow Twins: Self-Supervised Learning via Redundancy Reduction , 2021, ICML.

[10] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[11] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[12] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.

[14] Yang Song,et al. The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15] Jian Yang,et al. Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Kaiming He,et al. Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[17] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[18] Julien Mairal,et al. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[19] Zhuowen Tu,et al. Unsupervised object class discovery via saliency-guided multiple class learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Hava T. Siegelmann,et al. Replay in Deep Learning: Current Approaches and Missing Biological Elements , 2021, Neural Computation.

[21] Svetlana Lazebnik,et al. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[23] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Christopher Kanan,et al. Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25] Xiaopeng Hong,et al. Few-Shot Class-Incremental Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Nathan D. Cahill,et al. Memory Efficient Experience Replay for Streaming Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[27] Priyadarshini Panda,et al. Tree-CNN: A hierarchical Deep Convolutional Neural Network for incremental learning , 2018, Neural Networks.

[28] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.

[29] Junnan Li,et al. Prototypical Contrastive Learning of Unsupervised Representations , 2020, ICLR.

[30] Matthias De Lange,et al. Continual learning: A comparative study on how to defy forgetting in classification tasks , 2019, ArXiv.

[31] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[32] Adrian Popescu,et al. IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[34] Cordelia Schmid,et al. Incremental Learning of Object Detectors without Catastrophic Forgetting , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[37] John K. Tsotsos,et al. Incremental Learning Through Deep Adaptation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Philip S. Yu,et al. Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[39] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[40] Larry P. Heck,et al. Class-incremental Learning via Deep Model Consolidation , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[41] Rosa H. M. Chan,et al. Challenges in Task Incremental Learning for Assistive Robotics , 2020, IEEE Access.

[42] Alexei A. Efros,et al. Discovering object categories in image collections , 2005 .

[43] Andrea Vedaldi,et al. Efficient Parametrization of Multi-domain Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44] David Filliat,et al. Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges , 2020, Inf. Fusion.

[45] King-Sun Fu,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Adrian Popescu,et al. A Comprehensive Study of Class Incremental Learning Algorithms for Visual Tasks , 2020, Neural Networks.

[47] Martial Hebert,et al. Growing a Brain: Fine-Tuning by Increasing Model Capacity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Ronald Kemker,et al. FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[49] Christopher Kanan,et al. REMIND Your Neural Network to Prevent Catastrophic Forgetting , 2020, ECCV.

[50] Kibok Lee,et al. A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[51] Marcus Rohrbach,et al. Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[52] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[53] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Alexandros Karatzoglou,et al. Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[55] Geoffrey E. Hinton,et al. Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[56] Xiaopeng Hong,et al. Topology-Preserving Class-Incremental Learning , 2020, ECCV.

[57] Yandong Guo,et al. Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[59] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[60] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61] Jia Deng,et al. How Useful Is Self-Supervised Pretraining for Visual Tasks? , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[63] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[64] Samuel Rota Bulo,et al. Modeling the Background for Incremental Learning in Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65] Philip H. S. Torr,et al. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[66] Marc'Aurelio Ranzato,et al. Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[67] Tyler L. Hayes,et al. RODEO: Replay for Online Object Detection , 2020, BMVC.

[68] Jing Zhang,et al. Joint Geometrical and Statistical Alignment for Visual Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69] Timothy M. Hospedales,et al. How Well Do Self-Supervised Models Transfer? , 2020, ArXiv.

[70] Tinne Tuytelaars,et al. Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71] Rama Chellappa,et al. Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[73] Cordelia Schmid,et al. End-to-End Incremental Learning , 2018, ECCV.

[74] Ronald Kemker,et al. Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.

[75] Ivor W. Tsang,et al. Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[76] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[77] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[78] Tinne Tuytelaars,et al. A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[79] Pietro Zanuttigh,et al. Incremental Learning Techniques for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[80] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[81] Svetlana Lazebnik,et al. Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights , 2018, ECCV.