O NLINE B OUNDARY -F REE C ONTINUAL L EARNING BY S CHEDULED D ATA P RIOR

Typical continual learning setup assumes that the dataset is split into multiple discrete tasks. We argue that it is less realistic as the streamed data would have no notion of task boundary in real-world data. Here, we take a step forward to investigate more realistic online continual learning – learning continuously changing data distribution without explicit task boundary, which we call boundary-free setup. Due to the lack of boundary, it is not obvious when and what information in the past to be preserved for a better remedy for the stability-plasticity dilemma. To this end, we propose a scheduled transfer of previously learned knowledge. In addition, we further propose a data-driven balancing between the knowledge in the past and the present in learning objective. Moreover, since it is not straightforward to use the previously proposed forgetting measure without task boundaries, we further propose a novel forgetting and knowledge gain measure based on information theory. We empirically evaluate our method on a Gaussian data stream and its periodic extension, which is frequently observed in real-life data, as well as the conventional disjoint task-split. Our method outperforms prior arts by large margins in various setups, using four benchmark datasets in continual learning literature – CIFAR-10, CIFAR-100, TinyImageNet and ImageNet. Code is available at https://github.com/yonseivnl/sdp.

[1]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[2]  Jung-Woo Ha,et al.  Online Continual Learning on a Contaminated Data Stream with Blurry Task Boundaries , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Deepak Pathak,et al.  The CLEAR Benchmark: Continual LEArning on Real-World Imagery , 2022, NeurIPS Datasets and Benchmarks.

[4]  Jennifer G. Dy,et al.  Learning to Prompt for Continual Learning , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Po-Sen Huang,et al.  Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.

[6]  Jung-Woo Ha,et al.  Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference , 2021, ICLR.

[7]  Gunhee Kim,et al.  Continual Learning on Noisy Data Streams via Self-Purified Replay , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Kyungduk Kim,et al.  What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers , 2021, EMNLP.

[9]  Vladlen Koltun,et al.  Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Michael S. Bernstein,et al.  On the Opportunities and Risks of Foundation Models , 2021, ArXiv.

[11]  Murray Shanahan,et al.  Encoders and Ensembles for Task-Free Continual Learning , 2021, ArXiv.

[12]  Jihwan Bang,et al.  Rainbow Memory: Continual Learning with a Memory of Diverse Samples , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kaushik Roy,et al.  Gradient Projection Memory for Continual Learning , 2021, ICLR.

[14]  Hyunwoo J. Kim,et al.  Online Continual Learning in Image Classification: An Empirical Survey , 2021, Neurocomputing.

[15]  Joost van de Weijer,et al.  Class-Incremental Learning: Survey and Performance Evaluation on Image Classification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Philip H. S. Torr,et al.  GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[17]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[18]  Simone Calderara,et al.  Dark Experience for General Continual Learning: a Strong, Simple Baseline , 2020, NeurIPS.

[19]  Fengqing Zhu,et al.  Incremental Learning in Online Scenario , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Quoc V. Le,et al.  RandAugment: Practical data augmentation with no separate search , 2019, ArXiv.

[21]  Tinne Tuytelaars,et al.  A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tinne Tuytelaars,et al.  Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.

[23]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Yoshua Bengio,et al.  Gradient based sample selection for online continual learning , 2019, NeurIPS.

[28]  Tinne Tuytelaars,et al.  Task-Free Continual Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  David Rolnick,et al.  Experience Replay for Continual Learning , 2018, NeurIPS.

[30]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[31]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[32]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[33]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[34]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[35]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[38]  B. Liu,et al.  Online Continual Learning through Mutual Information Maximization , 2022, ICML.

[39]  OctoMiao Overcoming catastrophic forgetting in neural networks , 2016 .

[40]  Petra Theunissen,et al.  Conference Paper , 2009 .