From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning

Continual learning (CL) is one of the most promising trends in recent machine learning research. Its goal is to go beyond classical assumptions in machine learning and develop models and learning strategies that present high robustness in dynamic environments. The landscape of CL research is fragmented into several learning evaluation protocols, comprising different learning tasks, datasets, and evaluation metrics. Additionally, the benchmarks adopted so far are still distant from the complexity of real-world scenarios, and are usually tailored to highlight capabilities specific to certain strategies. In such a landscape, it is hard to objectively assess strategies. In this work, we fill this gap for CL on image data by introducing two novel CL benchmarks that involve multiple heterogeneous tasks from six image datasets, with varying levels of complexity and quality. Our aim is to fairly evaluate current state-of-the-art CL strategies on a common ground that is closer to complex real-world scenarios. We additionally structure our benchmarks so that tasks are presented in increasing and decreasing order of complexity -- according to a curriculum -- in order to evaluate if current CL models are able to exploit structure across tasks. We devote particular emphasis to providing the CL community with a rigorous and reproducible evaluation protocol for measuring the ability of a model to generalize and not to forget while learning. Furthermore, we provide an extensive experimental evaluation showing that popular CL strategies, when challenged with our benchmarks, yield sub-par performance, high levels of forgetting, and present a limited ability to effectively leverage curriculum task ordering. We believe that these results highlight the need for rigorous comparisons in future CL works as well as pave the way to design new CL strategies that are able to deal with more complex scenarios.

[1]  Santhosh K. Ramakrishnan,et al.  A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems , 2023, Neural Networks.

[2]  Daniel A. Braun,et al.  Hierarchically structured task-agnostic continual learning , 2022, Machine Learning.

[3]  N. Japkowicz,et al.  Active Lifelong Anomaly Detection with Experience Replay , 2022, 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA).

[4]  N. Japkowicz,et al.  LIFEWATCH: Lifelong Wasserstein Change Point Detection , 2022, 2022 International Joint Conference on Neural Networks (IJCNN).

[5]  B. Krawczyk,et al.  ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams , 2022, Machine Learning.

[6]  N. Japkowicz,et al.  CPDGA: Change point driven growing auto-encoder for lifelong anomaly detection , 2022, Knowl. Based Syst..

[7]  Davide Bacciu,et al.  Is Class-Incremental Enough for Continual Learning? , 2021, Frontiers in Artificial Intelligence.

[8]  Bartosz Krawczyk,et al.  Tensor decision trees for continual learning from drifting data streams , 2021, Machine Learning.

[9]  Katsumi Inoue,et al.  Learning from interpretation transition using differentiable logic programming semantics , 2021, Mach. Learn..

[10]  Simone Calderara,et al.  Avalanche: an End-to-End Library for Continual Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Ioannis Kanellos,et al.  A Comprehensive Study of Class Incremental Learning Algorithms for Visual Tasks , 2020, Neural Networks.

[12]  Sundong Kim,et al.  Ada-boundary: accelerating DNN training via adaptive boundary batch selection , 2020, Machine Learning.

[13]  Philip H. S. Torr,et al.  GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[14]  Trevor Darrell,et al.  Adversarial Continual Learning , 2020, ECCV.

[15]  Tinne Tuytelaars,et al.  A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Vincenzo Lomonaco,et al.  Rehearsal-Free Continual Learning over Small Non-I.I.D. Batches , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[18]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[19]  David Rolnick,et al.  Experience Replay for Continual Learning , 2018, NeurIPS.

[20]  David Filliat,et al.  Don't forget, there is more than forgetting: new metrics for Continual Learning , 2018, ArXiv.

[21]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[22]  Michael L. Littman,et al.  Policy and Value Transfer in Lifelong Reinforcement Learning , 2018, ICML.

[23]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[24]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[25]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[27]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[28]  Davide Maltoni,et al.  CORe50: a New Dataset and Benchmark for Continuous Object Recognition , 2017, CoRL.

[29]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[30]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[31]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[33]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[35]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[36]  Sung Ju Hwang,et al.  Forget-free Continual Learning with Winning Subnetworks , 2022, ICML.

[37]  Ya Le,et al.  Tiny ImageNet Visual Recognition Challenge , 2015 .

[38]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[39]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[40]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .