More Is Better: An Analysis of Instance Quantity/Quality Trade-off in Rehearsal-based Continual Learning

The design of machines and algorithms capable of learning in a dynamically changing environment has become an increasingly topical problem with the increase of the size and heterogeneity of data available to learning systems. As a consequence, the key issue of Continual Learning has become that of addressing the stabilityplasticity dilemma of connectionist systems, as they need to adapt their model without forgetting previously acquired knowledge. Within this context, rehearsalbased methods i.e., solutions in where the learner exploits memory to revisit past data, has proven to be very effective, leading to performance at the state-of-the-art. In our study, we propose an analysis of the memory quantity/quality trade-off adopting various data reduction approaches to increase the number of instances storable in memory. In particular, we investigate complex instance compression techniques such as deep encoders, but also trivial approaches such as image resizing and linear dimensionality reduction. Our findings suggest that the optimal trade-off is severely skewed toward instance quantity, where rehearsal approaches with several heavily compressed instances easily outperform state-of-the-art approaches with the same amount of memory at their disposal. Further, in high memory configurations, deep approaches extracting spatial structure combined with extreme resizing (of the order of 8× 8 images) yield the best results, while in memory-constrained configurations where deep approaches cannot be used due to their memory requirement in training, Extreme Learning Machines (ELM) offer a clear advantage.

[1]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[3]  Yen-Cheng Liu,et al.  Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[4]  Tom Diethe,et al.  Optimal Continual Learning has Perfect Memory and is NP-hard , 2020, ICML.

[5]  Davide Maltoni,et al.  CORe50: a New Dataset and Benchmark for Continuous Object Recognition , 2017, CoRL.

[6]  Pietro Zanuttigh,et al.  Incremental Learning Techniques for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[7]  Narasimhan Sundararajan,et al.  Classification of Mental Tasks from Eeg Signals Using Extreme Learning Machine , 2006, Int. J. Neural Syst..

[8]  Richard E. Turner,et al.  Continual Learning with Adaptive Weights (CLAW) , 2020, ICLR.

[9]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[10]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[12]  David Rolnick,et al.  Experience Replay for Continual Learning , 2018, NeurIPS.

[13]  Pankaj Gupta,et al.  Neural Topic Modeling with Continual Lifelong Learning , 2020, ICML.

[14]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[15]  J. Weijer,et al.  Self-Training for Class-Incremental Semantic Segmentation , 2020, IEEE transactions on neural networks and learning systems.

[16]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[17]  Eugenio Culurciello,et al.  Continual Reinforcement Learning in 3D Non-stationary Environments , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Visvanathan Ramesh,et al.  A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning , 2020, ArXiv.

[19]  Samuel Rota Bulo,et al.  Modeling the Background for Incremental Learning in Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yoshua Bengio,et al.  Gradient based sample selection for online continual learning , 2019, NeurIPS.

[21]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[22]  Hui Chen,et al.  Ensemble of extreme learning machines for multivariate calibration of near-infrared spectroscopy. , 2019, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.

[23]  Zhanxing Zhu,et al.  Reinforced Continual Learning , 2018, NeurIPS.

[24]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[25]  Vincenzo Lomonaco,et al.  Efficient Continual Learning in Neural Networks with Embedding Regularization , 2019, Neurocomputing.

[26]  Trevor Darrell,et al.  Uncertainty-guided Continual Learning with Bayesian Neural Networks , 2019, ICLR.

[27]  Andrei A. Rusu,et al.  Embracing Change: Continual Learning in Deep Neural Networks , 2020, Trends in Cognitive Sciences.

[28]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[29]  Sebastian Ruder,et al.  Episodic Memory in Lifelong Language Learning , 2019, NeurIPS.

[30]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[31]  Chao Yang,et al.  A Survey on Deep Transfer Learning , 2018, ICANN.

[32]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Pascal Poupart,et al.  Progressive Memory Banks for Incremental Domain Adaptation , 2018, ICLR.

[34]  Ruiping Wang,et al.  CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions , 2020, Artif. Intell..

[35]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Hang Joon Jo,et al.  Multiclass Classification for the Differential Diagnosis on the ADHD Subtypes Using Recursive Feature Elimination and Hierarchical Extreme Learning Machine: Structural MRI Study , 2016, PloS one.

[37]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[38]  Philip H. S. Torr,et al.  GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[39]  Scott Sanner,et al.  Online Class-Incremental Continual Learning with Adversarial Shapley Value , 2020, AAAI.

[40]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[41]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[42]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[44]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[45]  S. Risi,et al.  Continual Learning through Evolvable Neural Turing Machines , 2016 .

[46]  Alex Lamb,et al.  Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[47]  Tinne Tuytelaars,et al.  Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.

[48]  Fan-Keng Sun,et al.  LAMOL: LAnguage MOdeling for Lifelong Language Learning , 2020, ICLR.

[49]  Jeonghwan Gwak,et al.  Diagnosis of Alzheimer's Disease Based on Structural MRI Images Using a Regularized Extreme Learning Machine and PCA Features , 2017, Journal of healthcare engineering.

[50]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[51]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[52]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[53]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).