论文信息 - Regularization Shortcomings for Continual Learning

Regularization Shortcomings for Continual Learning

In most machine learning algorithms, training data are assumed independent and identically distributed (iid). Otherwise, the algorithms' performances are challenged. A famous phenomenon with non-iid data distribution is known as \say{catastrophic forgetting}. Algorithms dealing with it are gathered in the \textit{Continual Learning} research field. In this article, we study the \textit{regularization} based approaches to continual learning. We show that those approaches can not learn to discriminate classes from different tasks in an elemental continual benchmark: class-incremental setting. We make theoretical reasoning to prove this shortcoming and illustrate it with examples and experiments.

[1] Yandong Guo,et al. Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] David Barber,et al. Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[3] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[4] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.

[5] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[6] David Filliat,et al. Generative Models from the perspective of Continual Learning , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[7] Davide Maltoni,et al. Continuous Learning in Single-Incremental-Task Scenarios , 2018, Neural Networks.

[8] Tinne Tuytelaars,et al. Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.

[9] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Joelle Pineau,et al. Online Learned Continual Compression with Stacked Quantization Module , 2019, ICML 2020.

[11] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[12] David Filliat,et al. Marginal Replay vs Conditional Replay for Continual Learning , 2018, ICANN.

[13] Alex Lamb,et al. Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[14] Bogdan Raducanu,et al. Memory Replay GANs: Learning to Generate New Categories without Forgetting , 2018, NeurIPS.

[15] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[16] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[17] Byoung-Tak Zhang,et al. Overcoming Catastrophic Forgetting by Incremental Moment Matching , 2017, NIPS.

[18] David Filliat,et al. Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges , 2020, Inf. Fusion.

[19] Dahua Lin,et al. Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Adrian Popescu,et al. DeeSIL: Deep-Shallow Incremental Learning , 2018, ECCV Workshops.

[21] Richard E. Turner,et al. Variational Continual Learning , 2017, ICLR.

[22] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[23] Joelle Pineau,et al. Online Learned Continual Compression with Adaptive Quantization Modules , 2019, ICML.

[24] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[25] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.

[27] David Filliat,et al. DisCoRL: Continual Reinforcement Learning via Policy Distillation , 2019, ArXiv.

[28] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.

[29] David Filliat,et al. Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer , 2019, ArXiv.

[30] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[31] Jiwon Kim,et al. Continual Learning with Deep Generative Replay , 2017, NIPS.