论文信息 - Target Layer Regularization for Continual Learning Using Cramer-Wold Generator

Target Layer Regularization for Continual Learning Using Cramer-Wold Generator

We propose an effective regularization strategy (CW-TaLaR) for solving continual learning problems. It uses a penalizing term expressed by the Cramer-Wold distance between two probability distributions defined on a target layer of an underlying neural network that is shared by all tasks, and the simple architecture of the Cramer-Wold generator for modeling output data representation. Our strategy preserves target layer distribution while learning a new task but does not require remembering previous tasks’ datasets. We perform experiments involving several common supervised frameworks, which prove the competitiveness of the CW-TaLaR method in comparison to a few existing state-of-the-art continual learning models.

[1] Soheil Kolouri,et al. Sliced Cramer Synaptic Consolidation for Preserving Deeply Learned Representations , 2020, ICLR.

[2] Philip H. S. Torr,et al. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[3] Andrei A. Rusu,et al. Embracing Change: Continual Learning in Deep Neural Networks , 2020, Trends in Cognitive Sciences.

[4] Jacek Tabor,et al. Cramer-Wold Auto-Encoder , 2020, J. Mach. Learn. Res..

[5] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[6] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.

[7] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[8] Marcus Rohrbach,et al. Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[9] Davide Maltoni,et al. Continuous Learning in Single-Incremental-Task Scenarios , 2018, Neural Networks.

[10] OctoMiao. Overcoming catastrophic forgetting in neural networks , 2016 .

[11] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[12] Nathan D. Cahill,et al. Memory Efficient Experience Replay for Streaming Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[13] Richard E. Turner,et al. Variational Continual Learning , 2017, ICLR.

[14] Stefan Wermter,et al. Lifelong Learning of Spatiotemporal Representations With Dual-Memory Recurrent Self-Organization , 2018, Front. Neurorobot..

[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[19] Yen-Cheng Liu,et al. Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[20] Joost van de Weijer,et al. Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[21] Davide Maltoni,et al. CORe50: a New Dataset and Benchmark for Continuous Object Recognition , 2017, CoRL.