Does the Adam Optimizer Exacerbate Catastrophic Forgetting?
暂无分享,去创建一个
[1] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[2] Yoshua Bengio,et al. On Catastrophic Interference in Atari 2600 Games , 2020, ArXiv.
[3] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[4] Yoshua Bengio,et al. Toward Training Recurrent Neural Networks for Lifelong Learning , 2018, Neural Computation.
[5] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[6] Geoffrey E. Hinton,et al. Similarity of Neural Network Representations Revisited , 2019, ICML.
[7] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Seyed Iman Mirzadeh,et al. Understanding the Role of Training Regimes in Continual Learning , 2020, NeurIPS.
[10] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[11] Gerald Tesauro,et al. Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.
[12] Byoung-Tak Zhang,et al. Overcoming Catastrophic Forgetting by Incremental Moment Matching , 2017, NIPS.
[13] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[16] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[17] Ronald Kemker,et al. Measuring Catastrophic Forgetting in Neural Networks , 2017, AAAI.
[18] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[19] B. Underwood,et al. Fate of first-list associations in transfer theory. , 1959, Journal of experimental psychology.
[20] Mark W. Spong,et al. Robot dynamics and control , 1989 .
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] Adam White,et al. Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks , 2020, AAMAS.
[23] Yarin Gal,et al. Towards Robust Evaluations of Continual Learning , 2018, ArXiv.
[24] Mark W. Spong,et al. Swinging up the Acrobot: an example of intelligent control , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[25] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.
[26] R Ratcliff,et al. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.
[27] Vincent Liu,et al. Sparse Representation Neural Networks for Online Reinforcement Learning , 2019 .
[28] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[29] Hermann Ebbinghaus (1885). Memory: A Contribution to Experimental Psychology , 2013, Annals of Neurosciences.
[30] Robert M. French,et al. Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .
[31] Nicolas Y. Masse,et al. Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization , 2018, Proceedings of the National Academy of Sciences.
[32] Bing Liu,et al. Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.
[33] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[34] Demis Hassabis,et al. Improved protein structure prediction using potentials from deep learning , 2020, Nature.
[35] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[36] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[37] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[38] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.
[39] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[40] Richard S. Sutton,et al. A First Empirical Study of Emphatic Temporal Difference Learning , 2017, ArXiv.
[41] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[43] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .