Maintaining Plasticity in Deep Continual Learning
暂无分享,去创建一个
R. Sutton | A. Mahmood | Shibhansh Dohare | Parash Rahman | J. F. Hernandez-Garcia | A. R. Mahmood | Richard S. Sutton
[1] Pierre-Luc Bacon,et al. The Primacy Bias in Deep Reinforcement Learning , 2022, ICML.
[2] Mark Rowland,et al. Understanding and Preventing Capacity Loss in Reinforcement Learning , 2022, ICLR.
[3] P. Stone,et al. Dynamic Sparse Training for Deep Reinforcement Learning , 2021, IJCAI.
[4] Zhe Gan,et al. Chasing Sparsity in Vision Transformers: An End-to-End Exploration , 2021, NeurIPS.
[5] Razvan Pascanu,et al. A study on the plasticity of neural networks , 2021, ArXiv.
[6] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[7] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.
[8] Soham De,et al. On the Origin of Implicit Regularization in Stochastic Gradient Descent , 2021, ICLR.
[9] Marc'Aurelio Ranzato,et al. Efficient Continual Learning with Modular Networks and Task-Driven Priors , 2020, ICLR.
[10] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[11] Shimon Whiteson,et al. Transient Non-stationarity and Generalisation in Deep Reinforcement Learning , 2020, ICLR.
[12] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[13] Nadav Cohen,et al. Implicit Regularization in Deep Learning May Not Be Explainable by Norms , 2020, NeurIPS.
[14] Ray C. C. Cheung,et al. Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers , 2020, ICLR.
[15] Fahad Shahbaz Khan,et al. Random Path Selection for Continual Learning , 2019, NeurIPS.
[16] Ryan P. Adams,et al. On Warm-Starting Neural Network Training , 2019, NeurIPS.
[17] Tinne Tuytelaars,et al. Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.
[18] Martha White,et al. Meta-Learning Representations for Continual Learning , 2019, NeurIPS.
[19] Michael James,et al. Online Normalization for Training Neural Networks , 2019, NeurIPS.
[20] Kyunghyun Cho,et al. Continual Learning via Neural Pruning , 2019, ArXiv.
[21] G. Tesauro,et al. Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.
[22] Carla P. Gomes,et al. Understanding Batch Normalization , 2018, NeurIPS.
[23] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[24] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.
[25] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[26] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.
[27] Philip H. S. Torr,et al. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.
[28] Dawn Xiaodong Song,et al. Gradients explode - Deep Networks are shallow - ResNet explained , 2017, ICLR.
[29] Jaime G. Carbonell,et al. The exploding gradient problem demystified - definition, prevalence, impact, origin, tradeoffs, and solutions , 2017 .
[30] David Kappel,et al. Deep Rewiring: Training very sparse deep networks , 2017, ICLR.
[31] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Sung Ju Hwang,et al. Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.
[33] Frank Hutter,et al. A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.
[34] Martial Hebert,et al. Growing a Brain: Fine-Tuning by Increasing Model Capacity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[36] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[37] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[38] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[39] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[40] Rui Peng,et al. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.
[41] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[42] Andries Petrus Engelbrecht,et al. Measuring Saturation in Neural Networks , 2015, 2015 IEEE Symposium Series on Computational Intelligence.
[43] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[44] Quoc V. Le,et al. Adding Gradient Noise Improves Learning for Very Deep Networks , 2015, ArXiv.
[45] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[46] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[47] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[48] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[49] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[50] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[51] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[52] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[53] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.
[54] Richard S. Sutton,et al. Representation Search through Generate and Test , 2013, AAAI Workshop: Learning Rich Representations from Low-Level Sensors.
[55] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[56] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[57] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[58] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[59] Geoffrey E. Hinton,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[60] Honglak Lee,et al. Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.
[61] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[62] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[63] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[64] Martin Vetterli,et al. The effective rank: A measure of effective dimensionality , 2007, 2007 15th European Signal Processing Conference.
[65] Christopher Barry,et al. The influence of age of acquisition in word reading and other tasks : A never ending story ? , 2004 .
[66] Mark S. Seidenberg,et al. Age of Acquisition Effects in Word Reading and Other Tasks , 2002 .
[67] M. L. Lambon Ralph,et al. Age of acquisition effects in adult lexical processing reflect loss of plasticity in maturing systems: insights from connectionist networks. , 2000, Journal of experimental psychology. Learning, memory, and cognition.
[68] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[69] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[70] Mark B. Ring. CHILD: A First Step Towards Continual Learning , 1997, Machine Learning.
[71] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[72] Matthieu Geist,et al. What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study , 2021, ICLR.
[73] George Em Karniadakis,et al. TRAINABILITY OF ReLU NETWORKS AND DATA-DEPENDENT INITIALIZATION , 2019, Journal of Machine Learning for Modeling and Computing.
[74] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[75] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[76] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[77] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[78] Emile Fiesler,et al. Evaluating pruning methods , 1995 .
[79] Petri Koistinen,et al. Using additive noise in back-propagation training , 1992, IEEE Trans. Neural Networks.
[80] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[81] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.