暂无分享,去创建一个
Yee Whye Teh | Razvan Pascanu | Peter E. Latham | Jonathan Schwarz | Siddhant M. Jayakumar | Y. Teh | Razvan Pascanu | P. Latham | Jonathan Schwarz
[1] Marcus Rohrbach,et al. Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.
[2] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Alexandros Karatzoglou,et al. Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .
[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[6] Ali Farhadi,et al. Supermasks in Superposition , 2020, NeurIPS.
[7] Ryan P. Adams,et al. Bayesian Online Changepoint Detection , 2007, 0710.3742.
[8] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[9] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.
[10] Razvan Pascanu,et al. Natural Neural Networks , 2015, NIPS.
[11] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[12] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[13] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[14] Murray Shanahan,et al. Continual Reinforcement Learning with Multi-Timescale Replay , 2020, ArXiv.
[15] Razvan Pascanu,et al. Meta-Learning with Warped Gradient Descent , 2020, ICLR.
[16] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.
[17] Razvan Pascanu,et al. Continual World: A Robotic Benchmark For Continual Reinforcement Learning , 2021, ArXiv.
[18] Gintare Karolina Dziugaite,et al. The Lottery Ticket Hypothesis at Scale , 2019, ArXiv.
[19] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[20] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[21] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[22] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[23] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[24] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[25] Decebal Constantin Mocanu,et al. SpaceNet: Make Free Space For Continual Learning , 2020, Neurocomputing.
[26] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.
[27] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.
[28] Boaz Barak,et al. Deep double descent: where bigger models and more data hurt , 2019, ICLR.
[29] Andreas S. Tolias,et al. Three scenarios for continual learning , 2019, ArXiv.
[30] Miguel Á. Carreira-Perpiñán,et al. Distributed optimization of deeply nested systems , 2012, AISTATS.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[32] Nikko Ström,et al. Sparse connection and pruning in large dynamic artificial neural networks , 1997, EUROSPEECH.
[33] Andreas S. Tolias,et al. Generative replay with feedback connections as a general strategy for continual learning , 2018, ArXiv.
[34] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[35] Varun Kanade,et al. Implicit Regularization for Optimal Sparse Recovery , 2019, NeurIPS.
[36] Ali Farhadi,et al. Soft Threshold Weight Reparameterization for Learnable Sparsity , 2020, ICML.
[37] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[38] Alexander J. Smola,et al. Laplace Propagation , 2003, NIPS.
[39] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[40] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[41] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[42] David Barber,et al. Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.
[43] Manfred K. Warmuth,et al. Reparameterizing Mirror Descent as Gradient Descent , 2020, NeurIPS.
[44] Svetlana Lazebnik,et al. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[45] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[46] Philip H. S. Torr,et al. SNIP: Single-shot Network Pruning based on Connection Sensitivity , 2018, ICLR.
[47] Razvan Pascanu,et al. Top-KAST: Top-K Always Sparse Training , 2021, NeurIPS.
[48] Max Welling,et al. Learning Sparse Neural Networks through L0 Regularization , 2017, ICLR.
[49] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[50] Erich Elsen,et al. The Difficulty of Training Sparse Neural Networks , 2019, ArXiv.
[51] Joel Veness,et al. The Forget-me-not Process , 2016, NIPS.
[52] Jiwon Kim,et al. Continual Learning with Deep Generative Replay , 2017, NIPS.
[53] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[54] Emile Fiesler,et al. Evaluating pruning methods , 1995 .
[55] Yann LeCun. PhD thesis: Modeles connexionnistes de l'apprentissage (connectionist learning models) , 1987 .
[56] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[57] Richard E. Turner,et al. Variational Continual Learning , 2017, ICLR.
[58] Luke Zettlemoyer,et al. Sparse Networks from Scratch: Faster Training without Losing Performance , 2019, ArXiv.
[59] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[60] Yen-Cheng Liu,et al. Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.
[61] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[62] David Kappel,et al. Deep Rewiring: Training very sparse deep networks , 2017, ICLR.
[63] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[64] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[65] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[66] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[67] Erich Elsen,et al. Rigging the Lottery: Making All Tickets Winners , 2020, ICML.
[68] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..
[69] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[70] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[71] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[72] Xin Wang,et al. Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization , 2019, ICML.
[73] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[74] Marc'Aurelio Ranzato,et al. Efficient Lifelong Learning with A-GEM , 2018, ICLR.
[75] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[76] Y. L. Cun. Learning Process in an Asymmetric Threshold Network , 1986 .
[77] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[78] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[79] Anthony V. Robins,et al. Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..
[80] P. Zhao,et al. Implicit regularization via hadamard product over-parametrization in high-dimensional linear regression , 2019 .
[81] Yee Whye Teh,et al. Functional Regularisation for Continual Learning using Gaussian Processes , 2019, ICLR.
[82] Kaushik Roy,et al. Gradient Projection Memory for Continual Learning , 2021, ICLR.
[83] Yoshua Bengio,et al. Target Propagation , 2015, ICLR.
[84] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.
[85] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[86] Shan Yu,et al. Continual learning of context-dependent processing in neural networks , 2018, Nature Machine Intelligence.
[87] Hongyang Zhang,et al. Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations , 2017, COLT.