GateON: an unsupervised method for large scale continual learning
暂无分享,去创建一个
[1] C. Harvey,et al. Representational drift: Emerging theories for continual learning and experimental future directions , 2022, Current Opinion in Neurobiology.
[2] Andrew M. Saxe,et al. Orthogonal representations for robust context-dependent task performance in brains and neural networks , 2022, Neuron.
[3] Lucas O. Souza,et al. Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments , 2021, Frontiers in Neurorobotics.
[4] Bing Liu,et al. Continual Learning with Knowledge Transfer for Sentiment Classification , 2021, ECML/PKDD.
[5] Bing Liu,et al. Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning , 2021, NeurIPS.
[6] P. Chaudhari,et al. Model Zoo: A Growing Brain That Learns Continually , 2021, ICLR.
[7] Bing Liu,et al. Adapting BERT for Continual Learning of a Sequence of Aspect Sentiment Classification Tasks , 2021, NAACL.
[8] Alan Yuille,et al. Understanding Catastrophic Forgetting and Remembering in Continual Learning with Optimal Relevance Mapping , 2021, ArXiv.
[9] Wulfram Gerstner,et al. Learning in Volatile Environments With the Bayes Factor Surprise , 2021, Neural Computation.
[10] Seyed Iman Mirzadeh,et al. Linear Mode Connectivity in Multitask and Continual Learning , 2020, ICLR.
[11] Matthias De Lange,et al. Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Seyed Iman Mirzadeh,et al. Understanding the Role of Training Regimes in Continual Learning , 2020, NeurIPS.
[13] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[14] Mehrdad Farajtabar,et al. Orthogonal Gradient Descent for Continual Learning , 2019, AISTATS.
[15] Tinne Tuytelaars,et al. A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[16] Hung-yi Lee,et al. LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning , 2019, ICLR 2020.
[17] David Filliat,et al. Continual Learning for Robotics , 2019, Inf. Fusion.
[18] Philip S. Yu,et al. BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis , 2019, NAACL.
[19] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.
[20] Christopher Summerfield,et al. Comparing continual task learning in minds and machines , 2018, Proceedings of the National Academy of Sciences.
[21] Zhanxing Zhu,et al. Reinforced Continual Learning , 2018, NeurIPS.
[22] Murray Shanahan,et al. Continual Reinforcement Learning with Complex Synapses , 2018, ICML.
[23] Alexandros Karatzoglou,et al. Overcoming catastrophic forgetting with hard attention to the task , 2018, ICML.
[24] Richard E. Turner,et al. Variational Continual Learning , 2017, ICLR.
[25] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[26] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[27] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[28] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[29] Christoph H. Lampert,et al. iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Tinne Tuytelaars,et al. Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.
[32] S. Arun,et al. Selective IT neurons are selective along many dimensions , 2016, Journal of neurophysiology.
[33] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] D. Leopold,et al. Face-selective neurons maintain consistent visual responses across months , 2014, Proceedings of the National Academy of Sciences.
[35] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.
[36] Jürgen Schmidhuber,et al. Compete to Compute , 2013, NIPS.
[37] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[38] Robert C. Wilson,et al. An Approximately Bayesian Delta-Rule Model Explains the Dynamics of Belief Updating in a Changing Environment , 2010, The Journal of Neuroscience.
[39] Ryan P. Adams,et al. Bayesian Online Changepoint Detection , 2007, 0710.3742.
[40] P. Fearnhead,et al. On‐line inference for multiple changepoint problems , 2007 .
[41] T. Tanaka,et al. Adaptive resonance theory , 1997, Scholarpedia.
[42] W. Abraham,et al. Memory retention – the synaptic stability versus plasticity dilemma , 2005, Trends in Neurosciences.
[43] ZhaoHong Han,et al. EFFECTS OF THE SECOND LANGUAGE ON THE FIRST , 2004, Studies in Second Language Acquisition.
[44] Robert E. Mercer,et al. The Task Rehearsal Method of Life-Long Learning: Overcoming Impoverished Data , 2002, Canadian Conference on AI.
[45] Zoltán Dienes,et al. Transfer of implicit knowledge across domains? How implicit and how abstract? , 1997 .
[46] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[47] Ken Lang,et al. NewsWeeder: Learning to Filter Netnews , 1995, ICML.
[48] Anthony V. Robins,et al. Catastrophic forgetting in neural networks: the role of rehearsal mechanisms , 1993, Proceedings 1993 The First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.
[49] R Ratcliff,et al. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.
[50] James L. McClelland,et al. James L. McClelland, David Rumelhart and the PDP Research Group, Parallel distributed processing: explorations in the microstructure of cognition . Vol. 1. Foundations . Vol. 2. Psychological and biological models . Cambridge MA: M.I.T. Press, 1987. , 1989, Journal of Child Language.
[51] Stephen Grossberg,et al. Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..
[52] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[53] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[54] Eric T. Nalisnick,et al. Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality , 2019 .
[55] Benjamin Frederick Goodrich,et al. Neuron Clustering for Mitigating Catastrophic Forgetting in Supervised and Reinforcement Learning , 2015 .
[56] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[57] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[58] K. Pearson. VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.