Exact learning dynamics of deep linear networks with prior knowledge
暂无分享,去创建一个
[1] Andrew M. Saxe,et al. Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation , 2022, ArXiv.
[2] Andrew M. Saxe,et al. Orthogonal representations for robust context-dependent task performance in brains and neural networks , 2022, Neuron.
[3] C. Pehlevan,et al. Neural Networks as Kernel Learners: The Silent Alignment Effect , 2021, ICLR.
[4] Andrew M. Saxe,et al. Continual Learning in the Teacher-Student Setup: Impact of Task Similarity , 2021, ICML.
[5] Amir Globerson,et al. A Theoretical Analysis of Fine-tuning with Linear Teachers , 2021, NeurIPS.
[6] Andrew M. Saxe,et al. Probing transfer learning with a model of synthetic correlated datasets , 2021, Mach. Learn. Sci. Technol..
[7] Masato Okada,et al. Statistical Mechanical Analysis of Catastrophic Forgetting in Continual Learning with Teacher and Student Networks , 2021, Journal of the Physical Society of Japan.
[8] Pierre Alquier,et al. A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix , 2020, AISTATS.
[9] Dongsung Huh,et al. Curvature-corrected learning dynamics in deep neural networks , 2020, ICML.
[10] Michael I. Jordan,et al. On the Theory of Transfer Learning: The Importance of Task Diversity , 2020, NeurIPS.
[11] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[12] Surya Ganguli,et al. Statistical Mechanics of Deep Learning , 2020, Annual Review of Condensed Matter Physics.
[13] Florent Krzakala,et al. Dynamics of stochastic gradient descent for two-layer neural networks in the teacher–student setup , 2019, NeurIPS.
[14] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[15] Martha White,et al. Meta-Learning Representations for Continual Learning , 2019, NeurIPS.
[16] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[17] Naftali Tishby,et al. Machine learning and the physical sciences , 2019, Reviews of Modern Physics.
[18] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[19] Jon Kleinberg,et al. Transfusion: Understanding Transfer Learning for Medical Imaging , 2019, NeurIPS.
[20] Wei Hu,et al. Width Provably Matters in Optimization for Deep Linear Neural Networks , 2019, ICML.
[21] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[22] Surya Ganguli,et al. A mathematical theory of semantic development in deep neural networks , 2018, Proceedings of the National Academy of Sciences.
[23] Christopher Summerfield,et al. Comparing continual task learning in minds and machines , 2018, Proceedings of the National Academy of Sciences.
[24] Matus Telgarsky,et al. Gradient descent aligns the layers of deep linear networks , 2018, ICLR.
[25] Wei Hu,et al. A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks , 2018, ICLR.
[26] Surya Ganguli,et al. An analytic theory of generalization dynamics and transfer learning in deep linear networks , 2018, ICLR.
[27] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.
[28] Thomas Laurent,et al. Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global , 2017, ICML.
[29] Tomaso A. Poggio,et al. Theory IIIb: Generalization in Deep Networks , 2018, ArXiv.
[30] Jascha Sohl-Dickstein,et al. Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks , 2018, ICML.
[31] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[32] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[33] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.
[34] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[35] Surya Ganguli,et al. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.
[36] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[37] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[38] Jiri Matas,et al. All you need is a good init , 2015, ICLR.
[39] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[40] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[41] James L. McClelland. Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory. , 2013, Journal of experimental psychology. General.
[42] Nart Bedin Atalay,et al. Simulating probability learning and probabilistic reversal learning using the attention-gated reinforcement learning (AGREL) model , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[43] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[44] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[45] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[46] Nikolaus Kriegeskorte,et al. Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .
[47] G. Murphy,et al. The Big Book of Concepts , 2002 .
[48] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[49] James L. McClelland,et al. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.
[50] Saad,et al. Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.
[51] Michael Biehl,et al. Learning by on-line gradient descent , 1995 .
[52] John B. Moore,et al. Global analysis of Oja's flow for neural networks , 1994, IEEE Trans. Neural Networks.
[53] R Ratcliff,et al. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.
[54] Guilherme França,et al. Understanding the Dynamics of Gradient Flow in Overparameterized Linear models , 2021, ICML.
[55] Grant M. Rotskoff,et al. Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks , 2018, NeurIPS.
[56] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.
[57] Kenji Fukumizu,et al. Effect of Batch Learning in Multilayer Neural Networks , 1998, ICONIP.
[58] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[59] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[60] S. Carey. Conceptual Change in Childhood , 1985 .