暂无分享,去创建一个
Florent Krzakala | Lenka Zdeborová | Andrew M. Saxe | Sebastian Goldt | Madhu S. Advani | F. Krzakala | L. Zdeborová | Madhu S. Advani | Sebastian Goldt
[1] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[2] Ohad Shamir,et al. Size-Independent Sample Complexity of Neural Networks , 2017, COLT.
[3] Hong Hu,et al. A Solvable High-Dimensional Model of GAN , 2018, NeurIPS.
[4] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[5] Yann LeCun,et al. Comparing dynamics: deep neural networks versus glassy systems , 2018, ICML.
[6] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[7] Grant M. Rotskoff,et al. Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks , 2018, NeurIPS.
[8] Nicolas Macris,et al. The committee machine: computational to statistical gaps in learning a two-layers neural network , 2018, NeurIPS.
[9] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[10] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[11] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[12] T. Watkin,et al. THE STATISTICAL-MECHANICS OF LEARNING A RULE , 1993 .
[13] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[14] Shai Shalev-Shwartz,et al. SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.
[15] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[16] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[17] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[18] Wolfgang Kinzel,et al. Improving a Network Generalization Ability by Selecting Examples , 1990 .
[19] Michael Biehl,et al. Learning by on-line gradient descent , 1995 .
[20] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[21] Anthony C. C. Coolen,et al. Statistical mechanical analysis of the dynamics of learning in perceptrons , 1997, Stat. Comput..
[22] Andrew M. Saxe,et al. High-dimensional dynamics of generalization error in neural networks , 2017, Neural Networks.
[23] E. Gardner,et al. Three unfinished works on the optimal storage capacity of networks , 1989 .
[24] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[25] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[26] Saad,et al. On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[27] Florent Krzakala,et al. Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.
[28] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[29] Saad,et al. Exact solution for on-line learning in multilayer neural networks. , 1995, Physical review letters.
[30] Yue M. Lu,et al. Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA , 2017, ArXiv.
[31] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.
[32] H. Schwarze. Learning a rule in a multilayer neural network , 1993 .
[33] Surya Ganguli,et al. Statistical Mechanics of Optimal Convex Inference in High Dimensions , 2016 .
[34] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[35] David Saad,et al. Learning with Noise and Regularizers in Multilayer Neural Networks , 1996, NIPS.
[36] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[37] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[38] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[39] J. Hertz,et al. Generalization in a linear perceptron in the presence of noise , 1992 .
[40] E. Oja,et al. On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix , 1985 .
[41] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[42] Surya Ganguli,et al. An analytic theory of generalization dynamics and transfer learning in deep linear networks , 2018, ICLR.
[43] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[44] Sompolinsky,et al. Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.