暂无分享,去创建一个
Thiparat Chotibut | Paolo E. Trevisanutto | Mirco Milletarí | Thiparat Chotibut | P. E. Trevisanutto | M. Milletarí
[1] Zohar Ringel,et al. Mutual information, neural networks and the renormalization group , 2017, ArXiv.
[2] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[3] David J. Schwab,et al. An exact mapping between the Variational Renormalization Group and Deep Learning , 2014, ArXiv.
[4] M. Mézard,et al. The Bethe lattice spin glass revisited , 2000, cond-mat/0009418.
[5] M. Cassandro,et al. Critical point behaviour and probability theory , 1978 .
[6] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[7] Jeffrey Pennington,et al. Geometry of Neural Network Loss Surfaces via Random Matrix Theory , 2017, ICML.
[8] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[9] Irene Giardina,et al. Random Fields and Spin Glasses: A Field Theory Approach , 2010 .
[10] R. Zecchina,et al. Inverse statistical problems: from the inverse Ising problem to data science , 2017, 1702.01522.
[11] R. Baierlein. Probability Theory: The Logic of Science , 2004 .
[12] Erio Tosatti. Statistical Mechanics and Applications in Condensed Matter , 2016 .
[13] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[14] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[15] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[16] Shang‐keng Ma. Modern Theory of Critical Phenomena , 1976 .
[17] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[18] Surya Ganguli,et al. On the Expressive Power of Deep Neural Networks , 2016, ICML.
[19] T. Toffoli. Physics and computation , 1982 .
[20] Christian Van den Broeck,et al. Statistical Mechanics of Learning , 2001 .
[21] Hwee Kuan Lee,et al. Distribution Regression Network , 2018, ArXiv.
[22] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[23] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[24] Fabrice Mortessagne,et al. Equilibrium and Non-Equilibrium Statistical Thermodynamics: Contents , 2004 .
[25] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[26] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[27] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[28] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[29] Arnaud Doucet,et al. On the Selection of Initialization and Activation Function for Deep Neural Networks , 2018, ArXiv.
[30] Yann LeCun,et al. Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond , 2016, 1611.07476.
[31] J. Cardy,et al. Quantum quenches in extended systems , 2007, 0704.1880.
[32] Daniel J. Amit,et al. Modeling brain function: the world of attractor neural networks, 1st Edition , 1989 .
[33] Jeffrey Pennington,et al. Nonlinear random matrix theory for deep learning , 2019, NIPS.
[34] Max Tegmark,et al. Why Does Deep and Cheap Learning Work So Well? , 2016, Journal of Statistical Physics.
[35] Surya Ganguli,et al. Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.
[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[37] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..