Fisher-Rao Metric, Geometry, and Complexity of Neural Networks
暂无分享,去创建一个
Tomaso A. Poggio | Tengyuan Liang | Alexander Rakhlin | James Stokes | T. Poggio | Tengyuan Liang | A. Rakhlin | J. Stokes
[1] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[2] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[3] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[4] V. Koltchinskii,et al. Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.
[5] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[6] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[7] Martin Bauer,et al. Uniqueness of the Fisher–Rao metric on the space of smooth densities , 2014, 1411.5577.
[8] Ruslan Salakhutdinov,et al. Path-SGD: Path-Normalized Optimization in Deep Neural Networks , 2015, NIPS.
[9] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[10] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[11] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[12] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[13] Ohad Shamir,et al. Failures of Gradient-Based Deep Learning , 2017, ICML.
[14] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[15] Ohad Shamir,et al. Failures of Deep Learning , 2017, ArXiv.
[16] Shun-ichi Amari,et al. Universal statistics of Fisher information in deep neural networks: mean field approach , 2018, AISTATS.
[17] S. Amari. Natural Gradient Works Eciently in Learning , 2022 .