暂无分享,去创建一个
Nathan Srebro | Suriya Gunasekar | Blake E. Woodworth | Blake Woodworth | Nathan Srebro | Suriya Gunasekar | N. Srebro
[1] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[2] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .
[3] D. Kinderlehrer,et al. THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .
[4] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[5] J. Duistermaat. On Hessian Riemannian structures , 1999 .
[6] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..
[7] Ambuj Tewari,et al. On the Universality of Online Mirror Descent , 2011, NIPS.
[8] Maxim Raginsky,et al. Continuous-time stochastic Mirror Descent on a network: Variance reduction, consensus, convergence , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[9] Matus Telgarsky,et al. Margins, Shrinkage, and Boosting , 2013, ICML.
[10] G. Anastassiou,et al. Differential Geometry of Curves and Surfaces , 2014 .
[11] S. Amari,et al. Curvature of Hessian manifolds , 2014 .
[12] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[13] Sayan Mukherjee,et al. The Information Geometry of Mirror Descent , 2013, IEEE Transactions on Information Theory.
[14] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.
[15] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[16] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[17] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[18] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[19] Nathan Srebro,et al. Kernel and Rich Regimes in Overparametrized Models , 2019, COLT.
[20] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.