Sharp Minima Can Generalize For Deep Nets Supplementary Material
暂无分享,去创建一个
[1] Shai Shalev-Shwartz,et al. Fast Rates for Empirical Risk Minimization of Strict Saddle Problems , 2017, COLT.
[2] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[3] Yann LeCun,et al. Singularity of the Hessian in Deep Learning , 2016, ArXiv.
[4] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[5] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.
[6] A. Klyachko. Random walks on symmetric spaces and inequalities for matrix spectra , 2000 .