Understanding generalization error of SGD in nonconvex optimization
暂无分享,去创建一个
[1] Massimiliano Pontil,et al. Stability of Randomized Learning Algorithms , 2005, J. Mach. Learn. Res..
[2] Ohad Shamir,et al. Learnability, Stability and Uniform Convergence , 2010, J. Mach. Learn. Res..
[3] Guillermo Sapiro,et al. Robust Large Margin Deep Neural Networks , 2017, IEEE Transactions on Signal Processing.
[4] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[5] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..
[6] Shie Mannor,et al. Robustness and generalization , 2010, Machine Learning.
[7] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.
[8] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.
[11] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[12] Alexander Shapiro,et al. On Complexity of Stochastic Programming Problems , 2005 .
[13] Tsuyoshi Murata,et al. {m , 1934, ACML.
[14] Stephen P. Boyd,et al. Proximal Algorithms , 2013, Found. Trends Optim..
[15] Benar Fux Svaiter,et al. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..
[16] Gebräuchliche Fertigarzneimittel,et al. V , 1893, Therapielexikon Neurologie.
[17] David A. McAllester. PAC-Bayesian model averaging , 1999, COLT '99.
[18] Mark W. Schmidt,et al. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.
[19] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[20] ZhangHongchao,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2016 .
[21] Vladimir Naumovich Vapni. The Nature of Statistical Learning Theory , 1995 .
[22] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[23] Saeed Ghadimi,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2013, Mathematical Programming.
[24] Gorjan Alagic,et al. #p , 2019, Quantum information & computation.
[25] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[26] Boris Polyak. Gradient methods for the minimisation of functionals , 1963 .
[27] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[28] Danna Zhou,et al. d. , 1840, Microbial pathogenesis.