Optimal Learning for Multi-pass Stochastic Gradient Methods
暂无分享,去创建一个
[1] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[2] J. Tropp. User-Friendly Tools for Random Matrices: An Introduction , 2012 .
[3] S. Smale,et al. Learning Theory Estimates via Integral Operators and Their Approximations , 2007 .
[4] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..
[5] Jean-Yves Audibert. Optimization for Machine Learning , 1995 .
[6] Francesco Orabona,et al. Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning , 2014, NIPS.
[7] Y. Yao,et al. On Early Stopping in Gradient Descent Learning , 2007 .
[8] Massimiliano Pontil,et al. Online Gradient Descent Learning Algorithms , 2008, Found. Comput. Math..
[9] Yuan Yao,et al. Online Learning as Stochastic Approximation of Regularization Paths: Optimality and Almost-Sure Convergence , 2011, IEEE Transactions on Information Theory.
[10] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[11] Andreas Christmann,et al. Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.
[12] Felipe Cucker,et al. Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .
[13] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[14] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[15] Lorenzo Rosasco,et al. Less is More: Nyström Computational Regularization , 2015, NIPS.
[16] Ohad Shamir,et al. Better Mini-Batch Algorithms via Accelerated Gradient Methods , 2011, NIPS.
[17] Lorenzo Rosasco,et al. Generalization Properties and Implicit Regularization for Multiple Passes SGM , 2016, ICML.
[18] Lorenzo Rosasco,et al. On regularization algorithms in learning theory , 2007, J. Complex..
[19] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[20] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[21] Stanislav Minsker. On Some Extensions of Bernstein's Inequality for Self-adjoint Operators , 2011, 1112.5448.
[22] Stephen P. Boyd,et al. Stochastic Subgradient Methods , 2007 .
[23] I. Pinelis,et al. Remarks on Inequalities for Large Deviation Probabilities , 1986 .
[24] F. Bach,et al. Non-parametric Stochastic Approximation with Large Step sizes , 2014, 1408.0361.
[25] Lorenzo Rosasco,et al. Learning with Incremental Iterative Regularization , 2014, NIPS.
[26] Ohad Shamir,et al. Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..
[27] Lorenzo Rosasco,et al. Iterative Regularization for Learning with Convex Loss Functions , 2015, J. Mach. Learn. Res..
[28] Tong Zhang,et al. Learning Bounds for Kernel Regression Using Effective Data Dimensionality , 2005, Neural Computation.
[29] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.
[30] L. Rosasco,et al. Less is More: Nystr\"om Computational Regularization , 2015 .