暂无分享,去创建一个
[1] Yuan Yao,et al. Online Learning as Stochastic Approximation of Regularization Paths: Optimality and Almost-Sure Convergence , 2011, IEEE Transactions on Information Theory.
[2] Y. Yao,et al. Cross-validation based adaptation for regularization operators in learning theory , 2010 .
[3] Lorenzo Rosasco,et al. Iterative Regularization for Learning with Convex Loss Functions , 2015, J. Mach. Learn. Res..
[4] Tong Zhang,et al. Learning Bounds for Kernel Regression Using Effective Data Dimensionality , 2005, Neural Computation.
[5] Tong Zhang,et al. Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.
[6] Rong Jin,et al. Improved Bounds for the Nyström Method With Application to Kernel Classification , 2011, IEEE Transactions on Information Theory.
[7] S. Smale,et al. A Dynamic Theory of Learning , 2008 .
[8] Massimiliano Pontil,et al. Online Gradient Descent Learning Algorithms , 2008, Found. Comput. Math..
[9] S. Smale,et al. Learning Theory Estimates via Integral Operators and Their Approximations , 2007 .
[10] Stanislav Minsker. On Some Extensions of Bernstein's Inequality for Self-adjoint Operators , 2011, 1112.5448.
[11] Lorenzo Rosasco,et al. Learning with Incremental Iterative Regularization , 2014, NIPS.
[12] Lorenzo Rosasco,et al. NYTRO: When Subsampling Meets Early Stopping , 2015, AISTATS.
[13] Gilles Blanchard,et al. Optimal learning rates for Kernel Conjugate Gradient regression , 2010, NIPS.
[14] Michael W. Mahoney,et al. Revisiting the Nystrom Method for Improved Large-scale Machine Learning , 2013, J. Mach. Learn. Res..
[15] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[16] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[17] F. Bach,et al. Non-parametric Stochastic Approximation with Large Step sizes , 2014, 1408.0361.
[18] H. Engl,et al. Regularization of Inverse Problems , 1996 .
[19] H. Robbins. A Stochastic Approximation Method , 1951 .
[20] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.
[21] Steven C. H. Hoi,et al. Large Scale Online Kernel Learning , 2016, J. Mach. Learn. Res..
[22] Felipe Cucker,et al. Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .
[23] Lorenzo Rosasco,et al. Optimal Rates for Multi-pass Stochastic Gradient Methods , 2016, J. Mach. Learn. Res..
[24] Matthias W. Seeger,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.
[25] Michael W. Mahoney,et al. Fast Randomized Kernel Ridge Regression with Statistical Guarantees , 2015, NIPS.
[26] David P. Woodruff,et al. Fast approximation of matrix coherence and statistical leverage , 2011, ICML.
[27] I. Pinelis,et al. Remarks on Inequalities for Large Deviation Probabilities , 1986 .
[28] Y. Yao,et al. On Early Stopping in Gradient Descent Learning , 2007 .
[29] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2004 .
[30] Takayuki Furuta,et al. NORM INEQUALITIES EQUIVALENT TO LÖWNER-HEINZ THEOREM , 1989 .
[31] Richard Peng,et al. Uniform Sampling for Matrix Approximation , 2014, ITCS.
[32] Bernhard Schölkopf,et al. Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.
[33] Lorenzo Rosasco,et al. Less is More: Nyström Computational Regularization , 2015, NIPS.
[34] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[35] P. Mathé,et al. MODULI OF CONTINUITY FOR OPERATOR VALUED FUNCTIONS , 2002 .
[36] Andreas Christmann,et al. Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.
[37] Martin J. Wainwright,et al. Randomized sketches for kernels: Fast and optimal non-parametric regression , 2015, ArXiv.
[38] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[39] Ameet Talwalkar,et al. On the Impact of Kernel Approximation on Learning Accuracy , 2010, AISTATS.
[40] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[41] J. Tropp. User-Friendly Tools for Random Matrices: An Introduction , 2012 .
[42] Martin J. Wainwright,et al. Early stopping for non-parametric regression: An optimal data-dependent stopping rule , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[43] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[44] Ameet Talwalkar,et al. Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..
[45] Lorenzo Rosasco,et al. On regularization algorithms in learning theory , 2007, J. Complex..
[46] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[47] Mikhail Belkin,et al. Diving into the shallows: a computational perspective on large-scale shallow learning , 2017, NIPS.