Learning with SGD and Random Features
暂无分享,去创建一个
Lorenzo Rosasco | Luigi Carratino | Alessandro Rudi | L. Rosasco | Alessandro Rudi | Luigi Carratino
[1] Francis R. Bach,et al. On Structured Prediction Theory with Calibrated Convex Surrogate Losses , 2017, NIPS.
[2] Benjamin Recht,et al. Large Scale Kernel Learning using Block Coordinate Descent , 2016, ArXiv.
[3] Alessandro Rudi,et al. Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes , 2018, NeurIPS.
[4] Le Song,et al. Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.
[5] AI Koan,et al. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.
[6] Volkan Cevher,et al. Optimal rates for spectral algorithms with least-squares regression over Hilbert spaces , 2018, Applied and Computational Harmonic Analysis.
[7] Don R. Hush,et al. Optimal Rates for Regularized Least Squares Regression , 2009, COLT.
[8] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[9] Daniele Calandriello,et al. On Fast Leverage Score Sampling and Optimal Learning , 2018, NeurIPS.
[10] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[11] Lorenzo Rosasco,et al. Learning with Incremental Iterative Regularization , 2014, NIPS.
[12] Michael W. Mahoney,et al. Fast Randomized Kernel Ridge Regression with Statistical Guarantees , 2015, NIPS.
[13] Andreas Christmann,et al. Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.
[14] Lorenzo Rosasco,et al. Consistent Multitask Learning with Nonlinear Output Relations , 2017, NIPS.
[15] Lorenzo Rosasco,et al. Generalization Properties of Doubly Online Learning Algorithms , 2018, J. Complex..
[16] Lorenzo Rosasco,et al. A Consistent Regularization Approach for Structured Prediction , 2016, NIPS.
[17] David P. Woodruff,et al. Fast approximation of matrix coherence and statistical leverage , 2011, ICML.
[18] Martin J. Wainwright,et al. Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions , 2012, 2014 48th Annual Conference on Information Sciences and Systems (CISS).
[19] Martin J. Wainwright,et al. Randomized sketches for kernels: Fast and optimal non-parametric regression , 2015, ArXiv.
[20] Lorenzo Rosasco,et al. Learning from Examples as an Inverse Problem , 2005, J. Mach. Learn. Res..
[21] Lorenzo Rosasco,et al. FALKON: An Optimal Large Scale Kernel Method , 2017, NIPS.
[22] Lorenzo Rosasco,et al. On the Sample Complexity of Subspace Learning , 2013, NIPS.
[23] S. Smale,et al. ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .
[24] Massimiliano Pontil,et al. Convex multi-task feature learning , 2008, Machine Learning.
[25] Lorenzo Rosasco,et al. Optimal Rates for Learning with Nyström Stochastic Gradient Methods , 2017, ArXiv.
[26] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[27] Bernhard Schölkopf,et al. Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.
[28] Lorenzo Rosasco,et al. Less is More: Nyström Computational Regularization , 2015, NIPS.
[29] Felipe Cucker,et al. On the mathematical foundations of learning , 2001 .
[30] Alexander J. Smola,et al. Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.
[31] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[32] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[33] Lorenzo Rosasco,et al. NYTRO: When Subsampling Meets Early Stopping , 2015, AISTATS.
[34] Zoltán Szabó,et al. Optimal Rates for Random Fourier Features , 2015, NIPS.
[35] Lorenzo Rosasco,et al. Generalization Properties of Learning with Random Features , 2016, NIPS.
[36] David P. Woodruff,et al. Sketching Structured Matrices for Faster Nonlinear Regression , 2013, NIPS.
[37] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .
[38] Sanjiv Kumar,et al. Orthogonal Random Features , 2016, NIPS.
[39] F. Bach,et al. Non-parametric Stochastic Approximation with Large Step sizes , 2014, 1408.0361.
[40] Francesco Orabona,et al. Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning , 2014, NIPS.
[41] H. Robbins. A Stochastic Approximation Method , 1951 .
[42] Dennis DeCoste,et al. Compact Random Feature Maps , 2013, ICML.
[43] David P. Woodruff. Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..
[44] Alessandro Rudi,et al. Exponential convergence of testing error for stochastic gradient methods , 2017, COLT.
[45] Lorenzo Rosasco,et al. Optimal Rates for Multi-pass Stochastic Gradient Methods , 2016, J. Mach. Learn. Res..
[46] Matthias W. Seeger,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.
[47] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[48] Ohad Shamir,et al. Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..
[49] P. Baldi,et al. Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.
[50] Alessandro Rudi,et al. Localized Structured Prediction , 2018, NeurIPS.
[51] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[52] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[53] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.