暂无分享,去创建一个
David W. Jacobs | Tom Goldstein | Soham De | Abhay Kumar Yadav | D. Jacobs | T. Goldstein | Soham De | A. Yadav
[1] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[2] Shiqian Ma,et al. Barzilai-Borwein Step Size for Stochastic Gradient Descent , 2016, NIPS.
[3] Denis J. Dean,et al. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , 1999 .
[4] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[5] Peter Richtárik,et al. Importance Sampling for Minibatches , 2016, J. Mach. Learn. Res..
[6] J. Borwein,et al. Two-Point Step Size Gradient Methods , 1988 .
[7] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[10] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[11] Thierry Bertin-Mahieux,et al. The Million Song Dataset , 2011, ISMIR.
[12] Guillaume Bouchard,et al. Online Learning to Sample , 2015, 1506.09016.
[13] Philipp Hennig,et al. Probabilistic Line Searches for Stochastic Optimization , 2015, NIPS.
[14] Mark W. Schmidt,et al. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.
[15] Justin Domke,et al. Finito: A faster, permutable incremental gradient method for big data problems , 2014, ICML.
[16] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[17] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .
[18] Mark W. Schmidt,et al. StopWasting My Gradients: Practical SVRG , 2015, NIPS.
[19] David W. Jacobs,et al. Automated Inference with Adaptive Batches , 2017, AISTATS.
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Donald E. Knuth,et al. The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .
[22] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[23] Deanna Needell,et al. Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm , 2013, Mathematical Programming.
[24] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[25] Alexander J. Smola,et al. On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants , 2015, NIPS.
[26] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[27] Tom Goldstein,et al. Efficient Distributed SGD with Variance Reduction , 2015, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[28] Ohad Shamir,et al. Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.
[29] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[30] Guillaume Bouchard,et al. Accelerating Stochastic Gradient Descent via Online Learning to Sample , 2015, ArXiv.
[31] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[32] Mark W. Schmidt,et al. Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..
[33] H. Robbins. A Stochastic Approximation Method , 1951 .
[34] Jorge Nocedal,et al. Sample size selection in optimization methods for machine learning , 2012, Math. Program..
[35] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[36] Richard G. Baraniuk,et al. A Field Guide to Forward-Backward Splitting with a FASTA Implementation , 2014, ArXiv.
[37] Donald E. Knuth,et al. The art of computer programming: sorting and searching (volume 3) , 1973 .