Second-Order Stochastic Optimization for Machine Learning in Linear Time
暂无分享,去创建一个
[1] R. Fletcher,et al. A New Approach to Variable Metric Algorithms , 1970, Comput. J..
[2] C. G. Broyden. The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm , 1970 .
[3] D. Shanno. Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .
[4] D. Goldfarb. A family of variable-metric methods derived by variational means , 1970 .
[5] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[6] Denis J. Dean,et al. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , 1999 .
[7] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[8] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[9] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[10] H. Robbins. A Stochastic Approximation Method , 1951 .
[11] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.
[12] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[13] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[14] Jorge Nocedal,et al. On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..
[15] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..
[16] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[17] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[18] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[19] Gary L. Miller,et al. Iterative Row Sampling , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[20] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[21] Rong Jin,et al. Linear Convergence with Condition Number Independent Access of Full Gradients , 2013, NIPS.
[22] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[23] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[24] Aryan Mokhtari,et al. RES: Regularized Stochastic BFGS Algorithm , 2014, IEEE Transactions on Signal Processing.
[25] Lin Xiao,et al. An Accelerated Proximal Coordinate Gradient Method , 2014, NIPS.
[26] Andrea Montanari,et al. Convergence rates of sub-sampled Newton methods , 2015, NIPS.
[27] Richard Peng,et al. Uniform Sampling for Matrix Approximation , 2014, ITCS.
[28] Zaïd Harchaoui,et al. A Universal Catalyst for First-Order Optimization , 2015, NIPS.
[29] Elad Hazan,et al. Fast and Simple PCA via Convex Optimization , 2015, ArXiv.
[30] Michael I. Jordan,et al. A Linearly-Convergent Stochastic L-BFGS Algorithm , 2015, AISTATS.
[31] Zeyuan Allen-Zhu. Katyusha: Accelerated Variance Reduction for Faster SGD , 2016 .
[32] Ohad Shamir,et al. Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity , 2015, ICML.
[33] Haipeng Luo,et al. Efficient Second Order Online Learning by Sketching , 2016, NIPS.
[34] Michael B. Cohen,et al. Nearly Tight Oblivious Subspace Embeddings by Trace Inequalities , 2016, SODA.
[35] J. Nocedal,et al. Exact and Inexact Subsampled Newton Methods for Optimization , 2016, 1609.08502.
[36] Zeyuan Allen Zhu,et al. Katyusha: Accelerated Variance Reduction for Faster SGD , 2016, ArXiv.
[37] Jorge Nocedal,et al. A Stochastic Quasi-Newton Method for Large-Scale Optimization , 2014, SIAM J. Optim..
[38] Peng Xu,et al. Sub-sampled Newton Methods with Non-uniform Sampling , 2016, NIPS.
[39] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[40] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..
[41] Ohad Shamir,et al. Oracle Complexity of Second-Order Methods for Finite-Sum Problems , 2016, ICML.
[42] Martin J. Wainwright,et al. Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence , 2015, SIAM J. Optim..
[43] Haishan Ye,et al. A Unifying Framework for Convergence Analysis of Approximate Newton Methods , 2017, ArXiv.
[44] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.