Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence
暂无分享,去创建一个
[1] E. Rebrova,et al. Sharp Analysis of Sketch-and-Project Methods via a Connection to Randomized Singular Value Decomposition , 2022, ArXiv.
[2] Michael W. Mahoney,et al. Asymptotic Convergence Rate and Statistical Inference for Stochastic Sequential Quadratic Programming , 2022, ArXiv.
[3] Xun Qian,et al. Basis Matters: Better Communication-Efficient Second Order Methods for Federated Learning , 2021, AISTATS.
[4] Stephen J. Wright,et al. Inexact Newton-CG algorithms with complexity guarantees , 2021, IMA Journal of Numerical Analysis.
[5] M. Anitescu,et al. Inequality constrained stochastic nonlinear optimization via active-set sequential quadratic programming , 2021, Mathematical Programming.
[6] Mert Pilanci,et al. Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update , 2021, NeurIPS.
[7] Peter Richtárik,et al. FedNL: Making Newton-Type Methods Applicable to Federated Learning , 2021, ICML.
[8] Mert Pilanci,et al. Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality , 2021, ICML.
[9] Peter Richtárik,et al. Distributed Second Order Methods with Fast Rates and Compressed Communication , 2021, ICML.
[10] M. Anitescu,et al. An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians , 2021, Mathematical Programming.
[11] Zhenyu Liao,et al. Sparse sketches with small inversion bias , 2020, COLT.
[12] Mert Pilanci,et al. Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization , 2020, NeurIPS.
[13] Michael W. Mahoney,et al. Precise expressions for random projections: Low-rank approximation and randomized Newton , 2020, NeurIPS.
[14] Anton Rodomanov,et al. New Results on Superlinear Convergence of Classical Quasi-Newton Methods , 2020, Journal of Optimization Theory and Applications.
[15] Aryan Mokhtari,et al. Non-asymptotic superlinear convergence of standard quasi-Newton methods , 2020, Mathematical Programming.
[16] Y. Nesterov,et al. Rates of superlinear convergence for classical quasi-Newton methods , 2020, Mathematical Programming.
[17] Anton Rodomanov,et al. Greedy Quasi-Newton Methods with Explicit Superlinear Convergence , 2020, SIAM J. Optim..
[18] Dmitry Kovalev,et al. Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates , 2019, ArXiv.
[19] A. Krause,et al. Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling , 2019, AISTATS.
[20] Mark W. Schmidt,et al. Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation , 2019, AISTATS.
[21] Fred Roosta,et al. Convergence of Newton-MR under Inexact Hessian Information , 2019, SIAM J. Optim..
[22] Dmitry Kovalev,et al. RSN: Randomized Subspace Newton , 2019, NeurIPS.
[23] Michael W. Mahoney,et al. Distributed estimation of the inverse Hessian by determinantal averaging , 2019, NeurIPS.
[24] Zaiwen Wen,et al. Globally Convergent Levenberg-Marquardt Method for Phase Retrieval , 2019, IEEE Transactions on Information Theory.
[25] Zhihua Zhang,et al. Do Subsampled Newton Methods Work for High-Dimensional Data? , 2019, AAAI.
[26] Michael W. Mahoney,et al. Sub-sampled Newton methods , 2018, Math. Program..
[27] Stefania Bellavia,et al. Subsampled inexact Newton methods for minimizing large sums of convex functions , 2018, IMA Journal of Numerical Analysis.
[28] Yi Zhou,et al. Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization , 2018, AISTATS.
[29] Peter Richtárik,et al. Randomized Block Cubic Newton Method , 2018, ICML.
[30] Peng Xu,et al. Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study , 2017, SDM.
[31] Haishan Ye,et al. Approximate Newton Methods and Their Local Convergence , 2017, ICML.
[32] Jorge Nocedal,et al. An investigation of Newton-Sketch and subsampled Newton methods , 2017, Optim. Methods Softw..
[33] J. Nocedal,et al. Exact and Inexact Subsampled Newton Methods for Optimization , 2016, 1609.08502.
[34] Katya Scheinberg,et al. Convergence Rate Analysis of a Stochastic Trust-Region Method via Supermartingales , 2016, INFORMS Journal on Optimization.
[35] Peng Xu,et al. Sub-sampled Newton Methods with Non-uniform Sampling , 2016, NIPS.
[36] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[37] Sham M. Kakade,et al. Faster Eigenvector Computation via Shift-and-Invert Preconditioning , 2016, ICML.
[38] Naman Agarwal,et al. Second-Order Stochastic Optimization for Machine Learning in Linear Time , 2016, J. Mach. Learn. Res..
[39] Haipeng Luo,et al. Efficient Second Order Online Learning by Sketching , 2016, NIPS.
[40] Shai Shalev-Shwartz,et al. SDCA without Duality, Regularization, and Individual Convexity , 2016, ICML.
[41] Elad Hazan,et al. Fast and Simple PCA via Convex Optimization , 2015, ArXiv.
[42] Andrea Montanari,et al. Convergence rates of sub-sampled Newton methods , 2015, NIPS.
[43] Peter Richtárik,et al. Randomized Iterative Methods for Linear Systems , 2015, SIAM J. Matrix Anal. Appl..
[44] Zeyuan Allen Zhu,et al. Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives , 2015, ICML.
[45] Martin J. Wainwright,et al. Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence , 2015, SIAM J. Optim..
[46] Katya Scheinberg,et al. Stochastic optimization using a trust-region method and random models , 2015, Mathematical Programming.
[47] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[48] Jorge Nocedal,et al. Sample size selection in optimization methods for machine learning , 2012, Math. Program..
[49] David P. Woodruff,et al. Low rank approximation and regression in input sparsity time , 2012, STOC '13.
[50] Michael A. Saunders,et al. CG Versus MINRES: An Empirical Comparison , 2012 .
[51] Jorge Nocedal,et al. On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..
[52] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[53] Mark W. Schmidt,et al. Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..
[54] J. Tropp. FREEDMAN'S INEQUALITY FOR MATRIX MARTINGALES , 2011, 1101.3039.
[55] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[56] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[57] Vincent Nesme,et al. Note on sampling without replacing from a finite collection of matrices , 2010, ArXiv.
[58] V. Koltchinskii,et al. High Dimensional Probability , 2006, math/0612726.
[59] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[60] Stephen J. Wright,et al. Numerical Optimization (Springer Series in Operations Research and Financial Engineering) , 2000 .
[61] Stanislav Abaimov,et al. Understanding Machine Learning , 2022, Machine Learning for Cyber Agents.
[62] Guanghui Lan,et al. First-order and Stochastic Optimization Methods for Machine Learning , 2020 .
[63] Ananth Grama,et al. GPU Accelerated Sub-Sampled Newton's Method for Convex Classification Problems , 2019, SDM.
[64] R. Dudley,et al. High Dimensional Probability VI , 2011 .
[65] J. J. Moré,et al. A Characterization of Superlinear Convergence and its Application to Quasi-Newton Methods , 1973 .