Exact and Inexact Subsampled Newton Methods for Optimization

The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of the paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact). We provide a complexity analysis for this method based on the properties of the CG iteration and the quality of the Hessian approximation, and compare it with a method that employs a stochastic gradient iteration instead of the CG method. We report preliminary numerical results that illustrate the performance of inexact subsampled Newton methods on machine learning applications based on logistic regression.

[1]  R. Dembo,et al.  INEXACT NEWTON METHODS , 1982 .

[2]  Gene H. Golub,et al.  Matrix computations , 1983 .

[3]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[4]  Denis J. Dean,et al.  Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables , 1999 .

[5]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[6]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[7]  Léon Bottou,et al.  On-line learning for very large data sets , 2005 .

[8]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[9]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[10]  James Martens,et al.  Deep learning via Hessian-free optimization , 2010, ICML.

[11]  Stephen J. Wright,et al.  Computational Methods for Sparse Solution of Linear Inverse Problems , 2010, Proceedings of the IEEE.

[12]  Jorge Nocedal,et al.  On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..

[13]  Xiang Li,et al.  Efficient least‐squares imaging with sparsity promotion and compressive sensing , 2012 .

[14]  Mark W. Schmidt,et al.  Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..

[15]  Jorge Nocedal,et al.  Sample size selection in optimization methods for machine learning , 2012, Math. Program..

[16]  Ilya Sutskever,et al.  Training Deep and Recurrent Networks with Hessian-Free Optimization , 2012, Neural Networks: Tricks of the Trade.

[17]  Yoram Singer,et al.  Parallel Boosting with Momentum , 2013, ECML/PKDD.

[18]  Nikolaos V. Sahinidis,et al.  Simulation optimization: a review of algorithms and applications , 2014, 4OR.

[19]  Michael C. Fu,et al.  Handbook of Simulation Optimization , 2014 .

[20]  Carola-Bibiane Schönlieb,et al.  Bilevel approaches for learning of variational imaging models , 2015, ArXiv.

[21]  Andrea Montanari,et al.  Convergence rates of sub-sampled Newton methods , 2015, NIPS.

[22]  Mark W. Schmidt,et al.  StopWasting My Gradients: Practical SVRG , 2015, NIPS.

[23]  P. Glynn,et al.  ON SAMPLING RATES IN STOCHASTIC RECURSIONS , 2016 .

[24]  Michael W. Mahoney,et al.  Sub-Sampled Newton Methods I: Globally Convergent Algorithms , 2016, ArXiv.

[25]  Peng Xu,et al.  Sub-sampled Newton Methods with Non-uniform Sampling , 2016, NIPS.

[26]  Michael W. Mahoney,et al.  Sub-Sampled Newton Methods II: Local Convergence Rates , 2016, ArXiv.

[27]  Naman Agarwal,et al.  Second Order Stochastic Optimization in Linear Time , 2016, ArXiv.

[28]  Martin J. Wainwright,et al.  Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence , 2015, SIAM J. Optim..

[29]  Naman Agarwal,et al.  Second-Order Stochastic Optimization for Machine Learning in Linear Time , 2016, J. Mach. Learn. Res..