Complexity of Inexact Proximal Newton methods

Recently several methods were proposed for sparse optimization which make careful use of second-order information [10, 28, 16, 3] to improve local convergence rates. These methods construct a composite quadratic approximation using Hessian information, optimize this approximation using a first-order method, such as coordinate descent and employ a line search to ensure sufficient descent. Here we propose a general framework, which includes slightly modified versions of existing algorithms and also a new algorithm, and provide a global convergence rate analysis in the spirit of proximal gradient methods, which includes analysis of method based on coordinate descent.

[1]  J. Nocedal,et al.  An inexact successive quadratic approximation method for L-1 regularized optimization , 2013, Mathematical Programming.

[2]  Chih-Jen Lin,et al.  A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[3]  Lu Li,et al.  An inexact interior point method for L1-regularized sparse covariance selection , 2010, Math. Program. Comput..

[4]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[5]  Katya Scheinberg,et al.  Noname manuscript No. (will be inserted by the editor) Efficient Block-coordinate Descent Algorithms for the Group Lasso , 2022 .

[6]  Jorge Nocedal,et al.  Newton-Like Methods for Sparse Inverse Covariance Estimation , 2012, NIPS.

[7]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[8]  Chia-Hua Ho,et al.  An improved GLMNET for l1-regularized logistic regression , 2011, J. Mach. Learn. Res..

[9]  Mark W. Schmidt,et al.  Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[10]  Stephen J. Wright,et al.  Identifying Activity , 2009, SIAM J. Optim..

[11]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[12]  Pradeep Ravikumar,et al.  Sparse inverse covariance matrix estimation using quadratic approximation , 2011, MLSLP.

[13]  Stephen J. Wright,et al.  Sparse reconstruction by separable approximation , 2009, IEEE Trans. Signal Process..

[14]  Ambuj Tewari,et al.  Stochastic methods for l1 regularized loss minimization , 2009, ICML '09.

[15]  Michael A. Saunders,et al.  Proximal Newton-type methods for convex optimization , 2012, NIPS.

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[18]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[19]  J. Zico Kolter,et al.  Sparse Gaussian Conditional Random Fields: Algorithms, Theory, and Application to Energy Forecasting , 2013, ICML.

[20]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[21]  Jorge Nocedal,et al.  Representations of quasi-Newton matrices and their use in limited memory methods , 1994, Math. Program..

[22]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[23]  R. Dembo,et al.  INEXACT NEWTON METHODS , 1982 .

[24]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[25]  Mark W. Schmidt,et al.  Projected Newton-type methods in machine learning , 2011 .