A second-order method for strongly convex ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-re

In this paper a robust second-order method is developed for the solution of strongly convex $$\ell _1$$ℓ1-regularized problems. The main aim is to make the proposed method as inexpensive as possible, while even difficult problems can be efficiently solved. The proposed approach is a primal-dual Newton conjugate gradients (pdNCG) method. Convergence properties of pdNCG are studied and worst-case iteration complexity is established. Numerical results are presented on synthetic sparse least-squares problems and real world machine learning problems.

[1]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[2]  Haesun Park,et al.  Fast Active-set-type Algorithms for L1-regularized Linear Regression , 2010, AISTATS.

[3]  S. Sathiya Keerthi,et al.  A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[4]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[5]  Chia-Hua Ho,et al.  An improved GLMNET for l1-regularized logistic regression , 2011, J. Mach. Learn. Res..

[6]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[7]  Homer F. Walker,et al.  Globally Convergent Inexact Newton Methods , 1994, SIAM J. Optim..

[8]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[9]  Wotao Yin,et al.  On the convergence of an active-set method for ℓ1 minimization , 2012, Optim. Methods Softw..

[10]  Jacek Gondzio,et al.  Matrix-free interior point method for compressed sensing problems , 2012, Mathematical Programming Computation.

[11]  Calton Pu,et al.  Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically , 2006, CEAS.

[12]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[13]  C. Kelley Iterative Methods for Linear and Nonlinear Equations , 1987 .

[14]  C. Vogel,et al.  Analysis of bounded variation penalty methods for ill-posed problems , 1994 .

[15]  Stephen J. Wright Accelerated Block-coordinate Relaxation for Regularized Optimization , 2012, SIAM J. Optim..

[16]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[17]  Jorge Nocedal,et al.  Second-order methods for L1 regularized problems in machine learning , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Jean-Yves Audibert Optimization for Machine Learning , 1995 .

[19]  Chih-Jen Lin,et al.  Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..

[20]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[21]  Shou-De Lin,et al.  Feature Engineering and Classifier Ensemble for KDD Cup 2010 , 2010, KDD 2010.

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[24]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[25]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[26]  Homer F. Walker,et al.  Choosing the Forcing Terms in an Inexact Newton Method , 1996, SIAM J. Sci. Comput..

[27]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[28]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[29]  Chih-Jen Lin,et al.  A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[30]  R. Dembo,et al.  INEXACT NEWTON METHODS , 1982 .

[31]  Jorge Nocedal,et al.  A family of second-order methods for convex $$\ell _1$$ℓ1-regularized optimization , 2016, Math. Program..

[32]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[33]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[34]  R. Freund Review of A mathematical view of interior-point methods in convex optimization, by James Renegar, SIAM, Philadelphia, PA , 2004 .

[35]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[36]  Nicholas I. M. Gould,et al.  Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization , 2012, Optim. Methods Softw..

[37]  Emmanuel J. Candès,et al.  NESTA: A Fast and Accurate First-Order Method for Sparse Recovery , 2009, SIAM J. Imaging Sci..

[38]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[39]  Chih-Jen Lin,et al.  Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[40]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[41]  S. Nash A survey of truncated-Newton methods , 2000 .

[42]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[43]  Stephen J. Wright,et al.  Sparse reconstruction by separable approximation , 2009, IEEE Trans. Signal Process..

[44]  James Renegar,et al.  A mathematical view of interior-point methods in convex optimization , 2001, MPS-SIAM series on optimization.

[45]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[46]  Wotao Yin,et al.  TR 0707 A Fixed-Point Continuation Method for ` 1-Regularized Minimization with Applications to Compressed Sensing , 2007 .

[47]  Yousef Saad,et al.  Convergence Theory of Nonlinear Newton-Krylov Algorithms , 1994, SIAM J. Optim..

[48]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[49]  Peter Richtárik,et al.  Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.

[50]  Weiwen Tian,et al.  Globally convergent inexact generalized Newton's methods for nonsmooth equations , 2002 .

[51]  Emmanuel J. Candès,et al.  Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[52]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[53]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[54]  Dmitry M. Malioutov,et al.  Homotopy continuation for sparse signal representation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[55]  Ambuj Tewari,et al.  Stochastic methods for l1 regularized loss minimization , 2009, ICML '09.

[56]  Damien Serant Advanced Signal Processing Algorithms for GNSS/OFDM Receiver , 2012 .

[57]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[58]  Honglak Lee,et al.  Efficient L1 Regularized Logistic Regression , 2006, AAAI.

[59]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[60]  J. Tropp,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, Commun. ACM.

[61]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[62]  Yin Zhang,et al.  A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation , 2010, SIAM J. Sci. Comput..

[63]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[64]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[65]  Homer F. Walker,et al.  Globalization Techniques for Newton-Krylov Methods and Applications to the Fully Coupled Solution of the Navier-Stokes Equations , 2006, SIAM Rev..

[66]  Gene H. Golub,et al.  A Nonlinear Primal-Dual Method for Total Variation-Based Image Restoration , 1999, SIAM J. Sci. Comput..