论文信息 - A second-order method for strongly convex ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-re

A second-order method for strongly convex ℓ1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _1$$\end{document}-re

In this paper a robust second-order method is developed for the solution of strongly convex $$\ell _1$$ℓ1-regularized problems. The main aim is to make the proposed method as inexpensive as possible, while even difficult problems can be efficiently solved. The proposed approach is a primal-dual Newton conjugate gradients (pdNCG) method. Convergence properties of pdNCG are studied and worst-case iteration complexity is established. Numerical results are presented on synthetic sparse least-squares problems and real world machine learning problems.

Jacek Gondzio | Kimon Fountoulakis | J. Gondzio | K. Fountoulakis

[1] Chih-Jen Lin,et al. A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[2] Haesun Park,et al. Fast Active-set-type Algorithms for L1-regularized Linear Regression , 2010, AISTATS.

[3] S. Sathiya Keerthi,et al. A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[4] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[5] Chia-Hua Ho,et al. An improved GLMNET for l1-regularized logistic regression , 2011, J. Mach. Learn. Res..

[6] Y. Nesterov. Gradient methods for minimizing composite objective function , 2007 .

[7] Homer F. Walker,et al. Globally Convergent Inexact Newton Methods , 1994, SIAM J. Optim..

[8] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[9] Wotao Yin,et al. On the convergence of an active-set method for ℓ1 minimization , 2012, Optim. Methods Softw..

[10] Jacek Gondzio,et al. Matrix-free interior point method for compressed sensing problems , 2012, Mathematical Programming Computation.

[11] Calton Pu,et al. Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically , 2006, CEAS.

[12] Yiming Yang,et al. RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[13] C. Kelley. Iterative Methods for Linear and Nonlinear Equations , 1987 .

[14] C. Vogel,et al. Analysis of bounded variation penalty methods for ill-posed problems , 1994 .

[15] Stephen J. Wright. Accelerated Block-coordinate Relaxation for Regularized Optimization , 2012, SIAM J. Optim..

[16] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[17] Jorge Nocedal,et al. Second-order methods for L1 regularized problems in machine learning , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18] Jean-Yves Audibert. Optimization for Machine Learning , 1995 .

[19] Chih-Jen Lin,et al. Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..

[20] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .

[21] Shou-De Lin,et al. Feature Engineering and Classifier Ensemble for KDD Cup 2010 , 2010, KDD 2010.

[22] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[23] Stephen P. Boyd,et al. An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[24] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[25] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[26] Homer F. Walker,et al. Choosing the Forcing Terms in an Inexact Newton Method , 1996, SIAM J. Sci. Comput..

[27] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..

[28] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .

[29] Chih-Jen Lin,et al. A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[30] R. Dembo,et al. INEXACT NEWTON METHODS , 1982 .

[31] Jorge Nocedal,et al. A family of second-order methods for convex $$\ell _1$$ℓ1-regularized optimization , 2016, Math. Program..

[32] Peter Richtárik,et al. Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.

[33] Paul Tseng,et al. A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[34] R. Freund. Review of A mathematical view of interior-point methods in convex optimization, by James Renegar, SIAM, Philadelphia, PA , 2004 .

[35] Mark W. Schmidt,et al. Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[36] Nicholas I. M. Gould,et al. Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization , 2012, Optim. Methods Softw..

[37] Emmanuel J. Candès,et al. NESTA: A Fast and Accurate First-Order Method for Sparse Recovery , 2009, SIAM J. Imaging Sci..

[38] Mário A. T. Figueiredo,et al. Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[39] Chih-Jen Lin,et al. Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[40] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[41] S. Nash. A survey of truncated-Newton methods , 2000 .

[42] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[43] Stephen J. Wright,et al. Sparse reconstruction by separable approximation , 2009, IEEE Trans. Signal Process..

[44] James Renegar,et al. A mathematical view of interior-point methods in convex optimization , 2001, MPS-SIAM series on optimization.

[45] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[46] Wotao Yin,et al. TR 0707 A Fixed-Point Continuation Method for ` 1-Regularized Minimization with Applications to Compressed Sensing , 2007 .

[47] Yousef Saad,et al. Convergence Theory of Nonlinear Newton-Krylov Algorithms , 1994, SIAM J. Optim..

[48] K. Lange,et al. Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[49] Peter Richtárik,et al. Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.

[50] Weiwen Tian,et al. Globally convergent inexact generalized Newton's methods for nonsmooth equations , 2002 .

[51] Emmanuel J. Candès,et al. Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[52] Chia-Hua Ho,et al. Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[53] Yin Zhang,et al. Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[54] Dmitry M. Malioutov,et al. Homotopy continuation for sparse signal representation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[55] Ambuj Tewari,et al. Stochastic methods for l1 regularized loss minimization , 2009, ICML '09.

[56] Damien Serant. Advanced Signal Processing Algorithms for GNSS/OFDM Receiver , 2012 .

[57] P. Tseng. Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[58] Honglak Lee,et al. Efficient L1 Regularized Logistic Regression , 2006, AAAI.

[59] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.

[60] J. Tropp,et al. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, Commun. ACM.

[61] Yurii Nesterov,et al. Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[62] Yin Zhang,et al. A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation , 2010, SIAM J. Sci. Comput..

[63] Tom E. Bishop,et al. Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[64] Andrew Zisserman,et al. Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[65] Homer F. Walker,et al. Globalization Techniques for Newton-Krylov Methods and Applications to the Fully Coupled Solution of the Navier-Stokes Equations , 2006, SIAM Rev..

[66] Gene H. Golub,et al. A Nonlinear Primal-Dual Method for Total Variation-Based Image Restoration , 1999, SIAM J. Sci. Comput..