Nonmonotone Barzilai–Borwein Gradient Algorithm for $$\ell _{1}$$ℓ1-Regularized Nonsmooth Minimization in Compressive Sensing

This study aims to minimize the sum of a smooth function and a nonsmooth $$\ell _{1}$$ℓ1-regularized term. This problem as a special case includes the $$\ell _{1}$$ℓ1-regularized convex minimization problem in signal processing, compressive sensing, machine learning, data mining, and so on. However, the non-differentiability of the $$\ell _{1}$$ℓ1-norm causes more challenges especially in large problems encountered in many practical applications. This study proposes, analyzes, and tests a Barzilai–Borwein gradient algorithm. At each iteration, the generated search direction demonstrates descent property and can be easily derived by minimizing a local approximal quadratic model and simultaneously taking the favorable structure of the $$\ell _{1}$$ℓ1-norm. A nonmonotone line search technique is incorporated to find a suitable stepsize along this direction. The algorithm is easily performed, where each iteration requiring the values of the objective function and the gradient of the smooth term. Under some conditions, the proposed algorithm appears globally convergent. The limited experiments using some nonconvex unconstrained problems from the CUTEr library with additive $$\ell _{1}$$ℓ1-regularization illustrate that the proposed algorithm performs quite satisfactorily. Extensive experiments for $$\ell _{1}$$ℓ1-regularized least squares problems in compressive sensing verify that our algorithm compares favorably with several state-of-the-art algorithms that have been specifically designed in recent years.

[1]  W. Hager,et al.  The cyclic Barzilai-–Borwein method for unconstrained optimization , 2006 .

[2]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[3]  José M. Bioucas-Dias,et al.  A New TwIST: Two-Step Iterative Shrinkage/Thresholding Algorithms for Image Restoration , 2007, IEEE Transactions on Image Processing.

[4]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[5]  Chih-Jen Lin,et al.  Newton's Method for Large Bound-Constrained Optimization Problems , 1999, SIAM J. Optim..

[6]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[7]  Ángel Santos-Palomo,et al.  Updating and downdating an upper trapezoidal sparse orthogonal factorization , 2006 .

[8]  S. V. N. Vishwanathan,et al.  A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning , 2008, J. Mach. Learn. Res..

[9]  Xiaoming Yuan,et al.  Alternating Direction Method for Covariance Selection Models , 2011, Journal of Scientific Computing.

[10]  L. Grippo,et al.  A nonmonotone line search technique for Newton's method , 1986 .

[11]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[12]  Yin Zhang,et al.  A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation , 2010, SIAM J. Sci. Comput..

[13]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[14]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[15]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[16]  Cho-Jui Hsieh,et al.  Coordinate Descent Method for Large-scale L 2-loss Linear SVM , 2008 .

[17]  William W. Hager,et al.  A Nonmonotone Line Search Technique and Its Application to Unconstrained Optimization , 2004, SIAM J. Optim..

[18]  Ambuj Tewari,et al.  Stochastic methods for l1 regularized loss minimization , 2009, ICML '09.

[19]  Emmanuel J. Candès,et al.  Quantitative Robust Uncertainty Principles and Optimally Sparse Decompositions , 2004, Found. Comput. Math..

[20]  Roger Fletcher,et al.  On the asymptotic behaviour of some new gradient methods , 2005, Math. Program..

[21]  Wanyou Cheng,et al.  A derivative-free nonmonotone line search and its application to the spectral residual method , 2009 .

[22]  L. Liao,et al.  R-linear convergence of the Barzilai and Borwein gradient method , 2002 .

[23]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[24]  Yong Zhang,et al.  An augmented Lagrangian approach for sparse principal component analysis , 2009, Mathematical Programming.

[25]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[26]  Kim-Chuan Toh,et al.  A coordinate gradient descent method for ℓ1-regularized convex minimization , 2011, Comput. Optim. Appl..

[27]  Emmanuel J. Candès,et al.  NESTA: A Fast and Accurate First-Order Method for Sparse Recovery , 2009, SIAM J. Imaging Sci..

[28]  José Mario Martínez,et al.  Nonmonotone Spectral Projected Gradient Methods on Convex Sets , 1999, SIAM J. Optim..

[29]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[30]  M. Raydan On the Barzilai and Borwein choice of steplength for the gradient method , 1993 .

[31]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[32]  Yan Zhang,et al.  A Nonmonotone Filter Barzilai-Borwein Method for Optimization , 2010, Asia Pac. J. Oper. Res..

[33]  Michael P. Friedlander,et al.  Probing the Pareto Frontier for Basis Pursuit Solutions , 2008, SIAM J. Sci. Comput..

[34]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[35]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[36]  Antony F. R. Brown Language Translation , 1958, JACM.

[37]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[38]  Junfeng Yang,et al.  Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[39]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[40]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[41]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[42]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[43]  Marcos Raydan,et al.  The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem , 1997, SIAM J. Optim..

[44]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[45]  Wotao Yin,et al.  A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression , 2010, J. Mach. Learn. Res..

[46]  Yoram Singer,et al.  Efficient Learning using Forward-Backward Splitting , 2009, NIPS.

[47]  Chih-Jen Lin,et al.  A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[48]  Chih-Jen Lin,et al.  Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[49]  Nicholas I. M. Gould,et al.  CUTE: constrained and unconstrained testing environment , 1995, TOMS.

[50]  Wotao Yin,et al.  On the convergence of an active-set method for ℓ1 minimization , 2012, Optim. Methods Softw..

[51]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[52]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .