Accelerated Difference of Convex functions Algorithm and its Application to Sparse Binary Logistic Regression

In this work, we present a variant of DCA (Difference of Convex function Algorithm) with the aim of improving its performance. The proposed algorithm, named Accelerated DCA (ADCA), consists in incorporating the Nesterov’s acceleration technique into DCA. We first investigate ADCA for solving the standard DC program and rigorously study its convergence properties and the convergence rate. Secondly, we develop ADCA for a special case of the standard DC program whose the objective function is the sum of a differentiable function with L-Lipschitz continuous gradient (possibly nonconvex) and a DC function. We exploit the special structure of the problem to propose an efficient DC decomposition for which the corresponding ADCA scheme is inexpensive. As an application, we consider the sparse binary logistic regression problem. Numerical experiments on several benchmark datasets illustrate the efficiency of our algorithm and its superiority over well-known methods.

[1]  Le Thi Hoai An,et al.  DC approximation approaches for sparse optimization , 2014, Eur. J. Oper. Res..

[2]  Tie-Yan Liu,et al.  Efficient Inexact Proximal Gradient Algorithm for Nonconvex Problems , 2016, IJCAI.

[3]  Roy Gladstone,et al.  On programming theory , 1965 .

[4]  Le Thi Hoai An,et al.  DC programming and DCA: thirty years of developments , 2018, Math. Program..

[5]  N. H. Beebe Multiscale Modeling & Simulation , 2022 .

[6]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[7]  Luigi Grippo,et al.  Nonmonotone Globalization Techniques for the Barzilai-Borwein Gradient Method , 2002, Comput. Optim. Appl..

[8]  David Mumford,et al.  Communications on Pure and Applied Mathematics , 1989 .

[9]  Heng Huang,et al.  Inexact Proximal Gradient Methods for Non-convex and Non-smooth Optimization , 2016, AAAI.

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  De Leone,et al.  Computational Optimization and Applications Volume 34, Number 2, June 2006 , 2006 .

[12]  Michael J. Todd,et al.  Mathematical programming , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[13]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[14]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[15]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[16]  Huan Li,et al.  Accelerated Proximal Gradient Methods for Nonconvex Programming , 2015, NIPS.

[17]  Le Thi Hoai An,et al.  The DC (Difference of Convex Functions) Programming and DCA Revisited with DC Models of Real World Nonconvex Optimization Problems , 2005, Ann. Oper. Res..

[18]  Le Thi Hoai An,et al.  Recent Advances in DC Programming and DCA , 2013, Trans. Comput. Collect. Intell..

[19]  Antonin Chambolle,et al.  Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage , 1998, IEEE Trans. Image Process..

[20]  Juan Peypouquet,et al.  Splitting Methods with Variable Metric for Kurdyka–Łojasiewicz Functions and General Convergence Rates , 2015, J. Optim. Theory Appl..

[21]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.