A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems

We consider a class of nonconvex nonsmooth optimization problems whose objective is the sum of a smooth function and a finite number of nonnegative proper closed possibly nonsmooth functions (whose proximal mappings are easy to compute), some of which are further composed with linear maps. This kind of problems arises naturally in various applications when different regularizers are introduced for inducing simultaneous structures in the solutions. Solving these problems, however, can be challenging because of the coupled nonsmooth functions: the corresponding proximal mapping can be hard to compute so that standard first-order methods such as the proximal gradient algorithm cannot be applied efficiently. In this paper, we propose a successive difference-of-convex approximation method for solving this kind of problems. In this algorithm, we approximate the nonsmooth functions by their Moreau envelopes in each iteration. Making use of the simple observation that Moreau envelopes of nonnegative proper closed functions are continuous difference-of-convex functions, we can then approximately minimize the approximation function by first-order methods with suitable majorization techniques. These first-order methods can be implemented efficiently thanks to the fact that the proximal mapping of each nonsmooth function is easy to compute. Under suitable assumptions, we prove that the sequence generated by our method is bounded and any accumulation point is a stationary point of the objective. We also discuss how our method can be applied to concrete applications such as nonconvex fused regularized optimization problems and simultaneously structured matrix optimization problems, and illustrate the performance numerically for these two specific applications.

[1]  R. Tyrrell Rockafellar,et al.  Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.

[2]  Yves Lucet,et al.  Fast Moreau envelope computation I: numerical algorithms , 2007, Numerical Algorithms.

[3]  Matthias Hein,et al.  Non-negative least squares for high-dimensional linear models: consistency and sparse recovery without regularization , 2012, 1205.0953.

[4]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[5]  I. Daubechies,et al.  Sparse and stable Markowitz portfolios , 2007, Proceedings of the National Academy of Sciences.

[6]  Nicolas Vayatis,et al.  Estimation of Simultaneously Sparse and Low Rank Matrices , 2012, ICML.

[7]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[8]  Y. Ye,et al.  Sparse Portfolio Selection via Quasi-Norm Regularization , 2013, 1312.6350.

[9]  Jack Xin,et al.  Difference-of-Convex Learning: Directional Stationarity, Optimality, and Sparsity , 2017, SIAM J. Optim..

[10]  R. Rockafellar Convex Analysis: (pms-28) , 1970 .

[11]  Defeng Sun,et al.  A Majorized Penalty Approach for Calibrating Rank Constrained Correlation Matrix Problems , 2010 .

[12]  Zhaosong Lu,et al.  Penalty decomposition methods for rank minimization , 2010, Optim. Methods Softw..

[13]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[14]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[15]  Jieping Ye,et al.  A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems , 2013, ICML.

[16]  Ivan Markovsky,et al.  Structured low-rank approximation and its applications , 2008, Autom..

[17]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[18]  Edgar Asplund Differentiability of the metric projection in finite-dimensional Euclidean space , 1973 .

[19]  Zhaosong Lu,et al.  Sparse Recovery via Partial Regularization: Models, Theory and Algorithms , 2015, Math. Oper. Res..

[20]  Guoyin Li,et al.  Global Convergence of Splitting Methods for Nonconvex Composite Optimization , 2014, SIAM J. Optim..

[21]  R. Tyrrell Rockafellar,et al.  Variational Analysis , 1998, Grundlehren der mathematischen Wissenschaften.

[22]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[23]  Akiko Takeda,et al.  Efficient DC Algorithm for Constrained Sparse Optimization , 2017, 1701.08498.

[24]  Yaoliang Yu,et al.  Minimizing Nonconvex Non-Separable Functions , 2015, AISTATS.

[25]  Yaoliang Yu,et al.  Better Approximation and Faster Algorithm Using the Proximal Average , 2013, NIPS.

[26]  Emmanuel J. Candès,et al.  Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[27]  Le Thi Hoai An,et al.  A DC Programming Approach for Sparse Eigenvalue Problem , 2010, ICML.

[28]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[29]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[30]  I. Selesnick,et al.  Convex fused lasso denoising with non-convex regularization and its use for pulse detection , 2015, 2015 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[31]  Nicholas J. Higham,et al.  Computing a Nearest Correlation Matrix with Factor Structure , 2010, SIAM J. Matrix Anal. Appl..

[32]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[33]  Yong Zhang,et al.  Sparse Approximation via Penalty Decomposition Methods , 2012, SIAM J. Optim..