On the Linear Convergence to Weak/Standard d-Stationary Points of DCA-Based Algorithms for Structured Nonsmooth DC Programming

We consider a class of structured nonsmooth difference-of-convex minimization. We allow nonsmoothness in both the convex and concave components in the objective function, with a finite max structure in the concave part. Our focus is on algorithms that compute a (weak or standard) d(irectional)-stationary point as advocated in a recent work of Pang et al. in 2017. Our linear convergence results are based on direct generalizations of the assumptions of error bounds and separation of isocost surfaces proposed in the seminal work of Luo et al. in 1993, as well as one additional assumption of locally linear regularity regarding the intersection of certain stationary sets and dominance regions. An interesting by-product is to present a sharper characterization of the limit set of the basic algorithm proposed by Pang et. al., which fits between d-stationarity and global optimality. We also discuss sufficient conditions under which these assumptions hold. Finally, we provide several realistic and nontrivial statistical learning models where all assumptions hold.

[1]  Pham Dinh Tao,et al.  Duality in D.C. (Difference of Convex functions) Optimization. Subgradient Methods , 1988 .

[2]  Hoai An Le Thi,et al.  DC programming and DCA: thirty years of developments , 2018, Mathematical Programming.

[3]  Adrian S. Lewis,et al.  The [barred L]ojasiewicz Inequality for Nonsmooth Subanalytic Functions with Applications to Subgradient Dynamical Systems , 2006, SIAM J. Optim..

[4]  Le Thi Hoai An,et al.  DC programming and DCA: thirty years of developments , 2018, Math. Program..

[5]  Zhe Sun,et al.  Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization , 2018, Mathematical Programming.

[6]  Heinz H. Bauschke,et al.  Strong conical hull intersection property, bounded linear regularity, Jameson’s property (G), and error bounds in convex optimization , 1999, Math. Program..

[7]  Hédy Attouch,et al.  On the convergence of the proximal algorithm for nonsmooth functions involving analytic features , 2008, Math. Program..

[8]  Z. Luo,et al.  Error Bounds for Quadratic Systems , 1999 .

[9]  Bo Wen,et al.  A proximal difference-of-convex algorithm with extrapolation , 2016, Computational Optimization and Applications.

[10]  P. Tseng,et al.  On the linear convergence of descent methods for convex essentially smooth minimization , 1992 .

[11]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[12]  Jieping Ye,et al.  A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems , 2013, ICML.

[13]  Paul Tseng,et al.  Error Bound and Convergence Analysis of Matrix Splitting Algorithms for the Affine Variational Inequality Problem , 1992, SIAM J. Optim..

[14]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[15]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[16]  Jong-Shi Pang,et al.  Nonconvex Games with Side Constraints , 2011, SIAM J. Optim..

[17]  Bo Wen,et al.  Linear Convergence of Proximal Gradient Algorithm with Extrapolation for a Class of Nonconvex Nonsmooth Minimization Problems , 2015, SIAM J. Optim..

[18]  Le Thi Hoai An,et al.  Convergence Analysis of Difference-of-Convex Algorithm with Subanalytic Data , 2018, Journal of Optimization Theory and Applications.

[19]  Jong-Shi Pang,et al.  Error bounds in mathematical programming , 1997, Math. Program..

[20]  Z.-Q. Luo,et al.  Error bounds and convergence analysis of feasible descent methods: a general approach , 1993, Ann. Oper. Res..

[21]  D. Russell Luke,et al.  Nonconvex Notions of Regularity and Convergence of Fundamental Algorithms for Feasibility Problems , 2012, SIAM J. Optim..

[22]  Jong-Shi Pang,et al.  Composite Difference-Max Programs for Modern Statistical Estimation Problems , 2018, SIAM J. Optim..

[23]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[24]  Le Thi Hoai An,et al.  The DC (Difference of Convex Functions) Programming and DCA Revisited with DC Models of Real World Nonconvex Optimization Problems , 2005, Ann. Oper. Res..

[25]  Dmitriy Drusvyatskiy,et al.  Error Bounds, Quadratic Growth, and Linear Convergence of Proximal Methods , 2016, Math. Oper. Res..

[26]  Akiko Takeda,et al.  A refined convergence analysis of pDCA$_e$ with applications to simultaneous sparse recovery and outlier detection , 2018 .

[27]  Jong-Shi Pang,et al.  Computing B-Stationary Points of Nonsmooth DC Programs , 2015, Math. Oper. Res..

[28]  Akiko Takeda,et al.  DC formulations and algorithms for sparse optimization problems , 2017, Mathematical Programming.

[29]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[30]  Patrick T. Harker,et al.  Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications , 1990, Math. Program..

[31]  Akiko Takeda,et al.  A refined convergence analysis of pDCAe\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {pDCA}_{e}$$\end{document , 2018, Computational Optimization and Applications.

[32]  J.-B. Hiriart-Urruty,et al.  From Convex Optimization to Nonconvex Optimization. Necessary and Sufficient Conditions for Global Optimality , 1989 .

[33]  Tong Zhang,et al.  Analysis of Multi-stage Convex Relaxation for Sparse Regularization , 2010, J. Mach. Learn. Res..

[34]  Zhi-Quan Luo,et al.  Error bounds for analytic systems and their applications , 1994, Math. Program..

[35]  Jong-Shi Pang,et al.  Decomposition Methods for Computing Directional Stationary Solutions of a Class of Nonsmooth Nonconvex Optimization Problems , 2018, SIAM J. Optim..

[36]  Jong-Shi Pang,et al.  On the pervasiveness of difference-convexity in optimization and statistics , 2017, Math. Program..

[37]  Guoyin Li,et al.  Calculus of the Exponent of Kurdyka–Łojasiewicz Inequality and Its Applications to Linear Convergence of First-Order Methods , 2016, Foundations of Computational Mathematics.

[38]  Jack Xin,et al.  Difference-of-Convex Learning: Directional Stationarity, Optimality, and Sparsity , 2017, SIAM J. Optim..

[39]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[40]  Anthony Man-Cho So,et al.  A unified approach to error bounds for structured convex optimization problems , 2015, Mathematical Programming.

[41]  Jonathan M. Borwein,et al.  On difference convexity of locally Lipschitz functions , 2011 .