From error bounds to the complexity of first-order descent methods for convex functions

This paper shows that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization. In a first stage, this objective led us to revisit the interplay between error bounds and the Kurdyka-Łojasiewicz (KL) inequality. One can show the equivalence between the two concepts for convex functions having a moderately flat profile near the set of minimizers (as those of functions with Hölderian growth). A counterexample shows that the equivalence is no longer true for extremely flat functions. This fact reveals the relevance of an approach based on KL inequality. In a second stage, we show how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Our approach is completely original and makes use of a one-dimensional worst-case proximal sequence in the spirit of the famous majorant method of Kantorovich. Our result applies to a very simple abstract scheme that covers a wide class of descent methods. As a byproduct of our study, we also provide new results for the globalization of KL inequalities in the convex framework. Our main results inaugurate a simple method: derive an error bound, compute the desingularizing function whenever possible, identify essential constants in the descent method and finally compute the complexity using the one-dimensional worst case proximal sequence. Our method is illustrated through projection methods for feasibility problems, and through the famous iterative shrinkage thresholding algorithm (ISTA), for which we show that the complexity bound is of the form $$O(q^{k})$$O(qk) where the constituents of the bound only depend on error bound constants obtained for an arbitrary least squares objective with $$\ell ^1$$ℓ1 regularization.

[1]  A. Hoffman On approximate solutions of systems of linear inequalities , 1952 .

[2]  L. Kantorovich,et al.  Functional analysis in normed spaces , 1952 .

[3]  S. Łojasiewicz Sur le problème de la division , 1959 .

[4]  H. Brezis Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert , 1973 .

[5]  S. M. Robinson An Application of Error Bounds for Convex Programming in a Linear Space , 1975 .

[6]  Ronald E. Bruck Asymptotic convergence of nonlinear contraction semigroups in Hilbert space , 1975 .

[7]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[8]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[9]  Olvi L. Mangasarian,et al.  A Condition Number for Differentiable Convex Inequalities , 1985, Math. Oper. Res..

[10]  Alfred Auslender,et al.  Global Regularity Theorems , 1988, Math. Oper. Res..

[11]  Michael C. Ferris,et al.  Finite termination of the proximal point algorithm , 1991, Math. Program..

[12]  J. Dedieu Penalty functions in subanalytic optimization , 1992 .

[13]  Marie-Françoise Roy,et al.  Real algebraic geometry , 1992 .

[14]  S. Łojasiewicz Sur la géométrie semi- et sous- analytique , 1993 .

[15]  Z.-Q. Luo,et al.  Error bounds and convergence analysis of feasible descent methods: a general approach , 1993, Ann. Oper. Res..

[16]  M. Ferris,et al.  Weak sharp minima in mathematical programming , 1993 .

[17]  Zhi-Quan Luo,et al.  Error bounds for analytic systems and their applications , 1994, Math. Program..

[18]  Zhi-Quan Luo,et al.  Extension of Hoffman's Error Bound to Polynomial Systems , 1994, SIAM J. Optim..

[19]  Patrick L. Combettes,et al.  Inconsistent signal feasibility problems: least-squares solutions in a product space , 1994, IEEE Trans. Signal Process..

[20]  Heinz H. Bauschke,et al.  On Projection Algorithms for Solving Convex Feasibility Problems , 1996, SIAM Rev..

[21]  O. Cornejo,et al.  Conditioning and Upper-Lipschitz Inverse Subdifferentials in Nonsmooth Optimization Problems , 1997 .

[22]  Jong-Shi Pang,et al.  Error bounds in mathematical programming , 1997, Math. Program..

[23]  K. Kurdyka On gradients of functions definable in o-minimal structures , 1998 .

[24]  Marie-Françoise Roy,et al.  Witt Rings in Real Algebraic Geometry , 1998 .

[25]  Z. Luo,et al.  Error Bounds for Quadratic Systems , 1999 .

[26]  Xi Yin Zheng,et al.  Global error bounds with fractional exponents , 2000, Math. Program..

[27]  D. Bertsekas,et al.  Convergen e Rate of In remental Subgradient Algorithms , 2000 .

[28]  M. Coste AN INTRODUCTION TO O-MINIMAL GEOMETRY , 2002 .

[29]  Diethard Klatte,et al.  A Frank–Wolfe Type Theorem for Convex Polynomial Programs , 2002, Comput. Optim. Appl..

[30]  Constantin Zualinescu Sharp Estimates for Hoffman's Constant for Systems of Linear Inequalities and Equalities , 2003, SIAM J. Optim..

[31]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[32]  Marc Teboulle,et al.  Convergence rate analysis and error bounds for projection algorithms in convex feasibility problems , 2003, Optim. Methods Softw..

[33]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[34]  Robert E. Mahony,et al.  Convergence of the Iterates of Descent Methods for Analytic Cost Functions , 2005, SIAM J. Optim..

[35]  F. Giannessi Variational Analysis and Generalized Differentiation , 2006 .

[36]  B. Mordukhovich Variational analysis and generalized differentiation , 2006 .

[37]  Adrian S. Lewis,et al.  The [barred L]ojasiewicz Inequality for Nonsmooth Subanalytic Functions with Applications to Subgradient Dynamical Systems , 2006, SIAM J. Optim..

[38]  Adrian S. Lewis,et al.  Clarke Subgradients of Stratifiable Functions , 2006, SIAM J. Optim..

[39]  J. Peypouquet ASYMPTOTIC CONVERGENCE TO THE OPTIMAL VALUE OF DIAGONAL PROXIMAL ITERATIONS IN CONVEX MINIMIZATION , 2008 .

[40]  Jean-Noël Corvellec,et al.  Nonlinear error bounds for lower semicontinuous functions on metric spaces , 2008, Math. Program..

[41]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[42]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[43]  J. Bolte,et al.  Characterizations of Lojasiewicz inequalities: Subgradient flows, talweg, convexity , 2009 .

[44]  Hédy Attouch,et al.  On the convergence of the proximal algorithm for nonsmooth functions involving analytic features , 2008, Math. Program..

[45]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..

[46]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[47]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[48]  P. L. Combettes,et al.  There is no variational characterization of the cycles in the method of periodic projections , 2011, 1102.1378.

[49]  Benar Fux Svaiter,et al.  Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..

[50]  Guoyin Li,et al.  Global error bounds for piecewise convex polynomials , 2013, Math. Program..

[51]  Hà Huy Vui Global Hölderian Error Bound for Nondegenerate Polynomials , 2013, SIAM J. Optim..

[52]  P. Vuong New Fractional Error Bounds for Polynomial Systems with Applications to Holderian Stability in Optimization and Spectral Theory of Tensors , 2014 .

[53]  Mohamed-Jalal Fadili,et al.  Local Linear Convergence of Forward-Backward under Partial Smoothness , 2014, NIPS.

[54]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[55]  J. Peypouquet Convex Optimization in Normed Spaces: Theory, Methods and Examples , 2015 .

[56]  Juan Peypouquet,et al.  Splitting Methods with Variable Metric for Kurdyka–Łojasiewicz Functions and General Convergence Rates , 2015, J. Optim. Theory Appl..

[57]  Boris S. Mordukhovich,et al.  New fractional error bounds for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors , 2015, Math. Program..

[58]  J. Bolte,et al.  On damped second-order gradient systems , 2014, 1411.8005.

[59]  J. Peypouquet Convex Optimization in Normed Spaces , 2015 .

[60]  Edouard Pauwels,et al.  Majorization-Minimization Procedures and Convergence of SQP Methods for Semi-Algebraic and Tame Programs , 2014, Math. Oper. Res..

[61]  Edouard Pauwels The value function approach to convergence analysis in composite optimization , 2016, Oper. Res. Lett..

[62]  Shimrit Shtern,et al.  Linearly convergent away-step conditional gradient for non-strongly convex functions , 2015, Mathematical Programming.

[63]  Boris S. Mordukhovich,et al.  Error bounds for parametric polynomial systems with applications to higher-order stability analysis and convergence rates , 2015, Math. Program..