Convergence Analysis of Deflected Conditional Approximate Subgradient Methods

Subgradient methods for nondifferentiable optimization benefit from deflection, i.e., defining the search direction as a combination of the previous direction and the current subgradient. In the constrained case they also benefit from projection of the search direction onto the feasible set prior to computing the steplength, that is, from the use of conditional subgradient techniques. However, combining the two techniques is not straightforward, especially if an inexact oracle is available which can only compute approximate function values and subgradients. We present a convergence analysis of several different variants, both conceptual and implementable, of approximate conditional deflected subgradient methods. Our analysis extends the available results in the literature by using the main stepsize rules presented so far, while allowing deflection in a more flexible way. Furthermore, to allow for (diminishing/square summable) rules where the stepsize is tightly controlled a priori, we propose a new class of deflection-restricted approaches where it is the deflection parameter, rather than the stepsize, which is dynamically adjusted using the “target value” of the optimization sequence. For both Polyak-type and diminishing/square summable stepsizes, we propose a “correction” of the standard formula which shows that, in the inexact case, knowledge about the error computed by the oracle (which is available in several practical applications) can be exploited in order to strengthen the convergence properties of the method. The analysis allows for several variants of the algorithm; at least one of them is likely to show numerical performances similar to these of “heavy ball” subgradient methods, popular within backpropagation approaches to train neural networks, while possessing stronger convergence properties.

[1]  M. Guignard Lagrangean relaxation , 2003 .

[2]  C. Beltran Solving the p-Median Problem with a Semi-Lagrangian Relaxation∗ , .

[3]  Antonio Frangioni,et al.  Generalized Bundle Methods , 2002, SIAM J. Optim..

[4]  Dimitri P. Bertsekas,et al.  Convex Analysis and Optimization , 2003 .

[5]  Michael Patriksson,et al.  Ergodic, primal convergence in dual subgradient schemes for convex programming , 1999, Mathematical programming.

[6]  Teodor Gabriel Crainic,et al.  Bundle-based relaxation methods for multicommodity capacitated fixed charge network design , 2001, Discret. Appl. Math..

[7]  P. Camerini,et al.  On improving relaxation methods by modified gradient techniques , 1975 .

[8]  B. Guta Subgradient Optimization Methods in Integer Programming with an Application to a Radiation Therapy Problem , 2003 .

[9]  Krzysztof C. Kiwiel,et al.  A Proximal Bundle Method with Approximate Subgradient Linearizations , 2006, SIAM J. Optim..

[10]  Claude Lemaréchal,et al.  Lagrangian Relaxation , 2000, Computational Combinatorial Optimization.

[11]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[12]  Claude Lemaréchal,et al.  An inexact bundle variant suited to column generation , 2009, Math. Program..

[13]  Laurence A. Wolsey,et al.  Two “well-known” properties of subgradient optimization , 2009, Math. Program..

[14]  Francisco Barahona,et al.  The volume algorithm: producing primal solutions with a subgradient method , 2000, Math. Program..

[15]  Yurii Nesterov,et al.  Complexity estimates of some cutting plane methods based on the analytic barrier , 1995, Math. Program..

[16]  Nelson Maculan,et al.  The volume algorithm revisited: relation with bundle methods , 2002, Math. Program..

[17]  Krzysztof C. Kiwiel,et al.  Convergence of Approximate and Incremental Subgradient Methods for Convex Optimization , 2003, SIAM J. Optim..

[18]  Giovanni Rinaldi,et al.  New approaches for optimizing over the semimetric polytope , 2005, Math. Program..

[19]  M. Patriksson,et al.  Conditional subgradient optimization -- Theory and applications , 1996 .

[20]  M. Solodov,et al.  Error Stability Properties of Generalized Gradient-Type Algorithms , 1998 .

[21]  Dimitri P. Bertsekas,et al.  The effect of deterministic noise in subgradient methods , 2010, Math. Program..

[22]  Andrzej Ruszczynski,et al.  A merit function approach to the subgradient method with averaging , 2008, Optim. Methods Softw..

[23]  Hanif D. Sherali,et al.  A variable target value method for nondifferentiable optimization , 2000, Oper. Res. Lett..

[24]  Hanif D. Sherali,et al.  On embedding the volume algorithm in a variable target value method , 2004, Oper. Res. Lett..

[25]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[26]  Hanif D. Sherali,et al.  Convergence and Computational Analyses for Some Variable Target Value and Subgradient Deflection Methods , 2006, Comput. Optim. Appl..

[27]  Jean-Philippe Vial,et al.  On Improvements to the Analytic Center Cutting Plane Method , 1998, Comput. Optim. Appl..

[28]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[29]  Antonio Frangioni,et al.  About Lagrangian Methods in Integer Optimization , 2005, Ann. Oper. Res..

[30]  Claude Lemaréchal,et al.  Convergence of some algorithms for convex minimization , 1993, Math. Program..

[31]  Antonio Frangioni,et al.  Solving semidefinite quadratic problems within nonsmooth optimization algorithms , 1996, Comput. Oper. Res..

[32]  Michael Patriksson,et al.  On the convergence of conditional epsilon-subgradient methods for convex programs and convex-concave saddle-point problems , 2003, Eur. J. Oper. Res..

[33]  C. Yalçin Kaya,et al.  On a Modified Subgradient Algorithm for Dual Problems via Sharp Augmented Lagrangian* , 2006, J. Glob. Optim..