Incremental Subgradient Methods for Nondifferentiable Optimization

We consider a class of subgradient methods for minimizing a convex function that consists of the sum of a large number of component functions. This type of minimization arises in a dual context from Lagrangian relaxation of the coupling constraints of large scale separable problems. The idea is to perform the subgradient iteration incrementally, by sequentially taking steps along the subgradients of the component functions, with intermediate adjustment of the variables after processing each component function. This incremental approach has been very successful in solving large differentiable least squares problems, such as those arising in the training of neural networks, and it has resulted in a much better practical rate of convergence than the steepest descent method. In this paper, we establish the convergence properties of a number of variants of incremental subgradient methods, including some that are stochastic. Based on the analysis and computational experiments, the methods appear very promising and effective for important classes of large problems. A particularly interesting discovery is that by randomizing the order of selection of component functions for iteration, the convergence rate is substantially improved.

[1]  Boris Polyak Minimization of unsmooth functionals , 1969 .

[2]  Yuri Ermoliev,et al.  Stochastic Programming Methods , 1976 .

[3]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[4]  Y. Ermoliev Stochastic quasigradient methods and their application to system optimization , 1983 .

[5]  V. F. Dem'yanov,et al.  Nondifferentiable Optimization , 1985 .

[6]  David K. Smith,et al.  Mathematical Programming: Theory and Algorithms , 1986 .

[7]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[8]  Yuri Ermoliev,et al.  Stochastic quasigradient methods. Numerical techniques for stochastic optimization , 1988 .

[9]  Zhi-Quan Luo,et al.  On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks , 1991, Neural Computation.

[10]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[11]  Claude Lemaréchal,et al.  Convergence of some algorithms for convex minimization , 1993, Math. Program..

[12]  Luo Zhi-quan,et al.  Analysis of an approximate gradient projection method with applications to the backpropagation algorithm , 1994 .

[13]  Luigi Grippo,et al.  A class of unconstrained minimization methods for neural network training , 1994 .

[14]  O. Mangasarian,et al.  Serial and parallel backpropagation convergence via nonmonotone perturbed minimization , 1994 .

[15]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[16]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[17]  D. Bertsekas Gradient convergence in gradient methods , 1997 .

[18]  Dimitri P. Bertsekas,et al.  A New Class of Incremental Gradient Methods for Least Squares Problems , 1997, SIAM J. Optim..

[19]  Paul Tseng,et al.  An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule , 1998, SIAM J. Optim..

[20]  M. Caramanis,et al.  Efficient Lagrangian relaxation algorithms for industry size job-shop scheduling problems , 1998 .

[21]  Dimitri P. Bertsekas,et al.  Network optimization : continuous and discrete models , 1998 .

[22]  M. Solodov,et al.  Error Stability Properties of Generalized Gradient-Type Algorithms , 1998 .

[23]  Mikhail V. Solodov,et al.  Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero , 1998, Comput. Optim. Appl..

[24]  Jean-Louis Goffin,et al.  Convergence of a simple subgradient level method , 1999, Math. Program..

[25]  D. Bertsekas,et al.  Incremental subgradient methods for nondifferentiable optimization , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[26]  Vivek S. Borkar,et al.  Distributed Asynchronous Incremental Subgradient Methods , 2001 .

[27]  D. Bertsekas,et al.  Convergen e Rate of In remental Subgradient Algorithms , 2000 .

[28]  K. Kiwiel,et al.  Parallel Subgradient Methods for Convex Optimization , 2001 .

[29]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .