A Stochastic Subgradient Method for Nonsmooth Nonconvex Multilevel Composition Optimization

We propose a single time-scale stochastic subgradient method for constrained optimization of a composition of several nonsmooth and nonconvex functions. The functions are assumed to be locally Lipschitz and differentiable in a generalized sense. Only stochastic estimates of the values and generalized derivatives of the functions are used. The method is parameter-free. We prove convergence with probability one of the method, by associating with it a system of differential inclusions and devising a nondifferentiable Lyapunov function for this system. For problems with functions having Lipschitz continuous derivatives, the method finds a point satisfying an optimality measure with error of order $1/\sqrt{N}$, after executing $N$ iterations with constant stepsize.

[1]  Haim Brezis,et al.  Monotonicity Methods in Hilbert Spaces and Some Applications to Nonlinear Partial Differential Equations , 1971 .

[2]  Yu. M. Yermol'yev A General Stochastic Programming Problem , 1971 .

[3]  F. Clarke Generalized gradients and applications , 1975 .

[4]  Lennart Ljung,et al.  Analysis of recursive stochastic algorithms , 1977 .

[5]  R. Mifflin Semismooth and Semiconvex Functions in Constrained Optimization , 1977 .

[6]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[7]  Vladimir I. Norkin,et al.  Generalized-differentiable functions , 1980 .

[8]  F. Clarke Generalized gradients of Lipschitz functionals , 1981 .

[9]  Andrzej Ruszczyński,et al.  A method of feasible directions for solving nonsmooth stochastic programming problems , 1986 .

[10]  Andrzej Ruszczynski,et al.  A Linearization Method for Nonsmooth Stochastic Programming Problems , 1987, Math. Oper. Res..

[11]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[12]  Philippe Artzner,et al.  Coherent Measures of Risk , 1999 .

[13]  Wlodzimierz Ogryczak,et al.  From stochastic dominance to mean-risk models: Semideviations as risk measures , 1999, Eur. J. Oper. Res..

[14]  Wlodzimierz Ogryczak,et al.  On consistency of stochastic dominance and mean–semideviation models , 2001, Math. Program..

[15]  H. Föllmer,et al.  Stochastic Finance: An Introduction in Discrete Time , 2002 .

[16]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..

[17]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..

[18]  Adrian S. Lewis,et al.  Clarke Subgradients of Stratifiable Functions , 2006, SIAM J. Optim..

[19]  V. Borkar Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .

[20]  Alexander Shapiro,et al.  Lectures on Stochastic Programming: Modeling and Theory , 2009 .

[21]  Angelia Nedic,et al.  Regularized Iterative Stochastic Approximation Methods for Stochastic Variational Inequality Problems , 2013, IEEE Transactions on Automatic Control.

[22]  Yuri M. Ermoliev,et al.  Sample Average Approximation Method for Compound Stochastic Optimization Problems , 2013, SIAM J. Optim..

[23]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[24]  Dmitriy Drusvyatskiy,et al.  Curves of Descent , 2012, SIAM J. Control. Optim..

[25]  A. Ruszczynski,et al.  Statistical estimation of composite risk functionals and risk optimization problems , 2015, 1504.02658.

[26]  Mengdi Wang,et al.  Accelerating Stochastic Composition Optimization , 2016, NIPS.

[27]  Mengdi Wang,et al.  Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions , 2014, Mathematical Programming.

[28]  Alfredo N. Iusem,et al.  Extragradient Method with Variance Reduction for Stochastic Variational Inequalities , 2017, SIAM J. Optim..

[29]  É. Moulines,et al.  Analysis of nonsmooth stochastic approximation: the differential inclusion approach , 2018, 1805.01916.

[30]  Feng Ruan,et al.  Stochastic Methods for Composite and Weakly Convex Optimization Problems , 2017, SIAM J. Optim..

[31]  Dmitriy Drusvyatskiy,et al.  Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..

[32]  Mengdi Wang,et al.  Multilevel Stochastic Gradient Methods for Nested Composition Optimization , 2018, SIAM J. Optim..

[33]  Andrzej Ruszczynski,et al.  Convergence of a stochastic subgradient method with averaging for nonsmooth nonconvex constrained optimization , 2019, Optimization Letters.

[34]  Dmitriy Drusvyatskiy,et al.  Stochastic Subgradient Method Converges on Tame Functions , 2018, Foundations of Computational Mathematics.

[35]  Saeed Ghadimi,et al.  A Single Timescale Stochastic Approximation Method for Nested Stochastic Optimization , 2018, SIAM J. Optim..