Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging: even when the objective is smooth at every iterate, the corresponding local models are unstable, and traditional remedies need unpredictably many cutting planes. We instead propose a multipoint generalization of the gradient descent iteration for local optimization. While designed with general objectives in mind, we are motivated by a “maxof-smooth” model that captures subdifferential dimension at optimality. We prove linear convergence when the objective is itself max-of-smooth, and experiments suggest a more general phenomenon.

[1]  Warren Hare,et al.  A derivative-free approximate gradient sampling algorithm for finite minimax problems , 2013, Comput. Optim. Appl..

[2]  Robert Mifflin,et al.  A -algorithm for convex minimization , 2005, Math. Program..

[3]  Yonatan Wexler,et al.  Minimizing the Maximal Loss: How and Why , 2016, ICML.

[4]  Adrian S. Lewis,et al.  Active Sets, Nonsmoothness, and Sensitivity , 2002, SIAM J. Optim..

[5]  Claudia A. Sagastizábal,et al.  Incremental-like bundle methods with application to energy planning , 2010, Comput. Optim. Appl..

[6]  Yair Carmon,et al.  Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss , 2021, COLT.

[7]  Robert Mifflin,et al.  A Science Fiction Story in Nonsmooth Optimization Originating at IIASA , 2012 .

[8]  Claudia Sagastizábal,et al.  BUNDLE METHODS IN THE XXIst CENTURY: A BIRD'S-EYE VIEW , 2014 .

[9]  R. Henrion,et al.  Joint chance constrained programming for hydro reservoir management , 2013 .

[10]  Alexander J. Smola,et al.  Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[11]  Jiaming Liang,et al.  A unified analysis of a class of proximal bundle methods for solving hybrid convex composite optimization problems , 2021 .

[12]  J. Borwein,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[13]  Claudia A. Sagastizábal,et al.  Divide to conquer: decomposition methods for energy optimization , 2012, Mathematical Programming.

[14]  K. Kiwiel Efficiency of Proximal Bundle Methods , 2000 .

[15]  Luís Nunes Vicente,et al.  Trust-Region Methods Without Using Derivatives: Worst Case Complexity and the NonSmooth Case , 2016, SIAM J. Optim..

[16]  W. Hachem,et al.  Convergence of Constant Step Stochastic Gradient Descent for Non-Smooth Non-Convex Functions , 2020, Set-Valued and Variational Analysis.

[17]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[18]  Philip Wolfe,et al.  Note on a method of conjugate subgradients for minimizing nondifferentiable functions , 1974, Math. Program..

[19]  Amir Beck,et al.  First-Order Methods in Optimization , 2017 .

[20]  Jiaming Liang,et al.  A Proximal Bundle Variant with Optimal Iteration-Complexity for a Large Range of Prox Stepsizes , 2021, SIAM Journal on Optimization.

[21]  A. Juditsky,et al.  5 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , I : General Purpose Methods , 2010 .

[22]  C. Lemaréchal An extension of davidon methods to non differentiable problems , 1975 .

[23]  J. Spingarn Submonotone subdifferentials of Lipschitz functions , 1981 .

[24]  David P. Woodruff,et al.  Sublinear Optimization for Machine Learning , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[25]  Benjamin Grimmer,et al.  Optimal Convergence Rates for the Proximal Bundle Method , 2021 .

[26]  M. Overton NONSMOOTH OPTIMIZATION VIA BFGS , 2008 .

[27]  A. Borghetti,et al.  Lagrangian Heuristics Based on Disaggregated Bundle Methods for Hydrothermal Unit Commitment , 2002, IEEE Power Engineering Review.

[28]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[29]  Nathan Srebro,et al.  Beating SGD: Learning SVMs in Sublinear Time , 2011, NIPS.

[30]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[31]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[32]  Yu Du,et al.  Rate of Convergence of the Bundle Method , 2016, Journal of Optimization Theory and Applications.

[33]  C. Lemaréchal,et al.  THE U -LAGRANGIAN OF A CONVEX FUNCTION , 1996 .

[34]  Stefan M. Wild,et al.  Manifold Sampling for Optimization of Nonconvex Functions That Are Piecewise Linear Compositions of Smooth Components , 2018, SIAM J. Optim..

[35]  Aris Daniilidis,et al.  Approximate convexity and submonotonicity , 2004 .