Highly smooth minimization of non-smooth problems

We establish improved rates for structured non-smooth optimization problems by means of nearoptimal higher-order accelerated methods. In particular, given access to a standard oracle model that provides a p order Taylor expansion of a smoothed version of the function, we show how to achieve ε-optimality for the original problem in Õp ( ε− 2p+2 3p+1 ) calls to the oracle. Furthermore, when p = 3, we provide an efficient implementation of the near-optimal accelerated scheme that achieves an O(ε−4/5) iteration complexity, where each iteration requires Õ(1) calls to a linear system solver. Thus, we go beyond the previous O(ε−1) barrier in terms of ε dependence, and in the case of `∞ regression and `1-SVM, we establish overall improvements for some parameter settings in the moderate-accuracy regime. Our results also lead to improved high-accuracy rates for minimizing a large class of convex quartic polynomials.

[1]  Tengyu Ma,et al.  Finding approximate local minima faster than gradient descent , 2016, STOC.

[2]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[3]  Yin Tat Lee,et al.  Solving linear programs in the current matrix multiplication time , 2018, STOC.

[4]  Richard Peng,et al.  Fast, Provably convergent IRLS Algorithm for p-norm Linear Regression , 2019, NeurIPS.

[5]  Renato D. C. Monteiro,et al.  An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and Its Implications to Second-Order Methods , 2013, SIAM J. Optim..

[6]  Richard Peng,et al.  Lp Row Sampling by Lewis Weights , 2015, STOC.

[7]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[8]  Yurii Nesterov,et al.  Dual extrapolation and its applications to solving variational inequalities and related problems , 2003, Math. Program..

[9]  Yin Tat Lee,et al.  Near-optimal method for highly smooth convex optimization , 2018, COLT.

[10]  Yin Tat Lee,et al.  Near Optimal Methods for Minimizing Convex Functions with Lipschitz $p$-th Derivatives , 2019, COLT.

[11]  D. Berend,et al.  IMPROVED BOUNDS ON BELL NUMBERS AND ON MOMENTS OF SUMS OF RANDOM VARIABLES , 2000 .

[12]  Marc Teboulle,et al.  Smoothing and First Order Methods: A Unified Framework , 2012, SIAM J. Optim..

[13]  Katta G. Murty,et al.  Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[14]  Zeyuan Allen Zhu,et al.  Optimal Black-Box Reductions Between Optimization Objectives , 2016, NIPS.

[15]  Pablo A. Parrilo,et al.  Minimizing Polynomial Functions , 2001, Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathematics and Computer Science.

[16]  Yin Tat Lee,et al.  Path Finding Methods for Linear Programming: Solving Linear Programs in Õ(vrank) Iterations and Faster Algorithms for Maximum Flow , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[17]  Marc Teboulle,et al.  A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications , 2017, Math. Oper. Res..

[18]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[19]  Yurii Nesterov,et al.  Relatively Smooth Convex Optimization by First-Order Methods, and Applications , 2016, SIAM J. Optim..

[20]  Adrian Vladu,et al.  Improved Convergence for and 1 Regression via Iteratively Reweighted Least Squares , 2019 .

[21]  Shang-Hua Teng,et al.  Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs , 2010, STOC '11.

[22]  Shuzhong Zhang,et al.  An Optimal High-Order Tensor Method for Convex Optimization , 2019, COLT.

[23]  Eduard A. Gorbunov,et al.  Optimal Tensor Methods in Smooth Convex and Uniformly ConvexOptimization , 2019, COLT.

[24]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[25]  Kevin Tian,et al.  Coordinate Methods for Accelerating ℓ∞ Regression and Faster Approximate Maximum Flow , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[26]  M. Baes Estimate sequence methods: extensions and approximations , 2009 .

[27]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[28]  Yin Tat Lee,et al.  An Almost-Linear-Time Algorithm for Approximate Max Flow in Undirected Graphs, and its Multicommodity Generalizations , 2013, SODA.

[29]  Yin Tat Lee,et al.  Acceleration with a Ball Optimization Oracle , 2020, NeurIPS.

[30]  Yin Tat Lee,et al.  An homotopy method for lp regression provably beyond self-concordance and in input-sparsity time , 2018, STOC.

[31]  Richard Peng,et al.  Higher-Order Accelerated Methods for Faster Non-Smooth Optimization , 2019, ArXiv.

[32]  Anirban Dasgupta,et al.  Sampling algorithms and coresets for ℓp regression , 2007, SODA '08.

[33]  Yurii Nesterov,et al.  Implementable tensor methods in unconstrained convex optimization , 2019, Mathematical Programming.

[34]  Brian Bullins,et al.  Fast minimization of structured convex quartics , 2018, 1812.10349.

[35]  John N. Tsitsiklis,et al.  NP-hardness of deciding convexity of quartic polynomials and related problems , 2010, Math. Program..

[36]  Gary L. Miller,et al.  Runtime guarantees for regression problems , 2011, ITCS '13.

[37]  Yurii Nesterov,et al.  Accelerating the cubic regularization of Newton’s method on convex problems , 2005, Math. Program..

[38]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[39]  Ohad Shamir,et al.  Oracle complexity of second-order methods for smooth convex optimization , 2017, Mathematical Programming.

[40]  Yurii Nesterov,et al.  Excessive Gap Technique in Nonsmooth Convex Minimization , 2005, SIAM J. Optim..

[41]  Olvi L. Mangasarian,et al.  Exact 1-Norm Support Vector Machines Via Unconstrained Convex Differentiable Minimization , 2006, J. Mach. Learn. Res..

[42]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..