Gradient Methods for Problems with Inexact Model of the Objective

We consider optimization methods for convex minimization problems under inexact information on the objective function. We introduce inexact model of the objective, which as a particular cases includes $(\delta,L)$ inexact oracle and relative smoothness condition. We analyze gradient method which uses this inexact model and obtain convergence rates for convex and strongly convex problems. To show potential applications of our general framework we consider three particular problems. The first one is clustering by electorial model introduced in [Nesterov, 2018]. The second one is approximating optimal transport distance, for which we propose a Proximal Sinkhorn algorithm. The third one is devoted to approximating optimal transport barycenter and we propose a Proximal Iterative Bregman Projections algorithm. We also illustrate the practical performance of our algorithms by numerical experiments.

[1]  Alexander Gasnikov,et al.  Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle , 2016, Journal of Optimization Theory and Applications.

[2]  Richard Sinkhorn Diagonal equivalence to matrices with prescribed row and column sums. II , 1967 .

[3]  Arthur Cayley,et al.  The Collected Mathematical Papers: On Monge's “Mémoire sur la théorie des déblais et des remblais” , 2009 .

[4]  Yurii Nesterov,et al.  Cubic regularization of Newton method and its global performance , 2006, Math. Program..

[5]  Darina Dvinskikh,et al.  Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters , 2018, NeurIPS.

[6]  V. Spokoiny,et al.  Construction of Non-asymptotic Confidence Sets in 2-Wasserstein Space , 2017, 1703.03658.

[7]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[8]  Yurii Nesterov Soft clustering by convex electoral model , 2020, Soft Comput..

[9]  Marc Teboulle,et al.  Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions , 1993, SIAM J. Optim..

[10]  Yurii Nesterov,et al.  First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.

[11]  Alexey Kroshnin,et al.  Statistical inference for Bures–Wasserstein barycenters , 2019, The Annals of Applied Probability.

[12]  P. Dvurechensky,et al.  Universal intermediate gradient method for convex problems with inexact oracle , 2017, Optim. Methods Softw..

[13]  Jérémie Bigot,et al.  Consistent estimation of a population barycenter in the Wasserstein space , 2013 .

[14]  Eduard A. Gorbunov,et al.  An Accelerated Directional Derivative Method for Smooth Stochastic Convex Optimization , 2018, Eur. J. Oper. Res..

[15]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[16]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[17]  A. Gasnikov Universal gradient descent , 2017, 1711.00394.

[18]  Alexey Chernov,et al.  Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints , 2016, DOOR.

[19]  Leonidas J. Guibas,et al.  Wasserstein Propagation for Semi-Supervised Learning , 2014, ICML.

[20]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[21]  Aaron Sidford,et al.  Towards Optimal Running Times for Optimal Transport , 2018, ArXiv.

[22]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[23]  Alexander Gasnikov,et al.  Fast gradient descent method for convex optimization problems with an oracle that generates a $(\delta,L)$-model of a function in a requested point , 2017, 1711.02747.

[24]  E. Barrio,et al.  A statistical analysis of a deformation model with Wasserstein barycenters : estimation procedure and goodness of fit test , 2015, 1508.06465.

[25]  L. Kantorovich On the Translocation of Masses , 2006 .

[26]  Thibaut Le Gouic,et al.  Existence and consistency of Wasserstein barycenters , 2015, Probability Theory and Related Fields.

[27]  Yurii Nesterov,et al.  Implementable tensor methods in unconstrained convex optimization , 2019, Mathematical Programming.

[28]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[29]  Alexander Gasnikov,et al.  Universal method with inexact oracle and its applications for searching equillibriums in multistage transport problems , 2015 .

[30]  Richard Sinkhorn Diagonal equivalence to matrices with prescribed row and column sums. II , 1974 .

[31]  Jason Altschuler,et al.  Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration , 2017, NIPS.

[32]  Amir Beck,et al.  On the Convergence of Alternating Minimization for Convex Programming with Applications to Iteratively Reweighted Least Squares and Decomposition Schemes , 2015, SIAM J. Optim..

[33]  Anton Rodomanov,et al.  Primal-Dual Method for Searching Equilibrium in Hierarchical Congestion Population Games , 2016, DOOR.

[34]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[35]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[36]  Alexander Gasnikov,et al.  Randomized Similar Triangles Method: A Unifying Framework for Accelerated Randomized Optimization Methods (Coordinate Descent, Directional Search, Derivative-Free Method) , 2017, ArXiv.

[37]  Alexander Gasnikov,et al.  Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm , 2018, ICML.

[38]  Yurii Nesterov,et al.  Relatively Smooth Convex Optimization by First-Order Methods, and Applications , 2016, SIAM J. Optim..

[39]  Yurii Nesterov,et al.  Universal gradient methods for convex optimization problems , 2015, Math. Program..

[40]  Kent Quanrud,et al.  Approximating optimal transport with linear programs , 2018, SOSA.

[41]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[42]  P. Dvurechensky,et al.  Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints , 2017 .

[43]  Eduard A. Gorbunov,et al.  An Accelerated Method for Derivative-Free Smooth Stochastic Convex Optimization , 2018, SIAM J. Optim..

[44]  P. Dvurechensky Gradient Method With Inexact Oracle for Composite Non-Convex Optimization , 2017, 1703.09180.

[45]  Angelia Nedic,et al.  Distributed Computation of Wasserstein Barycenters Over Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[46]  Dmitriy Drusvyatskiy,et al.  Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria , 2016, Mathematical Programming.

[47]  Alexander Gasnikov,et al.  Inexact model: a framework for optimization and variational inequalities , 2019, Optim. Methods Softw..

[48]  Julien Mairal,et al.  Optimization with First-Order Surrogate Functions , 2013, ICML.

[49]  Gleb Gusev,et al.  Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods , 2016, NIPS.

[50]  Alexander Gasnikov,et al.  Primal–dual accelerated gradient methods with small-dimensional relaxation oracle , 2018, Optim. Methods Softw..

[51]  J. Lorenz,et al.  On the scaling of multidimensional matrices , 1989 .

[52]  Y. Nesterov,et al.  First-order methods with inexact oracle: the strongly convex case , 2013 .

[53]  Alexandre d'Aspremont,et al.  Smooth Optimization with Approximate Gradient , 2005, SIAM J. Optim..

[54]  P. Dvurechensky,et al.  Generalized Mirror Prox: Solving Variational Inequalities with Monotone Operator, Inexact Oracle, and Unknown H\"older Parameters , 2018 .

[55]  Alessandro Rudi,et al.  Approximating the Quadratic Transportation Metric in Near-Linear Time , 2018, ArXiv.

[56]  Mohamed-Jalal Fadili,et al.  Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms , 2017, Journal of Optimization Theory and Applications.

[57]  Nicholas I. M. Gould,et al.  Improved second-order evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, ArXiv.

[58]  Yin Tat Lee,et al.  Path Finding Methods for Linear Programming: Solving Linear Programs in Õ(vrank) Iterations and Faster Algorithms for Maximum Flow , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[59]  Michael Werman,et al.  Fast and robust Earth Mover's Distances , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[60]  Peter Richtárik,et al.  Inexact Coordinate Descent: Complexity and Preconditioning , 2013, J. Optim. Theory Appl..

[61]  Coralia Cartis,et al.  A concise second-order complexity analysis for unconstrained optimization using high-order regularized models , 2020, Optim. Methods Softw..

[62]  Bernhard Schmitzer,et al.  Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems , 2016, SIAM J. Sci. Comput..

[63]  Sergey Omelchenko,et al.  A Stable Alternative to Sinkhorn's Algorithm for Regularized Optimal Transport , 2017, MOTOR.

[64]  Michael Cohen,et al.  On Acceleration with Noise-Corrupted Gradients , 2018, ICML.

[65]  Darina Dvinskikh,et al.  On the Complexity of Approximating Wasserstein Barycenter , 2019, ArXiv.