On the Complexity of Approximating Multimarginal Optimal Transport

We study the complexity of approximating the multimarginal optimal transport (MOT) distance, a generalization of the classical optimal transport distance, considered here between $m$ discrete probability distributions supported each on $n$ support points. First, we show that the standard linear programming (LP) representation of the MOT problem is not a minimum-cost flow problem when $m \geq 3$. This negative result implies that some combinatorial algorithms, e.g., network simplex method, are not suitable for approximating the MOT problem, while the worst-case complexity bound for the deterministic interior-point algorithm remains a quantity of $\tilde{O}(n^{3m})$. We then propose two simple and \textit{deterministic} algorithms for approximating the MOT problem. The first algorithm, which we refer to as \textit{multimarginal Sinkhorn} algorithm, is a provably efficient multimarginal generalization of the Sinkhorn algorithm. We show that it achieves a complexity bound of $\tilde{O}(m^3n^m\varepsilon^{-2})$ for a tolerance $\varepsilon \in (0, 1)$. This provides a first \textit{near-linear time} complexity bound guarantee for approximating the MOT problem and matches the best known complexity bound for the Sinkhorn algorithm in the classical OT setting when $m = 2$. The second algorithm, which we refer to as \textit{accelerated multimarginal Sinkhorn} algorithm, achieves the acceleration by incorporating an estimate sequence and the complexity bound is $\tilde{O}(m^3n^{m+1/3}\varepsilon^{-4/3})$. This bound is better than that of the first algorithm in terms of $1/\varepsilon$, and accelerated alternating minimization algorithm~\citep{Tupitsa-2020-Multimarginal} in terms of $n$. Finally, we compare our new algorithms with the commercial LP solver \textsc{Gurobi}. Preliminary results on synthetic data and real images demonstrate the effectiveness and efficiency of our algorithms.

[1]  Claude Berge,et al.  The Theory Of Graphs , 1962 .

[2]  M. Klein A Primal Method for Minimal Cost Flows with Applications to the Assignment and Transportation Problems , 1966 .

[3]  Richard Sinkhorn Diagonal equivalence to matrices with prescribed row and column sums. II , 1967 .

[4]  R. Dudley The Speed of Mean Glivenko-Cantelli Convergence , 1969 .

[5]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[6]  Richard M. Karp,et al.  Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems , 1972, Combinatorial Optimization.

[7]  Richard Sinkhorn Diagonal equivalence to matrices with prescribed row and column sums. II , 1974 .

[8]  Refael Hassin,et al.  The minimum cost flow problem: A unifying approach to dual algorithms and a new tree-search algorithm , 1983, Math. Program..

[9]  Éva Tardos,et al.  A strongly polynomial minimum cost circulation algorithm , 1985, Comb..

[10]  James B. Orlin,et al.  A faster strongly polynomial minimum cost flow algorithm , 1993, STOC '88.

[11]  Éva Tardos,et al.  An O(n2(m + Nlog n)log n) min-cost flow algorithm , 1988, JACM.

[12]  Andrew V. Goldberg,et al.  Finding minimum-cost circulations by canceling negative cycles , 1989, JACM.

[13]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[14]  Y. Brenier The least action principle and the related concept of generalized flows for incompressible perfect fluids , 1989 .

[15]  Andrew V. Goldberg,et al.  Finding Minimum-Cost Circulations by Successive Approximation , 1990, Math. Oper. Res..

[16]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[17]  Refael Hassin Algorithms for the minimum cost circulation problem based on maximizing the mean improvement , 1992, Oper. Res. Lett..

[18]  S. Thomas McCormick,et al.  Canceling most helpful total cuts for minimum cost network flow , 1993, Networks.

[19]  S. Thomas McCormick,et al.  Two Strongly Polynomial Cut Cancelling Algorithms for Minimum Cost Network Flow , 1993, Discret. Appl. Math..

[20]  James B. Orlin,et al.  A polynomial time primal network simplex algorithm for minimum cost flows , 1996, SODA '96.

[21]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[22]  Robert E. Tarjan,et al.  Dynamic trees as search trees via euler tours, applied to the network simplex algorithm , 1997, Math. Program..

[23]  W. Gangbo,et al.  Optimal maps for the multidimensional Monge-Kantorovich problem , 1998 .

[24]  Andrew V. Goldberg,et al.  Beyond the flow decomposition barrier , 1998, JACM.

[25]  Y. Brenier Minimal geodesics on groups of volume-preserving maps and generalized solutions of the Euler equations , 1999 .

[26]  C. Villani Topics in Optimal Transportation , 2003 .

[27]  I. Ekeland An optimal matching problem , 2003, math/0308206.

[28]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[29]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[30]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[31]  L. Kantorovich On the Translocation of Masses , 2006 .

[32]  P. Gori-Giorgi,et al.  Strictly correlated electrons in density-functional theory: A general formulation with applications to spherical densities , 2007, cond-mat/0701025.

[33]  P. Chiappori,et al.  Hedonic price equilibria, stable matching, and optimal transport: equivalence, topology, and uniqueness , 2007 .

[34]  Daniel A. Spielman,et al.  Faster approximate lossy generalized flow via interior point algorithms , 2008, STOC.

[35]  Y. Brenier Generalized solutions and hydrostatic approximation of the Euler equations , 2008 .

[36]  Bahman Kalantari,et al.  On the complexity of general matrix scaling and entropy minimization via the RAS algorithm , 2007, Math. Program..

[37]  G. Carlier,et al.  Matching for teams , 2010 .

[38]  Pradeep Ravikumar,et al.  Nearest Neighbor based Greedy Coordinate Descent , 2011, NIPS.

[39]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[40]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[41]  Codina Cotar,et al.  Density Functional Theory and Optimal Transportation with Coulomb Cost , 2011, 1104.0603.

[42]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[43]  Tommi S. Jaakkola,et al.  Convergence Rate Analysis of MAP Coordinate Minimization Algorithms , 2012, NIPS.

[44]  H. Soner,et al.  Robust Hedging and Martingale Optimal Transport in Continuous Time , 2012 .

[45]  G. Buttazzo,et al.  Optimal-transport formulation of electronic density-functional theory , 2012, 1205.4514.

[46]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[47]  A. Guillin,et al.  On the rate of convergence in Wasserstein distance of the empirical measure , 2013, 1312.2128.

[48]  Nizar Touzi,et al.  A Stochastic Control Approach to No-Arbitrage Bounds Given Marginals, with an Application to Lookback Options , 2013, 1401.3921.

[49]  Lin Lin,et al.  Kantorovich dual solution for strictly correlated electrons in atoms and molecules , 2012, 1210.7117.

[50]  A. Galichon,et al.  A stochastic control approach to no-arbitrage bounds given marginals, with an application to lookback options , 2014, 1401.3921.

[51]  Brendan Pass Multi-marginal optimal transport: theory and applications , 2014, 1406.0026.

[52]  Adam M. Oberman,et al.  NUMERICAL METHODS FOR MATCHING FOR TEAMS AND WASSERSTEIN BARYCENTERS , 2014, 1411.3602.

[53]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[54]  Yin Tat Lee,et al.  Path Finding Methods for Linear Programming: Solving Linear Programs in Õ(vrank) Iterations and Faster Algorithms for Maximum Flow , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[55]  Lin Xiao,et al.  On the complexity analysis of randomized block-coordinate descent methods , 2013, Mathematical Programming.

[56]  Mark W. Schmidt,et al.  Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection , 2015, ICML.

[57]  Peter Richtárik,et al.  Accelerated, Parallel, and Proximal Coordinate Descent , 2013, SIAM J. Optim..

[58]  Lin Xiao,et al.  An Accelerated Randomized Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization , 2015, SIAM J. Optim..

[59]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[60]  Steffen Borgwardt,et al.  Discrete Wasserstein barycenters: optimal transport for discrete data , 2015, Mathematical Methods of Operations Research.

[61]  Zeyuan Allen Zhu,et al.  Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling , 2015, ICML.

[62]  Jason Altschuler,et al.  Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration , 2017, NIPS.

[63]  Justin Solomon,et al.  Parallel Streaming Wasserstein Barycenters , 2017, NIPS.

[64]  Yurii Nesterov,et al.  Lectures on Convex Optimization , 2018 .

[65]  Aaron Sidford,et al.  Towards Optimal Running Times for Optimal Transport , 2018, ArXiv.

[66]  David B. Dunson,et al.  Scalable Bayes via Barycenter in Wasserstein Space , 2015, J. Mach. Learn. Res..

[67]  Le Hui,et al.  Unsupervised Multi-Domain Image Translation with Domain-Specific Encoders/Decoders , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[68]  Alexander Gasnikov,et al.  Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm , 2018, ICML.

[69]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[70]  Justin Solomon,et al.  Stochastic Wasserstein Barycenters , 2018, ICML.

[71]  Darina Dvinskikh,et al.  Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters , 2018, NeurIPS.

[72]  Jelena Diakonikolas,et al.  Alternating Randomized Block Coordinate Descent , 2018, ICML.

[73]  Gabriel Peyré,et al.  Semi-dual Regularized Optimal Transport , 2018, SIAM Rev..

[74]  Steve Oudot,et al.  Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport , 2018, NeurIPS.

[75]  Vahab S. Mirrokni,et al.  Accelerating Greedy Coordinate Descent Methods , 2018, ICML.

[76]  Michael I. Jordan,et al.  On the Acceleration of the Sinkhorn and Greenkhorn Algorithms for Optimal Transport , 2019, ArXiv.

[77]  Kent Quanrud,et al.  Approximating optimal transport with linear programs , 2018, SOSA.

[78]  F. Bach,et al.  Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance , 2017, Bernoulli.

[79]  Kevin Tian,et al.  A Direct Õ(1/ε) Iteration Parallel Algorithm for Optimal Transport , 2019, ArXiv.

[80]  Knut-Andreas Lie,et al.  Scale Space and Variational Methods in Computer Vision , 2019, Lecture Notes in Computer Science.

[81]  Yinyu Ye,et al.  Interior-Point Methods Strike Back: Solving the Wasserstein Barycenter Problem , 2019, NeurIPS.

[82]  Hongyuan Zha,et al.  A Fast Proximal Point Method for Computing Exact Wasserstein Distance , 2018, UAI.

[83]  Darina Dvinskikh,et al.  On the Complexity of Approximating Wasserstein Barycenters , 2019, ICML.

[84]  Michael I. Jordan,et al.  On the Efficiency of the Sinkhorn and Greenkhorn Algorithms and Their Acceleration for Optimal Transport , 2019 .

[85]  S. Guminov,et al.  Accelerated Alternating Minimization, Accelerated Sinkhorn's Algorithm and Accelerated Iterative Bregman Projections. , 2019 .

[86]  Jean-David Benamou,et al.  Generalized incompressible flows, multi-marginal transport and Sinkhorn algorithm , 2017, Numerische Mathematik.

[87]  Kevin Tian,et al.  A Direct tilde{O}(1/epsilon) Iteration Parallel Algorithm for Optimal Transport , 2019, NeurIPS.

[88]  Mingkui Tan,et al.  Multi-marginal Wasserstein GAN , 2019, NeurIPS.

[89]  Shiguang Shan,et al.  AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[90]  Nathaniel Lahn,et al.  A Graph Theoretic Additive Approximation of Optimal Transport , 2019, NeurIPS.

[91]  Marco Cuturi,et al.  Computational Optimal Transport: With Applications to Data Science , 2019 .

[92]  Gabriel Peyré,et al.  Sample Complexity of Sinkhorn Divergences , 2018, AISTATS.

[93]  Yin Tat Lee,et al.  Solving linear programs in the current matrix multiplication time , 2018, STOC.

[94]  Jonathan Weed,et al.  Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem , 2019, NeurIPS.

[95]  Michael I. Jordan,et al.  On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms , 2019, ICML.

[96]  Liang Mi,et al.  Multi-Marginal Optimal Transport Defines a Generalized Metric , 2020, ArXiv.

[97]  César A. Uribe,et al.  Multimarginal Optimal Transport by Accelerated Gradient Descent , 2020 .

[98]  Nhat Ho,et al.  On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm , 2020, ICML.

[99]  Pavel Dvurechensky,et al.  Multimarginal Optimal Transport by Accelerated Alternating Minimization , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[100]  Michael I. Jordan,et al.  Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast Algorithm , 2020, NeurIPS.

[101]  Jing Lei Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces , 2018, Bernoulli.

[102]  Michael I. Jordan,et al.  Revisiting Fixed Support Wasserstein Barycenter: Computational Hardness and Efficient Algorithms , 2020, ArXiv.

[103]  Enric Boix-Adsera,et al.  Wasserstein barycenters can be computed in polynomial time in fixed dimension , 2020, J. Mach. Learn. Res..

[104]  Nhat Ho,et al.  On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity , 2021, AISTATS.

[105]  Enric Boix-Adsera,et al.  Hardness results for Multimarginal Optimal Transport problems , 2020, Discret. Optim..

[106]  Andrew R. Teel,et al.  ESAIM: Control, Optimisation and Calculus of Variations , 2022 .