Distributed Computation of Wasserstein Barycenters Over Networks

We propose a new class-optimal algorithm for the distributed computation of Wasserstein Barycenters over networks. Assuming that each node in a graph has a probability distribution, we prove that every node reaches the barycenter of all distributions held in the network by using local interactions compliant with the topology of the graph. We provide an estimate for the minimum number of communication rounds required for the proposed method to achieve arbitrary relative precision both in the optimality of the solution and the consensus among all agents for undirected fixed networks.

[1]  M. Fréchet Les éléments aléatoires de nature quelconque dans un espace distancié , 1948 .

[2]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[5]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[6]  L. Kantorovich On the Translocation of Masses , 2006 .

[7]  A. Banerjee Convex Analysis and Optimization , 2006 .

[8]  C. Villani Optimal Transport: Old and New , 2008 .

[9]  Arthur Cayley,et al.  The Collected Mathematical Papers: On Monge's “Mémoire sur la théorie des déblais et des remblais” , 2009 .

[10]  S. Kakade,et al.  On the duality of strong convexity and strong smoothness : Learning applications and matrix regularization , 2009 .

[11]  M. Beiglböck,et al.  Model-independent bounds for option prices—a mass transport approach , 2011, Finance and Stochastics.

[12]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[13]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[14]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[15]  G. Buttazzo,et al.  Optimal-transport formulation of electronic density-functional theory , 2012, 1205.4514.

[16]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[17]  A. Doucet,et al.  Distributed nonlinear consensus in the space of probability measures , 2014 .

[18]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[19]  Hossein Mobahi,et al.  Learning with a Wasserstein Loss , 2015, NIPS.

[20]  Volkan Cevher,et al.  WASP: Scalable Bayes via barycenters of subset posteriors , 2015, AISTATS.

[21]  Anton Rodomanov,et al.  Primal-Dual Method for Searching Equilibrium in Hierarchical Congestion Population Games , 2016, DOOR.

[22]  Gabriel Peyré,et al.  A Smoothed Dual Approach for Variational Wasserstein Problems , 2015, SIAM J. Imaging Sci..

[23]  Jérémie Bigot,et al.  Regularization of barycenters in the Wasserstein space , 2016 .

[24]  Steffen Borgwardt,et al.  Discrete Wasserstein barycenters: optimal transport for discrete data , 2015, Mathematical Methods of Operations Research.

[25]  Gabriel Peyré,et al.  Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.

[26]  Y. Nesterov,et al.  Efficient numerical methods for entropy-linear programming problems , 2016, Computational Mathematics and Mathematical Physics.

[27]  Alexey Chernov,et al.  Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints , 2016, DOOR.

[28]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[29]  Avi Wigderson,et al.  Much Faster Algorithms for Matrix Scaling , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[30]  Angelia Nedić,et al.  Fast Convergence Rates for Distributed Non-Bayesian Learning , 2015, IEEE Transactions on Automatic Control.

[31]  Justin Solomon Computational Optimal Transport , 2017 .

[32]  Angelia Nedic,et al.  Distributed Learning for Cooperative Inference , 2017, ArXiv.

[33]  Justin Solomon,et al.  Parallel Streaming Wasserstein Barycenters , 2017, NIPS.

[34]  Zeyuan Allen Zhu,et al.  Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent , 2014, ITCS.

[35]  P. Dvurechensky,et al.  Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints , 2017 .

[36]  Angelia Nedic,et al.  Optimal Algorithms for Distributed Optimization , 2017, ArXiv.

[37]  Nicolas Courty,et al.  Large Scale Optimal Transport and Mapping Estimation , 2017, ICLR.

[38]  Alexander Gasnikov,et al.  Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm , 2018, ICML.

[39]  Justin Solomon,et al.  Stochastic Wasserstein Barycenters , 2018, ICML.

[40]  Darina Dvinskikh,et al.  Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters , 2018, NeurIPS.

[41]  Bruno Lévy,et al.  Notions of optimal transport theory and how to implement them on a computer , 2017, Comput. Graph..

[42]  Vivien Seguy,et al.  Smooth and Sparse Optimal Transport , 2017, AISTATS.

[43]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[44]  Yi Zhou,et al.  Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.

[45]  Sergey Omelchenko,et al.  A Stable Alternative to Sinkhorn's Algorithm for Regularized Optimal Transport , 2017, MOTOR.