Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters

We study the decentralized distributed computation of discrete approximations for the regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network. We assume there is a network of agents/machines/computers, and each agent holds a private continuous probability measure and seeks to compute the barycenter of all the measures in the network by getting samples from its local measure and exchanging information with its neighbors. Motivated by this problem, we develop, and analyze, a novel accelerated primal-dual stochastic gradient method for general stochastic convex optimization problems with linear equality constraints. Then, we apply this method to the decentralized distributed optimization setting to obtain a new algorithm for the distributed semi-discrete regularized Wasserstein barycenter problem. Moreover, we show explicit non-asymptotic complexity for the proposed algorithm.

[1]  Alexander J. Smola,et al.  Neural Information Processing Systems , 1997, NIPS 1997.

[2]  E. Giné,et al.  Central limit theorems for the wasserstein distance between the empirical and the true distributions , 1999 .

[3]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[4]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[5]  L. Kantorovich On the Translocation of Masses , 2006 .

[6]  Panos J. Antsaklis,et al.  Networked Embedded Sensing and Control: Workshop NESC'05: University of Notre Dame, USAOctober 2005 Proceedings (Lecture Notes in Control and Information Sciences) , 2006 .

[7]  J. Shamma,et al.  Belief consensus and distributed hypothesis testing in sensor networks , 2006 .

[8]  A. Juditsky,et al.  Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[9]  Arthur Cayley,et al.  The Collected Mathematical Papers: On Monge's “Mémoire sur la théorie des déblais et des remblais” , 2009 .

[10]  Michael Lindenbaum,et al.  Nonnegative Matrix Factorization with Earth Mover's Distance Metric for Image Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[12]  Guanghui Lan,et al.  An optimal method for stochastic composite optimization , 2011, Mathematical Programming.

[13]  G. Buttazzo,et al.  Optimal-transport formulation of electronic density-functional theory , 2012, 1205.4514.

[14]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[15]  Mathias Beiglböck,et al.  Model-independent bounds for option prices—a mass transport approach , 2011, Finance Stochastics.

[16]  Leonidas J. Guibas,et al.  Wasserstein Propagation for Semi-Supervised Learning , 2014, ICML.

[17]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[18]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[19]  Anton Rodomanov,et al.  Primal-Dual Method for Searching Equilibrium in Hierarchical Congestion Population Games , 2016, DOOR.

[20]  Gabriel Peyré,et al.  A Smoothed Dual Approach for Variational Wasserstein Problems , 2015, SIAM J. Imaging Sci..

[21]  Gabriel Peyré,et al.  Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.

[22]  Y. Nesterov,et al.  Efficient numerical methods for entropy-linear programming problems , 2016, Computational Mathematics and Mathematical Physics.

[23]  Alexey Chernov,et al.  Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints , 2016, DOOR.

[24]  Alexander Gasnikov,et al.  Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle , 2016, Journal of Optimization Theory and Applications.

[25]  Victor M. Panaretos,et al.  Amplitude and phase variation of point processes , 2016, 1603.08691.

[26]  Dinh Q. Phung,et al.  Multilevel Clustering via Wasserstein Means , 2017, ICML.

[27]  Jérémie Bigot,et al.  Geodesic PCA in the Wasserstein space by Convex PCA , 2017 .

[28]  Wei Shi,et al.  Geometrically convergent distributed optimization with uncoordinated step-sizes , 2016, 2017 American Control Conference (ACC).

[29]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[30]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[31]  V. Spokoiny,et al.  Construction of Non-asymptotic Confidence Sets in 2-Wasserstein Space , 2017, 1703.03658.

[32]  Angelia Nedić,et al.  Fast Convergence Rates for Distributed Non-Bayesian Learning , 2015, IEEE Transactions on Automatic Control.

[33]  Angelia Nedic,et al.  Distributed Learning for Cooperative Inference , 2017, ArXiv.

[34]  Gustavo K. Rohde,et al.  Optimal Mass Transport: Signal processing and machine-learning applications , 2017, IEEE Signal Processing Magazine.

[35]  Justin Solomon,et al.  Parallel Streaming Wasserstein Barycenters , 2017, NIPS.

[36]  Arkadi Nemirovski,et al.  Non-asymptotic confidence bounds for the optimal value of a stochastic program , 2016, Optim. Methods Softw..

[37]  Laurent Massoulié,et al.  Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks , 2017, ICML.

[38]  P. Dvurechensky,et al.  Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints , 2017 .

[39]  Angelia Nedic,et al.  Optimal Algorithms for Distributed Optimization , 2017, ArXiv.

[40]  Alexander Gasnikov,et al.  Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm , 2018, ICML.

[41]  Justin Solomon,et al.  Stochastic Wasserstein Barycenters , 2018, ICML.

[42]  Angelia Nedic,et al.  Distributed Computation of Wasserstein Barycenters Over Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[43]  Yi Zhou,et al.  Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.

[44]  Sergey Omelchenko,et al.  A Stable Alternative to Sinkhorn's Algorithm for Regularized Optimal Transport , 2017, MOTOR.