Streaming computation of optimal weak transport barycenters

We introduce the weak barycenter of a family of probability distributions, based on the recently developed notion of optimal weak transport of measures [22]. We provide a theoretical analysis of the weak barycenter and its relationship to the classic Wasserstein barycenter, and discuss its meaning in the light of convex ordering between probability measures. In particular, we argue that, rather than averaging the information of the input distributions as done by the usual optimal transport barycenters, weak barycenters contain geometric information shared across all input distributions, which can be interpreted as a latent random variable affecting all the measures. We also provide iterative algorithms to compute a weak barycenter for either finite or infinite families of arbitrary measures (with finite moments of order 2), which are particularly well suited for the streaming setting, i.e., when measures arrive sequentially. In particular, our streaming computation of weak barycenters does not require to smooth empirical measures or to define a common grid for them, as some of the previous approaches to Wasserstin barycenters do. The concept of weak barycenter and our computation approaches are illustrated on synthetic examples, validated on 2D real-world data and compared to the classical Wasserstein barycenters.

[1]  Justin Solomon,et al.  Continuous Regularized Wasserstein Barycenters , 2020, NeurIPS.

[2]  Julien Guyon,et al.  Nonlinear Option Pricing , 2013 .

[3]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[4]  Felipe A. Tobar,et al.  Bayesian Learning with Wasserstein Barycenters , 2018, ESAIM: Probability and Statistics.

[5]  D. Dvinskikh Stochastic Approximation versus Sample Average Approximation for population Wasserstein barycenters. , 2020, 2001.07697.

[6]  C. Villani Optimal Transport: Old and New , 2008 .

[7]  D. A. Edwards On the existence of probability measures with given marginals , 1978 .

[8]  C. Villani Topics in Optimal Transportation , 2003 .

[9]  James Zijun Wang,et al.  Fast Discrete Distribution Clustering Using Wasserstein Barycenter With Sparse Support , 2015, IEEE Transactions on Signal Processing.

[10]  Nicolas Papadakis,et al.  Regularized Optimal Transport and the Rot Mover's Distance , 2016, J. Mach. Learn. Res..

[11]  Julio D. Backhoff Veraguas,et al.  Existence, duality, and cyclical monotonicity for weak transport costs , 2018, Calculus of Variations and Partial Differential Equations.

[12]  Ryan R Brinkman,et al.  Per‐channel basis normalization methods for flow cytometry data , 2009, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[13]  Nicolas Juillet,et al.  On a mixture of Brenier and Strassen Theorems , 2018, Proceedings of the London Mathematical Society.

[14]  Justin Solomon,et al.  Parallel Streaming Wasserstein Barycenters , 2017, NIPS.

[15]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[16]  J. A. Cuesta-Albertos,et al.  A fixed-point approach to barycenters in Wasserstein space , 2015, 1511.05355.

[17]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[18]  X. Nguyen Convergence of latent mixing measures in finite and infinite mixture models , 2011, 1109.3250.

[19]  J'er'emie Bigot,et al.  Data-driven regularization of Wasserstein barycenters with an application to multivariate density registration , 2018, Information and Inference: A Journal of the IMA.

[20]  R. P. Kertz,et al.  Complete lattices of probability measures with applications to martingale theory , 2000 .

[21]  Enac,et al.  Characterization of barycenters in the Wasserstein space by averaging optimal transport maps , 2012, 1212.2562.

[22]  Thibaut Le Gouic,et al.  Existence and consistency of Wasserstein barycenters , 2015, Probability Theory and Related Fields.

[23]  Nicolas Courty,et al.  Large Scale Optimal Transport and Mapping Estimation , 2017, ICLR.

[24]  Victor M. Panaretos,et al.  Fréchet means and Procrustes analysis in Wasserstein space , 2017, Bernoulli.

[25]  M. Beiglböck,et al.  Model-independent bounds for option prices—a mass transport approach , 2011, Finance and Stochastics.

[26]  E. Feinberg,et al.  Fatou's Lemma in Its Classical Form and Lebesgue's Convergence Theorems for Varying Measures with Applications to Markov Decision Processes , 2020 .

[27]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[28]  Aurélien Alfonsi,et al.  Sampling of Probability Measures in the Convex Order and Approximation of Martingale Optimal Transport Problems , 2017 .

[29]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[30]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[31]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[32]  Tyler Maunu,et al.  Gradient descent algorithms for Bures-Wasserstein barycenters , 2020, COLT.

[33]  Paul-Marie Samson,et al.  Kantorovich duality for general transport costs and applications , 2014, 1412.7480.

[34]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[35]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[36]  M. Beiglbock,et al.  Weak monotone rearrangement on the line , 2019, Electronic Communications in Probability.

[37]  Michael W. Botsko,et al.  An Elementary Proof of Lebesgue's Differentiation Theorem , 2003, Am. Math. Mon..