Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form

Although optimal transport (OT) problems admit closed form solutions in a very few notable cases, e.g. in 1D or between Gaussians, these closed forms have proved extremely fecund for practitioners to define tools inspired from the OT geometry. On the other hand, the numerical resolution of OT problems using entropic regularization has given rise to many applications, but because there are no known closed-form solutions for entropic regularized OT problems, these approaches are mostly algorithmic, not informed by elegant closed forms. In this paper, we propose to fill the void at the intersection between these two schools of thought in OT by proving that the entropy-regularized optimal transport problem between two Gaussian measures admits a closed form. Contrary to the unregularized case, for which the explicit form is given by the Wasserstein-Bures distance, the closed form we obtain is differentiable everywhere, even for Gaussians with degenerate covariance matrices. We obtain this closed form solution by solving the fixed-point equation behind Sinkhorn's algorithm, the default method for computing entropic regularized OT. Remarkably, this approach extends to the generalized unbalanced case -- where Gaussian measures are scaled by positive constants. This extension leads to a closed form expression for unbalanced Gaussians as well, and highlights the mass transportation / destruction trade-off seen in unbalanced optimal transport. Moreover, in both settings, we show that the optimal transportation plans are (scaled) Gaussians and provide analytical formulas of their parameters. These formulas constitute the first non-trivial closed forms for entropy-regularized optimal transport, thus providing a ground truth for the analysis of entropic OT and Sinkhorn's algorithm.

[1]  D. Bures An extension of Kakutani’s theorem on infinite product measures to the tensor product of semifinite *-algebras , 1969 .

[2]  D. Dowson,et al.  The Fréchet distance between multivariate normal distributions , 1982 .

[3]  M. Gelbrich On a Formula for the L2 Wasserstein Metric between Measures on Euclidean and Hilbert Spaces , 1990 .

[4]  J. Benamou NUMERICAL RESOLUTION OF AN \UNBALANCED" MASS TRANSPORT PROBLEM , 2003 .

[5]  R. Bhatia Positive Definite Matrices , 2007 .

[6]  David W. Jacobs,et al.  Approximate earth mover’s distance in linear time , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  N. Higham Functions of Matrices: Theory and Computation (Other Titles in Applied Mathematics) , 2008 .

[8]  C. Villani Optimal Transport: Old and New , 2008 .

[9]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[10]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[11]  S. Dereich,et al.  Constructive quantization: Approximation by empirical measures , 2011, 1108.5346.

[12]  Asuka Takatsu Wasserstein geometry of Gaussian measures , 2011 .

[13]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[14]  A. Guillin,et al.  On the rate of convergence in Wasserstein distance of the empirical measure , 2013, 1312.2128.

[15]  A. Galichon,et al.  Matching in closed-form: equilibrium, identification, and comparative statics , 2015, 2102.04295.

[16]  Julien Rabin,et al.  Sliced and Radon Wasserstein Barycenters of Measures , 2014, Journal of Mathematical Imaging and Vision.

[17]  F. Santambrogio Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .

[18]  Giuseppe Savaré,et al.  Optimal Entropy-Transport problems and a new Hellinger–Kantorovich distance between positive measures , 2015, 1508.07941.

[19]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[20]  Hossein Mobahi,et al.  Learning with a Wasserstein Loss , 2015, NIPS.

[21]  Tryphon T. Georgiou,et al.  On the Relation Between Optimal Transport and Schrödinger Bridges: A Stochastic Control Viewpoint , 2014, J. Optim. Theory Appl..

[22]  Gabriel Peyré,et al.  A Smoothed Dual Approach for Variational Wasserstein Problems , 2015, SIAM J. Imaging Sci..

[23]  Alexander Mielke,et al.  Optimal Transport in Competition with Reaction: The Hellinger-Kantorovich Distance and Geodesic Curves , 2015, SIAM J. Math. Anal..

[24]  Tryphon T. Georgiou,et al.  Optimal Steering of a Linear Stochastic System to a Final Probability Distribution, Part I , 2016, IEEE Transactions on Automatic Control.

[25]  Gabriel Peyré,et al.  Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.

[26]  Marco Cuturi,et al.  On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests , 2015, Entropy.

[27]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[28]  A. Figalli The Monge-ampere Equation and Its Applications , 2017 .

[29]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[30]  François-Xavier Vialard,et al.  Scaling algorithms for unbalanced optimal transport problems , 2017, Math. Comput..

[31]  François-Xavier Vialard,et al.  An Interpolating Distance Between Optimal Transport and Fisher–Rao Metrics , 2010, Foundations of Computational Mathematics.

[32]  Marco Cuturi,et al.  Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions , 2018, NeurIPS.

[33]  Luigi Malagò,et al.  Wasserstein Riemannian Geometry of Positive Definite Matrices , 2018, 1801.09269.

[34]  Tryphon T. Georgiou,et al.  Optimal Steering of a Linear Stochastic System to a Final Probability Distribution—Part III , 2014, IEEE Transactions on Automatic Control.

[35]  Gabriel Peyré,et al.  Learning Generative Models with Sinkhorn Divergences , 2017, AISTATS.

[36]  Marco Cuturi,et al.  Subspace Robust Wasserstein distances , 2019, ICML.

[37]  Tryphon T. Georgiou,et al.  Optimal Transport for Gaussian Mixture Models , 2017, IEEE Access.

[38]  R. Bhatia,et al.  On the Bures–Wasserstein distance between positive definite matrices , 2017, Expositiones Mathematicae.

[39]  Massimiliano Pontil,et al.  Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm , 2019, NeurIPS.

[40]  P. Gori-Giorgi,et al.  Kinetic Correlation Functionals from the Entropic Regularization of the Strictly Correlated Electrons Problem , 2019, Journal of chemical theory and computation.

[41]  Jean Feydy,et al.  Sinkhorn Divergences for Unbalanced Optimal Transport , 2019, ArXiv.

[42]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[43]  Roland Badeau,et al.  Generalized Sliced Wasserstein Distances , 2019, NeurIPS.

[44]  Alain Trouvé,et al.  Interpolating between Optimal Transport and MMD using Sinkhorn Divergences , 2018, AISTATS.

[45]  Gabriel Peyré,et al.  Sample Complexity of Sinkhorn Divergences , 2018, AISTATS.

[46]  Jonathan Weed,et al.  Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem , 2019, NeurIPS.

[47]  Nicolas Courty,et al.  Sliced Gromov-Wasserstein , 2019, NeurIPS.

[48]  E. Barrio,et al.  The statistical effect of entropic regularization in optimal transportation , 2020, arXiv.org.

[49]  Marco Cuturi,et al.  Debiased Sinkhorn barycenters , 2020, ICML.

[50]  Anton Mallasto,et al.  Entropy-regularized 2-Wasserstein distance between Gaussian measures , 2020, Information Geometry.