Entropic Approximation of Wasserstein Gradient Flows

This article details a novel numerical scheme to approximate gradient flows for optimal transport (i.e. Wasserstein) metrics. These flows have proved useful to tackle theoretically and numerically non-linear diffusion equations that model for instance porous media or crowd evolutions. These gradient flows define a suitable notion of weak solutions for these evolutions and they can be approximated in a stable way using discrete flows. These discrete flows are implicit Euler time stepping according to the Wasserstein metric. A bottleneck of these approaches is the high computational load induced by the resolution of each step. Indeed, this corresponds to the resolution of a convex optimization problem involving a Wasserstein distance to the previous iterate. Following several recent works on the approximation of Wasserstein distances, we consider a discrete flow induced by an entropic regularization of the transportation coupling. This entropic regularization allows one to trade the initial Wasserstein fidelity term for a Kulback-Leibler divergence, which is easier to deal with numerically. We show how KL proximal schemes, and in particular Dykstra's algorithm, can be used to compute each step of the regularized flow. The resulting algorithm is both fast, parallelizable and versatile, because it only requires multiplications by a Gibbs kernel. On Euclidean domains discretized on an uniform grid, this corresponds to a linear filtering (for instance a Gaussian filtering when $c$ is the squared Euclidean distance) which can be computed in nearly linear time. On more general domains, such as (possibly non-convex) shapes or on manifolds discretized by a triangular mesh, following a recently proposed numerical scheme for optimal transport, this Gibbs kernel multiplication is approximated by a short-time heat diffusion.

[1]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[2]  Richard Sinkhorn A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[3]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[4]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[5]  Richard Sinkhorn Diagonal equivalence to matrices with prescribed row and column sums. II , 1967 .

[6]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[7]  R. Dykstra An Iterative Procedure for Obtaining $I$-Projections onto the Intersection of Convex Sets , 1985 .

[8]  Y. Brenier The least action principle and the related concept of generalized flows for incompressible perfect fluids , 1989 .

[9]  P. G. Ciarlet,et al.  Introduction to Numerical Linear Algebra and Optimisation , 1989 .

[10]  Jonathan Eckstein,et al.  Nonlinear Proximal Point Algorithms Using Bregman Functions, with Applications to Convex Programming , 1993, Math. Oper. Res..

[11]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[12]  D. Kinderlehrer,et al.  THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .

[13]  Joachim Weickert,et al.  Anisotropic diffusion in image processing , 1996 .

[14]  K. Kiwiel Proximal Minimization Methods with Generalized Bregman Functions , 1997 .

[15]  L. Rüschendorf,et al.  Closedness of Sum Spaces andthe Generalized Schrödinger Problem , 1998 .

[16]  D. Kinderlehrer,et al.  Approximation of Parabolic Equations Using the Wasserstein Metric , 1999 .

[17]  Yann Brenier,et al.  A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem , 2000, Numerische Mathematik.

[18]  Benar Fux Svaiter,et al.  An Inexact Hybrid Generalized Proximal Point Algorithm and Some New Results on the Theory of Bregman Functions , 2000, Math. Oper. Res..

[19]  F. Otto THE GEOMETRY OF DISSIPATIVE EVOLUTION EQUATIONS: THE POROUS MEDIUM EQUATION , 2001 .

[20]  U. Frisch,et al.  A reconstruction of the initial conditions of the Universe by optimal mass transportation , 2001, Nature.

[21]  Heinz H. Bauschke,et al.  Phase retrieval, error reduction algorithm, and Fienup variants: a view from convex optimization. , 2002, Journal of the Optical Society of America. A, Optics, image science, and vision.

[22]  M. Agueh Existence of solutions to degenerate parabolic equations via the Monge-Kantorovich theory. , 2002, math/0309410.

[23]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[24]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[25]  P. L. Combettes,et al.  A Dykstra-like algorithm for two monotone operators , 2007 .

[26]  José A. Carrillo,et al.  Convergence of the Mass-Transport Steepest Descent Scheme for the Subcritical Patlak-Keller-Segel Model , 2008, SIAM J. Numer. Anal..

[27]  J. A. Carrillo,et al.  Numerical Simulation of Diffusive and Aggregation Phenomena in Nonlinear Continuity Equations by Evolving Diffeomorphisms , 2009, SIAM J. Sci. Comput..

[28]  Giuseppe Savaré,et al.  The Wasserstein Gradient Flow of the Fisher Information and the Quantum Drift-diffusion Equation , 2009 .

[29]  Stefan Adams,et al.  From a Large-Deviations Principle to the Wasserstein Gradient Flow: A New Micro-Macro Passage , 2010, 1004.4076.

[30]  Michael Westdickenberg,et al.  VARIATIONAL PARTICLE SCHEMES FOR THE POROUS MEDIUM EQUATION AND FOR THE SYSTEM OF ISENTROPIC EULER EQUATIONS , 2008, 0807.3573.

[31]  S. Varadhan On the behavior of the fundamental solution of the heat equation with variable coefficients , 2010 .

[32]  F. Santambrogio,et al.  A MACROSCOPIC CROWD MOTION MODEL OF GRADIENT FLOW TYPE , 2010, 1002.0686.

[33]  A. Figalli The Optimal Partial Transport Problem , 2010 .

[34]  Brendan Pass,et al.  On the local structure of optimal measures in the multi-marginal optimal transportation problem , 2010, 1005.2162.

[35]  M. Burger,et al.  A mixed finite element method for nonlinear diffusion equations , 2010 .

[36]  Matthias Erbar The heat equation on manifolds as a gradient flow in the Wasserstein space , 2010 .

[37]  J. Maas Gradient flows of the entropy for finite Markov chains , 2011, 1102.5238.

[38]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[39]  Wolfgang Heidrich,et al.  Displacement interpolation using Lagrangian mass transport , 2011, ACM Trans. Graph..

[40]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[41]  S. Chow,et al.  Fokker–Planck Equations for a Free Energy Functional or Markov Process on a Graph , 2011, Archive for Rational Mechanics and Analysis.

[42]  Carola-Bibiane Schönlieb,et al.  Regularized Regression and Density Estimation based on Optimal Transport , 2012 .

[43]  Enac,et al.  Characterization of barycenters in the Wasserstein space by averaging optimal transport maps , 2012, 1212.2562.

[44]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[45]  M. Agueh,et al.  One-Dimensional Numerical Algorithms for Gradient Flows in the p-Wasserstein Spaces , 2013 .

[46]  A. Mielke Geodesic convexity of the relative entropy in reversible Markov chains , 2013 .

[47]  Christian L'eonard A survey of the Schr\"odinger problem and some of its connections with optimal transport , 2013, 1308.0215.

[48]  R. McCann,et al.  Insights into capacity-constrained optimal transport , 2013, Proceedings of the National Academy of Sciences.

[49]  D. Matthes,et al.  Convergence of a variational Lagrangian scheme for a nonlinear drift diffusion equation , 2013, 1301.0747.

[50]  Keenan Crane,et al.  Geodesics in heat: A new approach to computing distance based on heat flow , 2012, TOGS.

[51]  Jean-Marie Mirebeau,et al.  Sparse Non-negative Stencils for Anisotropic Diffusion , 2013, Journal of Mathematical Imaging and Vision.

[52]  Chris J. Budd,et al.  Monge-Ampére based moving mesh methods for numerical weather prediction, with applications to the Eady problem , 2013, J. Comput. Phys..

[53]  Jonathan M. Borwein,et al.  Global convergence of a non-convex Douglas–Rachford iteration , 2012, J. Glob. Optim..

[54]  J. Carrillo,et al.  A Finite-Volume Method for Nonlinear Nonlocal Equations with a Gradient Flow Structure , 2014, 1402.4252.

[55]  Gabriel Peyré,et al.  Optimal Transport with Proximal Splitting , 2013, SIAM J. Imaging Sci..

[56]  Gui-Song Xia,et al.  Synthesizing and Mixing Stationary Gaussian Texture Models , 2014, SIAM J. Imaging Sci..

[57]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[58]  Arindam Banerjee,et al.  Bregman Alternating Direction Method of Multipliers , 2013, NIPS.

[59]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[60]  Guillaume Carlier,et al.  Optimal Transport and Cournot-Nash Equilibria , 2012, Math. Oper. Res..