From Optimal Transport to Discrepancy

A common way to quantify the ,,distance'' between measures is via their discrepancy, also known as maximum mean discrepancy (MMD). Discrepancies are related to Sinkhorn divergences $S_\varepsilon$ with appropriate cost functions as $\varepsilon \to \infty$. In the opposite direction, if $\varepsilon \to 0$, Sinkhorn divergences approach another important distance between measures, namely the Wasserstein distance or more generally optimal transport ,,distance''. In this chapter, we investigate the limiting process for arbitrary measures on compact sets and Lipschitz continuous cost functions. In particular, we are interested in the behavior of the corresponding optimal potentials $\hat \varphi_\varepsilon$, $\hat \psi_\varepsilon$ and $\hat \varphi_K$ appearing in the dual formulation of the Sinkhorn divergences and discrepancies, respectively. While part of the results are known, we provide rigorous proofs for some relations which we have not found in this generality in the literature. Finally, we demonstrate the limiting process by numerical examples and show the behavior of the distances when used for the approximation of measures by point measures in a process called dithering.

[1]  Gabriele Steidl,et al.  Quadrature Errors, Discrepancies, and Their Relations to Halftoning on the Torus and the Sphere , 2012, SIAM J. Sci. Comput..

[2]  Giuseppe Savaré,et al.  Optimal Entropy-Transport problems and a new Hellinger–Kantorovich distance between positive measures , 2015, 1508.07941.

[3]  Christian L'eonard,et al.  O C ] 1 1 N ov 2 01 0 FROM THE SCHRÖDINGER PROBLEM TO THE MONGE-KANTOROVICH , 2010 .

[4]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[5]  Michael Gnewuch,et al.  Weighted geometric discrepancies and numerical integration on reproducing kernel Hilbert spaces , 2012, J. Complex..

[6]  Holger Wendland,et al.  Scattered Data Approximation: Conditionally positive definite functions , 2004 .

[7]  Alain Trouvé,et al.  Interpolating between Optimal Transport and MMD using Sinkhorn Divergences , 2018, AISTATS.

[8]  F. Santambrogio Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .

[9]  P. Rabier,et al.  L log L and finite entropy , 2013 .

[10]  Gabriel Peyré,et al.  Convergence of Entropic Schemes for Optimal Transport and Gradient Flows , 2015, SIAM J. Math. Anal..

[11]  G. Yule On the Methods of Measuring Association between Two Attributes , 1912 .

[12]  Jonathan M. Borwein,et al.  Duality and convex programming , 2010 .

[13]  Gabriel Peyré,et al.  Sample Complexity of Sinkhorn Divergences , 2018, AISTATS.

[14]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[15]  Yann Brenier,et al.  A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem , 2000, Numerische Mathematik.

[16]  G. Peyr'e Entropic Wasserstein Gradient Flows , 2015, 1502.06216.

[17]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[18]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[19]  François-Xavier Vialard,et al.  Scaling algorithms for unbalanced optimal transport problems , 2017, Math. Comput..

[20]  Pierre Weiss,et al.  Optimal Transport Approximation of 2-Dimensional Measures , 2018, SIAM J. Imaging Sci..

[21]  Nicolas Chauffert,et al.  A Projection Method on Measures Sets , 2017 .

[22]  I. J. Schoenberg Metric spaces and completely monotone functions , 1938 .

[23]  Joachim Weickert,et al.  Universität Des Saarlandes Fachrichtung 6.1 – Mathematik Electrostatic Halftoning Electrostatic Halftoning , 2022 .

[24]  Gabriel Peyré,et al.  Entropic Approximation of Wasserstein Gradient Flows , 2015, SIAM J. Imaging Sci..

[25]  G. Yule On the Methods of Measuring Association between Two Attributes , 1912 .

[26]  Mathieu Desbrun,et al.  Blue noise through optimal transport , 2012, ACM Trans. Graph..

[27]  Andrea Braides Γ-convergence for beginners , 2002 .

[28]  M. Gräf Efficient Algorithms for the Computation of Optimal Quadrature Points on Riemannian Manifolds , 2013 .

[29]  Richard Sinkhorn A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[30]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[31]  Zoubin Ghahramani,et al.  Training generative neural networks via Maximum Mean Discrepancy optimization , 2015, UAI.

[32]  L. Rüschendorf Convergence of the iterative proportional fitting procedure , 1995 .

[33]  Ingo Steinwart,et al.  Mercer’s Theorem on General Domains: On the Interaction between Measures, Kernels, and RKHSs , 2012 .

[34]  Lauwerens Kuipers,et al.  Uniform distribution of sequences , 1974 .

[35]  Joaquín Muñoz-García,et al.  A test for the two-sample problem based on empirical characteristic functions , 2008, Comput. Stat. Data Anal..

[36]  Rémi Peyre Comparison between W2 distance and Ḣ−1 norm, and Localization of Wasserstein distance , 2011, ESAIM: Control, Optimisation and Calculus of Variations.

[37]  R. Berman The Sinkhorn algorithm, parabolic optimal transport and geometric Monge–Ampère equations , 2017, Numerische Mathematik.

[38]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[39]  Dirk A. Lorenz,et al.  Entropic regularization of continuous optimal transport problems , 2019, 1906.01333.

[40]  Joachim Weickert,et al.  Fast electrostatic halftoning , 2011, Journal of Real-Time Image Processing.

[41]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[42]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[43]  Jan van Neerven,et al.  Analysis in Banach Spaces , 2023, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge / A Series of Modern Surveys in Mathematics.

[44]  Simone Di Marino,et al.  An Optimal Transport Approach for the Schrödinger Bridge Problem and Convergence of Sinkhorn Algorithm , 2019, Journal of Scientific Computing.

[45]  Marco Cuturi,et al.  Computational Optimal Transport , 2019 .

[46]  I. Ekeland,et al.  Convex analysis and variational problems , 1976 .

[47]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[48]  François-Xavier Vialard An elementary introduction to entropic regularization and proximal methods for numerical optimal transport , 2019 .

[49]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[50]  E. Novak,et al.  Tractability of Multivariate Problems , 2008 .

[51]  J. Seidel,et al.  Spherical codes and designs , 1977 .

[52]  Andrea Braides Gamma-Convergence for Beginners , 2002 .

[53]  A. Wilson in the Theory of Trip Distribution, Mode Split and Route Split , 2016 .

[54]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[55]  D. Lorenz,et al.  Quadratically Regularized Optimal Transport , 2019, Applied Mathematics & Optimization.

[56]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[57]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[58]  C. Villani Topics in Optimal Transportation , 2003 .

[59]  Joachim Weickert,et al.  Dithering by Differences of Convex Functions , 2011, SIAM J. Imaging Sci..

[60]  Roberto Cominetti,et al.  Asymptotic analysis of the exponential penalty trajectory in linear programming , 1994, Math. Program..

[61]  Gabriele Steidl,et al.  Curve Based Approximation of Measures on Manifolds by Discrepancy Minimization , 2019, Found. Comput. Math..

[62]  Julien Chevallier,et al.  Uniform decomposition of probability measures: quantization, clustering and rate of convergence , 2018, Journal of Applied Probability.