Convergence rates for empirical barycenters in metric spaces: curvature, convexity and extendable geodesics

This paper provides rates of convergence for empirical (generalised) barycenters on compact geodesic metric spaces under general conditions using empirical processes techniques. Our main assumption is termed a variance inequality and provides a strong connection between usual assumptions in the field of empirical processes and central concepts of metric geometry. We study the validity of variance inequalities in spaces of non-positive and non-negative Aleksandrov curvature. In this last scenario, we show that variance inequalities hold provided geodesics, emanating from a barycenter, can be extended by a constant factor. We also relate variance inequalities to strong geodesic convexity. While not restricted to this setting, our results are largely discussed in the context of the 2-Wasserstein space.

[1]  P. Bartlett,et al.  Empirical minimization , 2006 .

[2]  C. Schotz,et al.  Convergence rates for the generalized Fréchet mean via the quadruple inequality , 2018, Electronic Journal of Statistics.

[3]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[4]  Thibaut Le Gouic,et al.  Distribution's template estimate with Wasserstein metrics , 2011, 1111.5927.

[5]  A. Petrunin Semiconcave Functions in Alexandrov???s Geometry , 2013, 1304.0292.

[6]  Shin-Ichi Ohta,et al.  Barycenters in Alexandrov spaces of curvature bounded below , 2012 .

[7]  P. Massart Some applications of concentration inequalities to statistics , 2000 .

[8]  Takumi Yokota Convex functions and barycenter on CAT(1)-Spaces of small radii , 2016 .

[9]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[10]  Karl-Theodor Sturm,et al.  Probability Measures on Metric Spaces of Nonpositive Curvature , 2003 .

[11]  Shahar Mendelson,et al.  Improving the sample complexity using global data , 2002, IEEE Trans. Inf. Theory.

[12]  Shun-ichi Amari,et al.  Information Geometry and Its Applications , 2016 .

[13]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[14]  Shahar Mendelson,et al.  Learning without Concentration , 2014, COLT.

[15]  Shin-ichi Ohta,et al.  Convexities of metric spaces , 2007 .

[16]  Karthik Sridharan,et al.  Empirical Entropy, Minimax Regret and Minimax Risk , 2013, ArXiv.

[17]  S. Graf,et al.  Foundations of Quantization for Probability Distributions , 2000 .

[18]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[19]  Karl-Theodor Sturm,et al.  On the geometry of metric measure spaces. II , 2006 .

[20]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[21]  M. Bridson,et al.  Metric Spaces of Non-Positive Curvature , 1999 .

[22]  M. Fréchet Les éléments aléatoires de nature quelconque dans un espace distancié , 1948 .

[23]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[24]  Jonathan Weed,et al.  Statistical Optimal Transport via Factored Couplings , 2018, AISTATS.

[25]  M. Talagrand New concentration inequalities in product spaces , 1996 .

[26]  R. Nickl,et al.  Mathematical Foundations of Infinite-Dimensional Statistical Models , 2015 .

[27]  Thibaut Le Gouic,et al.  Existence and consistency of Wasserstein barycenters , 2015, Probability Theory and Related Fields.

[28]  Alexander Gasnikov,et al.  Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn's Algorithm , 2018, ICML.

[29]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[30]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[31]  Jérémie Bigot,et al.  Upper and lower risk bounds for estimating the Wasserstein barycenter of random measures on the real line , 2018 .

[32]  D. Burago,et al.  A Course in Metric Geometry , 2001 .

[33]  A note on flatness of non separable tangent cone at a barycenter , 2019, 1906.11536.

[34]  B. Afsari Riemannian Lp center of mass: existence, uniqueness, and convexity , 2011 .

[35]  Takumi Yokota A rigidity theorem in Alexandrov spaces with lower curvature bound , 2009, 0912.0114.

[36]  H. Karcher Riemannian Center of Mass and so called karcher mean , 2014, 1407.2087.

[37]  G. Lugosi,et al.  Near-optimal mean estimators with respect to general norms , 2018, Probability Theory and Related Fields.

[38]  Bruno Pelletier,et al.  Informative barycentres in statistics , 2005 .

[39]  R. Bhattacharya,et al.  LARGE SAMPLE THEORY OF INTRINSIC AND EXTRINSIC SAMPLE MEANS ON MANIFOLDS—II , 2003 .

[40]  E. Mammen,et al.  Smooth Discrimination Analysis , 1999 .

[41]  Benoît R. Kloeckner A geometric study of Wasserstein spaces: Euclidean spaces , 2008, 0804.3505.

[42]  R. Veldhuis The centroid of the symmetrical Kullback-Leibler distance , 2002, IEEE Signal Processing Letters.

[43]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[44]  R. Bhattacharya,et al.  Large sample theory of intrinsic and extrinsic sample means on manifolds--II , 2005, math/0507423.

[45]  Alexey Kroshnin,et al.  Statistical inference for Bures–Wasserstein barycenters , 2019, The Annals of Applied Probability.

[46]  Marco Cuturi,et al.  Computational Optimal Transport , 2019 .

[47]  O. Bousquet A Bennett concentration inequality and its application to suprema of empirical processes , 2002 .

[48]  H. Le,et al.  Limit theorems for empirical Fr\'echet means of independent and non-identically distributed manifold-valued random variables , 2011, 1102.0228.

[49]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[50]  R. Handel Probability in High Dimension , 2014 .

[51]  C. Villani,et al.  Quantitative Concentration Inequalities for Empirical Measures on Non-compact Spaces , 2005, math/0503123.

[52]  C. Villani,et al.  Ricci curvature for metric-measure spaces via optimal transport , 2004, math/0412127.

[53]  F. Santambrogio Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .

[54]  M. Émery,et al.  Sur le barycentre d'une probabilité dans une variété , 1991 .

[55]  C. Villani Topics in Optimal Transportation , 2003 .

[56]  Gilles Blanchard,et al.  On the Rate of Convergence of Regularized Boosting Classifiers , 2003, J. Mach. Learn. Res..

[57]  I. Vajda Theory of statistical inference and information , 1989 .

[58]  Jonathan Weed,et al.  Estimation of smooth densities in Wasserstein distance , 2019, COLT.

[59]  S. Boucheron,et al.  Theory of classification : a survey of some recent advances , 2005 .

[60]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[61]  C. Villani Optimal Transport: Old and New , 2008 .

[63]  V. Koltchinskii Rejoinder: Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0135.

[64]  K. Ball,et al.  Sharp uniform convexity and smoothness inequalities for trace norms , 1994 .

[65]  Jason Altschuler,et al.  Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration , 2017, NIPS.

[66]  Paolo Tilli,et al.  Topics on analysis in metric spaces , 2004 .

[67]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[68]  J. A. Cuesta-Albertos,et al.  A fixed-point approach to barycenters in Wasserstein space , 2015, 1511.05355.