Means and medians of sets of persistence diagrams

The persistence diagram is the fundamental object in topological data analysis. It inherits the stochastic variability of the data we use as input. As such we need to understand how to perform statistics on the space of persistence diagrams. This paper looks at the space of persistence diagrams under a variety of different metrics which are analogous to L p metrics on the space of functions. Using these metrics we can form different cost functions defining different central tendencies and their corresponding measures of variability. This gives us the natural definitions of both the mean and median of a finite number of persistence diagrams. We give a characterization of the mean and the median of an odd number of persistence diagrams. Although we have examples of the mean not being unique nor continuous we prove that generically the mean of sets of persistence diagrams with finitely many off diagonal points is unique. In comparison the sets of persistence diagrams with finitely many off diagonal points which do not have a unique median is of positive measure.

[1]  R. Ghrist Barcodes: The persistent topology of data , 2007 .

[2]  R. Adler,et al.  PR ] 2 4 Ju l 2 01 1 Submitted to the Annals of Applied Probability DISTANCE FUNCTIONS , CRITICAL POINTS , AND TOPOLOGY FOR SOME RANDOM COMPLEXES By , 2011 .

[3]  J. Marron,et al.  Object oriented data analysis: Sets of trees , 2007, 0711.3147.

[4]  Chandrajit L. Bajaj,et al.  The algebraic degree of geometric optimization problems , 1988, Discret. Comput. Geom..

[5]  Matthew Kahle,et al.  Random Geometric Complexes , 2009, Discret. Comput. Geom..

[6]  Elizabeth S. Meckes,et al.  Limit theorems for Betti numbers of random simplicial complexes , 2010 .

[7]  Frédéric Chazal,et al.  On the Bootstrap for Persistence Diagrams and Landscapes , 2013, ArXiv.

[8]  Omer Bobrowski,et al.  Crackle: The Homology of Noise , 2014, Discret. Comput. Geom..

[9]  Peter Bubenik,et al.  Statistical topology using persistence landscapes , 2012, ArXiv.

[10]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[11]  Sayan Mukherjee,et al.  Probabilistic Fréchet Means and Statistics on Vineyards , 2013, ArXiv.

[12]  Frédéric Chazal,et al.  Optimal rates of convergence for persistence diagrams in Topological Data Analysis , 2013, ArXiv.

[13]  S. Mukherjee,et al.  Persistent Homology Transform for Modeling Shapes and Surfaces , 2013, 1310.1030.

[14]  J Steve Marron,et al.  Overview of object oriented data analysis , 2014, Biometrical journal. Biometrische Zeitschrift.

[15]  Peter Bubenik,et al.  A statistical approach to persistent homology , 2006, math/0607634.

[16]  Sayan Mukherjee,et al.  Probabilistic Fréchet Means and Statistics on Vineyards , 2013, ArXiv.

[17]  Andrew J. Blumberg,et al.  Robust Statistics, Hypothesis Testing, and Confidence Intervals for Persistent Homology on Metric Measure Spaces , 2012, Found. Comput. Math..

[18]  Sivaraman Balakrishnan,et al.  Statistical Inference For Persistent Homology , 2013, arXiv.org.

[19]  Sivaraman Balakrishnan,et al.  Confidence sets for persistence diagrams , 2013, The Annals of Statistics.

[20]  Sayan Mukherjee,et al.  Fréchet Means for Distributions of Persistence Diagrams , 2012, Discrete & Computational Geometry.

[21]  Daniela Giorgi,et al.  Retrieval of trademark images by means of size functions , 2006, Graph. Model..

[22]  D. Yogeshwaran,et al.  On the topology of random complexes built over stationary point processes. , 2012, 1211.0061.

[23]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[24]  Simone Vantini,et al.  Object Oriented Data Analysis: A few methodological challenges. , 2014, Biometrical journal. Biometrische Zeitschrift.

[25]  Shin-Ichi Ohta,et al.  Barycenters in Alexandrov spaces of curvature bounded below , 2012 .

[26]  James Stephen Marron,et al.  Object-Oriented Data Analysis of Cell Images , 2014 .

[27]  Herbert Edelsbrunner,et al.  Computing Robustness and Persistence for Images , 2010, IEEE Transactions on Visualization and Computer Graphics.

[28]  Andrew J. Blumberg,et al.  Persistent homology for metric measure spaces, and robust statistics for hypothesis testing and confidence intervals , 2012, ArXiv.

[29]  S. Mukherjee,et al.  Probability measures on the space of persistence diagrams , 2011 .