Probabilistic Fréchet Means and Statistics on Vineyards

In order to use persistence diagrams as a true statistical tool, it would be very useful to have a good notion of mean and variance for a set of diagrams. In [20], Mileyko and his collaborators made the first study of the properties of the Frechet mean in (Dp,Wp), the space of persistence diagrams equipped with the p-th Wasserstein metric. In particular, they showed that the Frechet mean of a finite set of diagrams always exists, but is not necessarily unique. As an unfortunate consequence, one sees that the means of a continuously-varying set of diagrams do not themselves vary continuously, which presents obvious problems when trying to extend the Frechet mean definition to the realm of vineyards. We fix this problem by altering the original definition of Frechet mean so that it now becomes a probability measure on the set of persistence diagrams; in a nutshell, the mean of a set of diagrams will be a weighted sum of atomic measures, where each atom is itself the (Frechet mean) persistence diagram of a perturbation of the input diagrams. We show that this new definition defines a (Holder) continuous map, for each k, from (Dp) k → P (Dp), and we present several examples to show how it may become a useful statistic on vineyards.

[1]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[2]  R. Selten Reexamination of the perfectness concept for equilibrium points in extensive games , 1975, Classics in Game Theory.

[3]  James R. Munkres,et al.  Elements of algebraic topology , 1984 .

[4]  Herbert Edelsbrunner,et al.  Topological Persistence and Simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[5]  Valerio Pascucci,et al.  Morse-smale complexes for piecewise linear 3-manifolds , 2003, SCG '03.

[6]  Herbert Edelsbrunner,et al.  Extreme Elevation on a 2-Manifold , 2004, SCG '04.

[7]  Herbert Edelsbrunner,et al.  Interface surfaces for protein-protein complexes , 2004, RECOMB.

[8]  David Cohen-Steiner,et al.  Stability of Persistence Diagrams , 2005, Discret. Comput. Geom..

[9]  David Cohen-Steiner,et al.  Vines and vineyards by updating persistence in linear time , 2006, SCG '06.

[10]  Vin de Silva,et al.  On the Local Behavior of Spaces of Natural Images , 2007, International Journal of Computer Vision.

[11]  Herbert Edelsbrunner,et al.  Protein-protein interfaces: properties, preferences, and projections. , 2007, Journal of proteome research.

[12]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[13]  Earl F. Glynn,et al.  Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock , 2008, PloS one.

[14]  C. Villani Optimal Transport: Old and New , 2008 .

[15]  H. Edelsbrunner,et al.  Homological illusions of persistence and stability , 2008 .

[16]  Kenneth A. Brown,et al.  Nonlinear Statistics of Human Speech Data , 2009, Int. J. Bifurc. Chaos.

[17]  Leonidas J. Guibas,et al.  Proximity of persistence modules and their diagrams , 2009, SCG '09.

[18]  Jennifer Gamble,et al.  Exploring uses of persistent homology for statistical analysis of landmark-based shape data , 2010, J. Multivar. Anal..

[19]  S. Mukherjee,et al.  Probability measures on the space of persistence diagrams , 2011 .

[20]  Paul R Zurek,et al.  GiA Roots: software for the high throughput analysis of plant root system architecture , 2012, BMC Plant Biology.

[21]  Yuri Dabaghian,et al.  A Topological Paradigm for Hippocampal Spatial Map Formation Using Persistent Homology , 2012, PLoS Comput. Biol..

[22]  Peter Bubenik,et al.  Statistical topology using persistence landscapes , 2012, ArXiv.

[23]  Andrew J. Blumberg,et al.  Persistent homology for metric measure spaces, and robust statistics for hypothesis testing and confidence intervals , 2012, ArXiv.

[24]  Frédéric Chazal,et al.  Optimal rates of convergence for persistence diagrams in Topological Data Analysis , 2013, ArXiv.

[25]  Sayan Mukherjee,et al.  Fréchet Means for Distributions of Persistence Diagrams , 2012, Discrete & Computational Geometry.

[26]  Jose A. Perea,et al.  SW1PerS: Sliding windows and 1-persistence scoring; discovering periodicity in gene expression time series data , 2015, BMC Bioinformatics.

[27]  Jose A. Perea,et al.  Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis , 2013, Found. Comput. Math..