Same But Different: Distance Correlations Between Topological Summaries

Persistent homology allows us to create topological summaries of complex data. In order to analyse these statistically, we need to choose a topological summary and a relevant metric space in which this topological summary exists. While different summaries may contain the same information (as they come from the same persistence module), they can lead to different statistical conclusions since they lie in different metric spaces. The best choice of metric will often be application-specific. In this paper we discuss distance correlation, which is a non-parametric tool for comparing data sets that can lie in completely different metric spaces. In particular we calculate the distance correlation between different choices of topological summaries. We compare some different topological summaries for a variety of random models of underlying data via the distance correlation between the samples. We also give examples of performing distance correlation between topological summaries and other scalar measures of interest, such as a paired random variable or a parameter of the random model used to generate the underlying data. This article is meant to be expository in style, and will include the definitions of standard statistical quantities in order to be accessible to non-statisticians.

[1]  Makoto Yamada,et al.  Riemannian Manifold Kernel for Persistence Diagrams , 2018, ArXiv.

[2]  Ulrich Bauer,et al.  A stable multi-scale kernel for topological machine learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[4]  Pawel Dlotko,et al.  A persistence landscapes toolbox for topological statistics , 2014, J. Symb. Comput..

[5]  Makoto Yamada,et al.  Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams , 2018, NeurIPS.

[6]  Dmitriy Morozov,et al.  Geometry Helps to Compare Persistence Diagrams , 2016, ALENEX.

[7]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[8]  Ulrich Bauer,et al.  Ripser: efficient computation of Vietoris–Rips persistence barcodes , 2019, Journal of Applied and Computational Topology.

[9]  Katharine Turner,et al.  Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids , 2015, 1507.01454.

[10]  Sayan Mukherjee,et al.  Fréchet Means for Distributions of Persistence Diagrams , 2012, Discrete & Computational Geometry.

[11]  Katharine Turner,et al.  Hypothesis testing for topological data analysis , 2013, J. Appl. Comput. Topol..

[12]  Katharine Turner,et al.  Means and medians of sets of persistence diagrams , 2013, ArXiv.

[13]  Henry Adams,et al.  Persistence Images: A Stable Vector Representation of Persistent Homology , 2015, J. Mach. Learn. Res..

[14]  Kenji Fukumizu,et al.  Kernel Method for Persistence Diagrams via Kernel Embedding and Weight Factor , 2017, J. Mach. Learn. Res..

[15]  R. Lyons Distance covariance in metric spaces , 2011, 1106.5758.

[16]  Mark W. Meckes,et al.  Positive definite metric spaces , 2010, 1012.5863.

[17]  Steve Oudot,et al.  Eurographics Symposium on Geometry Processing 2015 Stable Topological Signatures for Points on 3d Shapes , 2022 .

[18]  Katharine Turner Medians of populations of persistence diagrams , 2013 .

[19]  L. Klebanov,et al.  A characterization of distributions by mean values of statistics and certain probabilistic metrics , 1992 .

[20]  Karthikeyan Natesan Ramamurthy,et al.  A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Massimo Ferri,et al.  Comparing Persistence Diagrams Through Complex Vectors , 2015, ICIAP.

[22]  Steve Oudot,et al.  Sliced Wasserstein Kernel for Persistence Diagrams , 2017, ICML.

[23]  Jesper Møller,et al.  The Accumulated Persistence Function, a New Useful Functional Summary Statistic for Topological Data Analysis, With a View to Brain Artery Trees and Spatial Point Process Applications , 2016, Journal of Computational and Graphical Statistics.

[24]  C. Berg,et al.  Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions , 1984 .