The ℓ∞-Cophenetic Metric for Phylogenetic Trees as an Interleaving Distance

There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the l∞-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with n labeled leaves as a point in \(\mathbb {R}^{n(n+1)/2}\) known as the cophenetic vector, then comparing the two resulting Euclidean points using the l∞ distance. Meanwhile, the interleaving distance is a formal categorical construction generalized from the definition of Chazal et al., originally introduced to compare persistence modules arising from the field of topological data analysis. We show that the l∞-cophenetic metric is an example of an interleaving distance. To do this, we define phylogenetic trees as a category of merge trees with some additional structure, namely, labelings on the leaves plus a requirement that morphisms respect these labels. Then we can use the definition of a flow on this category to give an interleaving distance. Finally, we show that, because of the additional structure given by the categories defined, the map sending a labeled merge tree to the cophenetic vector is, in fact, an isometric embedding, thus proving that the l∞-cophenetic metric is an interleaving distance.

[1]  Claudia Landi,et al.  The Edit Distance for Reeb Graphs of Surfaces , 2014, Discret. Comput. Geom..

[2]  Amit Patel,et al.  Categorified Reeb Graphs , 2015, Discret. Comput. Geom..

[3]  Vincent Moulton,et al.  A parsimony-based metric for phylogenetic trees , 2015, Adv. Appl. Math..

[4]  Steve Oudot,et al.  The Structure and Stability of Persistence Modules , 2012, Springer Briefs in Mathematics.

[5]  Magnus Bakke Botnan,et al.  Computational Complexity of the Interleaving Distance , 2017, SoCG.

[6]  Louis J. Billera,et al.  Geometry of the Space of Phylogenetic Trees , 2001, Adv. Appl. Math..

[7]  Leonidas J. Guibas,et al.  Proximity of persistence modules and their diagrams , 2009, SCG '09.

[8]  J. Curry Sheaves, Cosheaves and Applications , 2013, 1303.3255.

[9]  D. Robinson,et al.  Comparison of weighted labelled trees , 1979 .

[10]  Thomas Mailund,et al.  QDist-quartet distance between evolutionary trees , 2004, Bioinform..

[11]  Facundo Mémoli,et al.  Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition , 2007, PBG@Eurographics.

[12]  Bei Wang,et al.  Convergence between Categorical Representations of Reeb Space and Mapper , 2015, SoCG.

[13]  David Sanchez,et al.  Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf , 2013, BMC Bioinformatics.

[14]  Bernd Hamann,et al.  Measuring the Distance Between Merge Trees , 2014, Topological Methods in Data Analysis and Visualization.

[15]  Vin de Silva,et al.  Theory of interleavings on categories with a flow , 2017, 1706.04095.

[16]  Ruriko Yoshida,et al.  Tropical Foundations for Probability & Statistics on Phylogenetic Tree Space , 2018 .

[17]  Ming Li,et al.  Computing the quartet distance between evolutionary trees , 2000, SODA '00.

[18]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[19]  Daniela Giorgi,et al.  Reeb graphs for shape analysis and applications , 2008, Theor. Comput. Sci..

[20]  Ulrich Bauer,et al.  Strong Equivalence of the Interleaving and Functional Distortion Metrics for Reeb Graphs , 2014, SoCG.

[21]  Ulrich Bauer,et al.  An Edit Distance for Reeb Graphs , 2016, 3DOR@Eurographics.

[22]  Mikhail Belkin,et al.  Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering , 2015, COLT.

[23]  S. Lane Categories for the Working Mathematician , 1971 .

[24]  Peter Bubenik,et al.  Categorification of Persistent Homology , 2012, Discret. Comput. Geom..

[25]  P. Diaconis,et al.  Matchings and phylogenetic trees. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Megan Owen,et al.  Computing Geodesic Distances in Tree Space , 2009, SIAM J. Discret. Math..

[27]  Amos Korman,et al.  The Dependent Doors Problem , 2017, ACM Trans. Algorithms.

[28]  Kyle Fox,et al.  Computing the Gromov-Hausdorff Distance for Metric Trees , 2015, ISAAC.

[29]  Michael Kaufmann,et al.  Comparing trees via crossing minimization , 2010, J. Comput. Syst. Sci..

[30]  Gunther H. Weber,et al.  Interleaving Distance between Merge Trees , 2013 .

[31]  Steve Oudot,et al.  Structure and Stability of the One-Dimensional Mapper , 2015, Found. Comput. Math..

[32]  Ulrich Bauer,et al.  Measuring Distance between Reeb Graphs , 2013, SoCG.

[33]  Gabriel Valiente,et al.  An efficient bottom-up distance between trees , 2001, Proceedings Eighth Symposium on String Processing and Information Retrieval.

[34]  Vin de Silva,et al.  Metrics for Generalized Persistence Modules , 2013, Found. Comput. Math..

[35]  Theory of interleavings on $[0,\infty)$-actegories , 2017 .

[36]  Gabriel Cardona,et al.  An algebraic metric for phylogenetic trees , 2009, Appl. Math. Lett..