Computing Distances between Evolutionary Trees

Comparing objects to find their similarities or, equivalently, dissimilarities, is a fundamental issue in many fields including pattern recognition, image analysis, drug design, the study of thermodynamic costs of computing, cognitive science, etc. Various models have been introduced to measure the degree of similarity or dissimilarity in the literature. In the latter case the degree of dissimilarity is also often referred to as the distance. While some distances are straightforward to compute, e.g. the Hamming distance for binary strings, the Euclidean distance for geometric objects; some others are formulated as combinatorial optimization problems and thus pose nontrivial challenging algorithmic problems, sometimes even uncomputable, such as the universal information distance between two objects [4].

[1]  K. Wagner Bemerkungen zum Vierfarbenproblem. , 1936 .

[2]  A. Edwards,et al.  The reconstruction of evolution , 1963 .

[3]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[4]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[5]  D. Robinson Comparison of labeled trees with valency three , 1971 .

[6]  G. Moore,et al.  An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets. , 1973, Journal of theoretical biology.

[7]  A. K. Dewdney,et al.  Wagner's theorem for Torus graphs , 1973, Discret. Math..

[8]  W. J. Quesne The Uniquely Evolved Character Concept and its Cladistic Application , 1974 .

[9]  D. Sankoff Minimal Mutation Trees of Sequences , 1975 .

[10]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  M. Garey Johnson: computers and intractability: a guide to the theory of np- completeness (freeman , 1979 .

[13]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[14]  Derick Wood,et al.  A Note on Some Tree Similarity Measures , 1982, Inf. Process. Lett..

[15]  Ralph P. Boland,et al.  Approximating minimum-length-sequence metrics: a cautionary note , 1983 .

[16]  J. P. Jarvis,et al.  Counterexamples in measuring the distance between binary trees , 1983 .

[17]  J. P. Jarvis,et al.  Comments on computing the similarity of binary trees , 1983 .

[18]  W. H. Day Properties of the nearest neighbor interchange metric for trees of small size , 1983 .

[19]  M. Krivánek Computing the nearest neighbor interchange metric for unlabeled binary trees is NP-complete , 1986 .

[20]  Jean Marcel Pallo,et al.  On the Rotation Distance in the Lattice of Binary Trees , 1987, Inf. Process. Lett..

[21]  J. Hartigan,et al.  Statistical Analysis of Hominoid Molecular Evolution , 1987 .

[22]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[23]  R. Tarjan,et al.  Rotation distance, triangulations, and hyperbolic geometry , 1986, STOC '86.

[24]  M. A. Armstrong Groups and symmetry , 1988 .

[25]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[26]  J. Hein Reconstructing evolution of sequences subject to recombination using parsimony. , 1990, Mathematical biosciences.

[27]  Robert E. Tarjan,et al.  Short Encodings of Evolving Structures , 1992, SIAM J. Discret. Math..

[28]  D. Aldous Triangulating the Circle, at Random , 1994 .

[29]  John D. Kececioglu,et al.  Reconstructing a history of recombinations from a set of sequences , 1994, SODA '94.

[30]  Graham A. Stephen String Searching Algorithms , 1994, Lecture Notes Series on Computing.

[31]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[32]  Leonidas J. Guibas,et al.  Morphing Simple Polygons , 1994, SCG '94.

[33]  Subhash Suri,et al.  Morphing binary trees , 1995, SODA '95.

[34]  Michael S. Waterman,et al.  Introduction to Computational Biology: Maps, Sequences and Genomes , 1998 .

[35]  Michael S. Waterman,et al.  Introduction to computational biology , 1995 .

[36]  J. Tromp,et al.  On the nearest neighbour interchange distance between evolutionary trees. , 1996, Journal of theoretical biology.

[37]  J. Collado-Vides Integrative Approaches to Molecular Biology , 1996 .

[38]  Kaizhong Zhang,et al.  On the Editing Distance Between Undirected Acyclic Graphs , 1996, Int. J. Found. Comput. Sci..

[39]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[40]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[41]  B. Dasgupta,et al.  On distances between phylogenetic trees , 1997, SODA '97.

[42]  Péter Gács,et al.  Information Distance , 1998, IEEE Trans. Inf. Theory.

[43]  Ming Li,et al.  Better Approximation of Diagonal-Flip Transformation and Rotation Transformation , 1998, COCOON.

[44]  Xin He,et al.  On the Linear-Cost Subtree-Transfer Distance between Phylogenetic Trees , 1999, Algorithmica.

[45]  Marc Noy,et al.  Flipping Edges in Triangulations , 1999, Discret. Comput. Geom..

[46]  Xin He,et al.  On computing the nearest neighbor interchange distance , 1999, Discrete Mathematical Problems with Medical Applications.

[47]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .