Comparison of phylogenetic trees through alignment of embedded evolutionary distances

BackgroundThe understanding of evolutionary relationships is a fundamental aspect of modern biology, with the phylogenetic tree being a primary tool for describing these associations. However, comparison of trees for the purpose of assessing similarity and the quantification of various biological processes remains a significant challenge.ResultsWe describe a novel approach for the comparison of phylogenetic distance information based on the alignment of representative high-dimensional embeddings (xCEED: Comparison of Embedded Evolutionary Distances). The xCEED methodology, which utilizes multidimensional scaling and Procrustes-related superimposition approaches, provides the ability to measure the global similarity between trees as well as incongruities between them. We demonstrate the application of this approach to the prediction of coevolving protein interactions and demonstrate its improved performance over the mirrortree, tol-mirrortree, phylogenetic vector projection, and partial correlation approaches. Furthermore, we show its applicability to both the detection of horizontal gene transfer events as well as its potential use in the prediction of interaction specificity between a pair of multigene families.ConclusionsThese approaches provide additional tools for the study of phylogenetic trees and associated evolutionary processes. Source code is available at http://gomezlab.bme.unc.edu/tools.

[1]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[2]  Baba C. Vemuri,et al.  A robust algorithm for point set registration using mixture of Gaussians , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[4]  Li Liao,et al.  Phylogenetic tree information aids supervised learning for predicting protein-protein interaction based on distance matrices , 2007, BMC Bioinformatics.

[5]  M. Sternberg,et al.  Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. , 2005, Journal of molecular biology.

[6]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[7]  F. Hoffmann,et al.  Tangled Trees: Phylogeny, Cospeciation, and Coevolution , 2004 .

[8]  Yoshihiro Yamanishi,et al.  The inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships , 2005, Bioinform..

[9]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[10]  Benjamin A. Shoemaker,et al.  Correlated evolution of interacting proteins: looking behind the mirrortree. , 2009, Journal of molecular biology.

[11]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[12]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[13]  M. Allen Understanding Regression Analysis , 1997 .

[14]  David Haussler,et al.  Detecting Coevolution in and among Protein Domains , 2007, PLoS Comput. Biol..

[15]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[16]  James R. Cole,et al.  The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data , 2006, Nucleic Acids Res..

[17]  Gi-Ho Sung,et al.  Ancient Tripartite Coevolution in the Attine Ant-Microbe Symbiosis , 2003, Science.

[18]  Willem J. Heiser,et al.  Resistant orthogonal procrustes analysis , 1992 .

[19]  Eric Bapteste,et al.  Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement , 2005, BMC Evolutionary Biology.

[20]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[21]  James R. Brown Ancient horizontal gene transfer , 2003, Nature Reviews Genetics.

[22]  F. Cohen,et al.  Co-evolution of proteins with their interaction partners. , 2000, Journal of molecular biology.

[23]  K. J. Fryxell,et al.  The coevolution of gene family trees. , 1996, Trends in genetics : TIG.

[24]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[25]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[26]  Arun K. Ramani,et al.  Exploiting the co-evolution of interacting proteins to discover interaction specificity. , 2003, Journal of molecular biology.

[27]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[28]  J. Tukey,et al.  The Fitting of Power Series, Meaning Polynomials, Illustrated on Band-Spectroscopic Data , 1974 .

[29]  Roderic D. M. Page,et al.  Tangled trees : phylogeny, cospeciation, and coevolution , 2003 .

[30]  Teresa M. Przytycka,et al.  Predicting protein-protein interaction by searching evolutionary tree automorphism space , 2005, ISMB.

[31]  D. Robinson Comparison of labeled trees with valency three , 1971 .

[32]  William Stafford Noble,et al.  Learning to predict protein-protein interactions from protein sequences , 2003, Bioinform..

[33]  Michael T. Hallett,et al.  Towards Identifying Lateral Gene Transfer Events , 2002, Pacific Symposium on Biocomputing.

[34]  R. Campbell,et al.  Co-evolution of ligand-receptor pairs , 1994, Nature.

[35]  M. Ragan Detection of lateral gene transfer among microbial genomes. , 2001, Current opinion in genetics & development.

[36]  A. Valencia,et al.  Similarity of phylogenetic trees as indicator of protein-protein interaction. , 2001, Protein engineering.

[37]  Fred R. McMorris,et al.  COMPARISON OF UNDIRECTED PHYLOGENETIC TREES BASED ON SUBTREES OF FOUR EVOLUTIONARY UNITS , 1985 .

[38]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[39]  Yoshihiro Yamanishi,et al.  Partial correlation coefficient between distance matrices as a new indicator of protein-protein interactions , 2006, Bioinform..

[40]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[41]  E. Koonin,et al.  Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ , 2003, Genome Biology.

[42]  M. C. Jones,et al.  Comparison of Smoothing Parameterizations in Bivariate Kernel Density Estimation , 1993 .

[43]  R. Warner Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM]. , 2007 .