Testing phylogenetic methods to identify horizontal gene transfer.

The subject of this chapter is to describe the methodology for assessing the power of phylogenetic HGT detection methods. Detection power is defined in the framework of hypothesis testing. Rates of false positives and false negatives can be estimated by testing HGT detection methods on HGT-free orthologous sets, and on the same sets with in silico simulated HGT events. The whole process can be divided into three steps: obtaining HGT-free orthologous sets, in silico simulation of HGT events in the same set, and submitting both sets for evaluation by any of the tested methods.Phylogenetic methods of HGT detection can be roughly divided into three types: likelihood-based tests of topologies (Kishino-Hasegawa (KH), Shimodaira-Hasegawa (SH), and Approximately Unbiased (AU) tests), tree distance methods (symmetrical difference of Robinson and Foulds (RF), and Subtree Pruning and Regrafting (SPR) distances), and genome spectral approaches (bipartition and quartet decomposition analysis). Restrictions that are inherent to phylogenetic methods of HGT detection in general and the power and precision of each method are discussed and comparative analyses of different approaches are provided, as well as some examples of assessing the power of phylogenetic HGT detection methods from a case study of orthologous sets from gamma-proteobacteria (Poptsova and Gogarten, BMC Evol Biol 7, 45, 2007) and cyanobacteria (Zhaxybayeva et al., Genome Res 16, 1099-108, 2006).

[1]  C. Hutchison,et al.  Gene content phylogeny of herpesviruses. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  J. Gogarten,et al.  The power of phylogenetic approaches to detect horizontally transferred genes , 2007, BMC Evolutionary Biology.

[3]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[4]  V. Moulton,et al.  Bounding the Number of Hybridisation Events for a Consistent Evolutionary History , 2005, Journal of mathematical biology.

[5]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[6]  David Fernández-Baca,et al.  Mrf Supertrees , 2004 .

[7]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[8]  Fred R. McMorris,et al.  Consensusn-trees , 1981 .

[9]  S. vanDongen Graph Clustering by Flow Simulation , 2000 .

[10]  J. Peter Gogarten,et al.  BranchClust: a phylogenetic algorithm for selecting gene families , 2007, BMC Bioinformatics.

[11]  Hidetoshi Shimodaira An approximately unbiased test of phylogenetic tree selection. , 2002, Systematic biology.

[12]  Martin Vingron,et al.  TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing , 2002, Bioinform..

[13]  D. Penny,et al.  Use of spectral analysis to test hypotheses on the origin of pinnipeds. , 1995, Molecular biology and evolution.

[14]  N. Grishin,et al.  Genome trees and the tree of life. , 2002, Trends in genetics : TIG.

[15]  M. Steel,et al.  Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees , 2001 .

[16]  W. Doolittle,et al.  Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. , 2006, Genome research.

[17]  Charles Semple,et al.  A supertree method for rooted trees , 2000, Discret. Appl. Math..

[18]  Abdoulaye Baniré Diallo,et al.  Algorithms for Detecting Complete and Partial Horizontal Gene Transfers: Theory and Practice , 2007 .

[19]  Olga Zhaxybayeva,et al.  Bootstrap, Bayesian probability and maximum likelihood mapping: exploring new tools for comparative genome analyses , 2002, BMC Genomics.

[20]  W. Martin,et al.  The tree of one percent , 2006, Genome Biology.

[21]  M J Sanderson,et al.  Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. , 2001, Systematic biology.

[22]  J. Peter Gogarten,et al.  GPX: A Tool for the Exploration and Visualization of Genome Evolution , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[23]  A. Rodrigo,et al.  Likelihood-based tests of topologies in phylogenetics. , 2000, Systematic biology.

[24]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[25]  Vladimir Makarenkov,et al.  T-REX: reconstructing and visualizing phylogenetic trees and reticulation networks , 2001, Bioinform..

[26]  Nicholas Hamilton,et al.  Phylogenetic identification of lateral genetic transfer events , 2006, BMC Evolutionary Biology.

[27]  J. Bull,et al.  An Empirical Test of Bootstrapping as a Method for Assessing Confidence in Phylogenetic Analysis , 1993 .

[28]  J. Gogarten,et al.  Planetary Systems and the Origins of Life: Horizontal gene transfer, gene histories, and the root of the tree of life , 2007 .

[29]  Charles Semple,et al.  Hybrids in real time. , 2006, Systematic biology.

[30]  Hidetoshi Shimodaira,et al.  Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference , 1999, Molecular Biology and Evolution.

[31]  Robert G. Beiko,et al.  A simulation test bed for hypotheses of genome evolution , 2007, Bioinform..

[32]  Vladimir Makarenkov,et al.  New Efficient Algorithm for Detection of Horizontal Gene Transfer Events , 2003, WABI.

[33]  Antonio Lazcano,et al.  Comparative Analysis of Methodologies for the Detection of Horizontally Transferred Genes: A Reassessment of First-Order Markov Models , 2005, Silico Biol..

[34]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[35]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[36]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[37]  Yan Boucher,et al.  Defining the Core of Nontransferable Prokaryotic Genes: The Euryarchaeal Core , 2001, Journal of Molecular Evolution.

[38]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Masami Hasegawa,et al.  CONSEL: for assessing the confidence of phylogenetic tree selection , 2001, Bioinform..

[40]  J. Gogarten,et al.  Horizontal transfer of ATPase genes--the tree of life becomes a net of life. , 1993, Bio Systems.

[41]  Timothy J. Harlow,et al.  Highways of gene sharing in prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[42]  J. Gogarten,et al.  The early evolution of cellular life. , 1995, Trends in ecology & evolution.

[43]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[44]  K. Strimmer,et al.  Inferring confidence sets of possibly misspecified gene trees , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[45]  B. Rannala,et al.  Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. , 2004, Systematic biology.

[46]  Yan Boucher,et al.  Phylogenetic reconstruction and lateral gene transfer. , 2004, Trends in microbiology.

[47]  L. Hamel,et al.  Unsupervised Learning in Detection of Gene Transfer , 2008, Journal of biomedicine & biotechnology.

[48]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[49]  N. Moran,et al.  From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-Proteobacteria , 2003, PLoS biology.

[50]  M. Bordewich,et al.  Computing the Hybridization Number of Two Phylogenetic Trees Is Fixed-Parameter Tractable , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51]  E. N. Adams Consensus Techniques and the Comparison of Taxonomic Trees , 1972 .

[52]  F. Robb,et al.  Evolutionary relationships of bacterial and archaeal glutamine synthetase genes , 1994, Journal of Molecular Evolution.

[53]  Olga Zhaxybayeva,et al.  Genome mosaicism and organismal lineages. , 2004, Trends in genetics : TIG.

[54]  H. Kishino,et al.  Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea , 1989, Journal of Molecular Evolution.