An Enhanced Algorithm for Reconstructing a Phylogenetic Tree Based on the Tree Rearrangement and Maximum Likelihood Method

The phylogeny reconstruction problem is a fundamental problem in computational molecular biology and biochemical physics. Since the number of data sets has grown substantially in recent years, the accuracy and speed of constructing phylogenies become increasingly critical. Numerous studies have demonstrated that the maximum likelihood (ML) method is the most effective method for reconstructing a phylogenetic tree from sequence data. Conversely, tree bisection and reconnection (TBR) is a tree topology rearrangement method that can generate an extensive tree space. In this paper, we propose an enhanced method for reconstructing phylogenetic trees in which the TBR operation is modified and combined with the minimum evolution principle to filter out some unnecessary reconnected positions to reduce the search time. The experiment results demonstrate that the proposed method can assist other algorithms in constructing more accurate trees within a reasonable time.

[1]  Yun S. Song Properties of Subtree-Prune-and-Regraft Operations on Totally-Ordered Phylogenetic Trees , 2006 .

[2]  Daniel C. Pevear,et al.  VP1 Sequencing of All Human Rhinovirus Serotypes:Insights into Genus Phylogeny and Susceptibility to AntiviralCapsid-BindingCompounds , 2004, Journal of Virology.

[3]  John Fauvel ALGORITHMS IN THE PRE-CALCULUS CLASSROOM: WHO0 WAS NEWTON-RAPHSONa , 1998 .

[4]  M. Rosenberg,et al.  Traditional phylogenetic reconstruction methods reconstruct shallow and deep evolutionary relationships equally well. , 2001, Molecular biology and evolution.

[5]  Olivier Gascuel,et al.  Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood , 2005, Bioinform..

[6]  Thomas Ludwig,et al.  RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees , 2005, Bioinform..

[7]  Masatoshi Nei,et al.  Inconsistency of the maximum parsimony method when the rate of nucleotide substitution is constant , 1994, Journal of Molecular Evolution.

[8]  Olivier Gascuel,et al.  Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle , 2002, J. Comput. Biol..

[9]  R. Sokal,et al.  A QUANTITATIVE APPROACH TO A PROBLEM IN CLASSIFICATION† , 1957, Evolution; International Journal of Organic Evolution.

[10]  O. Gascuel,et al.  Consistency of Topological Moves Based on the Balanced Minimum Evolution Principle of Phylogenetic Inference , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[12]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[13]  Anne-Mieke Vandamme,et al.  Tracing the origin and history of the HIV-2 epidemic , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Thomas M. Keane,et al.  DPRml: distributed phylogeny reconstruction by maximum likelihood , 2005, Bioinform..

[15]  Marietta L. Baba,et al.  Evolution of cytochromec investigated by the maximum parsimony method , 2005, Journal of Molecular Evolution.

[16]  D. Bryant The Splits in the Neighborhood of a Tree , 2004 .

[17]  Ziheng Yang,et al.  Computational Molecular Evolution , 2006 .

[18]  J. Huelsenbeck,et al.  SUCCESS OF PHYLOGENETIC METHODS IN THE FOUR-TAXON CASE , 1993 .

[19]  O. Gascuel,et al.  Improvement of distance-based phylogenetic methods by a local maximum likelihood approach using triplets. , 2002, Molecular biology and evolution.

[20]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[21]  P. Lewis,et al.  Success of maximum likelihood phylogeny inference in the four-taxon case. , 1995, Molecular biology and evolution.

[22]  J. Rougemont,et al.  A rapid bootstrap algorithm for the RAxML Web servers. , 2008, Systematic biology.

[23]  Moriya Ohkuma,et al.  Comparison of four protein phylogeny of parabasalian symbionts in termite guts. , 2007, Molecular phylogenetics and evolution.

[24]  S. Gupta,et al.  Statistical decision theory and related topics IV , 1988 .

[25]  A. Lemmon,et al.  The metapopulation genetic algorithm: An efficient solution for the problem of large phylogeny estimation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  W. H. Day Properties of the nearest neighbor interchange metric for trees of small size , 1983 .

[27]  K. Crandall,et al.  Phylogeny Estimation and Hypothesis Testing Using Maximum Likelihood , 1997 .

[28]  Derick Wood,et al.  A Note on Some Tree Similarity Measures , 1982, Inf. Process. Lett..

[29]  C. Ho,et al.  Trypanosoma brucei RNA Triphosphatase , 2001, The Journal of Biological Chemistry.

[30]  B. Rannala,et al.  Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. , 1997, Molecular biology and evolution.

[31]  Vincent Moulton,et al.  Biogeographic interpretation of splits graphs: least squares optimization of branch lengths. , 2005, Systematic biology.

[32]  M A Newton,et al.  Bayesian Phylogenetic Inference via Markov Chain Monte Carlo Methods , 1999, Biometrics.

[33]  Haiyan Jiang,et al.  Insertions and the emergence of novel protein structure: a structure-based phylogenetic study of insertions , 2007, BMC Bioinformatics.

[34]  Charles Semple,et al.  On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance , 2005 .

[35]  K. Schleifer,et al.  ARB: a software environment for sequence data. , 2004, Nucleic acids research.

[36]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[37]  B. Larget,et al.  Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees , 2000 .

[38]  P. Keeling,et al.  On the monophyly of chromalveolates using a six-protein phylogeny of eukaryotes. , 2005, International journal of systematic and evolutionary microbiology.

[39]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[40]  W. Lamboy,et al.  The Accuracy of the Maximum Parsimony Method for Phylogeny Reconstruction with Morphological Characters , 1994 .

[41]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.