Parallel Inference of a 10.000-Taxon Phylogeny with Maximum Likelihood

Inference of large phylogenetic trees with statistical methods is computationally intensive. We recently introduced simple heuristics which yield accurate trees for synthetic as well as real data and are implemented in a sequential program called RAxML. We have demonstrated that RAxML outperforms the currently fastest statistical phylogeny programs (MrBayes, PHYML) in terms of speed and likelihood values on real data. In this paper we present a non-deterministic parallel implementation of our algorithm which in some cases yields super-linear speedups for an analysis of 1.000 organisms on a LINUX cluster. In addition, we use RAxML to infer a 10.000-taxon phylogenetic tree containing representative organisms from the three domains: Eukarya, Bacteria and Archaea. Finally, we compare the sequential speed and accuracy of RAxML and PHYML on 8 synthetic alignments comprising 4.000 sequences.

[1]  Thomas Ludwig,et al.  New fast and accurate heuristics for inference of large phylogenetic trees , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[2]  Donald K. Berry,et al.  Parallel Implementation and Performance of FastDNAml - A Program for Maximum Likelihood Phylogenetic Inference , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[3]  K. Schleifer,et al.  ARB: a software environment for sequence data. , 2004, Nucleic acids research.

[4]  Xizhou Feng,et al.  Parallel algorithms for Bayesian phylogenetic inference , 2003, J. Parallel Distributed Comput..

[5]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[6]  Thomas Ludwig,et al.  RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees , 2005, Bioinform..

[7]  Hideo Matsuda,et al.  fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood , 1994, Comput. Appl. Biosci..

[8]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[9]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[10]  Bernard M. E. Moret,et al.  An investigation of phylogenetic likelihood methods , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[11]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[12]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .