RAxML-OMP: An Efficient Program for Phylogenetic Inference on SMPs

Inference of phylogenetic trees comprising hundreds or even thousands of organisms based on the Maximum Likelihood (ML) method is computationally extremely intensive. In order to accelerate computations we implemented RAxML-OMP, an efficient OpenMP-parallelization for Symmetric Multi-Processing machines (SMPs) based on the sequential program RAxML-V (Randomized Axelerated Maximum Likelihood). RAxML-V is a program for inference of evolutionary trees based upon the ML method and incorporates several advanced search algorithms like fast hill-climbing and simulated annealing. We assess performance of RAxML-OMP on the widely used Intel Xeon, Intel Itanium, and AMD Opteron architectures. RAxML-OMP scales particularly well on the AMD Opteron architecture and achieves even super-linear speedups for large datasets (with a length ≥ 5.000 base pairs) due to improved cache-efficiency and data locality. RAxML-OMP is freely available as open source code.

[1]  David S. Johnson,et al.  The computational complexity of inferring rooted phylogenies by parsimony , 1986 .

[2]  Hideo Matsuda,et al.  fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood , 1994, Comput. Appl. Biosci..

[3]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[4]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[5]  Michael R. Fellows,et al.  The hardness of perfect phylogeny, feasible register assignment and other problems on thin colored graphs , 2000, Theor. Comput. Sci..

[6]  Jonathan P. Bollback,et al.  Bayesian Inference of Phylogeny and Its Impact on Evolutionary Biology , 2001, Science.

[7]  David A. Bader,et al.  Industrial applications of high-performance computing for phylogeny reconstruction , 2001, SPIE ITCom.

[8]  J. Kim,et al.  Scaling of Accuracy in Extremely Large Phylogenetic Trees , 2000, Pacific Symposium on Biocomputing.

[9]  Donald K. Berry,et al.  Parallel Implementation and Performance of FastDNAml - A Program for Maximum Likelihood Phylogenetic Inference , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[10]  J. Huelsenbeck,et al.  Potential applications and pitfalls of Bayesian inference of phylogeny. , 2002, Systematic biology.

[11]  Martin Vingron,et al.  TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing , 2002, Bioinform..

[12]  Bernard M. E. Moret,et al.  An investigation of phylogenetic likelihood methods , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[13]  Dan Gusfield,et al.  Efficient reconstruction of phylogenetic networks with constrained recombination , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[14]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[15]  A. von Haeseler,et al.  IQPNNI: moving fast through tree space and stopping in time. , 2004, Molecular biology and evolution.

[16]  Jijun Tang,et al.  Phylogenetic reconstruction from arbitrary gene-order data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[17]  Dan Gusfield,et al.  Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination , 2004, J. Bioinform. Comput. Biol..

[18]  Thomas M. Keane,et al.  DPRml: distributed phylogeny reconstruction by maximum likelihood , 2005, Bioinform..

[19]  Alexandros Stamatakis,et al.  An efficient program for phylogenetic inference using simulated annealing , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[20]  Thomas Ludwig,et al.  RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees , 2005, Bioinform..

[21]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.