Parsimony accelerated Maximum Likelihood searches

Phylogenetic search is a key tool used in a variety of biological research endeavours. However, this search problem is known to be computationally difficult, due to the astronomically large search space, making the use of heuristic methods necessary. The performance of heuristic methods for finding Maximum Likelihood (ML) trees can be improved by using parsimony as an initial estimator for ML. The time spent in performing the parsimony search to boost performance is insignificant compared to the time spent in the ML search, leading to an overall gain in search time. These parsimony boosted ML searches lead to topologies with scores statistically similar to the unboosted searches, but in less time.

[1]  D. Ord,et al.  PAUP:Phylogenetic analysis using parsi-mony , 1993 .

[2]  Gajendra P. S. Raghava,et al.  OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy , 2003, BMC Bioinformatics.

[3]  Tamir Tuller,et al.  Maximum Likelihood of Evolutionary Trees Is Hard , 2005, RECOMB.

[4]  J. Bergsten A review of long‐branch attraction , 2005, Cladistics : the international journal of the Willi Hennig Society.

[5]  E. Boerwinkle,et al.  Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. , 1998, American journal of human genetics.

[6]  Hyrum Carroll,et al.  Phylogenetic Analysis of Large Sequence Data Sets , 2005 .

[7]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[8]  Olivier Poch,et al.  BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark , 2005, Proteins.

[9]  J. Huelsenbeck,et al.  SUCCESS OF PHYLOGENETIC METHODS IN THE FOUR-TAXON CASE , 1993 .

[10]  Hyrum Carroll,et al.  DNA reference alignment benchmarks based on tertiary structure of encoded proteins , 2007, Bioinform..

[11]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[12]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[13]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[14]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[15]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[16]  Peer Bork,et al.  SMART: identification and annotation of domains from signalling and extracellular protein sequences , 1999, Nucleic Acids Res..

[17]  G. Giribet,et al.  TNT: Tree Analysis Using New Technology , 2005 .

[18]  David S. Johnson,et al.  The computational complexity of inferring rooted phylogenies by parsimony , 1986 .

[19]  Reed A. Cartwright,et al.  DNA assembly with gaps (Dawg): simulating sequence evolution , 2005, Bioinform..

[20]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[21]  J. Huelsenbeck Performance of Phylogenetic Methods in Simulation , 1995 .

[22]  K. Crandall,et al.  Multiple interspecies transmissions of human and simian T-cell leukemia/lymphoma virus type I sequences. , 1996, Molecular biology and evolution.

[23]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[24]  M. Siddall,et al.  Success of Parsimony in the Four‐Taxon Case: Long‐Branch Repulsion by Likelihood in the Farris Zone , 1998 .

[25]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[26]  E. Delwart,et al.  Phylogenetic analysis of WNV in North American blood donors during the 2003-2004 epidemic seasons. , 2007, Virology.

[27]  G. Serio,et al.  A new method for calculating evolutionary substitution rates , 2005, Journal of Molecular Evolution.

[28]  J. Oliver,et al.  The general stochastic model of nucleotide substitution. , 1990, Journal of theoretical biology.

[29]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[30]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[31]  C. Sing,et al.  Application of cladistics to the analysis of genotype-phenotype relationships , 1992, European Journal of Epidemiology.

[32]  S. Tavaré Some probabilistic and statistical problems in the analysis of DNA sequences , 1986 .