Quartet-based phylogenetic inference: improvements and limits.

We analyze the performance of quartet methods in phylogenetic reconstruction. These methods first compute four-taxon trees (4-trees) and then use a combinatorial algorithm to infer a phylogeny that respects the inferred 4-trees as much as possible. Quartet puzzling (QP) is one of the few methods able to take weighting of the 4-trees, which is inferred by maximum likelihood, into account. QP seems to be widely used. We present weight optimization (WO), a new algorithm which is also based on weighted 4-trees. WO is faster and offers better theoretical guarantees than QP. Moreover, computer simulations indicate that the topological accuracy of WO is less dependent on the shape of the correct tree. However, although the performance of WO is better overall than that of QP, it is still less efficient than traditional phylogenetic reconstruction approaches based on pairwise evolutionary distances or maximum likelihood. This is likely related to long-branch attraction, a phenomenon to which quartet methods are very sensitive, and to inappropriate use of the initial results (weights) obtained by maximum likelihood for every quartet.

[1]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[2]  Fred R. McMorris,et al.  Consensusn-trees , 1981 .

[3]  A. Dress,et al.  Reconstructing the shape of a tree from observed dissimilarity data , 1986 .

[4]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[5]  Alain Guénoche,et al.  Trees and proximity representations , 1991, Wiley-Interscience series in discrete mathematics and optimization.

[6]  M. Steel The complexity of reconstructing trees from qualitative characters and subtrees , 1992 .

[7]  Hideo Matsuda,et al.  fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood , 1994, Comput. Appl. Biosci..

[8]  Sudhir Kumar,et al.  A stepwise algorithm for finding minimum evolution trees. , 1996, Molecular biology and evolution.

[9]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[10]  M Hasegawa,et al.  Instability of quartet analyses of molecular sequence data by the maximum likelihood method: the Cetacea/Artiodactyla relationships. , 1996, Molecular phylogenetics and evolution.

[11]  K. Strimmer,et al.  Bayesian Probabilities and Quartet Puzzling , 1997 .

[12]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[13]  Olivier Gascuel,et al.  Inferring evolutionary trees with strong combinatorial evidence , 1997, Theor. Comput. Sci..

[14]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[15]  Tandy J. Warnow,et al.  Constructing Big Trees from Short Sequences , 1997, ICALP.

[16]  Tao Jiang,et al.  Orchestrating quartets: approximation and data correction , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[17]  James Lyons-Weiler,et al.  Branch Length Heterogeneity Leads to Nonindependent Branch Length Estimates and Can Decrease the Efficiency of Methods of Phylogenetic Inference , 1999, Journal of Molecular Evolution.

[18]  Tao Jiang,et al.  Quartet Cleaning: Improved Algorithms and Simulations , 1999, ESA.

[19]  Stephen J. Willson,et al.  Building Phylogenetic Trees from Quartets by Using Local Inconsistency Measures , 1999 .

[20]  Stephen J. Willson,et al.  A Higher Order Parsimony Method to Reduce Long-Branch Attraction , 1999 .