Why Neighbor-Joining Works

Abstract We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson’s optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson’s criterion is not satisfied. We also provide a proof for Atteson’s conjecture on the optimal edge radius of the neighbor-joining algorithm. The strong performance guarantees we provide also hold for the quadratic time fast neighbor-joining algorithm, thus providing a theoretical basis for inferring very large phylogenies with neighbor-joining.

[1]  O. Gascuel,et al.  Neighbor-joining revealed. , 2006, Molecular biology and evolution.

[2]  Wen-Hsiung Li,et al.  NJML: a hybrid algorithm for the neighbor-joining and maximum-likelihood methods. , 2000, Molecular biology and evolution.

[3]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[4]  Lior Pachter,et al.  Beyond pairwise distances: neighbor-joining with phylogenetic diversity estimates. , 2006, Molecular biology and evolution.

[5]  Y. Pauplin Direct Calculation of a Tree Length Using a Distance Matrix , 2000, Journal of Molecular Evolution.

[6]  O. Gascuel,et al.  Improvement of distance-based phylogenetic methods by a local maximum likelihood approach using triplets. , 2002, Molecular biology and evolution.

[7]  J. Farris,et al.  PARSIMONY JACKKNIFING OUTPERFORMS NEIGHBOR‐JOINING , 1996, Cladistics : the international journal of the Willi Hennig Society.

[8]  Binhai Zhu,et al.  A lower bound on the edge linfinitely radius of Saitou and Nei's method for phylogenetic reconstruction , 2005, Inf. Process. Lett..

[9]  Hideo Matsuda,et al.  fastDNAmL: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood , 1994, Comput. Appl. Biosci..

[10]  Y. Xua,et al.  On the edge l ∞ radius of Saitou and Nei ’ s method for phylogenetic reconstruction , 2006 .

[11]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[12]  Yin-Feng Xu,et al.  On the edge linfinitf radius of Saitou and Nei's method for phylogenetic reconstruction , 2006, Theor. Comput. Sci..

[13]  Olivier Gascuel,et al.  The minimum evolution distance-based approach of phylogenetic inference , 2007, Mathematics of Evolution and Phylogeny.

[14]  P. Erdös,et al.  A few logs suffice to build (almost) all trees (l): part I , 1997 .

[15]  Jens Lagergren,et al.  Fast neighbor joining , 2005, Theor. Comput. Sci..

[16]  O. Gascuel,et al.  The Minimum-Evolution Distance-Based Approach to Phylogeny Inference , 2005 .

[17]  B. Hall Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences. , 2005, Molecular biology and evolution.

[18]  A. Tversky,et al.  Additive similarity trees , 1977 .

[19]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[20]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[21]  M. Nei,et al.  Prospects for inferring very large phylogenies by using the neighbor-joining method. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[22]  A. Halpern,et al.  Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. , 2000, Molecular biology and evolution.

[23]  Tandy J. Warnow,et al.  A few logs suffice to build (almost) all trees (I) , 1999, Random Struct. Algorithms.

[24]  J. A. Studier,et al.  A note on the neighbor-joining algorithm of Saitou and Nei. , 1988, Molecular biology and evolution.

[25]  David Bryant,et al.  On the Uniqueness of the Selection Criterion in Neighbor-Joining , 2005, J. Classif..

[26]  Kevin Atteson,et al.  The Performance of Neighbor-Joining Methods of Phylogenetic Reconstruction , 1999, Algorithmica.

[27]  O. Gascuel A note on Sattath and Tversky's, Saitou and Nei's, and Studier and Keppler's algorithms for inferring phylogenies from evolutionary distances. , 1994, Molecular biology and evolution.

[28]  Tandy J. Warnow,et al.  Performance study of phylogenetic methods: (unweighted) quartet methods and neighbor-joining , 2001, SODA '01.

[29]  Sudhir Kumar,et al.  Efficiency of the Neighbor-Joining Method in Reconstructing Deep and Shallow Evolutionary Relationships in Large Phylogenies , 2000, Journal of Molecular Evolution.

[30]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[31]  J. Huelsenbeck,et al.  SUCCESS OF PHYLOGENETIC METHODS IN THE FOUR-TAXON CASE , 1993 .