Neighbor-joining revealed.

It is nearly 20 years since the landmark paper (Saitou and Nei 1987) in Molecular Biology and Evolution introducing Neighbor-Joining (NJ). The method has become the most widely used method for building phylogenetic trees from distances, and the original paper has been cited about 13,000 times (Science Citation Index). Yet the question "what does the NJ method seek to do?" has until recently proved somewhat elusive, leading to some imprecise claims and misunderstanding. However, a rigorous answer to this question has recently been provided by further mathematical investigation, and the purpose of this note is to highlight these results and their significance for interpreting NJ. The origins of this story lie in a paper by Pauplin (2000) though its continuation has unfolded in more mathematically inclined literature. Our aim here is to make these findings more widely accessible.

[1]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[2]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[3]  J. A. Studier,et al.  A note on the neighbor-joining algorithm of Saitou and Nei. , 1988, Molecular biology and evolution.

[4]  N. Saitou,et al.  Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree , 1989 .

[5]  M. Nei,et al.  Theoretical foundation of the minimum-evolution method of phylogenetic inference. , 1993, Molecular biology and evolution.

[6]  O. Gascuel A note on Sattath and Tversky's, Saitou and Nei's, and Studier and Keppler's algorithms for inferring phylogenies from evolutionary distances. , 1994, Molecular biology and evolution.

[7]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[8]  Olivier Gascuel,et al.  Concerning the NJ algorithm and its unweighted version, UNJ , 1996, Mathematical Hierarchies and Biology.

[9]  Sudhir Kumar,et al.  A stepwise algorithm for finding minimum evolution trees. , 1996, Molecular biology and evolution.

[10]  Vladimir Makarenkov,et al.  Circular orders of tree metrics, and their uses for the reconstruction and fitting of phylogenetic trees , 1996, Mathematical Hierarchies and Biology.

[11]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .

[12]  N. Saitou,et al.  Reconstruction of gene trees from sequence data. , 1996, Methods in enzymology.

[13]  F. McMorris,et al.  Mathematical Hierarchies and Biology , 1997 .

[14]  M. Nei,et al.  The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Kevin Atteson,et al.  The Performance of Neighbor-Joining Methods of Phylogenetic Reconstruction , 1999, Algorithmica.

[16]  O. Gascuel On the optimization principle in phylogenetic analysis and the minimum-evolution criterion. , 2000, Molecular biology and evolution.

[17]  Y. Pauplin Direct Calculation of a Tree Length Using a Distance Matrix , 2000, Journal of Molecular Evolution.

[18]  F. Ruddle,et al.  An efficient cis-element discovery method using multiple sequence comparisons based on evolutionary relationships. , 2001, Genomics.

[19]  Olivier Gascuel,et al.  Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle , 2002, WABI.

[20]  Arndt von Haeseler,et al.  Shortest triplet clustering: reconstructing large phylogenies using representative sets , 2005, BMC Bioinformatics.

[21]  Charles Semple,et al.  Cyclic permutations and evolutionary trees , 2004, Adv. Appl. Math..

[22]  O. Gascuel,et al.  Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting. , 2003, Molecular biology and evolution.

[23]  O. Gascuel,et al.  The Minimum-Evolution Distance-Based Approach to Phylogeny Inference , 2005 .

[24]  O. Gascuel Mathematics of Evolution & Phylogeny , 2005 .

[25]  David Bryant,et al.  On the Uniqueness of the Selection Criterion in Neighbor-Joining , 2005, J. Classif..

[26]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[27]  Lior Pachter,et al.  Beyond pairwise distances: neighbor-joining with phylogenetic diversity estimates. , 2006, Molecular biology and evolution.

[28]  Olivier Gascuel,et al.  The minimum evolution distance-based approach of phylogenetic inference , 2007, Mathematics of Evolution and Phylogeny.

[29]  A. Oskooi Molecular Evolution and Phylogenetics , 2008 .