A note on the neighbor-joining algorithm of Saitou and Nei.

Saitou and Nei ( 1987 ) present an algorithm, which they call the neighbor-joining (NJ ) method, for estimating an additive tree from a distance matrix D . If D is treelike (i.e., if the distances in D correspond exactly to those in an actual tree), then the NJ method correctly reconstructs the tree from D. If D is not treelike (i.e., contains some noise), then there can be ambiguities in the estimated tree. Saitou and Nei simulate such data and verify that the accuracy of the NJ method is roughly equivalent to that of the Sattath and Tversky (1977) method. The minimum running time of the algorithm as formulated by Saitou and Nei is unclear. We present an alternative formulation that runs in time 0( N3), where N is the number of operational taxonomic units (OTUs) . We consider the 0( N3) running time to be useful in studies that involve a large number of OTUs, possibly in connection with reconstruction experiments using simulated or resampled (bootstrap, etc.) data. The proof given by Saitou and Nei that the correct tree is recovered if D is treelike is incorrect. We describe the error and supply a correct proof below. The modified algorithm is as follows: A. For each pair i, j of OTUs, compute