Assessing molecular phylogenies

David Hillis et al. attempt (1) to assess the performance of methods of phylogenetic analysis by numerical simulation studies. They compare five methods using a simple criterion-the probability of obtaining the correct evolutionary tree with a given amount of simulated data. In a related paper (2) Hillis and colleagues increased to ten the number of methods compared. This approach is known to be unsatisfactory. It is easy to construct elementary examples in which such a "confidence" method, though giving the correct answer with a specified probability (usually 95%), is wrong in a particular case. Coining some memorable terms for the purpose, Hacking (3) contrasted good "before-trial evaluation" methods that are "usually right" with good "after-trial evaluation" methods that are "credible on each occasion of use." The estimation of evolutionary trees, like most estimation problems in biology, is an after-trial evaluation problem, and the use of before-trial criteria is an inappropriate way of judging the various methods. For more than 70 years the backbone of the appropriate after-trial statistical estimation theory has been known to be Fisher's method of maximum likelihood (4). The sound logical basis of maximum likelihood (5) sets it apart from the other algorithmic methods tested. If the probability model is specified, maximum likelihood would be the preferred method in principle even if it did not lead to the correct answer with the highest probability (though there are theoretical reasons for not being surprised that it scores well on this criterion). It was for these reasons that CavalliSforza and I strove from the beginning to improve on our least-squares distance-matrix approach (6) by applying maximumlikelihood to a well-defined evolutionary model for continuous characters (7), and why I suggested using maximum-likelihood for the discrete-character case as well (8). Moreover, when we discussed the justification of our "method of minimum evolution" or "parsimony" (9), we did so not on the grounds which have since been advanced (10), but because we correctly expected it to be a good approximation to the maximum-likelihood solution (7). The message from statistical theory is simple: You cannot simulate evolutionary trees without a probability model, and if you possess a probability model you should be performing efficient after-trial evaluation using the method of maximum likelihood. Simulation is valuable in assessing the robustness of a model, but not the suitability of an estimation procedure; modern statistical theory has solved that problem already. A. W. F. Edwards Gonville and Caius College, Cambridge, CB2 ]TA, United Kingdom