Comparing Numerical Taxonomic Studies

Rohlf, F. J. 1 (IBM Thomas J. Watson Research Center, Yorktown Heights, New York 10598) and R. R. Sokal (Department of Ecology and Evolution, State University of New York, Stony Brook, New York 11794) 1981. Comparing numerical taxonomic studies. Syst. Zool., 30:459-490.-Recent proposals to measure the degree to which given taxometric methods meet goals defined by the three current schools of classification have led to quantitative comparisons of the methods. To aid in understanding such comparisons, a flow chart of taxonomic procedures is presented. Optimality tests are reviewed for each type of procedure. Possibly desirable properties of classifications include: the fit of a summary representation to a similarity matrix, stability, general utility, fit to a known cladistic relationship, and optimality criteria of numerical phylogenetic methods. We examine how they relate to the professed goals of the taxonomic schools and whether they can be used for comparative evaluations between these schools. Previous attempts at comparing numerical classifications are reexamined. Such comparisons have largely been made improperly. Published comparative tests of taxonomic congruence are based on inappropriate comparisons or were improperly executed and cannot furnish evidence on relative stability of phenetic, evolutionary, and phylogenetic classifications. Reports which claim to show that numerical phylogenetic classifications result in better fits to original similarity matrices than phenetic methods and therefore retain distance information better than phenetic classifications are shown to be misleading. In the first such study, the comparison was not relevant to the question asked. In all of these studies the results were biased in favor of phylogenetic methods by retaining redundant information during the computation of matrix correlations for the phylogenetic methods. In two later studies based on ten taxonomic data sets, the comparisons for the phylogenetic methods were in terms of unrooted trees rather than hierarchic classifications. By limiting the reference OTU to OTU 1 in each data set, results were obtained in these studies, that tended to favor the phylogenetic methods considerably more than if some other reference OTUs had been employed. Only in a few cases is there a significant increase in fit with the phylogenetic methods. Interpreted as classifications, UPGMA clustering of the original dissimilarity matrix gives the best fit in the majority of cases when compared with rooted trees (minimum length and least squares fitted). For these data, there is no evidence that classifications by any "phylogenetic" technique yield better summaries of phenetic information than UPGMA. A recent study of predictivity, while correctly designed, yielded complex results with no clear preference for any one school of taxonomy. Thus there is no current acceptable evidence that numerical phylogenetic methods yield classifications which contain more information than either phenetic or evolutionary ones. [Numerical taxonomy; classification; phenetics; phylogenetics; cladistics]. In this paper, we examine the methods proposed for measuring the degree to which various numerical methods meet the differing goals proposed by the various classificatory schools. Are given numerical methods consonant with the stated goals of a given school of taxonomy and can one develop criteria which measure how well these methods meet such goals? We also discuss the validity of usI Present address: Department of Ecology and Evolution, State University of New York, Stony Brook, New York 11794. ing the same criteria to compare phenetic and cladistic approaches to taxonomy. We first describe the normal flow of procedures in a numerical taxonomic study. Next we enumerate the types of criteria that have been used to evaluate the optimality of such procedures and discuss appropriate ways by which comparisons can be made between alternative taxonomic procedures. We suggest methods for improving the evaluation of numerical techniques and outline some principles for studies comparing numerical phenetic and phylogenetic techniques. Then we examine several pub-

[1]  Frank Harary,et al.  Graph Theory , 2016 .

[2]  E. J. Dupraw Non-Linnean Taxonomy and the Systematics of Honeybees , 1965 .

[3]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[4]  J. Farris On the Naturalness of Phylogenetic Classification , 1979 .

[5]  James S. Farris,et al.  The Meaning of Relationship and Taxonomic Procedure , 1967 .

[6]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[7]  A. Wilson,et al.  Albumin phylogeny for clawed frogs (Xenopus). , 1977, Science.

[8]  P. Sneath The application of computers to taxonomy. , 1957, Journal of general microbiology.

[9]  M. Janowitz A Note on Phenetic and Phylogenetic Classifications , 1979 .

[10]  J. Farris On the Phenetic Approach to Vertebrate Classification , 1977 .

[11]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[12]  E. J. Dupraw Non-Linnean Taxonomy , 1964, Nature.

[13]  J. Farris A Probability Model for Inferring Evolutionary Trees , 1973 .

[14]  J. Gower Maximal predictive classification , 1974 .

[15]  Joseph Felsenstein,et al.  Maximum Likelihood and Minimum-Steps Methods for Estimating Evolutionary Trees from Data on Discrete Characters , 1973 .

[16]  G. Estabrook,et al.  An Operational Method for Evaluating Classifications , 1976 .

[17]  Allan C. Wilson,et al.  Albumin Differences Among Ranid Frogs: Taxonomic and Phylogenetic Implications , 1973 .

[18]  F. James Rohlf,et al.  Taxonomic Congruence in the Leptopodomorpha Re-examined , 1981 .

[19]  S. Hakimi,et al.  The distance matrix of a graph and its tree realization , 1972 .

[20]  M. F. Mickevich Taxonomic Congruence: Rohlf and Sokal's Misunderstanding , 1980 .

[21]  J. Farris,et al.  Methods for Investigating Taxonomie Congruence and their Application to the Leptopodomorpha , 1981 .

[22]  R. Sokal,et al.  Comments on Taxonomic Congruence , 1980 .

[23]  F. Rohlf An Empirical Comparison of Three Ordination Techniques in Numerical Taxonomy , 1972 .

[24]  Walter J. Lequesne Further Studies Based on the Uniquely Derived Character Concept , 1972 .

[25]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[26]  J. Polhemus,et al.  Analysis of Taxonomic Congruence among Morphological, Ecological, and Biogeographic Data Sets for the Leptopodomorpha (Hemiptera) , 1980 .

[27]  C. J. Jardine,et al.  The structure and construction of taxonomic hierarchies , 1967 .

[28]  P. H. A. Sneath,et al.  Recent Developments in Theoretical and Quantitative Taxonomy , 1961 .

[29]  James S. Farris,et al.  The Information Content of the Phylogenetic System , 1979 .

[30]  Robin Sibson,et al.  The Construction of Hierarchic and Non-Hierarchic Classifications , 1968, Comput. J..

[31]  A. Wilson,et al.  Albumin evolution and organisinal evolution in tree frogs (Hylidae) , 1975 .

[32]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[33]  J. Farris Estimating Phylogenetic Trees from Distance Matrices , 1972, The American Naturalist.

[34]  J. Hartigan REPRESENTATION OF SIMILARITY MATRICES BY TREES , 1967 .

[35]  Niles Eldredge,et al.  Phylogenetic Patterns and the Evolutionary Process. , 1981 .

[36]  V. Sarich Pinniped origins and the rate of evolution of carnivore albumins. , 1969, Systematic zoology.

[37]  J. Gower Multivariate Analysis and Multidimensional Geometry , 1967 .

[38]  R. Sokal,et al.  THE COMPARISON OF DENDROGRAMS BY OBJECTIVE METHODS , 1962 .

[39]  Susan M. Case,et al.  Biochemical Systematics of Members of the Genus Rana Native to Western North America , 1978 .

[40]  P. Buneman A Note on the Metric Properties of Trees , 1974 .

[41]  G. Estabrook,et al.  Cladistic Methodology: A Discussion of the Theoretical Basis for the Induction of Evolutionary History , 1972 .