Clustering Seven Data Sets by Means of Some or All of Seven Clustering Methods.

Seven data sets, one artificially contrived, the others real data from different areas of psychology and sociology but mainly concerned with children, were subjected to clustering by seven different algorithms. These methods are for the most part rather well known: Holzinger and Harman's (1941) B-coefficient, Overall and Klett's (1972) Linear Typal Analysis, McQuitty and Koch's (1974a, b) elementary linkage analysis, Rohlf and his colleagues' (Rohlf, Kishpaugh, & Kirk, 1971) Numerical Taxonomy System, the Statistical Analysis System (SAS Institute, 1982) hierarchical clustering (Cluster) method, Cattell and Coulter's (1966) Taxonome, and one not very well known, Bolz's (1978) Type Analysis. Insofar as possible, all methods were run on all sets of data on the computer with appropriate adjustments to the respective programs. With the B-coefficient both hand and machine calculations were carried out. Statistical and logical comparisons were made among the different methods used on the data sets. All methods had their strengths and weaknesses, some being more adequate with some data sets and others with others. Surprisingly, the B-coefficient, at least with smaller sets compared favorably with other methods, even though it is scarcely known to modern clustering literature.

[1]  L. Mcquitty Elementary Linkage Analysis for Isolating Orthogonal and Oblique Types and Typal Relevancies , 1957 .

[2]  Louis L. McQuitty,et al.  Highest Entry Hierarchical Clustering , 1975 .

[3]  Leslie C. Morey,et al.  A Comparison of Four Clustering Methods Using MMPI Monte Carlo Data , 1980 .

[4]  Louis L. McQuitty,et al.  A Method for Hierarchical Clustering of a Matrix of a Thousand By a Thousand 1 , 1975 .

[5]  Roger K. Blashfield,et al.  Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. , 1976 .

[6]  Roger N. Shepard,et al.  Additive clustering: Representation of similarities as combinations of discrete overlapping properties. , 1979 .

[7]  Taxonomic Congruence: A Brief Discussion , 1983 .

[8]  R. M. Dreger The Children's Behavioral Classification Project: An interim report , 1977, Journal of abnormal child psychology.

[9]  Raymond B. Cattell,et al.  rp and other coefficients of pattern similarity , 1949, Psychometrika.

[10]  R. M. Dreger Use of Absolute Values in Estimating Reliability From the Inter-Item Correlations , 1973 .

[11]  C. Edelbrock,et al.  A typology of child behavior profile patterns: distribution and correlates for disturbed children aged 6–16 , 1980, Journal of abnormal child psychology.

[12]  R. M. Dreger,et al.  BEHAVIORAL CLASSIFICATION PROJECT. , 1964, Journal of consulting psychology.

[13]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[14]  Ralph Mason Dreger,et al.  Microcomputer Programs for the Rand Index of Cluster Similarity , 1986 .

[15]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[16]  R. M. Dreger The classification of children and their emotional problems , 1981 .

[17]  R. M. Dreger The classification of children and their emotional problems: An overview—II , 1982 .

[18]  Harry H. Harman,et al.  Factor analysis : a synthesis of factorial methods , 1941 .

[19]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[20]  Peter H. A. Sneath,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification , 1973 .

[21]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[22]  C. Mallows,et al.  A Method for Comparing Two Hierarchical Clusterings , 1983 .

[23]  Alan Agresti,et al.  The Measurement of Classification Agreement: An Adjustment to the Rand Statistic for Chance Agreement , 1984 .

[24]  An Extension of Intersection Methods From Trees to Dendrograms , 1984 .

[25]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[26]  R K Blashfield,et al.  The Literature On Cluster Analysis. , 1978, Multivariate behavioral research.

[27]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[28]  R K Blashfield,et al.  The Growth Of Cluster Analysis: Tryon, Ward, And Johnson. , 1980, Multivariate behavioral research.

[29]  C. Edelbrock Mixture Model Tests Of Hierarchical Clustering Algorithms: The Problem Of Classifying Everybody. , 1979, Multivariate behavioral research.

[30]  A. D. Gordon,et al.  Classification : Methods for the Exploratory Analysis of Multivariate Data , 1981 .