Clustering Methodologies in Exploratory Data Analysis

Publisher Summary This chapter reviews cluster analysis and related topics or the formal study of classification schemata, whereby objects are grouped, or clustered, according to measured or perceived intrinsic characteristics. The objective of a cluster analysis is to uncover natural groupings, or types, to prod one's creativity and ingenuity, and initiate hypotheses about the phenomenon being studied. Cluster analysis has a heuristic nature that encourages the exploration of data. Taxonomists, social scientists, psychologists, biologists, statisticians, mathematicians, engineers, computer scientists, medical researchers, and others who handle real data have all contributed to clustering methodology. The chapter presents cross-disciplinary communication so that one application area can profit from the experiences of others. The literature of cluster analysis straddles all quantitative, scientific disciplines, as demonstrated by the remarkable variety. Emphasis is on new developments, especially in the verification and validation of clustering results. The intent is to provide an applications-oriented treatment of cluster analysis in the spirit of exploratory data analysis. The chapter focuses on the four operations highlighted by reviewing techniques for assessing the tendency of the data to cluster, performing the clustering itself, and evaluating the validity of the results. Data representation includes recognition of data type and scale, measures of proximity and affinity, normalization, various two-dimensional projections of data, methods of visualizing multidimensional data, techniques for creating scales to describe data, and related matters. The chapter introduces the concept of intrinsic dimensionality that helps determine an appropriate number of factors for representing data. The chapter recommends that a serious data analyst be conversant with as broad a range of data analysis techniques and programs as possible and be aware of the assumptions on which the techniques are based.

[1]  Daryl J. Eigen,et al.  Cluster Analysis Based on Dimensional Information with Applications to Feature Selection and Classification , 1974, IEEE Trans. Syst. Man Cybern..

[2]  Brian Everitt,et al.  Graphical Techniques for Multivariate Data. , 1978 .

[3]  Joseph B. Kruskal Comments on "A Nonlinear Mapping for Data Structure Analysis" , 1971, IEEE Trans. Computers.

[4]  M. Kendall THE BASIC PROBLEMS OF CLUSTER ANALYSIS , 1973 .

[5]  A. D. Gordon,et al.  An Algorithm for Euclidean Sum of Squares Classification , 1977 .

[6]  Tamotsu Kasai,et al.  A Method for the Correction of Garbled Words Based on the Levenshtein Metric , 1976, IEEE Transactions on Computers.

[7]  Valerie L. Koch A Computer Program for Clustering Large Matrices , 1976 .

[8]  G. Estabrook A mathematical model in graph theory for biological classification. , 1966, Journal of theoretical biology.

[9]  Louis L. McQuitty,et al.  Highest Column Entry Hierarchical Clustering a Redevelopment and Elaboration of Elementary Linkage Analysis , 1976 .

[10]  J. Strauss,et al.  The Use of Clustering Techniques for the Classification of Psychiatric Patients , 1972, British Journal of Psychiatry.

[11]  Jon Louis Bentley,et al.  Fast Algorithms for Constructing Minimal Spanning Trees in Coordinate Spaces , 1978, IEEE Transactions on Computers.

[12]  T. Caelli,et al.  Constant curvature Riemannian scaling , 1978 .

[13]  G. N. Lance,et al.  Group-Size Depencence: A Rationale for Choice Between Numerical Classifications , 1971, Comput. J..

[14]  F. Rohlf Methods of Comparing Classifications , 1974 .

[15]  Amnon Rapoport,et al.  Structures in the subjective lexicon , 1971 .

[16]  King-Sun Fu,et al.  Syntactic Methods in Pattern Recognition , 1974, IEEE Transactions on Systems, Man, and Cybernetics.

[17]  Brian Everitt,et al.  Cluster analysis , 1974 .

[18]  Richard J. Howarth,et al.  Preliminary assessment of a nonlinear mapping algorithm in a geological context , 1973 .

[19]  Anil K. Jain,et al.  Feature definition in pattern recognition with small sample size , 1978, Pattern Recognit..

[20]  R K Blashfield,et al.  Evaluative criteria for psychiatric classification. , 1976, Journal of abnormal psychology.

[21]  Louis L. McQuitty,et al.  A Mutual Development of Some Typological Theories and Pattern-Analytic Methods , 1967 .

[22]  A. Scott,et al.  Clustering methods based on likelihood ratio criteria. , 1971 .

[23]  Anil K. Jain,et al.  MODELS AND METHODS IN CLUSTER VALIDITY. , 1800 .

[24]  L. Hubert,et al.  Quadratic assignment as a general data analysis strategy. , 1976 .

[25]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[26]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[27]  William S. Cleveland,et al.  Clustering by Identification with Special Application to Two-Way Tables of Counts , 1975 .

[28]  Margareta Holgersson,et al.  The limited value of cophenetic correlation as a clustering criterion , 1978, Pattern Recognit..

[29]  King-Sun Fu,et al.  A Clustering Procedure for Syntactic Patterns , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[30]  H. Harman Modern factor analysis , 1961 .

[31]  F. James Rohlf,et al.  Function-Point Cluster Analysis , 1973 .

[32]  Jacques J. Vidal,et al.  An Algorithm for Determining the Topological Dimensionality of Point Clusters , 1975, IEEE Transactions on Computers.

[33]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[34]  O C Tzeng,et al.  Three-Mode Multidimensional Scaling With Points Of View Solutions. , 1978, Multivariate behavioral research.

[35]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[36]  R. Casey,et al.  Advances in Pattern Recognition , 1971 .

[37]  Edmund R. Peay,et al.  Grouping by cliques for directed relationships , 1975 .

[38]  Brian D. Ripley,et al.  Quick tests for spatial interaction , 1978 .

[39]  R. Prim Shortest connection networks and some generalizations , 1957 .

[40]  Chris P. Tsokos,et al.  An Information Measure of Association in Contingency Tables , 1971, Inf. Control..

[41]  R. Mojena,et al.  Hierarchical Grouping Methods and Stopping Rules: An Evaluation , 1977, Comput. J..

[42]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[43]  Hrishikesh D. Vinod Mathematica Integer Programming and the Theory of Grouping , 1969 .

[44]  Malcolm E. Turner,et al.  CREDIBILITY AND CLUSTER , 1969 .

[45]  A. Hall Methods for demonstrating Resemblance in Taxonomy and Ecology , 1967, Nature.

[46]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[47]  C. J. Jardine,et al.  The structure and construction of taxonomic hierarchies , 1967 .

[48]  F. Rohlf Adaptive Hierarchical Clustering Schemes , 1970 .

[49]  S. Zahl A Comparison of Three Methods for the Analysis of Spatial Pattern , 1977 .

[50]  A Kiev,et al.  Cluster analysis profiles of suicide attempters. , 1976, The American journal of psychiatry.

[51]  G. W. Milligan,et al.  An examination of the effect of six types of error perturbation on fifteen clustering algorithms , 1980 .

[52]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[54]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[55]  E. Paykel,et al.  Classification of Suicide Attempters by Cluster Analysis , 1978, British Journal of Psychiatry.

[56]  R. D'Andrade U-statistic hierarchical clustering , 1978 .

[57]  B. Everitt Unresolved Problems in Cluster Analysis , 1979 .

[58]  D. Matula Graph Theoretic Techniques for Cluster Analysis Algorithms , 1977 .

[59]  J. Ramsay Confidence regions for multidimensional scaling analysis , 1978 .

[60]  P. M. Narendra,et al.  A non-parametric clustering scheme for landsat , 1977, Pattern Recognit..

[61]  L. Hubert,et al.  Data analysis and the connectivity of random graphs , 1973 .

[62]  Keinosuke Fukunaga,et al.  Generalized Clustering for Problem Localization , 1978, IEEE Transactions on Computers.

[63]  Azriel Rosenfeld,et al.  Digital Picture Processing , 1976 .

[64]  Azriel Rosenfeld,et al.  Digital picture analysis , 1976 .

[65]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[66]  Robert M. Haralick,et al.  Decomposition of Two-Dimensional Shapes by Graph-Theoretic Clustering , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[68]  G.B. Coleman,et al.  Image segmentation by clustering , 1979, Proceedings of the IEEE.

[69]  Anil K. Jain,et al.  Single-link characteristics of a mode-seeking clustering algorithm , 1979, Pattern Recognit..

[70]  L. Lefkovitch,et al.  Cluster generation and grouping using mathematical programming , 1978 .

[71]  Roger K. Blashfield,et al.  Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. , 1976 .

[72]  R. Haralick,et al.  A spatial clustering procedure for multi-image data , 1975 .

[73]  Richard C. T. Lee,et al.  A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space , 1977, IEEE Transactions on Computers.

[74]  R. MacCallum,et al.  A monte carlo investigation of recovery of structure by alscal , 1977 .

[75]  Azriel Rosenfeld,et al.  Pattern Recognition and Image Processing , 1976, IEEE Transactions on Computers.

[76]  Keinosuke Fukunaga,et al.  A Graph-Theoretic Approach to Nonparametric Cluster Analysis , 1976, IEEE Transactions on Computers.

[77]  R. Lord The Distribution of Distance in a Hypersphere , 1954 .

[78]  G. N. Lance,et al.  A general theory of classificatory sorting strategies: II. Clustering systems , 1967, Comput. J..

[79]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[80]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[81]  Anil K. Jain,et al.  Validity studies in clustering methodologies , 1979, Pattern Recognit..

[82]  A. A. Torn Cluster Analysis Using Seed Points and Density-Determined Hyperspheres as an Aid to Global Optimization , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[83]  Josef Kittler Comments on "single-link characteristics of a mode-seeking clustering algorithm" , 1979, Pattern Recognit..

[84]  P. Sneath A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap , 1977 .

[85]  E. R. Peay Nonmetric grouping: Clusters and cliques , 1975 .

[86]  L. Hubert,et al.  Normative location theory: Placement in continuous space , 1976 .

[87]  D. J. Strauss A model for clustering , 1975 .

[88]  J. V. Ness,et al.  Admissible clustering procedures , 1971 .

[89]  G. Sebestyen,et al.  An Algorithm for Non-Parametric Pattern Recognition , 1966, IEEE Trans. Electron. Comput..

[90]  R. J. Shanley,et al.  Delineation and analysis of clusters in orientation data , 1976 .

[91]  D. Curry,et al.  Some Statistical Considerations In Clustering With Binary Data. , 1976, Multivariate behavioral research.

[92]  George S. Sebestyen,et al.  Decision-making processes in pattern recognition , 1962 .

[93]  Robert C. MacCallum,et al.  Relations between factor analysis and multidimensional scaling. , 1974 .

[94]  Josiah Macy,et al.  Mathematics and Computer Science in Biology and Medicine , 1966 .

[95]  J Zubin,et al.  ON THE METHODS AND THEORY OF CLUSTERING. , 1969, Multivariate behavioral research.

[96]  John G. Kemeny,et al.  Mathematical models in the social sciences , 1964 .

[97]  L. Hubert Approximate Evaluation Techniques for the Single-Link and Complete-Link Hierarchical Clustering Procedures , 1974 .

[98]  J. Wolfe,et al.  Comparative Cluster Analysis Of Patterns Of Vocational Interest. , 1978, Multivariate behavioral research.

[99]  J. N. Srivastava,et al.  An Information Function Approach to Dimensionality Analysis and Curved Manifold Clustering , 1973 .

[100]  R R Sokal,et al.  Classification: Purposes, Principles, Progress, Prospects , 1974, Science.

[101]  J J Bartko,et al.  Another view of schizophrenia subtypes. A report from the international pilot study of schizophrenia. , 1976, Archives of general psychiatry.

[102]  R. Saunders,et al.  Poisson limits for a clustering model of strauss , 1977, Journal of Applied Probability.

[103]  Emanuel Parzen,et al.  Modern Probability Theory And Its Applications , 1962 .

[104]  J. Orford Implementation of criteria for partitioning a dendrogram , 1976 .

[105]  Forrest W. Young Nonmetric multidimensional scaling: Recovery of metric information , 1970 .

[106]  William H.E. Day Classification and specification of flat cluster methods , 1978 .

[107]  Robert F. Ling,et al.  On the theory and construction of k-clusters , 1972, Comput. J..

[108]  L. Hubert,et al.  A Graph-Theoretic Approach to Goodness-of-Fit in Complete-Link Hierarchical Clustering , 1976 .

[109]  Peter J. Diggle,et al.  Simple Monte Carlo Tests for Spatial Pattern , 1977 .

[110]  Azriel Rosenfeld,et al.  Image Segmentation by Pixel Classification in (Gray Level, Edge Value) Space , 1978, IEEE Transactions on Computers.

[111]  A. Dobson Unrooted trees for numerical taxonomy , 1974, Journal of Applied Probability.

[112]  King-Sun Fu,et al.  A Sentence-to-Sentence Clustering Procedure for Pattern Analysis , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[113]  Frank B. Baker,et al.  Sensitivity of the Complete-Link Clustering Technique to Missing Individuals , 1978 .

[114]  Anthony Ralston,et al.  Statistical Methods for Digital Computers. , 1980 .

[115]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[116]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[117]  Martin D. Levine,et al.  An Algorithm for Detecting Unimodal Fuzzy Sets and Its Application as a Clustering Technique , 1970, IEEE Transactions on Computers.

[118]  Keinosuke Fukunaga,et al.  A Branch and Bound Clustering Algorithm , 1975, IEEE Transactions on Computers.

[119]  Louis L. McQuitty,et al.  Reliable and Valid Hierarchical Classification , 1971 .

[120]  F. Marriott Practical problems in a method of cluster analysis. , 1971, Biometrics.

[121]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[122]  J. Wolfe PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS. , 1970, Multivariate behavioral research.

[123]  Richard C. Dubes,et al.  A Variation on a Nonparametric Clustering Method , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  J. Lingoes Some boundary conditions for a monotone analysis of symmetric matrices , 1971 .

[125]  W. H. Day Validity of clusters formed by graph-theoretic cluster methods , 1977 .

[126]  Rex Page,et al.  Algorithm 479: A minimal spanning tree clustering method , 1974, CACM.

[127]  D. G. Weeks,et al.  Restricted multidimensional scaling models , 1978 .

[128]  Herbert H. Stenson,et al.  GOODNESS OF FIT FOR RANDOM RANKINGS IN KRUSKAL'S NONMETRIC SCALING PROCEDURE * , 1969 .

[129]  Roger N. Shepard,et al.  Additive clustering: Representation of similarities as combinations of discrete overlapping properties. , 1979 .

[130]  V. Alagar The distribution of the distance between random points , 1976, Journal of Applied Probability.

[131]  R E Kendell,et al.  The Classification of Depressions: A Review of Contemporary Confusion , 1976, British Journal of Psychiatry.

[132]  D. Defays,et al.  An Efficient Algorithm for a Complete Link Method , 1977, Comput. J..

[133]  Paul E. Green,et al.  A Note on Proximity Measures and Cluster Analysis , 1969 .

[134]  Robert F. White,et al.  Probabilistic Clustering for Attributes of Mixed Type with Biopharmaceutical Applications , 1977 .

[135]  Heinrich Niemann,et al.  A Fast-Converging Algorithm for Nonlinear Mapping of High-Dimensional Data to a Plane , 1979, IEEE Transactions on Computers.

[136]  Godfried T. Toussaint,et al.  Subjective clustering and bibliography of books on pattern recognition , 1975, Inf. Sci..

[137]  J. Hammersley The Distribution of Distance in a Hypersphere , 1950 .

[138]  D. Layzer,et al.  Arrow of time , 1975 .

[139]  David Klahr,et al.  A monte carlo investigation of the statistical significance of Kruskal's nonmetric scaling procedure , 1969 .

[140]  M. F. Janowitz,et al.  An Order Theoretic Model for Cluster Analysis , 1978 .

[141]  R. F. Ling A Probability Theory of Cluster Analysis , 1973 .

[142]  R K Blashfield,et al.  Toward a Taxonomy of Psychopathology: The Purpose of Psychiatric Classification , 1976, British Journal of Psychiatry.

[143]  B. S. Duran,et al.  Cluster Analysis: A Survey , 1974 .

[144]  H. S. Magnuski,et al.  A Minimal Spanning Tree Clustering Method (Remark on Algorithm 479) , 1975, Commun. ACM.

[145]  G. Krishna,et al.  Agglomerative clustering using the concept of mutual nearest neighbourhood , 1978, Pattern Recognit..

[146]  J. Gower,et al.  Minimum Spanning Trees and Single Linkage Cluster Analysis , 1969 .

[147]  M A Woodbury,et al.  Clinical data representation in multidimensional space. , 1970, Computers and biomedical research, an international journal.

[148]  G. A. Butler,et al.  A vector field approach to cluster analysis , 1969, Pattern Recognit..

[149]  E. Patrick,et al.  Fundamentals of Pattern Recognition , 1973 .

[150]  J. Hartigan,et al.  Percentage Points of a Test for Clusters , 1969 .

[151]  Louis L. McQuitty,et al.  Highest Entry Hierarchical Clustering , 1975 .

[152]  Josef Kittler,et al.  A locally sensitive method for cluster analysis , 1976, Pattern Recognit..

[153]  G. Uhlenbeck,et al.  On the Theory of the Virial Development of the Equation of State of Monoatomic Gases , 1953 .

[154]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .

[155]  L. Hubert Monotone invariant clustering procedures , 1973 .

[156]  Louis L. McQuitty,et al.  A Method for Hierarchical Clustering of a Matrix of a Thousand By a Thousand 1 , 1975 .

[157]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[158]  L. Hubert Min and max hierarchical clustering using asymmetric similarity measures , 1973 .

[159]  J. Farris On the Cophenetic Correlation Coefficient , 1969 .

[160]  Bernard W. Silverman,et al.  Short distances, flat triangles and Poisson limits , 1978, Journal of Applied Probability.

[161]  P. Krishnaiah,et al.  Multivariate Analysis III. , 1975 .

[162]  Alan C. Shaw,et al.  A Formal Picture Description Scheme as a Basis for Picture Processing Systems , 1969, Inf. Control..

[163]  L. N. Kanal,et al.  Interactive pattern analysis and classification systems: A survey and commentary , 1972 .

[164]  Keinosuke Fukunaga,et al.  A Nonlinear Feature Extraction Algorithm Using Distance Transformation , 1972, IEEE Transactions on Computers.

[165]  J. Gower A comparison of some methods of cluster analysis. , 1967, Biometrics.

[166]  R. F. Ling,et al.  Probability Tables for Cluster Analysis Based on a Theory of Random Graphs , 1976 .

[167]  H. T. Clifford,et al.  An Introduction to Numerical Classification. , 1976 .

[168]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[169]  Herman Chernoff GRAPHICAL REPRESENTATIONS AS A DISCIPLINE , 1978 .

[170]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[171]  J. Hair Multivariate data analysis , 1972 .

[172]  P H Bartels,et al.  Cell recognition by neighborhood grouping techniques in TICAS. , 1970, Acta cytologica.

[173]  I. J. Good,et al.  The Botryology of Botryology , 1977 .

[174]  Andrew K. C. Wong,et al.  A Decision-Directed Clustering Algorithm for Discrete Data , 1977, IEEE Transactions on Computers.

[175]  F. Rohlf,et al.  Tests for Hierarchical Structure in Random Data Sets , 1968 .

[176]  M. V. Bhat,et al.  An Efficient Clustering Algorithm , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[177]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[178]  Robert F. Ling,et al.  Classification and Clustering. , 1979 .

[179]  Gerald S. Rogers,et al.  Mathematical Statistics: A Decision Theoretic Approach , 1967 .

[180]  W. Krzanowski Some Exact Percentage Points of a Statistic Useful in Analysis of Variance and Principal Component Analysis , 1979 .

[181]  D. F. Andrews,et al.  PLOTS OF HIGH-DIMENSIONAL DATA , 1972 .

[182]  D. Rogers,et al.  A Graph Theory Model for Systematic Biology, with an Example for the Oncidiinae (Orchidaceae) , 1966 .

[183]  E. Backer,et al.  Cluster analysis by optimal decomposition of induced fuzzy sets , 1978 .

[184]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[185]  N Sartorius,et al.  WHO international pilot study of schizophrenia. , 1972, Psychological medicine.

[186]  F. Kelly,et al.  A note on Strauss's model for clustering , 1976 .

[187]  Peter J. Diggle,et al.  On parameter estimation and goodness-of-fit testing for spatial point patterns , 1979 .

[188]  Azriel Rosenfeld,et al.  Some experiments in image segmentation by clustering of local feature values , 1979, Pattern Recognit..

[189]  H. P. Friedman,et al.  On Some Invariant Criteria for Grouping Data , 1967 .

[190]  E. Paykel,et al.  Classification of Depressed Patients: A Cluster Analysis Derived Grouping , 1971, British Journal of Psychiatry.

[191]  Anthony N. Mucciardi,et al.  An Automatic Clustering Algorithm and Its Properties in High-Dimensional Spaces , 1972, IEEE Trans. Syst. Man Cybern..

[192]  M. Roth,et al.  Multivariate Statistical Methods and Problems of Classification in Psychiatry , 1978, British Journal of Psychiatry.

[193]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[194]  Herman Chernoff,et al.  The Use of Faces to Represent Points in k- Dimensional Space Graphically , 1973 .

[195]  Keinosuke Fukunaga,et al.  A Nonparametric Valley-Seeking Technique for Cluster Analysis , 1971, IEEE Transactions on Computers.

[196]  Thomas W. Calvert,et al.  Nonorthogonal Projections for Feature Extraction in Pattern Recognition , 1969, IEEE Transactions on Computers.

[197]  R. Maronna,et al.  Multivariate Clustering Procedures with Variable Metrics , 1974 .

[198]  M. Rao Cluster Analysis and Mathematical Programming , 1971 .

[199]  C. E. Pykett Improving the efficiency of Sammon's nonlinear mapping by using clustering archetypes , 1978 .

[200]  D. W. Goodall A New Similarity Index Based on Probability , 1966 .

[201]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[202]  R. Shepard,et al.  Monotone mapping of similarities into a general metric space , 1974 .

[203]  D. A. Huffman,et al.  Development of New Pattern-Recognition Methods. , 1973 .

[204]  Richard C. T. Lee,et al.  A Heuristic Relaxation Method for Nonlinear Mapping in Cluster Analysis , 1973, IEEE Trans. Syst. Man Cybern..

[205]  D. Wishart Clustan : user manual , 1978 .

[206]  John C. Ogilvie,et al.  Evaluation of hierarchical grouping techniques; a preliminary study , 1972, Comput. J..

[207]  S. Sclove Population mixture models and clustering algorithms , 1977 .

[208]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[209]  R. M. Haralick,et al.  Pattern recognition with measurement space and spatial clustering for multiple images , 1969 .

[210]  L. Guttman A general nonmetric technique for finding the smallest coordinate space for a configuration of points , 1968 .

[211]  Richard C. T. Lee,et al.  Experiments with some cluster analysis algorithms , 1974, Pattern Recognit..

[212]  David M. Levine A monte carlo study of kruskal's variance based measure on stress , 1978 .

[213]  J. B. Kruskal,et al.  A geometric interpretation of diagnostic data from a digital machine: Based on a study of the morris, illinois electronic central office , 1966 .

[214]  R. Shepard,et al.  A nonmetric variety of linear factor analysis , 1974 .

[215]  D. E. Breedlove,et al.  The Origins of Taxonomy , 1971, Science.

[216]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[217]  L. Fisher,et al.  391: A Monte Carlo Comparison of Six Clustering Procedures , 1975 .

[218]  Joseph L. Zinnes,et al.  Theory and Methods of Scaling. , 1958 .

[219]  King-Sun Fu,et al.  A Nonparametric Partitioning Procedure for Pattern Classification , 1969, IEEE Transactions on Computers.

[220]  Gerard V. Trunk,et al.  Stastical Estimation of the Intrinsic Dimensionality of a Noisy Signal Collection , 1976, IEEE Transactions on Computers.

[221]  Geoffrey H. Ball,et al.  Data analysis in the social sciences: what about the details? , 1965, AFIPS '65 (Fall, part I).

[222]  R. M. Cormack,et al.  A Review of Classification , 1971 .

[223]  G W Williams,et al.  Cluster analysis applied to symptom ratings of psychiatric patients: an evaluation of its predictive ability. , 1976, The British journal of psychiatry : the journal of mental science.

[224]  R. F. Ling An exact probability distribution on the connectivity of random graphs , 1975 .

[225]  R K Blashfield,et al.  The Literature On Cluster Analysis. , 1978, Multivariate behavioral research.

[226]  Grinker Rr,et al.  The borderline syndrome. , 1978 .

[227]  Robert E. Jensen,et al.  A Dynamic Programming Algorithm for Cluster Analysis , 1969, Oper. Res..

[228]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[229]  Roger K. Blashfield The equivalence of three statistical packages for performing hierarchical cluster analysis , 1977 .

[230]  A W EDWARDS,et al.  A METHOD FOR CLUSTER ANALYSIS. , 1965, Biometrics.

[231]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[232]  K. Sreenivasan,et al.  On the construction of a representative synthetic workload , 1974, CACM.

[233]  P. Erdos,et al.  On the strength of connectedness of a random graph , 1964 .

[234]  Harry C. Andrews,et al.  Nonlinear Intrinsic Dimensionality Computations , 1974, IEEE Transactions on Computers.

[235]  Ashok K. Agrawala,et al.  An Approach to the Workload Characterization Problem , 1976, Computer.

[236]  B. S. Everitt,et al.  Visual Techniques for Representing Multivariate Data , 1975 .

[237]  Rodney Coleman,et al.  Random paths through convex bodies , 1969, Journal of Applied Probability.

[238]  William E. Wright,et al.  A formalization of cluster analysis , 1973, Pattern Recognit..

[239]  Louis L. McQuitty,et al.  A Comparative Study of Some Selected Methods of Pattern Analysis , 1971 .

[240]  E. Filsinger,et al.  An empirical typology of adjustment to aging. , 1978, Journal of gerontology.

[241]  Laveen N. Kanal,et al.  Patterns in pattern recognition: 1968-1974 , 1974, IEEE Trans. Inf. Theory.

[242]  A. Anne,et al.  Techniques in the consolidation, characterization, and expression of physiologic signals in the time domain , 1977, Proceedings of the IEEE.

[243]  Lawrence Hubert,et al.  The comparison and fitting of given classification schemes , 1977 .

[244]  M. Bartlett The statistical analysis of spatial pattern , 1974, Advances in Applied Probability.

[245]  J. Ramsay Maximum likelihood estimation in multidimensional scaling , 1977 .

[246]  R. Shepard Representation of structure in similarity data: Problems and prospects , 1974 .

[247]  L. Hubert Some applications of graph theory to clustering , 1974 .

[248]  S. Watanabe,et al.  Reduction of clustering problem to pattern recognition , 1969, Pattern Recognit..

[249]  J. Hartigan Asymptotic Distributions for Clustering Criteria , 1978 .

[250]  Bruce A. Eisenstein,et al.  Structural Editing by a Point Density Function , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[251]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[252]  L. Hubert,et al.  Measuring the Power of Hierarchical Cluster Analysis , 1975 .

[253]  F. Baker Stability of Two Hierarchical Grouping Techniques Case I: Sensitivity to Data Errors , 1974 .

[254]  Keinosuke Fukunaga,et al.  The optimum nonlinear features for a scatter criterion in discriminant analysis , 1977, IEEE Trans. Inf. Theory.

[255]  L. Hubert,et al.  Hierarchical Clustering and the Concept of Space Distortion. , 1975 .

[256]  G. N. Lance,et al.  A Note on a New Divisive Classificatory Program for Mixed Data , 1971, Comput. J..

[257]  J. Rubin Optimal classification into groups: an approach for solving the taxonomy problem. , 1967, Journal of theoretical biology.

[258]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[259]  J. Ware,et al.  Applications of Statistics , 1978 .

[260]  Robin Sibson,et al.  Some Observations on a Paper by Lance and Williams , 1971, Comput. J..

[261]  R. F. Ling The Expected Number of Components in Random Linear Graphs , 1973 .

[262]  D. Binder Bayesian cluster analysis , 1978 .

[263]  T. Cacoullos,et al.  Discriminant analysis and applications , 1974 .

[264]  Enrique H. Ruspini New experimental results in fuzzy clustering , 1973, Inf. Sci..

[265]  Robert S. Bennett,et al.  The intrinsic dimensionality of signal collections , 1969, IEEE Trans. Inf. Theory.

[266]  F. James Rohlf,et al.  A RANDOMIZATION TEST OF THE NON SPECIFICITY HYPOTHESIS IN NUMERICAL TAXONOMY , 1965 .

[267]  D. W. Goodall,et al.  A Probabilistic Similarity Index , 1964, Nature.

[268]  Mezzich Je Evaluating clustering methods for psychiatric diagnosis. , 1978 .