Investigation of proportional link linkage clustering methods

Proportional link linkage (PLL) clustering methods are a parametric family of monotone invariant agglomerative hierarchical clustering methods. This family includes the single, minimedian, and complete linkage clustering methods as special cases; its members are used in psychological and ecological applications. Since the literature on clustering space distortion is oriented to quantitative input data, we adapt its basic concepts to input data with only ordinal significance and analyze the space distortion properties of PLL methods. To enable PLL methods to be used when the numbern of objects being clustered is large, we describe an efficient PLL algorithm that operates inO(n2 logn) time andO(n2) space.

[1]  Louis Legendre,et al.  Succession of Species within a Community: Chronological Clustering, with Applications to Marine and Freshwater Zooplankton , 1985, The American Naturalist.

[2]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[3]  R. D'Andrade U-statistic hierarchical clustering , 1978 .

[4]  L. Hubert Some applications of graph theory to clustering , 1974 .

[5]  L. Mcquitty Elementary Linkage Analysis for Isolating Orthogonal and Oblique Types and Typal Relevancies , 1957 .

[6]  R. Sibson Order Invariant Methods for Data Analysis , 1972 .

[7]  E. Diday Inversions en classification hiérarchique : application à la construction adaptative d'indices d'agrégation , 1982 .

[8]  M. F. Janowitz,et al.  Monotone Equivariant Cluster Methods , 1979 .

[9]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[10]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[11]  Rank order cluster analysis. , 1966, The British journal of mathematical and statistical psychology.

[12]  Melvin F. Janowitz Continuous L-cluster methods , 1981, Discret. Appl. Math..

[13]  F. James Rohlf,et al.  Classification of Aedes by Numerical Taxonomic Methods (Diptera: Culicidae) , 1963 .

[14]  D. Matula Graph Theoretic Techniques for Cluster Analysis Algorithms , 1977 .

[15]  M. Janowitz Preservation of global order equivalence , 1979 .

[16]  Robin Sibson A model for taxonomy. II , 1970 .

[17]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[18]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[19]  M. Jambu,et al.  Cluster analysis and data analysis , 1985 .

[20]  M. F. Janowitz,et al.  An Order Theoretic Model for Cluster Analysis , 1978 .

[21]  W. T. Williams,et al.  A Generalized Sorting Strategy for Computer Classifications , 1966, Nature.

[22]  Melvin F. Janowitz,et al.  Semiflat L-cluster methods , 1978, Discret. Math..

[23]  L. Hubert Monotone invariant clustering procedures , 1973 .

[24]  R. Sokal,et al.  Principles of numerical taxonomy , 1965 .

[25]  R. Sibson,et al.  A model for taxonomy , 1968 .

[26]  W. Warde,et al.  A mathematical comparison of the members of an infinite family of agglomerative clustering algorithms , 1979 .

[27]  Vladimir Batagelj,et al.  Note on ultrametric hierarchical clustering algorithms , 1981 .

[28]  K. Florek,et al.  Sur la liaison et la division des points d'un ensemble fini , 1951 .

[29]  Brian Everitt,et al.  Cluster analysis , 1974 .

[30]  L. Hubert Some extensions of Johnson's hierarchical clustering algorithms , 1972 .

[31]  P. Legendre,et al.  Partitioning ordered variables into discrete states for discriminant analysis of ecological classifications , 1983 .

[32]  D. Defays,et al.  An Efficient Algorithm for a Complete Link Method , 1977, Comput. J..

[33]  P. Sneath The application of computers to taxonomy. , 1957, Journal of general microbiology.

[34]  Robert F. Ling,et al.  On the theory and construction of k-clusters , 1972, Comput. J..

[35]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[36]  David Wishart,et al.  256 NOTE: An Algorithm for Hierarchical Classifications , 1969 .

[37]  Pierre Legendre,et al.  Postglacial dispersal of freshwater fishes in the Québec peninsula , 1984 .

[38]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[39]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[40]  W. Kendrick,et al.  COMPUTER TAXONOMY IN THE FUNGI IMPERFECTI , 1964 .

[41]  L. Hubert A set-theoretical approach to the problem of hierarchical clustering , 1977 .

[42]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[43]  T. Sørensen,et al.  A method of establishing group of equal amplitude in plant sociobiology based on similarity of species content and its application to analyses of the vegetation on Danish commons , 1948 .

[44]  Bruce W. Weide,et al.  A Survey of Analysis Techniques for Discrete Algorithms , 1977, CSUR.

[45]  Lawrence Hubert,et al.  Data Analysis by Single-Link and Complete-Link Hierarchical Clustering , 1976 .

[46]  P. Legendre,et al.  Dynamics of pollution-indicator and heterotrophic bacteria in sewage treatment lagoons , 1984, Applied and environmental microbiology.

[47]  G. Milligan Ultrametric hierarchical clustering algorithms , 1979 .

[48]  L. Legendre,et al.  Towards Dynamic Biological Oceanography and Limnology , 1984 .

[49]  L. Hubert,et al.  Hierarchical Clustering and the Concept of Space Distortion. , 1975 .