Bicriterion methods for partitioning dissimilarity matrices.

Partitioning indices associated with the within-cluster sums of pairwise dissimilarities often exhibit a systematic bias towards clusters of a particular size, whereas minimization of the partition diameter (i.e. the maximum dissimilarity element across all pairs of objects within the same cluster) does not typically have this problem. However, when the partition-diameter criterion is used, there is often a myriad of alternative optimal solutions that can vary significantly with respect to their substantive interpretation. We propose a bicriterion partitioning approach that considers both diameter and within-cluster sums in the optimization problem and facilitates selection from among the alternative optima. We developed several MATLAB-based exchange algorithms that rapidly provide excellent solutions to bicriterion partitioning problems. These algorithms were evaluated using synthetic data sets, as well as an empirical dissimilarity matrix.

[1]  C. F. Banfield,et al.  Algorithm AS 113: A Transfer for Non-Hierarchical Classification , 1977 .

[2]  Gary Klein,et al.  Optimal clustering: A model and method , 1991 .

[3]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[4]  M. Rao Cluster Analysis and Mathematical Programming , 1971 .

[5]  D. Steinley Properties of the Hubert-Arabie adjusted Rand index. , 2004, Psychological methods.

[6]  P. Hansen,et al.  Complete-Link Cluster Analysis by Graph Coloring , 1978 .

[7]  L. Hubert,et al.  A Graph-Theoretic Approach to Goodness-of-Fit in Complete-Link Hierarchical Clustering , 1976 .

[8]  B. Jaumard,et al.  Efficient algorithms for divisive hierarchical clustering with the diameter criterion , 1990 .

[9]  J. R. Brown Chromatic Scheduling and the Chromatic Number Problem , 1972 .

[10]  L. Hubert Monotone invariant clustering procedures , 1973 .

[11]  M. Brusco An enhanced branch-and-bound algorithm for a partitioning problem. , 2003, The British journal of mathematical and statistical psychology.

[12]  G. W. Milligan,et al.  A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis. , 1986, Multivariate behavioral research.

[13]  Pierre Hansen,et al.  Bicriterion Cluster Analysis , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  L. Hubert Some applications of graph theory to clustering , 1974 .

[15]  A. Guénoche Enumeration of minimum diameter partitions , 1993 .

[16]  Cecilia R. Aragon,et al.  Optimization by Simulated Annealing: An Experimental Evaluation; Part II, Graph Coloring and Number Partitioning , 1991, Oper. Res..

[17]  Douglas Steinley,et al.  Local optima in K-means clustering: what you don't know may hurt you. , 2003, Psychological methods.

[18]  Phipps Arabie,et al.  AN OVERVIEW OF COMBINATORIAL DATA ANALYSIS , 1996 .

[19]  Michael J. Brusco,et al.  Compact integer-programming models for extracting subsets of stimuli from confusion matrices , 2001 .

[20]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[21]  S K Manning,et al.  Similarity ratings and confusability of lipread consonants compared with similarity ratings of auditory and orthographic stimuli. , 1991, The American journal of psychology.

[22]  Niels G. Waller,et al.  A comparison of the classification capabilities of the 1-dimensional kohonen neural network with two pratitioning and three hierarchical cluster analysis algorithms , 1998 .

[23]  E. S. Theise Finding a Subset of Stimulus-Response Pairs with Minimum Total Confusion: A Binary Integer Programming Approach , 1989 .

[24]  Phipps Arabie,et al.  Linear Unidimensional Scaling in the L2-Norm: Basic Optimization Methods Using MATLAB , 2002, J. Classif..

[25]  L. Hubert Min and max hierarchical clustering using asymmetric similarity measures , 1973 .

[26]  M. Brusco,et al.  A variable-selection heuristic for K-means clustering , 2001 .

[27]  Richard C. Dubes,et al.  Experiments in projection and clustering by simulated annealing , 1989, Pattern Recognit..