Improving Dynamic Programming Strategies for Partitioning

AbstractImprovements to the dynamic programming (DP) strategy for partitioning (nonhierarchical classification) as discussed in Hubert, Arabie, and Meulman (2001) are proposed. First, it is shown how the number of evaluations in the DP process can be decreased without affecting generality. Both a completely nonredundant and a quasi-nonredundant method are proposed. Second, an efficient implementation of both approaches is discussed. This implementation is shown to have a dramatic increase in speed over the original program. The flexibility of the approach is illustrated by analyzing three data sets.

[1]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[2]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[3]  Gary Klein,et al.  Optimal clustering: A model and method , 1991 .

[4]  M J Brusco Seriation of asymmetric matrices using integer linear programming. , 2001, The British journal of mathematical and statistical psychology.

[5]  M. Brusco An enhanced branch-and-bound algorithm for a partitioning problem. , 2003, The British journal of mathematical and statistical psychology.

[6]  Andrew B. Kahng,et al.  Multiway partitioning via geometric embeddings, orderings, and dynamic programming , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  David J. Hand,et al.  A Handbook of Small Data Sets , 1993 .

[8]  Euromonitor Plc European Marketing Data and Statistics , 1980 .

[9]  G. W. Milligan,et al.  CLUSTERING VALIDATION: RESULTS AND IMPLICATIONS FOR APPLIED ANALYSES , 1996 .

[10]  Phipps Arabie,et al.  Combinatorial Data Analysis: Optimization by Dynamic Programming , 1987 .

[11]  Keinosuke Fukunaga,et al.  A Branch and Bound Clustering Algorithm , 1975, IEEE Transactions on Computers.

[12]  Euromonitor Publications Limited European marketing data and statistics 1979/80 , 1979 .

[13]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[14]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[15]  Fionn Murtagh,et al.  Cluster Dissection and Analysis: Theory, Fortran Programs, Examples. , 1986 .

[16]  Walter D. Fisher On Grouping for Maximum Homogeneity , 1958 .

[17]  C. Alpert,et al.  Splitting an Ordering into a Partition to Minimize Diameter , 1997 .

[18]  Bjørn Olstad,et al.  Efficient Partitioning of Sequences , 1995, IEEE Trans. Computers.

[19]  L. Hubert,et al.  Measuring the Power of Hierarchical Cluster Analysis , 1975 .

[20]  L. Hubert,et al.  A general statistical framework for assessing categorical clustering in free recall. , 1976 .

[21]  G. De Soete,et al.  Clustering and Classification , 2019, Data-Driven Science and Engineering.

[22]  Kyungshik Lim,et al.  Optimal Partitioning of Heterogeneous Traffic Sources in Mobile Communications Networks , 1997, IEEE Trans. Computers.

[23]  Yadolah Dodge,et al.  Complexity relaxation of dynamic programming for cluster analysis , 1994 .

[24]  P. Hansen,et al.  Complete-Link Cluster Analysis by Graph Coloring , 1978 .

[25]  Robert E. Jensen,et al.  A Dynamic Programming Algorithm for Cluster Analysis , 1969, Oper. Res..

[26]  B. Jaumard,et al.  Minimum Sum of Squares Clustering in a Low Dimensional Space , 1996 .

[27]  D. Hand Cluster dissection and analysis: Helmuth SPATH Wiley, Chichester, 1985, 226 pages, £25.00 , 1986 .

[28]  Thomas J. Smith Constructing Ultrametric and Additive Trees Based on the L1 Norm , 2001, J. Classif..

[29]  T. C. Hu,et al.  Combinatorial algorithms , 1982 .

[30]  William H. E. Day,et al.  COMPLEXITY THEORY: AN INTRODUCTION FOR PRACTITIONERS OF CLASSIFICATION , 1996 .

[31]  A. Nijenhuis Combinatorial algorithms , 1975 .

[32]  G. Diehr Evaluation of a Branch and Bound Algorithm for Clustering , 1985 .

[33]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .