A new nonsmooth optimization algorithm for minimum sum-of-squares clustering problems

The minimum sum-of-squares clustering problem is formulated as a problem of nonsmooth, nonconvex optimization, and an algorithm for solving the former problem based on nonsmooth optimization techniques is developed. The issue of applying this algorithm to large data sets is discussed. Results of numerical experiments have been presented which demonstrate the effectiveness of the proposed algorithm.

[1]  G. Diehr Evaluation of a Branch and Bound Algorithm for Clustering , 1985 .

[2]  Keinosuke Fukunaga,et al.  A Branch and Bound Clustering Algorithm , 1975, IEEE Transactions on Computers.

[3]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[4]  A. Bagirov,et al.  A Global Optimization Approach to Classification , 2002 .

[5]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[6]  Khaled S. Al-Sultan,et al.  A Tabu search approach to the clustering problem , 1995, Pattern Recognit..

[7]  Pierre Hansen,et al.  Analysis of Global k-Means, an Incremental Heuristic for Minimum Sum-of-Squares Clustering , 2005, J. Classif..

[8]  Pierre Hansen,et al.  Variable Neighborhood Decomposition Search , 1998, J. Heuristics.

[9]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[10]  Ru-Qin Yu,et al.  Cluster Analysis by Simulated Annealing , 1994, Comput. Chem..

[11]  C. Reeves Modern heuristic techniques for combinatorial problems , 1993 .

[12]  Brian Everitt,et al.  Cluster analysis , 1974 .

[13]  Hans-Hermann Bock,et al.  Clustering and Neural Networks , 1998 .

[14]  Pierre Hansen,et al.  An Interior Point Algorithm for Minimum Sum-of-Squares Clustering , 1997, SIAM J. Sci. Comput..

[15]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[16]  Hans-Hermann Bock,et al.  Advances in data science and classification , 1998 .

[17]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[18]  A Bagirov,et al.  Using global optimization to improve classification for medical diagnosis and prognosis. , 2001, Topics in health information management.

[19]  Adil M. Bagirov,et al.  An approximate subgradient algorithm for unconstrained nonsmooth, nonconvex optimization , 2008, Math. Methods Oper. Res..

[20]  K. Kiwiel Methods of Descent for Nondifferentiable Optimization , 1985 .

[21]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[22]  Adil M. Bagirov Minimization Methods for One Class of Nonsmooth Functions and Calculation of Semi-Equilibrium Prices , 1999 .

[23]  Donald E. Brown,et al.  A practical application of simulated annealing to clustering , 1990, Pattern Recognit..

[24]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[25]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[26]  Adil M. Bagirov,et al.  A Method for Minimization of Quasidifferentiable Functions , 2002, Optim. Methods Softw..

[27]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[28]  V. F. Demʹi︠a︡nov,et al.  Constructive nonsmooth analysis , 1995 .

[29]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[30]  Inderjit S. Dhillon,et al.  Efficient Clustering of Very Large Document Collections , 2001 .

[31]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[32]  M. J. D. Powell,et al.  UOBYQA: unconstrained optimization by quadratic approximation , 2002, Math. Program..

[33]  Douglas M. Hawkins Topics in Applied Multivariate Analysis , 1982 .

[34]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[35]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[36]  Dominique Peeters,et al.  A comparison of two dual-based procedures for solving the p-median problem , 1985 .

[37]  Khaled S. Al-Sultan,et al.  Computational experience on four algorithms for the hard clustering problem , 1996, Pattern Recognit. Lett..

[38]  Olvi L. Mangasarian,et al.  Mathematical Programming in Data Mining , 1997, Data Mining and Knowledge Discovery.

[39]  Robert E. Jensen,et al.  A Dynamic Programming Algorithm for Cluster Analysis , 1969, Oper. Res..

[40]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.