Variable neighborhood search for minimum sum-of-squares clustering on networks

Euclidean Minimum Sum-of-Squares Clustering amounts to finding p prototypes by minimizing the sum of the squared Euclidean distances from a set of points to their closest prototype. In recent years related clustering problems have been extensively analyzed under the assumption that the space is a network, and not any more the Euclidean space. This allows one to properly address community detection problems, of significant relevance in diverse phenomena in biological, technological and social systems. However, the problem of minimizing the sum of squared distances on networks have not yet been addressed. Two versions of the problem are possible: either the p prototypes are sought among the set of nodes of the network, or also points along edges are taken into account as possible prototypes. While the first problem is transformed into a classical discrete p-median problem, the latter is new in the literature, and solved in this paper with the Variable Neighborhood Search heuristic. The solutions of the two problems are compared in a series of test examples.

[1]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[2]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[3]  Nenad Mladenovic,et al.  Degeneracy in the multi-source Weber problem , 1999, Math. Program..

[4]  Raca Todosijevic,et al.  AN EFFICIENT GENERAL VARIABLE NEIGHBORHOOD SEARCH FOR LARGE TRAVELLING SALESMAN PROBLEM WITH TIME WINDOWS , 2012 .

[5]  Dominique Peeters,et al.  Location on networks , 1992 .

[6]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[7]  Pierre Hansen,et al.  NP-hardness of Euclidean sum-of-squares clustering , 2008, Machine Learning.

[8]  Bhaba R. Sarker,et al.  Discrete location theory , 1991 .

[9]  P. Hansen,et al.  Variable neighbourhood search: methods and applications , 2010, Ann. Oper. Res..

[10]  J. Beasley A note on solving large p-median problems , 1985 .

[11]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Spectral methods for graph clustering - A survey , 2011, Eur. J. Oper. Res..

[12]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[13]  Igor Vasil'ev,et al.  Computational study of large-scale p-Median problems , 2007, Math. Program..

[14]  Pierre Hansen,et al.  Improvement and Comparison of Heuristics for Solving the Uncapacitated Multisource Weber Problem , 2000, Oper. Res..

[15]  Antonio Sassano,et al.  On the p-Median polytope , 2001, Math. Program..

[16]  Robert F. Ling,et al.  Cluster analysis algorithms for data reduction and classification of objects , 1981 .

[17]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[18]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[19]  Pierre Hansen,et al.  The p-median problem: A survey of metaheuristic approaches , 2005, Eur. J. Oper. Res..

[20]  P. Hansen,et al.  Variable neighborhood search for the p-median , 1997 .

[21]  Pierre Hansen,et al.  Degeneracy of Harmonic Means Clustering , 2011 .

[22]  Pierre Hansen,et al.  Analysis of Global k-Means, an Incremental Heuristic for Minimum Sum-of-Squares Clustering , 2005, J. Classif..

[23]  Alexander Veremyev,et al.  Identifying large robust network clusters via new compact formulations of maximum k-club problems , 2012, Eur. J. Oper. Res..

[24]  Pierre Hansen,et al.  An improved column generation algorithm for minimum sum-of-squares clustering , 2009, Math. Program..

[25]  J. Reese,et al.  Solution methods for the p-median problem: An annotated bibliography , 2006 .

[26]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[27]  Pierre Hansen,et al.  Heuristic solution of the multisource Weber problem as a p-median problem , 1996, Oper. Res. Lett..

[28]  John N. Hooker,et al.  Finite Dominating Sets for Network Location Problems , 1991, Oper. Res..

[29]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[30]  É. Taillard,et al.  Improvements and Comparison of Heuristics for solving the Multisource Weber Problem , 1997 .

[31]  Marc Barthelemy,et al.  Spatial Networks , 2010, Encyclopedia of Social Network Analysis and Mining.

[32]  S. Hakimi Optimum Distribution of Switching Centers in a Communication Network and Some Related Graph Theoretic Problems , 1965 .

[33]  N. Mladenović,et al.  Sum-of-squares clustering on networks , 2011 .

[34]  Igor Vasil'ev,et al.  A computational study of a nonlinear minsum facility location problem , 2012, Comput. Oper. Res..