A Polynomial Algorithm for Balanced Clustering via Graph Partitioning

The objective of clustering is to discover natural groups in datasets and to identify geometrical structures which might reside there, without assuming any prior knowledge on the characteristics of the data. The problem can be seen as detecting the inherent separations between groups of a given point set in a metric space governed by a similarity function. The pairwise similarities between all data objects form a weighted graph adjacency matrix which contains all necessary information for the clustering process, which can consequently be formulated as a graph partitioning problem. In this context, we propose a new cluster quality measure which uses the maximum spanning tree and allows us to compute the optimal clustering under the min-max principle in polynomial time. Our algorithm can be applied when a load-balanced clustering is required.

[1]  Anand Louis On the Complexity of Clustering Problems , 2018 .

[2]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Yan Zhou,et al.  Minimum Spanning Tree Based Clustering Algorithms , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[4]  Y Xu,et al.  Minimum spanning trees for gene expression data clustering. , 2001, Genome informatics. International Conference on Genome Informatics.

[5]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[6]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[7]  José Miguel Díaz-Báñez,et al.  The block-information-sharing strategy for task allocation: A case study for structure assembly with aerial robots , 2017, Eur. J. Oper. Res..

[8]  Tetsuo Asano,et al.  Clustering algorithms based on minimum and maximum spanning trees , 1988, SCG '88.

[9]  Ingemar J. Cox,et al.  "Ratio regions": a technique for image segmentation , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[10]  Chin-Chia Jane,et al.  A clustering algorithm for item assignment in a synchronized zone order picking system , 2005, Eur. J. Oper. Res..

[11]  Nadine Kroher,et al.  Discovery of Repeated Melodic Phrases in Folk Singing Recordings , 2018, IEEE Transactions on Multimedia.

[12]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  Ronen Basri,et al.  Hierarchy and adaptivity in segmenting visual scenes , 2006, Nature.

[14]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[15]  Dorit S. Hochbaum Polynomial Time Algorithms for Ratio Regions and a Variant of Normalized Cut , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[17]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[18]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[19]  Ying Xu,et al.  2D image segmentation using minimum spanning trees , 1997, Image Vis. Comput..

[20]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[21]  J. G. Klincewicz,et al.  Heuristics for the p-hub location problem , 1991 .

[22]  Anbal Ollero,et al.  Multiple Heterogeneous Unmanned Aerial Vehicles , 2008 .

[23]  Tian Zheng,et al.  Optimum cut-based clustering , 2007, Signal Process..

[24]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.