Clustering datasets by complex networks analysis

This paper proposes a method based on complex networks analysis, devised to perform clustering on multidimensional datasets. In particular, the method maps the elements of the dataset in hand to a weighted network according to the similarity that holds among data. Network weights are computed by transforming the Euclidean distances measured between data according to a Gaussian model. Notably, this model depends on a parameter that controls the shape of the actual functions. Running the Gaussian transformation with different values of the parameter allows to perform multiresolution analysis, which gives important information about the number of clusters expected to be optimal or suboptimal.Solutions obtained running the proposed method on simple synthetic datasets allowed to identify a recurrent pattern, which has been found in more complex, synthetic and real, datasets.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  J. Kumpula,et al.  Detecting modules in dense weighted networks with the Potts method , 2008, 0804.3457.

[3]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[4]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[5]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[6]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[7]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[8]  Vladimir Gudkov,et al.  Community Detection in Complex Networks by Dynamical Simplex Evolution , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  K. alik An efficient k'-means clustering algorithm , 2008 .

[10]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[11]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Jukka-Pekka Onnela,et al.  Community Structure in Time-Dependent, Multiscale, and Multiplex Networks , 2009, Science.

[13]  Christoph F. Eick,et al.  Supervised clustering - algorithms and benefits , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[14]  Ying Fan,et al.  Detecting the optimal number of communities in complex networks , 2011, ArXiv.

[15]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[16]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Jari Saramäki,et al.  Networks of Emotion Concepts , 2012, PloS one.

[18]  O. Sporns,et al.  Organization, development and function of complex brain networks , 2004, Trends in Cognitive Sciences.