Community Detection to Invariant Pattern Clustering in Images

Community detection is a kind of clustering task which aims to find groups of vertices densely connected internally but sparsely connected to other groups. In comparison with conventional clustering methods, community detection methods are able to examine structural, functional and dynamical properties of the networked data, beyond its physical attributes. However, such techniques have been barely explored in the literature as most of the machine learning data sets are represented as non-graph data (e.g., feature vectors, images, texts, and so on). In this work, we propose a simple community detection framework based on the literature that covers since the graph construction process from feature vectors generated from non-graph data until the application and evaluation of community detection methods over such a graph. The framework is further evaluated on the problem of invariant pattern clustering of images, which consists of given a set of image objects taken from different angles, positions or rotations, clustering the images related to each object. Experiments were conducted considering three community detection methods (fast greedy, walk-trap and label propagation) and two relevant clustering methods (k-means and HDBSCAN). The results indicate FG as the better choice among those algorithms as it usually approximates efficiently the ground-truth groups while also keeping a reasonable number of communities. Moreover, our results suggest that community detection may be an efficient task not only to cluster graph data, but also domain applications represented by non-graph data.

[1]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[2]  Maria Cristina Ferreira de Oliveira,et al.  Comparing relational and non-relational algorithms for clustering propositional data , 2013, SAC '13.

[3]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Data clustering based on complex network community detection , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[4]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[5]  Yaochu Jin,et al.  Nature-Inspired Graph Optimization for Dimensionality Reduction , 2017, 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI).

[6]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Luciano da Fontoura Costa,et al.  A Complex Networks Approach for Data Clustering , 2011, ArXiv.

[8]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[9]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Mirella Lapata,et al.  Unsupervised Semantic Role Induction via Split-Merge Clustering , 2011, ACL.

[11]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[12]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[13]  Francisco Aparecido Rodrigues,et al.  Segmentation of large images based on super-pixels and community detection in graphs , 2016, IET Image Process..

[14]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[15]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..

[16]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Leland McInnes,et al.  hdbscan: Hierarchical density based clustering , 2017, J. Open Source Softw..

[18]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Clara Pizzuti,et al.  Is normalized mutual information a fair measure for comparing community detection methods? , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[20]  Zhao Liang,et al.  Graph-based semi-supervised learning for semantic role diffusion , 2016 .

[21]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.