Active Semi-supervised Community Detection Algorithm with Label Propagation

Community detection is the fundamental problem in the analysis and understanding of complex networks, which has attracted a lot of attention in the last decade. Active learning aims to achieve high accuracy using as few labeled data as possible. However, so far as we know, active learning has not been applied to detect community to improve the performance of discovering community structure of complex networks. In this paper, we propose a community detection algorithm called active semi-supervised community detection algorithm with label propagation. Firstly, we transform a given complex network into a weighted network, select some informative nodes using the weighted shortest path method, and label those nodes for community detection. Secondly, we utilize the labeled nodes to expand the labeled nodes set by propagating the labels of the labeled nodes according to an adaptive threshold. Thirdly, we deal with the rest of unlabeled nodes. Finally, we demonstrate our community detection algorithm with three real networks and one synthetic network. Experimental results show that our active semi-supervised method achieves a better performance compared with some other community detection algorithms.

[1]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Sanjoy Dasgupta,et al.  A General Agnostic Active Learning Algorithm , 2007, ISAIM.

[3]  Wei Chen,et al.  A game-theoretic framework to identify overlapping communities in social networks , 2010, Data Mining and Knowledge Discovery.

[4]  Stefan Wrobel,et al.  Active Learning of Partially Hidden Markov Models , 2001 .

[5]  Marko Bajec,et al.  Unfolding network communities by combining defensive and offensive label propagation , 2011, ArXiv.

[6]  Ian Davidson,et al.  Active Spectral Clustering , 2010, 2010 IEEE International Conference on Data Mining.

[7]  Boleslaw K. Szymanski,et al.  Community detection using a neighborhood strength driven Label Propagation Algorithm , 2011, 2011 IEEE Network Science Workshop.

[8]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[9]  M. Barber,et al.  Detecting network communities by propagating labels under constraints. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[11]  Wai Lam,et al.  Active Learning of Constraints for Semi-supervised Text Clustering , 2007, SDM.

[12]  T. Murata,et al.  Advanced modularity-specialized label propagation algorithm for detecting communities in networks , 2009, 0910.1154.

[13]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Nicolas Labroche,et al.  Active Learning for Semi-Supervised K-Means Clustering , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[17]  Ulrik Brandes,et al.  On Finding Graph Clusterings with Maximum Modularity , 2007, WG.

[18]  Francesc Comellas,et al.  A fast and efficient algorithm to identify clusters in networks , 2010, Appl. Math. Comput..

[19]  Rong Jin,et al.  Active query selection for semi-supervised clustering , 2008, 2008 19th International Conference on Pattern Recognition.

[20]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[21]  Tsuyoshi Murata,et al.  How Does Label Propagation Algorithm Work in Bipartite Networks? , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[22]  Qing He,et al.  Effective semi-supervised document clustering via active learning with instance-level constraints , 2011, Knowledge and Information Systems.

[23]  Liang Zhao,et al.  Semi-supervised learning guided by the modularity measure in complex networks , 2012, Neurocomputing.

[24]  Xiaoke Ma,et al.  Semi-supervised clustering algorithm for community structure detection in complex networks , 2010 .

[25]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Nozha Boujemaa,et al.  Active semi-supervised fuzzy clustering , 2008, Pattern Recognit..

[27]  Ulrik Brandes,et al.  On variants of shortest-path betweenness centrality and their generic computation , 2008, Soc. Networks.