Parallelizing and optimizing overlapping community detection with speaker-listener Label Propagation Algorithm on multi-core architecture

Scaling up algorithms or applications in data mining studies for massive datasets to improve the computing performance is becoming crucial since we've stepped into Big Data era. For community detection, which is one of the most important topics in data mining, numerous algorithms were proposed for a better exploration of the complex networks in real world and researchers keep working on improving their computing performance by parallel programming. Speaker-listener Label Propagation Algorithm (SLPA) is a sequential linear time algorithm for overlapping community detection. In this work, we proposed a new approach to parallelize and optimize SLPA. We make it more efficient through reducing the computational complexity of one of its computing kernels and providing with a better way to get memory access during parallel execution. Evaluation of abundant experiments demonstrates our implementation's better scalability on multi-core CPUs than prior work.

[1]  Henri E. Bal,et al.  Scalable Overlapping Community Detection , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[2]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[3]  Malik Magdon-Ismail,et al.  Finding communities by clustering a graph into overlapping subgraphs , 2005, IADIS AC.

[4]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[5]  Mao-Bin Hu,et al.  Detect overlapping and hierarchical community structure in networks , 2008, ArXiv.

[6]  Xiaoming Liu,et al.  SLPA: Uncovering Overlapping Communities in Social Networks via a Speaker-Listener Interaction Dynamic Process , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[7]  Boleslaw K. Szymanski,et al.  Parallel Overlapping Community Detection with SLPA , 2013, 2013 International Conference on Social Computing.

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Boleslaw K. Szymanski,et al.  Towards Linear Time Overlapping Community Detection in Social Networks , 2012, PAKDD.

[10]  David A. Bader,et al.  Parallel Community Detection for Massive Graphs , 2011, PPAM.

[11]  Hans-Peter Kriegel,et al.  A Fast Parallel Clustering Algorithm for Large Spatial Databases , 1999, Data Mining and Knowledge Discovery.

[12]  Jianyong Wang,et al.  Parallel community detection on large networks with propinquity dynamics , 2009, KDD.

[13]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Steve Gregory,et al.  A Fast Algorithm to Find Overlapping Communities in Networks , 2008, ECML/PKDD.

[15]  Boleslaw K. Szymanski,et al.  Parallelizing SLPA for Scalable Overlapping Community Detection , 2015, Sci. Program..

[16]  Steve Gregory,et al.  Finding overlapping communities in networks by label propagation , 2009, ArXiv.

[17]  Steve Gregory,et al.  An Algorithm to Find Overlapping Community Structure in Networks , 2007, PKDD.

[18]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[19]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.