A Fast Parallel Community Discovery Model on Complex Networks Through Approximate Optimization

Community discovery plays an essential role in the analysis of the structural features of complex networks. Since online networks grow increasingly large and complex over time, the methods traditionally used for community discovery cannot efficiently handle large-scale network data. This introduces the important problem of how to effectively and efficiently discover large communities from complex networks. In this study, we propose a fast parallel community discovery model called picaso (a parallel community discovery a lgorithm based on approximate optimization), which integrates two new techniques: (1) Mountain model, which works by utilizing graph theory to approximate the selection of nodes needed for merging, and (2) Landslide algorithm, which is used to update the modularity increment based on the approximated optimization. In addition, the GraphX distribution computing framework is employed in order to achieve parallel community detection over complex networks. In the proposed model, clustering on modularity is used to initialize the Mountain model as well as to compute the weight of each edge in the networks. The relationships among the communities are then simplified by applying the Landslide algorithm, which allows us to obtain the community structures of the complex networks. Extensive experiments were conducted on real and synthetic complex network datasets, and the results demonstrate that the proposed algorithm can outperform the state of the art methods, in effectiveness and efficiency, when working to solve the problem of community detection. Moreover, we demonstratively prove that overall time performance approximates to four times faster than similar approaches. Effectively our results suggest a new paradigm for large-scale community discovery of complex networks.

[1]  Ira Assent,et al.  Scalable and Interactive Graph Clustering Algorithm on Multicore CPUs , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[2]  Josep-Lluís Larriba-Pey,et al.  High quality, scalable and parallel community detection for large real graphs , 2014, WWW.

[3]  Laks V. S. Lakshmanan,et al.  Approximate Closest Community Search in Networks , 2015, Proc. VLDB Endow..

[4]  Jing Peng,et al.  Network community detection based on spectral clustering , 2014, 2014 International Conference on Machine Learning and Cybernetics.

[5]  Srinivasan Parthasarathy,et al.  Efficient community detection in large networks using content and links , 2012, WWW.

[6]  Liang Zhao,et al.  Time series clustering via community detection in networks , 2015, Inf. Sci..

[7]  Hiroyuki Kitagawa,et al.  SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors , 2017, NDA@SIGMOD.

[8]  Derong Shen,et al.  Searching overlapping communities for group query , 2015, World Wide Web.

[9]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[10]  Viktor K. Prasanna,et al.  Fast parallel algorithm for unfolding of communities in large graphs , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[11]  Arif Mahmood,et al.  Subspace Based Network Community Detection Using Sparse Linear Coding , 2016, IEEE Trans. Knowl. Data Eng..

[12]  Arif Mahmood,et al.  Subspace Based Network Community Detection Using Sparse Linear Coding , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[14]  Jae-Gil Lee,et al.  Parallel community detection on large graphs with MapReduce and GraphChi , 2016, Data Knowl. Eng..

[15]  Josep-Lluís Larriba-Pey,et al.  Put Three and Three Together , 2016, ACM Trans. Knowl. Discov. Data.

[16]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Shaojie Qiao,et al.  Overlapping community identification approach in online social networks , 2015 .

[18]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[19]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Yunming Ye,et al.  MultiComm: Finding Community Structure in Multi-Dimensional Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[21]  Christian Staudt,et al.  Engineering Parallel Algorithms for Community Detection in Massive Networks , 2013, IEEE Transactions on Parallel and Distributed Systems.

[22]  Santo Fortunato,et al.  Limits of modularity maximization in community detection , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Yonggang Wen,et al.  Algorithms and Applications for Community Detection in Weighted Networks , 2015, IEEE Transactions on Parallel and Distributed Systems.

[24]  Inderjit S. Dhillon,et al.  Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[25]  Yasuhiro Fujiwara,et al.  Fast Algorithm for Modularity-Based Graph Clustering , 2013, AAAI.

[26]  Mohammad Kazem Akbari,et al.  Distributed Clique Percolation based community detection on social networks using MapReduce , 2013, The 5th Conference on Information and Knowledge Technology.

[27]  Jooyoung Lee,et al.  Improved network community structure improves function prediction , 2013, Scientific Reports.

[28]  Jing Li,et al.  Robust Local Community Detection: On Free Rider Effect and Its Elimination , 2015, Proc. VLDB Endow..

[29]  Xiang Li,et al.  Network Clustering via Maximizing Modularity: Approximation Algorithms and Theoretical Limits , 2015, 2015 IEEE International Conference on Data Mining.

[30]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.