Synergistic partitioning in multiple large scale social networks

Social networks have been part of people's daily life and plenty of users have registered accounts in multiple social networks. Interconnections among multiple social networks add a multiplier effect to social applications when fully used. With the sharp expansion of network size, traditional standalone algorithms can no longer support computing on large scale networks while alternatively, distributed and parallel computing become a solution to utilize the data-intensive information hidden in multiple social networks. As such, synergistic partitioning, which takes the relationships among different networks into consideration and focuses on partitioning the same nodes of different networks into same partitions. With that, the partitions containing the same nodes can be assigned to the same server to improve the data locality and reduce communication overhead among servers, which are very important for distributed applications. To date, there have been limited studies on multiple large scale network partitioning due to three major challenges: 1) the need to consider relationships across multiple networks given the existence of intricate interactions, 2) the difficulty for standalone programs to utilize traditional partitioning methods, 3) the fact that to generate balanced partitions is NP-complete. In this paper, we propose a novel framework to partition multiple social networks synergistically. In particular, we apply a distributed multilevel k-way partitioning method to divide the first network into k partitions. Based on the given anchor nodes which exist in all the social networks and the partition results of the first network, using MapReduce, we then develop a modified distributed multilevel partitioning method to divide other networks. Extensive experiments on two real data sets demonstrate that our method can significantly outperform baseline independent-partitioning method in accuracy and scalability.

[1]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[2]  Qiong Chen,et al.  Detecting local community structures in complex networks based on local degree central nodes , 2013 .

[3]  Danah Boyd,et al.  Social Network Sites: Definition, History, and Scholarship , 2007, J. Comput. Mediat. Commun..

[4]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[5]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[6]  Charles M. Fiduccia,et al.  A linear-time heuristic for improving network partitions , 1988, 25 years of DAC.

[7]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[8]  Vipin Kumar,et al.  Parallel Multilevel series k-Way Partitioning Scheme for Irregular Graphs , 1999, SIAM Rev..

[9]  John E. Savage,et al.  Parallelism in Graph-Partitioning , 1991, J. Parallel Distributed Comput..

[10]  Colin Studholme,et al.  An overlap invariant entropy measure of 3D medical image alignment , 1999, Pattern Recognit..

[11]  Curt Jones,et al.  A Heuristic for Reducing Fill-In in Sparse Matrix Factorization , 1993, PPSC.

[12]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[13]  D WilliamsRoy Performance of dynamic load balancing algorithms for unstructured mesh calculations , 1991 .

[14]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[15]  Amir H. Payberah,et al.  JA-BE-JA: A Distributed Algorithm for Balanced Graph Partitioning , 2013, 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems.

[16]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[17]  Adebiyi Marion,et al.  f a Social Networking Site with a Library and Conference Chat , 2011 .

[18]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[19]  Philip S. Yu,et al.  GConnect: A Connectivity Index for Massive Disk-resident Graphs , 2009, Proc. VLDB Endow..

[20]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  R. M. Mattheyses,et al.  A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.

[22]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[23]  Vipin Kumar,et al.  Analysis of Multilevel Graph Partitioning , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[24]  S. Dutt New faster Kernighan-Lin-type graph-partitioning algorithms , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[25]  Shahid H. Bokhari,et al.  A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.

[26]  Philip S. Yu,et al.  MCD: Mutual Clustering across Multiple Social Networks , 2015, 2015 IEEE International Congress on Big Data.

[27]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .

[28]  Charbel Farhat,et al.  Automatic partitioning of unstructured meshes for the parallel solution of problems in computational mechanics , 1993 .

[29]  John R. Gilbert,et al.  A parallel graph partitioning algorithm for a message-passing multiprocessor , 1987, International Journal of Parallel Programming.

[30]  Anshul Gupta,et al.  Fast and effective algorithms for graph partitioning and sparse-matrix ordering , 1997, IBM J. Res. Dev..

[31]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[32]  Peter Sanders,et al.  Think Locally, Act Globally: Perfectly Balanced Graph Partitioning , 2012, ArXiv.

[33]  Clifford Ambrose Truesdell,et al.  A first course in rational continuum mechanics , 1976 .

[34]  Sulamita Klein,et al.  Complexity of graph partition problems , 1999, STOC '99.