A Two-Tier Partition Algorithm for the Optimization of the Large-Scale Simulation of Information Diffusion in Social Networks

As online social networks play a more and more important role in public opinion, the large-scale simulation of social networks has been focused on by many scientists from sociology, communication, informatics, and so on. It is a good way to study real information diffusion in a symmetrical simulation world by agent-based modeling and simulation (ABMS), which is considered an effective solution by scholars from computational sociology. However, on the one hand, classical ABMS tools such as NetLogo cannot support the simulation of more than thousands of agents. On the other hand, big data platforms such as Hadoop and Spark used to study big datasets do not provide optimization for the simulation of large-scale social networks. A two-tier partition algorithm for the optimization of large-scale simulation of social networks is proposed in this paper. First, the simulation kernel of ABMS for information diffusion is implemented based on the Spark platform. Both the data structure and the scheduling mechanism are implemented by Resilient Distributed Data (RDD) to simulate the millions of agents. Second, a two-tier partition algorithm is implemented by community detection and graph cut. Community detection is used to find the partition of high interactions in the social network. A graph cut is used to achieve the goal of load balance. Finally, with the support of the dataset recorded from Twitter, a series of experiments are used to testify the performance of the two-tier partition algorithm in both the communication cost and load balance.

[1]  A. Clauset Finding local community structure in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Liang He,et al.  MapReduce-Based Large-Scale Online Social Network Worm Simulation: MapReduce-Based Large-Scale Online Social Network Worm Simulation , 2014 .

[4]  Kamalakar Karlapalem,et al.  A multi-agent simulation framework on small Hadoop cluster , 2011, Eng. Appl. Artif. Intell..

[5]  Michael J. North,et al.  Experiences creating three implementations of the repast agent modeling toolkit , 2006, TOMC.

[6]  Caroline C. Krejci,et al.  An agent-based approach to designing residential renewable energy systems , 2019, Renewable and Sustainable Energy Reviews.

[7]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[8]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[9]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[12]  Steven L. Lytinen,et al.  Agent-based Simulation Platforms: Review and Development Recommendations , 2006, Simul..

[13]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Elizabeth Sklar,et al.  NetLogo, a Multi-agent Simulation Environment , 2007, Artificial Life.

[15]  Bin Chen,et al.  Design and implementation of large-scale network propagation simulation method inspired by Pregel mechanism , 2018, SCIENTIA SINICA Informationis.

[16]  Kenrick J. Mock,et al.  Agent-based modeling of the dynamics of mammal-eating killer whales and their prey , 2012 .

[17]  Saint John Walker Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2014 .