A Survey on the Role of Centrality as Seed Nodes for Information Propagation in Large Scale Network

From the popular concept of six-degree separation, social networks are generally analyzed in the perspective of small world networks where centrality of nodes play a pivotal role in information propagation. However, working with a large dataset of a scale-free network (which follows power law) may be different due to the nature of the social graph. Moreover, the derivation of centrality may be difficult due to the computational complexity of identifying centrality measures. This study provides a comprehensive and extensive review and comparison of seven centrality measures (clustering coefficients, Node degree, K-core, Betweenness, Closeness, Eigenvector, PageRank) using four information propagation methods (Breadth First Search, Random Walk, Susceptible-Infected-Removed, Forest Fire). Five benchmark similarity measures (Tanimoto, Hamming, Dice, Sorensen, Jaccard) have been used to measure the similarity between the seed nodes identified using the centrality measures with actual source seeds derived through Google's LargeStar-SmallStar algorithm on Twitter Stream Data. MapReduce has been utilized for identifying the seed nodes based on centrality measures and for information propagation simulation. It is observed that most of the centrality measures perform well compared to the actual source in the initial stage but are saturated after a certain level of influence maximization in terms of both affected nodes and propagation level.

[1]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[2]  Hui Li,et al.  Centrality analysis of online social network big data , 2018, 2018 IEEE 3rd International Conference on Big Data Analysis (ICBDA).

[3]  Diego Noble,et al.  An Analysis of Centrality Measures for Complex and Social Networks , 2016, 2016 IEEE Global Communications Conference (GLOBECOM).

[4]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[5]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[6]  Jimeng Sun,et al.  A Survey of Models and Algorithms for Social Influence Analysis , 2011, Social Network Data Analytics.

[7]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[8]  M. Newman Spread of epidemic disease on networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Yamir Moreno,et al.  Absence of influential spreaders in rumor dynamics , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Guido Caldarelli,et al.  Mapping social dynamics on Facebook: The Brexit debate , 2017, Soc. Networks.

[11]  Sameep Mehta,et al.  A study of rumor control strategies on social networks , 2010, CIKM.

[12]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[13]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[14]  Chunhui Zhao,et al.  Social network information propagation model based on individual behavior , 2017, China Communications.

[15]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[16]  Paramita Dey,et al.  Influence maximization in online social network using different centrality measures as seed node of information propagation , 2019 .

[17]  P. Bak,et al.  A forest-fire model and some thoughts on turbulence , 1990 .

[18]  Wei Chen,et al.  Interplay between Social Influence and Network Centrality: A Comparative Study on Shapley Centrality and Single-Node-Influence Centrality , 2016, WWW.

[19]  Luís C. Lamb,et al.  Computing Vertex Centrality Measures in Massive Real Networks with a Neural Learning Model , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[20]  Sarbani Roy,et al.  Information spreading in Online Social Networks: A case study on Twitter network , 2017, MobiHoc.

[21]  Lucas Antiqueira,et al.  Analyzing and modeling real-world phenomena with complex networks: a survey of applications , 2007, 0711.3199.

[22]  Li Guo,et al.  Topic-aware Social Influence Minimization , 2015, WWW.

[23]  U Kang,et al.  Fast and Scalable Distributed Loopy Belief Propagation on Real-World Graphs , 2018, WSDM.

[24]  Esteban Moro Egido,et al.  The dynamical strength of social ties in information spreading , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Luciano da Fontoura Costa,et al.  The role of centrality for the identification of influential spreaders in complex networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[27]  Frank Harary,et al.  Distance in graphs , 1990 .

[28]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[29]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[30]  Richard F. Deckro,et al.  An Analytical Comparison of Social Network Measures , 2014, IEEE Transactions on Computational Social Systems.

[31]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[32]  M. Grabisch,et al.  Social Networks: Prestige, Centrality, and Influence - (Invited Paper) , 2011, RAMiCS.

[33]  Ulrik Brandes,et al.  On variants of shortest-path betweenness centrality and their generic computation , 2008, Soc. Networks.

[34]  Rosanna Grassi,et al.  Betweenness to assess leaders in criminal networks: New evidence using the dual projection approach , 2019, Soc. Networks.

[35]  Fernanda Campos,et al.  Information Diffusion in Social Networks: a recommendation model in the educational context , 2019, SBSI.

[36]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[37]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[38]  Panos Kalnis,et al.  Parallel Algorithm for Incremental Betweenness Centrality on Large Graphs , 2018, IEEE Transactions on Parallel and Distributed Systems.

[39]  L. D. Costa,et al.  How Many Nodes are Effectively Accessed in Complex Networks? , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.