CSI: Community-Level Social Influence Analysis

Modeling how information propagates in social networks driven by peer influence, is a fundamental research question towards understanding the structure and dynamics of these complex networks, as well as developing viral marketing applications. Existing literature studies influence at the level of individuals, mostly ignoring the existence of a community structure in which multiple nodes may exhibit a common influence pattern. In this paper we introduce CSI, a model for analyzing information propagation and social influence at the granularity of communities. CSI builds over a novel propagation model that generalizes the classic Independent Cascade model to deal with groups of nodes instead of single nodes influence. Given a social network and a database of past information propagation, we propose a hierarchical approach to detect a set of communities and their reciprocal influence strength. CSI provides a higher level and more intuitive description of the influence dynamics, thus representing a powerful tool to summarize and investigate patterns of influence in large social networks. The evaluation on various datasets suggests the effectiveness of the proposed approach in modeling information propagation at the level of communities. It further enables to detect interesting patterns of influence, such as the communities that play a key role in the overall diffusion process, or that are likely to start information cascades.

[1]  Lakhmi C. Jain,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2004, Lecture Notes in Computer Science.

[2]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[3]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[4]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[5]  Aristides Gionis,et al.  Sparsification of influence networks , 2011, KDD.

[6]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[7]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[8]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[9]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[10]  Laks V. S. Lakshmanan,et al.  A Data-Based Approach to Social Influence Maximization , 2011, Proc. VLDB Endow..

[11]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[12]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[13]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[14]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[15]  Nicola Barbieri,et al.  Cascade-based community detection , 2013, WSDM.

[16]  Michael C. Schatz,et al.  Revealing Biological Modules via Graph Summarization , 2009, J. Comput. Biol..

[17]  Nada Lavrac,et al.  Using Ontologies in Semantic Data Mining with SEGS and g-SEGS , 2011, Discovery Science.

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Jignesh M. Patel,et al.  Discovery-driven graph summarization , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[20]  Masahiro Kimura,et al.  Prediction of Information Diffusion Probabilities for Independent Cascade Model , 2008, KES.

[21]  Jennifer Neville,et al.  Modeling relationship strength in online social networks , 2010, WWW '10.

[22]  Gemma C. Garriga,et al.  Feature Selection in Taxonomies with Applications to Paleontology , 2008, Discovery Science.

[23]  Aristides Gionis,et al.  Taxonomy-driven lumping for sequence mining , 2009, Data Mining and Knowledge Discovery.