Mining structural hole spanners through information diffusion in social networks

The theory of structural holes suggests that individuals would benefit from filling the "holes" (called as structural hole spanners) between people or groups that are otherwise disconnected. A few empirical studies have verified that structural hole spanners play a key role in the information diffusion. However, there is still lack of a principled methodology to detect structural hole spanners from a given social network. In this work, we precisely define the problem of mining top-k structural hole spanners in large-scale social networks and provide an objective (quality) function to formalize the problem. Two instantiation models have been developed to implement the objective function. For the first model, we present an exact algorithm to solve it and prove its convergence. As for the second model, the optimization is proved to be NP-hard, and we design an efficient algorithm with provable approximation guarantees. We test the proposed models on three different networks: Coauthor, Twitter, and Inventor. Our study provides evidence for the theory of structural holes, e.g., 1% of Twitter users who span structural holes control 25% of the information diffusion on Twitter. We compare the proposed models with several alternative methods and the results show that our models clearly outperform the comparison methods. Our experiments also demonstrate that the detected structural hole spanners can help other social network applications, such as community kernel detection and link prediction. To the best of our knowledge, this is the first attempt to address the problem of mining structural hole spanners in large social networks.

[1]  A. van de Rijt,et al.  Dynamics of Networks if Everyone Strives for Structural Holes1 , 2008, American Journal of Sociology.

[2]  Jure Leskovec,et al.  Information diffusion and external influence in networks , 2012, KDD.

[3]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[4]  Aditya Bhaskara,et al.  Detecting high log-densities: an O(n¼) approximation for densest k-subgraph , 2010, STOC '10.

[5]  Yefim Dinitz,et al.  Dinitz' Algorithm: The Original Version and Even's Version , 2006, Essays in Memory of Shimon Even.

[6]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[7]  R. Burt Structural Holes and Good Ideas1 , 2004, American Journal of Sociology.

[8]  David Lazer,et al.  Inferring friendship network structure by using mobile phone data , 2009, Proceedings of the National Academy of Sciences.

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  Éva Tardos,et al.  Strategic network formation with structural holes , 2008, EC '08.

[11]  G. Ahuja Collaboration Networks, Structural Holes, and Innovation: A Longitudinal Study , 1998 .

[12]  John E. Hopcroft,et al.  Detecting the Structure of Social Networks Using (α, β)-Communities , 2011, WAW.

[13]  Bo Gao,et al.  PatentMiner: topic-driven patent analysis and mining , 2012, KDD.

[14]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Juan-Zi Li,et al.  Understanding retweeting behaviors in social networks , 2010, CIKM.

[16]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[17]  Jie Tang,et al.  Learning to predict reciprocity and triadic closure in social networks , 2013, TKDD.

[18]  Sanjeev Goyal,et al.  Structural holes in social networks , 2007, J. Econ. Theory.

[19]  D. Lazer,et al.  Inferring Social Network Structure using Mobile Phone Data , 2006 .

[20]  Christos Faloutsos,et al.  Rise and fall patterns of information diffusion: model and implications , 2012, KDD.

[21]  R. Burt Secondhand Brokerage: Evidence On The Importance Of Local Structure For Managers, Bankers, And Analysts , 2007 .

[22]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[23]  Ravi Kumar,et al.  Dynamics of conversations , 2010, KDD.

[24]  Jie Tang,et al.  Who will follow you back?: reciprocal relationship prediction , 2011, CIKM '11.

[25]  Robert E. Tarjan,et al.  Finding Strongly Knit Clusters in Social Networks , 2008, Internet Math..

[26]  James N. Baron,et al.  Resources and Relationships: Social Networks and Mobility in the Workplace , 1997 .

[27]  Andrew V. Goldberg,et al.  Flows in Undirected Unit Capacity Networks , 1999, SIAM J. Discret. Math..

[28]  Jie Tang,et al.  Detecting Community Kernels in Large Social Networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[29]  E. Katz The Two-Step Flow of Communication: An Up-To-Date Report on an Hypothesis , 1957 .

[30]  P. Lazarsfeld,et al.  The People's Choice: How the Voter Makes Up His Mind in a Presidential Campaign , 1968 .

[31]  Aditya Bhaskara,et al.  Polynomial integrality gaps for strong SDP relaxations of Densest k-subgraph , 2011, SODA.

[32]  Jon M. Kleinberg,et al.  Tracing information flow on a global scale using Internet chain-letter data , 2008, Proceedings of the National Academy of Sciences.

[33]  R. Burt The Social Structure of Competition , 2004 .

[34]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.