Chapter 16 GRAPH MINING APPLICATIONS TO SOCIAL NETWORK ANALYSIS

The prosperity of Web 2.0 and social media brings about many diverse social networks of unprecedented scales, which present new challenges for more effective graph-mining techniques. In this chapter, we present some graph patterns that are commonly observed in large-scale social networks. As most networks demonstrate strong community structures, one basic task in social network analysis is community detection which uncovers the group membership of actors in a network. We categorize and survey representative graph mining approaches and evaluation strategies for community detection. We then present and discuss some research issues for future exploration.

[1]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[2]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[3]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[4]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  George Karypis,et al.  Multilevel algorithms for partitioning power-law graphs , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[6]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[7]  Sougata Mukherjea,et al.  On the structural properties of massive telecom call graphs: findings and implications , 2006, CIKM '06.

[8]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[9]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[10]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[12]  S. Borgatti,et al.  LS sets, lambda sets and other cohesive subsets , 1990 .

[13]  Huan Liu,et al.  Uncovering cross-dimension group structures in multi-dimensional networks , 2009, SDM 2009.

[14]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[15]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[16]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[17]  Huan Liu,et al.  Topic taxonomy adaptation for group profiling , 2008, TKDD.

[18]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[19]  Yiannis Kompatsiaris,et al.  Bridge Bounding: A Local Approach for Efficient Community Discovery in Complex Networks , 2009, 0902.0871.

[20]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[21]  Huan Liu,et al.  Community evolution in dynamic multi-mode networks , 2008, KDD.

[22]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[23]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[24]  Christian Sohler,et al.  Counting triangles in data streams , 2006, PODS.

[25]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[26]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[27]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[28]  R. Armstrong The Long Tail: Why the Future of Business Is Selling Less of More , 2008 .

[29]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[30]  Ken Wakita,et al.  Finding community structure in mega-scale social networks: [extended abstract] , 2007, WWW '07.

[31]  Christos Faloutsos,et al.  Cascading Behavior in Large Blog Graphs , 2007 .

[32]  Luca Becchetti,et al.  Efficient semi-streaming algorithms for local triangle counting in massive graphs , 2008, KDD.

[33]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[35]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[36]  Dorothea Wagner,et al.  Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study , 2005, WEA.

[37]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[38]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[39]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[40]  Terrill L. Frantz,et al.  Communication Networks from the Enron Email Corpus “It's Always About the People. Enron is no Different” , 2005, Comput. Math. Organ. Theory.

[41]  A. Rbnyi ON THE EVOLUTION OF RANDOM GRAPHS , 2001 .

[42]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[43]  Volker Tresp,et al.  Soft Clustering on Graphs , 2005, NIPS.

[44]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[45]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[46]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[47]  Stanley Milgram,et al.  An Experimental Study of the Small World Problem , 1969 .

[48]  Donald F. Towsley,et al.  On distinguishing between Internet power law topology generators , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[49]  U. Brandes,et al.  Maximizing Modularity is hard , 2006, physics/0608255.

[50]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[51]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  R. Hanneman Introduction to Social Network Methods , 2001 .

[53]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[54]  Michalis Faloutsos,et al.  A simple conceptual model for the Internet topology , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[55]  Bart Selman,et al.  Natural communities in large linked networks , 2003, KDD '03.

[56]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[57]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[58]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[59]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[60]  Charalampos E. Tsourakakis Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[61]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[62]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[63]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.