The prosperity of Web 2.0 and social media brings about many diverse social networks of unprecedented scales, which present new challenges for more effective graph-mining techniques. In this chapter, we present some graph patterns that are commonly observed in large-scale social networks. As most networks demonstrate strong community structures, one basic task in social network analysis is community detection which uncovers the group membership of actors in a network. We categorize and survey representative graph mining approaches and evaluation strategies for community detection. We then present and discuss some research issues for future exploration.

[1]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[2]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[3]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[4]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  George Karypis,et al.  Multilevel algorithms for partitioning power-law graphs , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[6]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[7]  Sougata Mukherjea,et al.  On the structural properties of massive telecom call graphs: findings and implications , 2006, CIKM '06.

[8]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[9]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[10]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[12]  S. Borgatti,et al.  LS sets, lambda sets and other cohesive subsets , 1990 .

[13]  Huan Liu,et al.  Uncovering cross-dimension group structures in multi-dimensional networks , 2009, SDM 2009.

[14]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[15]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[16]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[17]  Huan Liu,et al.  Topic taxonomy adaptation for group profiling , 2008, TKDD.

[18]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[19]  Yiannis Kompatsiaris,et al.  Bridge Bounding: A Local Approach for Efficient Community Discovery in Complex Networks , 2009, 0902.0871.

[20]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[21]  Huan Liu,et al.  Community evolution in dynamic multi-mode networks , 2008, KDD.

[22]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[23]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[24]  Christian Sohler,et al.  Counting triangles in data streams , 2006, PODS.

[25]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[26]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[27]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[28]  R. Armstrong The Long Tail: Why the Future of Business Is Selling Less of More , 2008 .

[29]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[30]  Ken Wakita,et al.  Finding community structure in mega-scale social networks: [extended abstract] , 2007, WWW '07.

[31]  Christos Faloutsos,et al.  Cascading Behavior in Large Blog Graphs , 2007 .

[32]  Luca Becchetti,et al.  Efficient semi-streaming algorithms for local triangle counting in massive graphs , 2008, KDD.

[33]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[35]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[36]  Dorothea Wagner,et al.  Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study , 2005, WEA.

[37]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[38]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[39]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[40]  Terrill L. Frantz,et al.  Communication Networks from the Enron Email Corpus “It's Always About the People. Enron is no Different” , 2005, Comput. Math. Organ. Theory.


[42]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[43]  Volker Tresp,et al.  Soft Clustering on Graphs , 2005, NIPS.

[44]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[45]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[46]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[47]  Stanley Milgram,et al.  An Experimental Study of the Small World Problem , 1969 .

[48]  Donald F. Towsley,et al.  On distinguishing between Internet power law topology generators , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[49]  U. Brandes,et al.  Maximizing Modularity is hard , 2006, physics/0608255.

[50]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[51]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  R. Hanneman Introduction to Social Network Methods , 2001 .

[53]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[54]  Michalis Faloutsos,et al.  A simple conceptual model for the Internet topology , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[55]  Bart Selman,et al.  Natural communities in large linked networks , 2003, KDD '03.

[56]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[57]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[58]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[59]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[60]  Charalampos E. Tsourakakis Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[61]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[62]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[63]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.