Supplementary Material : Large-scale community structurein social and information networks

• We use several classes of graph partitioning algorithms to p robe the networks for sets of nodes that could plausibly be interpreted as communities. These algor ithms, including flow-based methods, spectral methods, and hierarchical methods, have compleme ntary strengths and weaknesses that are well understood both in theory and in practice. For examp le, flow-based methods are known to have difficulties with expanders ( 42, 43), and flow-based post-processing of other methods are known in practice to yield cuts with extremely good conducta n e values ( 39, 41). On the other hand, spectral methods are known to have difficulties when th y confuse long paths with deep

[1]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[2]  Fan Chung,et al.  The heat kernel as the pagerank of a graph , 2007, Proceedings of the National Academy of Sciences.

[3]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[5]  F. Chung Random walks and local cuts in graphs , 2007 .

[6]  Alan M. Frieze,et al.  A Geometric Preferential Attachment Model of Networks II , 2007, Internet Math..

[7]  J. Leskovec,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[8]  F. Chung Four proofs for the Cheeger inequality and graph partition algorithms , 2007 .

[9]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[10]  Michalis Faloutsos,et al.  Jellyfish: A conceptual model for the as Internet topology , 2006, Journal of Communications and Networks.

[11]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[12]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Sergey N. Dorogovtsev,et al.  K-core Organization of Complex Networks , 2005, Physical review letters.

[16]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[17]  Béla Bollobás,et al.  Mathematical results on scale‐free random graphs , 2005 .

[18]  Alan M. Frieze,et al.  A Geometric Preferential Attachment Model of Networks , 2006, Internet Math..

[19]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Satish Rao,et al.  Expander flows, geometric embeddings and graph partitioning , 2004, STOC '04.

[21]  Satish Rao,et al.  A Flow-Based Method for Improving the Expansion or Conductance of Graph Cuts , 2004, IPCO.

[22]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[23]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[24]  M. Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[26]  Kevin J. Lang Finding good nearly balanced cuts in power law graphs , 2004 .

[27]  Ian T. Foster,et al.  Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design , 2002, ArXiv.

[28]  Xiaoyi Gao,et al.  Human population structure detection via multilocus genotype clustering , 2007, BMC Genetics.

[29]  Michalis Faloutsos,et al.  A simple conceptual model for the Internet topology , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[30]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[31]  Frank Thomson Leighton,et al.  Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms , 1999, JACM.

[32]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[33]  Satish Rao,et al.  Finding near-optimal cuts: an empirical evaluation , 1993, SODA '93.

[34]  Frank Thomson Leighton,et al.  An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[35]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.