Structural Diversity and Homophily: A Study Across More than One Hundred Large-Scale Networks

Understanding the ways in which local network structures are formed and organized is a fundamental problem in network science. A widely recognized organizing principle is structural homophily, which suggests that people with more common neighbors are more likely to connect with each other. However, what influence the diverse structures formed by common neighbors have on link formation is much less well understood. To explore this problem, we begin by formally defining the structural diversity of common neighborhoods. Using a collection of 116 large-scale networks---the biggest with over 60 million nodes and 1.8 billion edges---we then leverage this definition to develop a unique network signature, which we use to uncover several distinct network superfamilies not discoverable by conventional methods. We demonstrate that structural diversity has a significant impact on link existence, and we discover striking cases where it violates the principle of homophily. Our findings suggest that structural diversity is an intrinsic network property, giving rise to potential advances in the pursuit of theories of link formation and network evolution.

[1]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[2]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Nitesh V. Chawla,et al.  CoupledLP: Link Prediction in Coupled Networks , 2015, KDD.

[4]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[5]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[6]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[7]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[8]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[9]  Zoltán Toroczkai,et al.  New Classes of Degree Sequences with Fast Mixing Swap Markov Chain Sampling , 2016, Combinatorics, Probability and Computing.

[10]  Peter Donnelly,et al.  Superfamilies of Evolved and Designed Networks , 2004 .

[11]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[12]  Morroe Berger,et al.  Freedom and control in modern society , 1954 .

[13]  B. Uzzi,et al.  Social Structure and Competition in Interfirm Networks: The Paradox of Embeddedness , 1997 .

[14]  Bin Wu,et al.  Link Prediction Based on Local Information , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[15]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[16]  Lars Backstrom,et al.  Structural diversity in social contagion , 2012, Proceedings of the National Academy of Sciences.

[17]  P. V. Marsden,et al.  Measuring Tie Strength , 1984 .

[18]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[19]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[20]  Jie Tang,et al.  Who will follow you back?: reciprocal relationship prediction , 2011, CIKM '11.

[21]  Hao Ma On measuring social friend interest similarities in recommender systems , 2014, SIGIR.

[22]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[23]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[24]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[26]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[27]  Philip S. Yu,et al.  Semi-supervised feature selection for graph classification , 2010, KDD.

[28]  M Girvan,et al.  Structure of growing social networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Jon M. Kleinberg,et al.  Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook , 2013, CSCW.

[30]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[31]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[32]  Mark S. Granovetter Economic Action and Social Structure: The Problem of Embeddedness , 1985, American Journal of Sociology.

[33]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[34]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Jimeng Sun,et al.  Cross-domain collaboration recommendation , 2012, KDD.

[36]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[37]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[38]  Ling Zhou,et al.  Modeling Paying Behavior in Game Social Networks , 2014, CIKM.

[39]  Pablo Robles-Granda,et al.  Sampling of Attributed Networks from Hierarchical Generative Models , 2016, KDD.

[40]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[41]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[42]  Nitesh V. Chawla,et al.  Link Prediction and Recommendation across Heterogeneous Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[43]  Juan-Zi Li,et al.  Extraction and mining of an academic social network , 2008, WWW.

[44]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.

[45]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[46]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[47]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.